Multilingual is ruined

#8
by sovetboga - opened

In short, multilingual is ruined. Russian was tested, in particular, and I even started thinking in English. Almost half the words in the response were truncated, the other half were Chinese characters and replaced with English words. So, that's the problem with layer truncation. The second quantum has no problems, Ud2kxl is better. Tested 4ks from Bartovsky

Cerebras org

hey @sovetboga , our calibration data mix for pruning is focused on coding + tool calling, so multilingual capabilities might be affected. if you're interested in conversational Russian for this model, you can curate this specific data mix and prune the original GLM4.5-Air model using our code: https://github.com/CerebrasResearch/reap

@lazarevich what kind of hardware is needed to run your tools on a model this size?

at least 2 x rtx 6000 pro or 192 gb vram for emi-production grade deploy
4bit version on 1 x rtx 6000 pro

@lazarevich in order to prune it for a specific topic, are you referring to this part in the readme:

DATASET_NAME: (default: theblackcat102/evol-codealpaca-v1)

So basically you would replace that with the dataset on a different topic right?

Sign up or log in to comment