--- base_model: - Moraliane/NekoMix-12B library_name: exllamav2 tags: - mergekit - merge - rp - russian - role-play language: - ru - en --- # NekoMix-12B-exl2 Original model: [NekoMix-12B](https://huggingface.co/Moraliane/NekoMix-12B) by [Moraliane](https://huggingface.co/Moraliane) ## Quants [4bpw h6 (main)](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/main) [4.5bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/4.5bpw-h6) [5bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/5bpw-h6) [6bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/6bpw-h6) [8bpw h8](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/8bpw-h8) ## Quantization notes Made with Exllamav2 0.2.8 with default dataset. It seems to be primarily a Russian RP model. No clue how it performs at all. It can be used with TabbyAPI or Text-Generation-WebUI with RTX GPU on Windows or RTX/ROCm on Linux. Exllamav2 doesn't support offloading to RAM, so make sure it fits your GPU. Otherwise use GGUF quants instead. For example, with 12GB VRAM it can be used at 6bpw/Q6 cache at 16k context. Эта модель может использоваться с TabbyAPI или Text-Generation-WebUI. Для работы с ней требуется Nvidia RTX (Windows) или RTX/ROCm (Linux). Exl2 формат требует, чтобы модель полностью помещалась в видеопамяти. Например, с 12ГБ видеопамяти можно использовать 6bpw версию с Q6 кэшем с 16k контекстом. # Original model card # NekoMix-12B ![NekoMix-12B](./remix.webp) # GGUF: https://huggingface.co/mradermacher/NekoMix-12B-GGUF # GGUF imatrix: Soon... # Presets: https://huggingface.co/Moraliane/NekoMix-12B/blob/main/pres/NekoMixRUS.json ![NekoMix-12B](./pres/import.png) ![NekoMix-12B](./pres/scr.png) Так же рекомендую использовать Mistral V3-Tekken в качестве Context Template и Instruct Template (!!Спорно!!) # Sampler: ![NekoMix-12B](./pres/samp.png) ![NekoMix-12B](./pres/topa.png) Рекомендую для начала использовать стоковый пресет simple-1 а так же Parameters_Top(A)Kek из https://huggingface.co/MarinaraSpaghetti/SillyTavern-Settings/tree/main/Parameters ``` Temp - 0,7 - 1,25 ~ TopA - 0,1 DRY - 0,8 1,75 2 0 I recommend trying the stock presets from SillyTavern, such as simple-1. ``` # Testmrg This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the della_linear merge method using E:\Programs\TextGen\text-generation-webui\models\IlyaGusev_saiga_nemo_12b as a base. ### Models Merged The following models were included in the merge: * E:\Programs\TextGen\text-generation-webui\models\MarinaraSpaghetti_NemoMix-Unleashed-12B * E:\Programs\TextGen\text-generation-webui\models\Vikhrmodels_Vikhr-Nemo-12B-Instruct-R-21-09-24 * E:\Programs\TextGen\text-generation-webui\models\TheDrummer_Rocinante-12B-v1.1 ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: E:\Programs\TextGen\text-generation-webui\models\IlyaGusev_saiga_nemo_12b parameters: weight: 0.5 # Основной акцент на русском языке density: 0.4 - model: E:\Programs\TextGen\text-generation-webui\models\MarinaraSpaghetti_NemoMix-Unleashed-12B parameters: weight: 0.2 # РП модель, чуть меньший вес из-за ориентации на английский density: 0.4 - model: E:\Programs\TextGen\text-generation-webui\models\TheDrummer_Rocinante-12B-v1.1 parameters: weight: 0.2 # Увеличенный вес для усиления РП аспектов density: 0.5 # Повышенная плотность для более сильного влияния - model: E:\Programs\TextGen\text-generation-webui\models\Vikhrmodels_Vikhr-Nemo-12B-Instruct-R-21-09-24 parameters: weight: 0.25 # Русскоязычная поддержка и баланс density: 0.4 merge_method: della_linear base_model: E:\Programs\TextGen\text-generation-webui\models\IlyaGusev_saiga_nemo_12b parameters: epsilon: 0.05 lambda: 1 dtype: float16 tokenizer_source: base ```