---
base_model:
- Moraliane/NekoMix-12B
library_name: exllamav2
tags:
- mergekit
- merge
- rp
- russian
- role-play
language:
- ru
- en
---
# NekoMix-12B-exl2
Original model: [NekoMix-12B](https://huggingface.co/Moraliane/NekoMix-12B) by [Moraliane](https://huggingface.co/Moraliane)  

## Quants
[4bpw h6 (main)](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/main)  
[4.5bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/4.5bpw-h6)  
[5bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/5bpw-h6)  
[6bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/6bpw-h6)  
[8bpw h8](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/8bpw-h8)  

## Quantization notes
Made with Exllamav2 0.2.8 with default dataset.  
It seems to be primarily a Russian RP model. No clue how it performs at all.  
It can be used with TabbyAPI or Text-Generation-WebUI with RTX GPU on Windows or RTX/ROCm on Linux.  
Exllamav2 doesn't support offloading to RAM, so make sure it fits your GPU. Otherwise use GGUF quants instead.  
For example, with 12GB VRAM it can be used at 6bpw/Q6 cache at 16k context.  

Эта модель может использоваться с TabbyAPI или Text-Generation-WebUI.  
Для работы с ней требуется Nvidia RTX (Windows) или RTX/ROCm (Linux).  
Exl2 формат требует, чтобы модель полностью помещалась в видеопамяти.  
Например, с 12ГБ видеопамяти можно использовать 6bpw версию с Q6 кэшем с 16k контекстом.

# Original model card
# NekoMix-12B

![NekoMix-12B](./remix.webp)
# GGUF:
https://huggingface.co/mradermacher/NekoMix-12B-GGUF

# GGUF imatrix:
Soon...

# Presets:
https://huggingface.co/Moraliane/NekoMix-12B/blob/main/pres/NekoMixRUS.json
![NekoMix-12B](./pres/import.png)
![NekoMix-12B](./pres/scr.png)
Так же рекомендую использовать Mistral V3-Tekken в качестве Context Template и Instruct Template (!!Спорно!!)
# Sampler:
![NekoMix-12B](./pres/samp.png)

![NekoMix-12B](./pres/topa.png)


Рекомендую для начала использовать стоковый пресет simple-1 а так же Parameters_Top(A)Kek из https://huggingface.co/MarinaraSpaghetti/SillyTavern-Settings/tree/main/Parameters

```
Temp - 0,7 - 1,25 ~
TopA - 0,1
DRY - 0,8 1,75 2 0
I recommend trying the stock presets from SillyTavern, such as simple-1.
```


# Testmrg

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

## Merge Details
### Merge Method

This model was merged using the della_linear merge method using E:\Programs\TextGen\text-generation-webui\models\IlyaGusev_saiga_nemo_12b as a base.

### Models Merged

The following models were included in the merge:
* E:\Programs\TextGen\text-generation-webui\models\MarinaraSpaghetti_NemoMix-Unleashed-12B
* E:\Programs\TextGen\text-generation-webui\models\Vikhrmodels_Vikhr-Nemo-12B-Instruct-R-21-09-24
* E:\Programs\TextGen\text-generation-webui\models\TheDrummer_Rocinante-12B-v1.1

### Configuration

The following YAML configuration was used to produce this model:

```yaml
models:
  - model: E:\Programs\TextGen\text-generation-webui\models\IlyaGusev_saiga_nemo_12b
    parameters:
      weight: 0.5  # Основной акцент на русском языке
      density: 0.4
  - model: E:\Programs\TextGen\text-generation-webui\models\MarinaraSpaghetti_NemoMix-Unleashed-12B
    parameters:
      weight: 0.2  # РП модель, чуть меньший вес из-за ориентации на английский
      density: 0.4
  - model: E:\Programs\TextGen\text-generation-webui\models\TheDrummer_Rocinante-12B-v1.1
    parameters:
      weight: 0.2  # Увеличенный вес для усиления РП аспектов
      density: 0.5  # Повышенная плотность для более сильного влияния
  - model: E:\Programs\TextGen\text-generation-webui\models\Vikhrmodels_Vikhr-Nemo-12B-Instruct-R-21-09-24
    parameters:
      weight: 0.25  # Русскоязычная поддержка и баланс
      density: 0.4

merge_method: della_linear
base_model: E:\Programs\TextGen\text-generation-webui\models\IlyaGusev_saiga_nemo_12b
parameters:
  epsilon: 0.05
  lambda: 1
dtype: float16
tokenizer_source: base

```