Update README.md

9b22287 verified 11 months ago

4.44 kB

	---
	base_model:
	- Moraliane/NekoMix-12B
	library_name: exllamav2
	tags:
	- mergekit
	- merge
	- rp
	- russian
	- role-play
	language:
	- ru
	- en
	---
	# NekoMix-12B-exl2
	Original model: [NekoMix-12B](https://huggingface.co/Moraliane/NekoMix-12B) by [Moraliane](https://huggingface.co/Moraliane)

	## Quants
	[4bpw h6 (main)](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/main)
	[4.5bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/4.5bpw-h6)
	[5bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/5bpw-h6)
	[6bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/6bpw-h6)
	[8bpw h8](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/8bpw-h8)

	## Quantization notes
	Made with Exllamav2 0.2.8 with default dataset.
	It seems to be primarily a Russian RP model. No clue how it performs at all.
	It can be used with TabbyAPI or Text-Generation-WebUI with RTX GPU on Windows or RTX/ROCm on Linux.
	Exllamav2 doesn't support offloading to RAM, so make sure it fits your GPU. Otherwise use GGUF quants instead.
	For example, with 12GB VRAM it can be used at 6bpw/Q6 cache at 16k context.

	Эта модель может использоваться с TabbyAPI или Text-Generation-WebUI.
	Для работы с ней требуется Nvidia RTX (Windows) или RTX/ROCm (Linux).
	Exl2 формат требует, чтобы модель полностью помещалась в видеопамяти.
	Например, с 12ГБ видеопамяти можно использовать 6bpw версию с Q6 кэшем с 16k контекстом.

	# Original model card
	# NekoMix-12B

	![NekoMix-12B](./remix.webp)
	# GGUF:
	https://huggingface.co/mradermacher/NekoMix-12B-GGUF

	# GGUF imatrix:
	Soon...

	# Presets:
	https://huggingface.co/Moraliane/NekoMix-12B/blob/main/pres/NekoMixRUS.json
	![NekoMix-12B](./pres/import.png)
	![NekoMix-12B](./pres/scr.png)
	Так же рекомендую использовать Mistral V3-Tekken в качестве Context Template и Instruct Template (!!Спорно!!)
	# Sampler:
	![NekoMix-12B](./pres/samp.png)

	![NekoMix-12B](./pres/topa.png)


	Рекомендую для начала использовать стоковый пресет simple-1 а так же Parameters_Top(A)Kek из https://huggingface.co/MarinaraSpaghetti/SillyTavern-Settings/tree/main/Parameters

	```
	Temp - 0,7 - 1,25 ~
	TopA - 0,1
	DRY - 0,8 1,75 2 0
	I recommend trying the stock presets from SillyTavern, such as simple-1.
	```





	# Testmrg

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the della_linear merge method using E:\Programs\TextGen\text-generation-webui\models\IlyaGusev_saiga_nemo_12b as a base.

	### Models Merged

	The following models were included in the merge:
	* E:\Programs\TextGen\text-generation-webui\models\MarinaraSpaghetti_NemoMix-Unleashed-12B
	* E:\Programs\TextGen\text-generation-webui\models\Vikhrmodels_Vikhr-Nemo-12B-Instruct-R-21-09-24
	* E:\Programs\TextGen\text-generation-webui\models\TheDrummer_Rocinante-12B-v1.1

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: E:\Programs\TextGen\text-generation-webui\models\IlyaGusev_saiga_nemo_12b
	parameters:
	weight: 0.5 # Основной акцент на русском языке
	density: 0.4
	- model: E:\Programs\TextGen\text-generation-webui\models\MarinaraSpaghetti_NemoMix-Unleashed-12B
	parameters:
	weight: 0.2 # РП модель, чуть меньший вес из-за ориентации на английский
	density: 0.4
	- model: E:\Programs\TextGen\text-generation-webui\models\TheDrummer_Rocinante-12B-v1.1
	parameters:
	weight: 0.2 # Увеличенный вес для усиления РП аспектов
	density: 0.5 # Повышенная плотность для более сильного влияния
	- model: E:\Programs\TextGen\text-generation-webui\models\Vikhrmodels_Vikhr-Nemo-12B-Instruct-R-21-09-24
	parameters:
	weight: 0.25 # Русскоязычная поддержка и баланс
	density: 0.4

	merge_method: della_linear
	base_model: E:\Programs\TextGen\text-generation-webui\models\IlyaGusev_saiga_nemo_12b
	parameters:
	epsilon: 0.05
	lambda: 1
	dtype: float16
	tokenizer_source: base

	```