ikarius commited on
Commit
10e8df2
Β·
verified Β·
1 Parent(s): 8ebd5a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -3
README.md CHANGED
@@ -1,3 +1,102 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Qwen/Qwen2.5-Coder-14B-Instruct
3
+ abliterated_from: huihui-ai/Qwen2.5-Coder-14B-Instruct-abliterated
4
+ quantization: nf4
5
+ license: apache-2.0
6
+ tags:
7
+ - code
8
+ - programming
9
+ - uncensored
10
+ - 4bit
11
+ - nf4
12
+ - qwen
13
+ - abliteration
14
+ - instruct
15
+ library: transformers
16
+ ---
17
+
18
+ # Qwen2.5-Coder-14B-Instruct-Abliterated-NF4
19
+
20
+ **A 4-bit (NF4) quantized, abliteration-uncensored version of Qwen2.5-Coder-14B-Instruct**
21
+
22
+ This model is a **pre-quantized 4-bit NormalFloat4 (NF4)** version of the uncensored [huihui-ai/Qwen2.5-Coder-14B-Instruct-abliterated](https://huggingface.co/huihui-ai/Qwen2.5-Coder-14B-Instruct-abliterated), optimized for **low VRAM** and **fast local inference**.
23
+
24
+ Ideal for **local deployment**, **edge devices**, or **low-VRAM environments** while maintaining strong coding and reasoning capabilities.
25
+
26
+ ---
27
+
28
+ ## πŸš€ Features
29
+
30
+ - **Base Model**: `Qwen/Qwen2.5-Coder-14B-Instruct`
31
+ - **Abliteration**: [huihui-ai](https://huggingface.co/huihui-ai) (uncensored)
32
+ - **Quantization**: **NF4 (4-bit)** – **pre-quantized**
33
+ - **Efficient**: ~8–9 GB VRAM required for inference
34
+ - **Safetensors**: Secure, modern format
35
+ - **Framework**: Compatible with `transformers`, `vLLM`, `Oobabooga`, etc.
36
+
37
+ ---
38
+
39
+ ## πŸ“₯ Installation & Usage
40
+
41
+ ### 1. Install Dependencies
42
+
43
+ ```bash
44
+ pip install transformers torch accelerate
45
+ ---
46
+
47
+ ---
48
+ ### Load the Model
49
+ from transformers import AutoTokenizer, AutoModelForCausalLM
50
+ import torch
51
+
52
+ model_id = "ikarius/Qwen2.5-Coder-14B-Instruct-Abliterated-NF4"
53
+
54
+ tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
55
+ model = AutoModelForCausalLM.from_pretrained(
56
+ model_id,
57
+ device_map="auto",
58
+ torch_dtype=torch.bfloat16, # eller torch.float16
59
+ trust_remote_code=True
60
+ )
61
+
62
+ ---
63
+ ---
64
+
65
+ # Eksempel
66
+ prompt = "Write a Python function to calculate Fibonacci numbers using memoization."
67
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
68
+
69
+ outputs = model.generate(
70
+ **inputs,
71
+ max_new_tokens=512,
72
+ temperature=0.7,
73
+ do_sample=True,
74
+ )
75
+
76
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
77
+
78
+ ---
79
+ ---
80
+ ### Using with text-generation-webui (Oobabooga)
81
+
82
+ python server.py \
83
+ --model ikarius/Qwen2.5-Coder-14B-Instruct-Abliterated-NF4 \
84
+ --bf16 \
85
+ --trust-remote-code
86
+
87
+ ---
88
+
89
+ 🀝 Credits
90
+
91
+ Original Model: Qwen Team
92
+ Abliteration: huihui-ai
93
+ Quantization: ikarius (NF4 via auto-gptq / transformers)
94
+ Hosting: Hugging Face Hub
95
+
96
+ ---
97
+ πŸ“„ License
98
+ Apache 2.0 (same as base model)
99
+
100
+ πŸ™Œ Support
101
+ ⭐ Star this repo if you find it useful!
102
+ πŸ› Report issues on the Discussions tab.