Superuser666-Sigil's picture
Upload checkpoint-2000 checkpoint with evaluation results
f6f2dee verified
---
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
library_name: transformers
tags:
- rust
- rust-programming
- code-generation
- qlora
- lora
- peft
- llama
- meta-llama-3.1
- instruction-tuned
- text-generation
- sigilderg
datasets:
- ammarnasr/the-stack-rust-clean
license: mit
language:
- en
pipeline_tag: text-generation
model-index:
- name: llama8b-rust-qlora-phase1-step-2000
results:
- task:
type: text-generation
dataset:
name: rust-code-evaluation
type: code-generation
metrics:
- name: Compilation Rate
type: compilation_rate
value: 0.4545
- name: Clippy Warnings (avg)
type: clippy_warnings
value: 0.0
- name: Idiomatic Score
type: idiomatic_score
value: 0.1523
- name: Documentation Rate
type: doc_comment_rate
value: 0.0
- name: Avg Functions
type: avg_functions
value: 1.38
- name: Avg Structs
type: avg_structs
value: 0.3091
- name: Avg Traits
type: avg_traits
value: 0.1091
- name: Test Rate
type: test_rate
value: 0.0
- name: Prompt Match Score
type: prompt_match
value: 0.1746
source:
name: SigilDERG Evaluation
url: https://github.com/Superuser666-Sigil/SigilDERG-Finetuner
---
# llama8b-rust-qlora-phase1 (checkpoint 2000 / 12000)
> This card describes **checkpoint 2000** of the Phase 1 Rust QLoRA run.
> For the full training plan, governance details, and final recommended checkpoints, see the **root model card** in the repository.
## Model Description
This is a QLoRA fine-tuned version of **meta-llama/Meta-Llama-3.1-8B-Instruct** specifically trained on Rust code. The model uses 4-bit quantization with LoRA (Low-Rank Adaptation) adapters for efficient training and inference.
The primary modality is **Rust code with English comments and explanations**.
## Training Details
### Training Configuration
- **Base Model**: `meta-llama/Meta-Llama-3.1-8B-Instruct`
- **Training Steps**: 2,000 / 12,000 (this checkpoint)
- **Learning Rate**: 9.468007641482094e-05 (peak)
- **Batch Size**: 16 × 4 (effective: 64)
- **Sequence Length**: 4096
- **Optimizer**: `paged_adamw_8bit`
- **LR Scheduler**: cosine
- **Warmup Steps**: 250
- **Weight Decay**: 0.0
- **Gradient Checkpointing**: True
- **BF16**: True
### LoRA Configuration
- **Rank (r)**: 16
- **Alpha**: 16
- **Dropout**: 0.05
- **Target Modules**: `q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj`
### Quantization
- **Method**: 4-bit NF4 (BitsAndBytes)
- **Compute Dtype**: bfloat16
- **Double Quantization**: True
### Datasets
The model was trained on the following dataset:
- `ammarnasr/the-stack-rust-clean`
**Dataset Configuration:**
- **Min Length**: 64
- **Max Length**: 200000
- **Exclude Tests**: True
- **Exclude Examples**: False
- **Exclude Benches**: True
- **Prefer Idiomatic**: False
- **Prefer Documented**: False
## Training Metrics
Latest logged training metrics around this checkpoint:
- **loss**: 0.764700
- **grad_norm**: 0.125130
- **learning_rate**: 0.000095
- **entropy**: 0.780214
- **num_tokens**: 352709516
- **mean_token_accuracy**: 0.816762
- **epoch**: 0.614862
- **log_step**: 1,992
- **checkpoint_step**: 2,000
- **step**: 1,992
(Logging is done every few steps, so `log_step` reflects the nearest logged step to the checkpoint.)
## Evaluation Results
- **Compilation Rate**: 45.45% (55 samples evaluated)
- **Average Clippy Warnings**: 0.00
- **Idiomatic Score**: 0.1523
- **Documentation Rate**: 0.00%
- **Test Rate**: 0.00%
**Functionality Coverage:**
- Average Functions: 1.38
- Average Structs: 0.31
- Average Traits: 0.11
- Average Impls: 0.16
**Detailed Evaluation Data:**
- [Metrics (JSONL)](https://huggingface.co/Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora/blob/main/checkpoint-2000/metrics.jsonl) - Full evaluation metrics
- [Error Logs (JSONL)](https://huggingface.co/Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora/blob/main/checkpoint-2000/errors.jsonl) - Compilation and runtime errors
*Evaluation completed: 2025-11-20T00:41:47.995913*
## Governance and Intended Use
This checkpoint is part of the **SigilDERG** ecosystem and follows **Rule Zero** principles.
- Intended primarily for **Rust code generation, explanation, refactoring, and review**.
- Not intended as a general-purpose advisor for medical, legal, financial, or other high-stakes domains.
## Usage
### Loading the Model
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
device_map="auto",
torch_dtype=torch.bfloat16
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "out/llama8b-rust-qlora-phase1/checkpoint-2000")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
```
### Generation
```python
# Format prompt for instruct model
messages = [
{"role": "system", "content": "You are a helpful Rust programming assistant."},
{"role": "user", "content": "Write a function that calculates fibonacci numbers"}
]
# Apply chat template
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
## Limitations
- This model is fine-tuned specifically for Rust code generation and may not perform well on other programming languages or general text tasks.
- The model inherits any limitations and biases from the base model.
- Generated code should always be reviewed and tested before use in production.
## Citation
If you use this model, please cite:
```bibtex
@software{sigilderg_finetuner,
title = {SigilDERG Rust Code Fine-tuned Model},
author = {Superuser666-Sigil/Dave Tofflemire},
year = {2025},
url = {https://github.com/Superuser666-Sigil/SigilDERG-Finetuner}
}
```
## License
This model is released under the MIT License.