| | --- |
| | base_model: meta-llama/Meta-Llama-3.1-8B-Instruct |
| | library_name: transformers |
| | tags: |
| | - rust |
| | - rust-programming |
| | - code-generation |
| | - qlora |
| | - lora |
| | - peft |
| | - llama |
| | - meta-llama-3.1 |
| | - instruction-tuned |
| | - text-generation |
| | - sigilderg |
| | datasets: |
| | - ammarnasr/the-stack-rust-clean |
| | license: mit |
| | language: |
| | - en |
| | pipeline_tag: text-generation |
| | model-index: |
| | - name: llama8b-rust-qlora-phase1-step-2000 |
| | results: |
| | - task: |
| | type: text-generation |
| | dataset: |
| | name: rust-code-evaluation |
| | type: code-generation |
| | metrics: |
| | - name: Compilation Rate |
| | type: compilation_rate |
| | value: 0.4545 |
| | - name: Clippy Warnings (avg) |
| | type: clippy_warnings |
| | value: 0.0 |
| | - name: Idiomatic Score |
| | type: idiomatic_score |
| | value: 0.1523 |
| | - name: Documentation Rate |
| | type: doc_comment_rate |
| | value: 0.0 |
| | - name: Avg Functions |
| | type: avg_functions |
| | value: 1.38 |
| | - name: Avg Structs |
| | type: avg_structs |
| | value: 0.3091 |
| | - name: Avg Traits |
| | type: avg_traits |
| | value: 0.1091 |
| | - name: Test Rate |
| | type: test_rate |
| | value: 0.0 |
| | - name: Prompt Match Score |
| | type: prompt_match |
| | value: 0.1746 |
| | source: |
| | name: SigilDERG Evaluation |
| | url: https://github.com/Superuser666-Sigil/SigilDERG-Finetuner |
| | --- |
| | |
| | # llama8b-rust-qlora-phase1 (checkpoint 2000 / 12000) |
| |
|
| | > This card describes **checkpoint 2000** of the Phase 1 Rust QLoRA run. |
| | > For the full training plan, governance details, and final recommended checkpoints, see the **root model card** in the repository. |
| |
|
| | ## Model Description |
| |
|
| | This is a QLoRA fine-tuned version of **meta-llama/Meta-Llama-3.1-8B-Instruct** specifically trained on Rust code. The model uses 4-bit quantization with LoRA (Low-Rank Adaptation) adapters for efficient training and inference. |
| |
|
| | The primary modality is **Rust code with English comments and explanations**. |
| |
|
| | ## Training Details |
| |
|
| | ### Training Configuration |
| |
|
| | - **Base Model**: `meta-llama/Meta-Llama-3.1-8B-Instruct` |
| | - **Training Steps**: 2,000 / 12,000 (this checkpoint) |
| | - **Learning Rate**: 9.468007641482094e-05 (peak) |
| | - **Batch Size**: 16 × 4 (effective: 64) |
| | - **Sequence Length**: 4096 |
| | - **Optimizer**: `paged_adamw_8bit` |
| | - **LR Scheduler**: cosine |
| | - **Warmup Steps**: 250 |
| | - **Weight Decay**: 0.0 |
| | - **Gradient Checkpointing**: True |
| | - **BF16**: True |
| |
|
| | ### LoRA Configuration |
| |
|
| | - **Rank (r)**: 16 |
| | - **Alpha**: 16 |
| | - **Dropout**: 0.05 |
| | - **Target Modules**: `q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj` |
| |
|
| | ### Quantization |
| |
|
| | - **Method**: 4-bit NF4 (BitsAndBytes) |
| | - **Compute Dtype**: bfloat16 |
| | - **Double Quantization**: True |
| |
|
| | ### Datasets |
| |
|
| | The model was trained on the following dataset: |
| |
|
| | - `ammarnasr/the-stack-rust-clean` |
| |
|
| | **Dataset Configuration:** |
| |
|
| | - **Min Length**: 64 |
| | - **Max Length**: 200000 |
| | - **Exclude Tests**: True |
| | - **Exclude Examples**: False |
| | - **Exclude Benches**: True |
| | - **Prefer Idiomatic**: False |
| | - **Prefer Documented**: False |
| |
|
| | ## Training Metrics |
| |
|
| | Latest logged training metrics around this checkpoint: |
| |
|
| | - **loss**: 0.764700 |
| | - **grad_norm**: 0.125130 |
| | - **learning_rate**: 0.000095 |
| | - **entropy**: 0.780214 |
| | - **num_tokens**: 352709516 |
| | - **mean_token_accuracy**: 0.816762 |
| | - **epoch**: 0.614862 |
| | - **log_step**: 1,992 |
| | - **checkpoint_step**: 2,000 |
| | - **step**: 1,992 |
| | |
| | (Logging is done every few steps, so `log_step` reflects the nearest logged step to the checkpoint.) |
| | |
| | ## Evaluation Results |
| | |
| | - **Compilation Rate**: 45.45% (55 samples evaluated) |
| | - **Average Clippy Warnings**: 0.00 |
| | - **Idiomatic Score**: 0.1523 |
| | - **Documentation Rate**: 0.00% |
| | - **Test Rate**: 0.00% |
| | |
| | **Functionality Coverage:** |
| | - Average Functions: 1.38 |
| | - Average Structs: 0.31 |
| | - Average Traits: 0.11 |
| | - Average Impls: 0.16 |
| |
|
| | **Detailed Evaluation Data:** |
| | - [Metrics (JSONL)](https://huggingface.co/Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora/blob/main/checkpoint-2000/metrics.jsonl) - Full evaluation metrics |
| | - [Error Logs (JSONL)](https://huggingface.co/Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora/blob/main/checkpoint-2000/errors.jsonl) - Compilation and runtime errors |
| |
|
| | *Evaluation completed: 2025-11-20T00:41:47.995913* |
| | ## Governance and Intended Use |
| |
|
| | This checkpoint is part of the **SigilDERG** ecosystem and follows **Rule Zero** principles. |
| |
|
| | - Intended primarily for **Rust code generation, explanation, refactoring, and review**. |
| | - Not intended as a general-purpose advisor for medical, legal, financial, or other high-stakes domains. |
| |
|
| | ## Usage |
| |
|
| | ### Loading the Model |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | from peft import PeftModel |
| | |
| | # Load base model |
| | base_model = AutoModelForCausalLM.from_pretrained( |
| | "meta-llama/Meta-Llama-3.1-8B-Instruct", |
| | device_map="auto", |
| | torch_dtype=torch.bfloat16 |
| | ) |
| | |
| | # Load LoRA adapter |
| | model = PeftModel.from_pretrained(base_model, "out/llama8b-rust-qlora-phase1/checkpoint-2000") |
| | tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct") |
| | ``` |
| |
|
| | ### Generation |
| |
|
| | ```python |
| | # Format prompt for instruct model |
| | messages = [ |
| | {"role": "system", "content": "You are a helpful Rust programming assistant."}, |
| | {"role": "user", "content": "Write a function that calculates fibonacci numbers"} |
| | ] |
| | |
| | # Apply chat template |
| | prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
| | |
| | # Generate |
| | inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| | outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7) |
| | response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
| | ``` |
| |
|
| | ## Limitations |
| |
|
| | - This model is fine-tuned specifically for Rust code generation and may not perform well on other programming languages or general text tasks. |
| | - The model inherits any limitations and biases from the base model. |
| | - Generated code should always be reviewed and tested before use in production. |
| |
|
| | ## Citation |
| |
|
| | If you use this model, please cite: |
| |
|
| | ```bibtex |
| | @software{sigilderg_finetuner, |
| | title = {SigilDERG Rust Code Fine-tuned Model}, |
| | author = {Superuser666-Sigil/Dave Tofflemire}, |
| | year = {2025}, |
| | url = {https://github.com/Superuser666-Sigil/SigilDERG-Finetuner} |
| | } |
| | ``` |
| |
|
| | ## License |
| |
|
| | This model is released under the MIT License. |
| |
|