Superuser666-Sigil's picture
Upload checkpoint-2000 checkpoint with evaluation results
f6f2dee verified
metadata
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
library_name: transformers
tags:
  - rust
  - rust-programming
  - code-generation
  - qlora
  - lora
  - peft
  - llama
  - meta-llama-3.1
  - instruction-tuned
  - text-generation
  - sigilderg
datasets:
  - ammarnasr/the-stack-rust-clean
license: mit
language:
  - en
pipeline_tag: text-generation
model-index:
  - name: llama8b-rust-qlora-phase1-step-2000
    results:
      - task:
          type: text-generation
        dataset:
          name: rust-code-evaluation
          type: code-generation
        metrics:
          - name: Compilation Rate
            type: compilation_rate
            value: 0.4545
          - name: Clippy Warnings (avg)
            type: clippy_warnings
            value: 0
          - name: Idiomatic Score
            type: idiomatic_score
            value: 0.1523
          - name: Documentation Rate
            type: doc_comment_rate
            value: 0
          - name: Avg Functions
            type: avg_functions
            value: 1.38
          - name: Avg Structs
            type: avg_structs
            value: 0.3091
          - name: Avg Traits
            type: avg_traits
            value: 0.1091
          - name: Test Rate
            type: test_rate
            value: 0
          - name: Prompt Match Score
            type: prompt_match
            value: 0.1746
        source:
          name: SigilDERG Evaluation
          url: https://github.com/Superuser666-Sigil/SigilDERG-Finetuner

llama8b-rust-qlora-phase1 (checkpoint 2000 / 12000)

This card describes checkpoint 2000 of the Phase 1 Rust QLoRA run.
For the full training plan, governance details, and final recommended checkpoints, see the root model card in the repository.

Model Description

This is a QLoRA fine-tuned version of meta-llama/Meta-Llama-3.1-8B-Instruct specifically trained on Rust code. The model uses 4-bit quantization with LoRA (Low-Rank Adaptation) adapters for efficient training and inference.

The primary modality is Rust code with English comments and explanations.

Training Details

Training Configuration

  • Base Model: meta-llama/Meta-Llama-3.1-8B-Instruct
  • Training Steps: 2,000 / 12,000 (this checkpoint)
  • Learning Rate: 9.468007641482094e-05 (peak)
  • Batch Size: 16 × 4 (effective: 64)
  • Sequence Length: 4096
  • Optimizer: paged_adamw_8bit
  • LR Scheduler: cosine
  • Warmup Steps: 250
  • Weight Decay: 0.0
  • Gradient Checkpointing: True
  • BF16: True

LoRA Configuration

  • Rank (r): 16
  • Alpha: 16
  • Dropout: 0.05
  • Target Modules: q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj

Quantization

  • Method: 4-bit NF4 (BitsAndBytes)
  • Compute Dtype: bfloat16
  • Double Quantization: True

Datasets

The model was trained on the following dataset:

  • ammarnasr/the-stack-rust-clean

Dataset Configuration:

  • Min Length: 64
  • Max Length: 200000
  • Exclude Tests: True
  • Exclude Examples: False
  • Exclude Benches: True
  • Prefer Idiomatic: False
  • Prefer Documented: False

Training Metrics

Latest logged training metrics around this checkpoint:

  • loss: 0.764700
  • grad_norm: 0.125130
  • learning_rate: 0.000095
  • entropy: 0.780214
  • num_tokens: 352709516
  • mean_token_accuracy: 0.816762
  • epoch: 0.614862
  • log_step: 1,992
  • checkpoint_step: 2,000
  • step: 1,992

(Logging is done every few steps, so log_step reflects the nearest logged step to the checkpoint.)

Evaluation Results

  • Compilation Rate: 45.45% (55 samples evaluated)
  • Average Clippy Warnings: 0.00
  • Idiomatic Score: 0.1523
  • Documentation Rate: 0.00%
  • Test Rate: 0.00%

Functionality Coverage:

  • Average Functions: 1.38
  • Average Structs: 0.31
  • Average Traits: 0.11
  • Average Impls: 0.16

Detailed Evaluation Data:

Evaluation completed: 2025-11-20T00:41:47.995913

Governance and Intended Use

This checkpoint is part of the SigilDERG ecosystem and follows Rule Zero principles.

  • Intended primarily for Rust code generation, explanation, refactoring, and review.
  • Not intended as a general-purpose advisor for medical, legal, financial, or other high-stakes domains.

Usage

Loading the Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Meta-Llama-3.1-8B-Instruct",
    device_map="auto",
    torch_dtype=torch.bfloat16
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "out/llama8b-rust-qlora-phase1/checkpoint-2000")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")

Generation

# Format prompt for instruct model
messages = [
    {"role": "system", "content": "You are a helpful Rust programming assistant."},
    {"role": "user", "content": "Write a function that calculates fibonacci numbers"}
]

# Apply chat template
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Limitations

  • This model is fine-tuned specifically for Rust code generation and may not perform well on other programming languages or general text tasks.
  • The model inherits any limitations and biases from the base model.
  • Generated code should always be reviewed and tested before use in production.

Citation

If you use this model, please cite:

@software{sigilderg_finetuner,
  title = {SigilDERG Rust Code Fine-tuned Model},
  author = {Superuser666-Sigil/Dave Tofflemire},
  year = {2025},
  url = {https://github.com/Superuser666-Sigil/SigilDERG-Finetuner}
}

License

This model is released under the MIT License.