RustMentor-1.7B-LiteRT

RustMentor-1.7B-LiteRT is a 1.7B-parameter Qwen3-based model fine-tuned for Rust programming education and code review. This repository hosts the LiteRT (.tflite) format for on-device Android inference with GPU/NPU acceleration.

For the LoRA adapter, see rust-mentor-1.7b. For GGUF (llama.cpp/Ollama), see rust-mentor-1.7b-GGUF.

Model Description

  • Base Model: Qwen/Qwen3-1.7B
  • Model Type: Causal LM (code tutoring + review)
  • Parameters: 1.7B
  • Context Length: 2048 tokens
  • Fine-tuning: QLoRA (r=16, alpha=16) with Unsloth optimization
  • Format: LiteRT .tflite (dynamic INT8 quantization)
  • License: Apache 2.0
  • Language: English, Rust code

Why LiteRT?

LiteRT (formerly TFLite) is Google's on-device ML framework. Compared to GGUF/llama.cpp:

  • GPU/NPU acceleration via NNAPI on Android (Tensor G3, Snapdragon, etc.)
  • 2-3x faster inference on Pixel 8 Pro vs CPU-only GGUF
  • Native Android SDK β€” no JNI wrapper needed
  • KV cache optimized for mobile memory constraints

What It Is Good At

  • Explaining Rust ownership, borrowing, and lifetimes with Go/Python/TS comparisons
  • Code review with borrow checker explanations
  • Error handling patterns (Result, Option, ?, thiserror, anyhow)
  • Async/await and Tokio patterns
  • Smart pointers (Box, Rc, Arc, RefCell)
  • Pattern matching and enum-based design
  • Trait-based architecture and generics
  • Type conversions (From, Into, AsRef, Deref)
  • Serde & JSON serialization
  • CLI tooling with clap
  • Cargo project structure, modules, and workspaces
  • Testing patterns and documentation

Intended Uses

Primary: Offline Rust programming tutor on Android (Pixel 8 Pro tested) via RustSensei app or Google AI Edge Gallery, with GPU/NPU-accelerated on-device inference.

Out-of-scope: General-purpose chat, non-Rust programming, safety-sensitive or factual tasks outside Rust development.

Prompt Examples

"In Go, I just pass values or pointers. What's this ownership thing in Rust?"

"Review this Rust code and explain what the borrow checker is doing:\n\nfn get_longest(a: String, b: String) -> String {\n    if a.len() > b.len() { a } else { b }\n}"

"How do I handle errors in Rust? I'm used to Go's if err != nil pattern."

"How does async work in Rust? In Go I just use goroutines and it's simple."

How to Use

Google AI Edge Gallery (Android)

  1. Install Google AI Edge Gallery from Play Store
  2. Import the .tflite model from this repo
  3. Chat offline with GPU/NPU acceleration

LiteRT-LM (Programmatic β€” Android/Kotlin)

// Add to build.gradle.kts:
// implementation("com.google.ai.edge:litert-lm:latest")

import com.google.ai.edge.litert.lm.LlmInference

val options = LlmInference.Options.builder()
    .setModelPath("/path/to/rust_mentor_1.7b.tflite")
    .setMaxTokens(512)
    .setTemperature(0.7f)
    .setTopP(0.9f)
    .build()

val llm = LlmInference.createFromOptions(context, options)
val response = llm.generateResponse("Explain Rust's ownership model to a Go developer")

MediaPipe LLM Inference (Alternative)

import mediapipe as mp

model_path = "rust_mentor_1.7b_q8_ekv2048.tflite"
llm = mp.tasks.genai.LlmInference.create_from_options(
    mp.tasks.genai.LlmInferenceOptions(model_path=model_path, max_tokens=512)
)
response = llm.generate_response("How do I handle errors in Rust?")

Training Data (Summary)

  • Strandset-Rust-v1: 3,000 samples of Rust code generation, review, refactoring, and bug detection tasks
  • Synthetic tutor conversations: 46 unique hand-crafted Rust tutoring dialogues across 28 topics, covering ownership, error handling, traits, async, smart pointers, macros, serde, testing, and more
  • Style: All conversations draw parallels to Go/Python/TypeScript equivalents

Training Configuration (QLoRA)

Parameter Value
Base Model Qwen/Qwen3-1.7B
Method QLoRA via Unsloth
LoRA Rank (r) 16
LoRA Alpha 16
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Epochs 3
Batch Size 2 x 4 (effective 8)
Learning Rate 2e-4 (cosine schedule)
Max Sequence Length 2048
Hardware NVIDIA A100 40GB (Google Colab)

Export Configuration (LiteRT)

Parameter Value
Conversion Tool litert-torch (re-authored Qwen3)
Quantization Dynamic INT8
KV Cache Length 2048
Prefill Lengths 8, 64, 128, 256, 512, 1024
Output Format .tflite (TFLite Flatbuffers)

Safety & Limitations

  • May generate incorrect code or hallucinate crate APIs β€” review before production use.
  • Not a replacement for the Rust compiler or clippy β€” always compile and test generated code.
  • Optimized for tutoring, not production code generation at scale.
  • Training data focuses on CLI/systems patterns; web framework coverage (Axum, Actix) is limited.

License

Apache 2.0 for the fine-tuned model; base model (Qwen/Qwen3-1.7B) license also applies.

Contact

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for sylvester-francis/rust-mentor-1.7b-LiteRT

Finetuned
Qwen/Qwen3-1.7B
Finetuned
(508)
this model

Dataset used to train sylvester-francis/rust-mentor-1.7b-LiteRT