RustMentor-1.7B-LiteRT

RustMentor-1.7B-LiteRT is a 1.7B-parameter Qwen3-based model fine-tuned for Rust programming education and code review. This repository hosts the LiteRT (.tflite) format for on-device Android inference with GPU/NPU acceleration.

For the LoRA adapter, see rust-mentor-1.7b. For GGUF (llama.cpp/Ollama), see rust-mentor-1.7b-GGUF.

Model Description

Base Model: Qwen/Qwen3-1.7B
Model Type: Causal LM (code tutoring + review)
Parameters: 1.7B
Context Length: 2048 tokens
Fine-tuning: QLoRA (r=16, alpha=16) with Unsloth optimization
Format: LiteRT .tflite (dynamic INT8 quantization)
License: Apache 2.0
Language: English, Rust code

Why LiteRT?

LiteRT (formerly TFLite) is Google's on-device ML framework. Compared to GGUF/llama.cpp:

GPU/NPU acceleration via NNAPI on Android (Tensor G3, Snapdragon, etc.)
2-3x faster inference on Pixel 8 Pro vs CPU-only GGUF
Native Android SDK — no JNI wrapper needed
KV cache optimized for mobile memory constraints

What It Is Good At

Explaining Rust ownership, borrowing, and lifetimes with Go/Python/TS comparisons
Code review with borrow checker explanations
Error handling patterns (Result, Option, ?, thiserror, anyhow)
Async/await and Tokio patterns
Smart pointers (Box, Rc, Arc, RefCell)
Pattern matching and enum-based design
Trait-based architecture and generics
Type conversions (From, Into, AsRef, Deref)
Serde & JSON serialization
CLI tooling with clap
Cargo project structure, modules, and workspaces
Testing patterns and documentation

Intended Uses

Primary: Offline Rust programming tutor on Android (Pixel 8 Pro tested) via RustSensei app or Google AI Edge Gallery, with GPU/NPU-accelerated on-device inference.

Out-of-scope: General-purpose chat, non-Rust programming, safety-sensitive or factual tasks outside Rust development.

Prompt Examples

"In Go, I just pass values or pointers. What's this ownership thing in Rust?"

"Review this Rust code and explain what the borrow checker is doing:\n\nfn get_longest(a: String, b: String) -> String {\n    if a.len() > b.len() { a } else { b }\n}"

"How do I handle errors in Rust? I'm used to Go's if err != nil pattern."

"How does async work in Rust? In Go I just use goroutines and it's simple."

How to Use

Google AI Edge Gallery (Android)

Install Google AI Edge Gallery from Play Store
Import the .tflite model from this repo
Chat offline with GPU/NPU acceleration

LiteRT-LM (Programmatic — Android/Kotlin)

// Add to build.gradle.kts:
// implementation("com.google.ai.edge:litert-lm:latest")

import com.google.ai.edge.litert.lm.LlmInference

val options = LlmInference.Options.builder()
    .setModelPath("/path/to/rust_mentor_1.7b.tflite")
    .setMaxTokens(512)
    .setTemperature(0.7f)
    .setTopP(0.9f)
    .build()

val llm = LlmInference.createFromOptions(context, options)
val response = llm.generateResponse("Explain Rust's ownership model to a Go developer")

MediaPipe LLM Inference (Alternative)

import mediapipe as mp

model_path = "rust_mentor_1.7b_q8_ekv2048.tflite"
llm = mp.tasks.genai.LlmInference.create_from_options(
    mp.tasks.genai.LlmInferenceOptions(model_path=model_path, max_tokens=512)
)
response = llm.generate_response("How do I handle errors in Rust?")

Training Data (Summary)

Strandset-Rust-v1: 3,000 samples of Rust code generation, review, refactoring, and bug detection tasks
Synthetic tutor conversations: 46 unique hand-crafted Rust tutoring dialogues across 28 topics, covering ownership, error handling, traits, async, smart pointers, macros, serde, testing, and more
Style: All conversations draw parallels to Go/Python/TypeScript equivalents

Training Configuration (QLoRA)

Parameter	Value
Base Model	Qwen/Qwen3-1.7B
Method	QLoRA via Unsloth
LoRA Rank (r)	16
LoRA Alpha	16
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Epochs	3
Batch Size	2 x 4 (effective 8)
Learning Rate	2e-4 (cosine schedule)
Max Sequence Length	2048
Hardware	NVIDIA A100 40GB (Google Colab)

Export Configuration (LiteRT)

Parameter	Value
Conversion Tool	litert-torch (re-authored Qwen3)
Quantization	Dynamic INT8
KV Cache Length	2048
Prefill Lengths	8, 64, 128, 256, 512, 1024
Output Format	.tflite (TFLite Flatbuffers)

Safety & Limitations

May generate incorrect code or hallucinate crate APIs — review before production use.
Not a replacement for the Rust compiler or clippy — always compile and test generated code.
Optimized for tutoring, not production code generation at scale.
Training data focuses on CLI/systems patterns; web framework coverage (Axum, Actix) is limited.

License

Apache 2.0 for the fine-tuned model; base model (Qwen/Qwen3-1.7B) license also applies.

Contact

Maintainer: Sylvester Francis (@sylvester-francis)
Repository: github.com/sylvester-francis/slm-rust-model
Issues/feedback: Open a discussion on the model repo

Downloads last month: 5

Model tree for sylvester-francis/rust-mentor-1.7b-LiteRT

Base model

Qwen/Qwen3-1.7B-Base

Finetuned

Qwen/Qwen3-1.7B