TestMaster-7B-GGUF: Resource-Efficient Polyglot Unit Test Generator

Model Architecture Quantization Testing

TestMaster-7B is a specialized Large Language Model (LLM) fine-tuned for generating robust, industrial-grade unit tests across multiple programming languages. Built upon Qwen2.5-Coder-7B-Instruct and fine-tuned using Unsloth, this model is optimized for logic reasoning, edge-case detection, and mock architecture simulation.

This repository contains the 4-bit Quantized (GGUF - Q4_K_M) version, designed to run efficiently on consumer hardware (e.g., NVIDIA RTX 3070, 8GB VRAM) with minimal performance loss compared to the FP16 baseline.

πŸš€ Key Features

  • Resource Efficient: Runs on ~4.5 GB of VRAM/RAM.
  • Polyglot Capabilities: Master-level performance in C, Java, Go, Python, C++, JavaScript, and even ROS2.
  • Advanced Reasoning: Capable of handling complex scenarios like:
    • Memory Safety (Null pointer checks in C).
    • Concurrency (Goroutines/WaitGroups in Go).
    • Reflection & Retry Logic (Java).
    • Async/Promise Mocking (JavaScript/Jest).
  • ASTER Methodology: Trained with principles inspired by Automated Software Testing & Error Remediation, focusing on compilability and behavioral correctness.

πŸ“Š Performance Evaluation

The model has been rigorously tested across various programming languages and complex testing scenarios. Below is the updated summary of its performance:

Language Test Category Difficulty Success Rate Key Observation
Python Mocking & AsyncIO πŸ”₯πŸ”₯πŸ”₯ 100% Flawless usage of AsyncMock, IsolatedAsyncioTestCase and assert_awaited.
Java JUnit 5 & Mockito πŸ”₯πŸ”₯ 100% Correctly migrated to JUnit 5 (ExtendWith); perfect Negative Testing & Verification logic.
JavaScript Jest & API Mocking πŸ”₯πŸ”₯ 100% Proactively used axios-mock-adapter instead of manual mocks; clean async/await flow.
C# xUnit & Moq πŸ”₯πŸ”₯ 100% Clean "Arrange-Act-Assert" structure; correct usage of It.IsAny and Attribute injection.
Go Interface Mocking πŸ”₯πŸ”₯πŸ”₯ 95% Correct usage of testify/mock embedding; handled struct-interface relationships well.
C++ Google Test/Mock βš™οΈ 98% Updated to modern MOCK_METHOD syntax; correctly managed memory (pointers) in SetUp/TearDown.
Rust Traits & Mockall πŸ”₯πŸ”₯πŸ”₯ 95% Successfully navigated ownership rules; correct usage of Box<dyn> and #[automock].
PHP Backend Testing πŸ”₯πŸ”₯ 100% Chose industry-standard Mockery library over basic PHPUnit methods for better readability.
C Memory Safety πŸ”₯πŸ”₯ 100% Proactively prevents SegFaults using sizeof and null checks.
πŸ“‹ Click to Show All Training Data (Log)
Step Loss
5 0.850300
10 0.857600
15 0.889200
20 0.849300
25 0.845600
30 0.801700
35 0.776000
40 0.811000
45 0.744900
50 0.742400
55 0.701500
60 0.728600
65 0.695400
70 0.631200
75 0.668300
80 0.602800
85 0.628700
90 0.657700
95 0.595500
100 0.592900
105 0.634900
110 0.665500
115 0.616500
120 0.615300
125 0.584500
130 0.622000
135 0.606700
140 0.577100
145 0.611500
150 0.579300
155 0.562100
160 0.595000
165 0.599200
170 0.547900
175 0.598000
180 0.566500
185 0.576300
190 0.543700
195 0.533000
200 0.575800
205 0.585400
210 0.555400
215 0.599300
220 0.528900
225 0.560100
230 0.579700
235 0.557400
240 0.518200
245 0.541800
250 0.534200
255 0.538100
260 0.570400
265 0.518400
270 0.527300
275 0.550300
280 0.536100
285 0.550300
290 0.551200
295 0.558500
300 0.529000
305 0.567800
310 0.530300
315 0.545600
320 0.529100
325 0.511600
330 0.538000
335 0.569400
340 0.524100
345 0.535100
350 0.573300
355 0.544000
360 0.547900
365 0.544900
370 0.533300
375 0.540300
380 0.543000
385 0.563800
390 0.514600
395 0.549900
400 0.562600
405 0.539200
410 0.580100
415 0.557300
420 0.555000
425 0.525200

πŸ’» Usage

Prompt Format (ChatML)

This model uses the ChatML template. Strict adherence to this format is recommended for optimal results.

<|im_start|>system
You are an expert software tester. Your goal is to write a comprehensive unit test.<|im_end|>
<|im_start|>user
Instruction:
Write a unit test for the following [Language] code...

Input:
[Source Code Here]
<|im_end|>
<|im_start|>assistant

Running with LM Studio / Ollama

  1. Download the .gguf file from this repository.
  2. Load it into your preferred GGUF runner (LM Studio, Ollama, etc.).
  3. System Prompt: Set the system prompt to: "You are an expert programmer and unit test generator."
  4. Context Window: Recommended set to 4096 or 8192.

Running with Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(
    model_path="./unit-test-model-q4_k_m.gguf",
    n_ctx=4096,
    n_gpu_layers=-1, # Offload all layers to GPU
    verbose=False
)

prompt = """<|im_start|>system
You are an expert unit test generator.<|im_end|>
<|im_start|>user
Write a Python unit test for a function that divides two numbers.<|im_end|>
<|im_start|>assistant"""

output = llm(
    prompt,
    max_tokens=1024,
    stop=["<|im_end|>"],
    echo=False
)

print(output['choices'][0]['text'])

⚠️ Limitations & Bias Python Imports: In very complex Python scenarios involving obscure libraries, the model might occasionally miss an import statement or hallucinate a mock path.

Context Window: While optimized, extremely large source code files might exceed the context window. It is recommended to test modular functions or classes.

Self-Healing: For dynamic languages (Python, JS), we recommend using this model in a "Self-Healing" loop (Execution -> Error Capture -> Repair) for 100% reliability.

πŸ“œ License This model is a fine-tune of Qwen2.5-Coder and is licensed under Apache 2.0.

🀝 Acknowledgments Fine-tuned using Unsloth (2x faster training).

Base model by Qwen Team.

Dataset curated for ASTER (Automated Software Testing & Error Remediation) research.

Downloads last month
65
GGUF
Model size
8B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for metindeder/test-master-unit-test-model-GGUF

Base model

Qwen/Qwen2.5-7B
Quantized
(156)
this model