MiroThinker-1.7-mini Q3_K_M GGUF
This repository contains a Q3_K_M GGUF quantization of miromind-ai/MiroThinker-1.7-mini.
Files
MiroThinker-1.7-mini.Q3_K_M.gguf
Quantization details
- Quantization:
Q3_K_M - Quantized with:
Unsloth+llama.cpp - GGUF size:
14.71 GB(13.70 GiB) - Source model license:
apache-2.0
Recommended use
This quant is aimed at fitting a 16 GB GPU while keeping better quality than lower-bit options. In local testing, it loaded successfully with llama.cpp on an AMD Radeon RX 9060 XT through the Vulkan backend.
llama.cpp example
llama-cli \
-m MiroThinker-1.7-mini.Q3_K_M.gguf \
--device Vulkan1 \
-ngl 999 \
-c 128
If VRAM is tight on your setup, reduce context length first, then reduce GPU offload.
Notes
- This repo contains only the GGUF quantized artifact, not the original safetensors weights.
- The upstream MiroThinker model itself can document deeper lineage separately.
- Downloads last month
- 60
Hardware compatibility
Log In to add your hardware
3-bit
Model tree for kai-os/MiroThinker-1.7-mini-Q3_K_M-GGUF
Base model
Qwen/Qwen3-235B-A22B-Thinking-2507 Finetuned
miromind-ai/MiroThinker-1.7-mini