FP8-Block Quantized Models - a nm-testing Collection

nm-testing 's Collections

KV Cache Quantization

FP8-Block Quantized Models

LLM Compressor testing

Speculators testing

Sparse-Llama-3.1-8B-2of4

FP8-Block Quantized Models

updated 25 days ago

Collection of State-of-the-art FP8 Block Quantized Models

RedHatAI/Qwen3-8B-FP8-block

Text Generation • 8B • Updated Nov 7 • 91
RedHatAI/Qwen3-32B-FP8-block

Text Generation • 33B • Updated Oct 24 • 34
RedHatAI/Qwen3-14B-FP8-block

Text Generation • 15B • Updated Oct 24 • 26
RedHatAI/Llama-3.1-8B-Instruct-FP8-block

Text Generation • 8B • Updated Oct 29 • 308
nm-testing/Qwen3-VL-235B-A22B-Instruct-FP8-BLOCK

Text Generation • Updated Oct 27
nm-testing/Llama-4-Scout-17B-16E-Instruct-BLOCK-FP8

Text Generation • 109B • Updated Oct 27 • 6
RedHatAI/Llama-3.3-70B-Instruct-FP8-block

Text Generation • 71B • Updated Oct 24 • 17
nm-testing/Llama-4-Maverick-17B-128E-Instruct-block-FP8

Text Generation • Updated Oct 27 • 39
nm-testing/Qwen3-30B-A3B-FP8-block

Text Generation • 3B • Updated Oct 27 • 30
nm-testing/granite-4.0-h-small-FP8-block

Text Generation • 32B • Updated 25 days ago • 71