GGUF quants?
#17
by
fsaudm - opened
https://github.com/ggml-org/llama.cpp/issues/16331
Until support for deepseek 3.2 is implemented in llama.cpp there will be no ggufs.
There is a working GGUF for the Thinking version (not Speciale extra thinking so far) here now: https://huggingface.co/sszymczyk/DeepSeek-V3.2-nolight-GGUF
It does not use the new sparse attention stuff, but basically runs same as earlier version.
Don't know