Thinking mode
Thanks for posting a working DeepSeek V3.2 GGUF after all the wait!
Were you able to get it working in thinking mode? I've run a couple of quick tests with and without --jinja and --chat-template-kwargs '{"enable_thinking": true }', but the model goes straight to output without thinking.
@anikifoss Like I wrote in README.md you have to explicitly set chat template when using this model. Try DeepSeek V3.2-Exp chat template, there are some pastebin links in README.md for llama.cpp and ik_llama.cpp, save these jinja templates as text files and use --jinja --chat-template-file <saved-chat-template-file> when running llama-cli or llama-server. Thinking should work without any problems.
Of course! Now I remember seeing that in the readme when this repo was published, but I forgot by the time I had a chance to play with the model.
Thank you for your reply despite me not paying attention to the readme!