view post Post 1725 Mini-QwQ an edge device friendly reasoning model distilled from QwQ-32B 🤗: kz919/QwQ-0.5B-Distilled-SFT🇬 🇬 🇺 🇫: kz919/QwQ-0.5B-Distilled-SFT-gguf🤖: kz919/Mini-QwQ See translation 👍 7 7 + Reply
Running Featured 273 Qwen2.5 Coder Artifacts 🐢 273 Generate and preview code from your app description
Cautious Optimizers: Improving Training with One Line of Code Paper • 2411.16085 • Published Nov 25, 2024 • 19
view post Post 1668 Just for the meme.But the clear lesson I learnt from building these demos are, the more powerful the underlying base model is, the closer you will get to GPT4o1. CoT is nothing more than simply inducing the latent reasoning capability from the model. kz919/GPT4-O1-Proximas 🚀 6 6 🔥 2 2 😎 1 1 + Reply