# IB-Physics-Mini-GPT (from scratch) **Model type:** small GPT-2–style decoder-only LM **Params:** ~30M (n_layer=6, n_head=6, n_embed=384) **Context length:** 256 **Training:** tiny pretrain on physics notes → SFT on instruction pairs ## Intended Use Educational demo and concept explainer for IB Physics HL topics. ## Limitations Small context, tiny dataset, not a fact oracle. Double-check results. ## How Trained 1) Tokenizer: BPE (vocab 16k) on `corpus_raw.txt`. 2) Pretrain: next-token prediction. 3) Finetune: instruction-style Q&A (short). ## Eval - Perplexity on held-out notes (see `eval/` scripts) - Manual Q&A sanity checks. ## License MIT for code. Dataset licensing is your responsibility.