| |
|
| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - allenai/MADLAD-400 |
| | language: |
| | - am |
| | base_model: |
| | - allenai/OLMo-2-1124-7B-Instruct |
| | --- |
| | # OLMo 2 1124 7B Instruct for Amharic: SSU-Wanda |
| |
|
| | This model is built on top of OLMo 2 1124 7B Instruct adapted for Amharic using 200M target language tokens sampled from MADLAD-400. The model is adapted using the SSU-Wanda approach (i.e., selecting parameters to update by column based on the aggregated Wanda scores). |
| |
|
| | ## Model Description |
| |
|
| | - **Language:** Amharic |
| | - **License:** Apache 2.0 |
| | - **Fine-tuned from model:** [allenai/OLMo-2-1124-7B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-7B-Instruct) |
| |
|
| |
|
| | ## Model Sources |
| |
|
| | - **Repository:** https://github.com/gucci-j/ssu |
| | - **Paper:** https://arxiv.org/abs/2512.04844 |
| |
|
| |
|
| | ## How to Get Started with the Model |
| | Use the code below to get started with the model. |
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | |
| | model = AutoModelForCausalLM.from_pretrained( |
| | "ssu-project/OLMo-2-1124-7B-Instruct-am-ssu" |
| | ) |
| | tokenizer = AutoTokenizer.from_pretrained( |
| | "ssu-project/OLMo-2-1124-7B-Instruct-am-ssu" |
| | ) |
| | ``` |
| |
|
| |
|
| | ## Citation |
| | ``` |
| | @misc{yamaguchi2025mitigatingcatastrophicforgettingtarget, |
| | title={Mitigating Catastrophic Forgetting in Target Language Adaptation of LLMs via Source-Shielded Updates}, |
| | author={Atsuki Yamaguchi and Terufumi Morishita and Aline Villavicencio and Nikolaos Aletras}, |
| | year={2025}, |
| | eprint={2512.04844}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL}, |
| | url={https://arxiv.org/abs/2512.04844}, |
| | } |
| | ``` |
| |
|
| |
|
| |
|