🎭 BART Emoji Translator

A fine-tuned BART model that translates English text to emoji sequences using curriculum learning and LoRA.

Model Description

This model converts natural language text into appropriate emoji representations. It was trained using a 6-stage curriculum learning approach with custom data retention strategies.

Base Model: facebook/bart-large
Training Method: LoRA (Low-Rank Adaptation)

Usage

from transformers import BartTokenizer, BartForConditionalGeneration
from peft import PeftModel

# Load model
base_model = BartForConditionalGeneration.from_pretrained("facebook/bart-large")
model = PeftModel.from_pretrained(base_model, "mohamedmostafa259/bart-emoji-translator")
tokenizer = BartTokenizer.from_pretrained("mohamedmostafa259/bart-emoji-translator")

# Translate text to emojis
def translate(
    text: str,
    max_length: int = 32,
    num_beams: int = 1,
    do_sample: bool = True,
    temperature: float = 1.0,
    top_p: float = 0.4,
    top_k: int = 50
) -> str:
    inputs = tokenizer(text, return_tensors="pt")
    outputs = model.generate(
        **inputs,
        max_length=max_length,
        num_beams=num_beams,
        do_sample=do_sample,
        temperature=temperature,
        top_p=top_p,
        top_k=top_k
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Examples
print(translate('I am happy.'))  # 😁🤘
print(translate('I feel misunderstood.'))  # 🤬😬
print(translate('My parents want to have a new baby.'))  # 👶🤰👪
print(translate('I eat dinner with my family.'))  # 🥪🥛👪

Output Variability

This model uses sampling-based decoding (temperature and nucleus sampling) rather than deterministic beam search. As a result, the same input may occasionally produce slightly different emoji sequences across runs. This behavior is expected and reflects the model choosing between multiple valid emoji interpretations for a given sentence.

Training Details

Training Hyperparameters

Learning Rate: 0.0001
Batch Size: 8
Gradient Accumulation Steps: 4
Max Epochs per Phase: 20
Early Stopping Patience: 3

LoRA Configuration

Rank (r): 128
Alpha: 256
Dropout: 0.1
Target Modules: q_proj, v_proj, k_proj, out_proj

Curriculum Learning Strategy

The model was trained using a 6-phase curriculum learning approach with strategic data retention:

Phase Composition

Phase	Current Stage	Previous Stages Included
Phase 1 (bootstrap)	Stage 1 (100%)	None
Phase 2	Stage 2 (100%)	None (Stage 1 dropped)
Phase 3	Stage 3 (100%)	Stage 2 (33%)
Phase 4	Stage 4 (100%)	Stage 3 (50%), Stage 2 (33%)
Phase 5	Stage 5 (100%)	Stage 4 (50%), Stage 3 (50%), Stage 2 (33%)
Phase 6	Stage 6 (100%)	Stage 5 (50%), Stage 4 (50%), Stage 3 (50%), Stage 2 (33%)

Stage Retention Strategy

Stage 1 (Bootstrap): Used only in Phase 1, then completely dropped to prevent the model from over-relying on basic patterns
Stage 2 (Foundation): Retained at 33% in all subsequent phases (3-6) to maintain core vocabulary
Stages 3-6 (Progressive Complexity): Each retained at 50% in subsequent phases to balance learning new patterns while preventing catastrophic forgetting

Complexity Progression

The curriculum is structured around emoji count, progressively increasing output complexity:

Phase 1 (Single Emoji Foundation): Simple phrases mapping to one emoji
- Example: "I feel very happy" → 😄
- Example: "You pour some wine" → 🍷
Phase 2 (Two Emojis): Basic two-concept expressions
- Example: "They move to the rhythm" → 🎵 💃
- Example: "I like to drink wine" → ❤️ 🍷
Phase 3 (Three Emojis): Short sentences with three distinct concepts
- Example: "He makes money selling his Islamic art" → 🤑 ☪️ 🎨
- Example: "We looked for the eagle and llama on the map" → 🦅 🦙 🗺️
Phase 4 (Four Emojis): Longer phrases with multiple related concepts
- Example: "We took the car to see the nature scenery" → 🚗 ⛰️ 🌲 🏞️
- Example: "He made breakfast with eggs and toast" → 🍳 🍞 🥛 😋
Phase 5 (Five Emojis): Complex sentences requiring sequential emoji representations
- Example: "The breakfast included eggs cheese bread and fresh milk" → 🥚 🧀 🍞 🥛 😋
- Example: "The bar served wine champagne beer and whiskey all night" → 🍷 🥂 🍾 🍺 🥃
Phase 6 (Six+ Emojis): Complex narratives with action sequences and multiple events
- Example: "Board plane pack luggage reach beach swim drink coconut milk nap" → ✈️ 🧳 🏝️ 🏊 🥥 😴
- Example: "Man in suit and woman in dress drink wine eat noodles listen to music get engaged" → 🤵 👗 🍷 🍜 🎵 💍

This emoji-count-based curriculum allows the model to:

Master single-concept mappings before handling multiple concepts
Gradually increase sequence length and complexity (progressive difficulty without overwhelming the model)
Prevention of catastrophic forgetting through strategic data retention (Stage 2 at 33%, Stages 3-6 at 50%)
Learn compositional patterns (how emojis combine to represent complex ideas)

Training Dynamics

Each phase uses stage-specific validation sets to prevent metric contamination across difficulty levels
Early stopping with patience=3 prevents overfitting when validation loss increases
The model automatically loads the best checkpoint (lowest validation loss) from each phase

Limitations

Works best with English input text
May not recognize very rare or newly created emojis
Performance varies with text complexity and length
Optimal for text under 32 tokens

Training Infrastructure

Experiment: bart-large_custom_curriculum_lr0.0001_r128_20251121_115133
Date: 2025-11-21
Framework: Transformers + PEFT
Training: Kaggle GPU T4 x2

Citation

If you use this model, please cite:

@misc{bart-emoji-translator,
  author = {mohamedmostafa259},
  title = {BART Emoji Translator},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/mohamedmostafa259/bart-emoji-translator}}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for mohamedmostafa259/bart-emoji-translator

Base model

facebook/bart-large

Adapter

(26)

this model

mohamedmostafa259
/

bart-emoji-translator