🎭 BART Emoji Translator

A fine-tuned BART model that translates English text to emoji sequences using curriculum learning and LoRA.

Model Description

This model converts natural language text into appropriate emoji representations. It was trained using a 6-stage curriculum learning approach with custom data retention strategies.

Base Model: facebook/bart-large
Training Method: LoRA (Low-Rank Adaptation)

Usage

from transformers import BartTokenizer, BartForConditionalGeneration
from peft import PeftModel

# Load model
base_model = BartForConditionalGeneration.from_pretrained("facebook/bart-large")
model = PeftModel.from_pretrained(base_model, "mohamedmostafa259/bart-emoji-translator")
tokenizer = BartTokenizer.from_pretrained("mohamedmostafa259/bart-emoji-translator")

# Translate text to emojis
def translate(
    text: str,
    max_length: int = 32,
    num_beams: int = 1,
    do_sample: bool = True,
    temperature: float = 1.0,
    top_p: float = 0.4,
    top_k: int = 50
) -> str:
    inputs = tokenizer(text, return_tensors="pt")
    outputs = model.generate(
        **inputs,
        max_length=max_length,
        num_beams=num_beams,
        do_sample=do_sample,
        temperature=temperature,
        top_p=top_p,
        top_k=top_k
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Examples
print(translate('I am happy.'))  # 😁🀘
print(translate('I feel misunderstood.'))  # 🀬😬
print(translate('My parents want to have a new baby.'))  # πŸ‘ΆπŸ€°πŸ‘ͺ
print(translate('I eat dinner with my family.'))  # πŸ₯ͺπŸ₯›πŸ‘ͺ

Output Variability

This model uses sampling-based decoding (temperature and nucleus sampling) rather than deterministic beam search. As a result, the same input may occasionally produce slightly different emoji sequences across runs. This behavior is expected and reflects the model choosing between multiple valid emoji interpretations for a given sentence.

Training Details

Training Hyperparameters

  • Learning Rate: 0.0001
  • Batch Size: 8
  • Gradient Accumulation Steps: 4
  • Max Epochs per Phase: 20
  • Early Stopping Patience: 3

LoRA Configuration

  • Rank (r): 128
  • Alpha: 256
  • Dropout: 0.1
  • Target Modules: q_proj, v_proj, k_proj, out_proj

Curriculum Learning Strategy

The model was trained using a 6-phase curriculum learning approach with strategic data retention:

Phase Composition

Phase Current Stage Previous Stages Included
Phase 1 (bootstrap) Stage 1 (100%) None
Phase 2 Stage 2 (100%) None (Stage 1 dropped)
Phase 3 Stage 3 (100%) Stage 2 (33%)
Phase 4 Stage 4 (100%) Stage 3 (50%), Stage 2 (33%)
Phase 5 Stage 5 (100%) Stage 4 (50%), Stage 3 (50%), Stage 2 (33%)
Phase 6 Stage 6 (100%) Stage 5 (50%), Stage 4 (50%), Stage 3 (50%), Stage 2 (33%)

Stage Retention Strategy

  • Stage 1 (Bootstrap): Used only in Phase 1, then completely dropped to prevent the model from over-relying on basic patterns
  • Stage 2 (Foundation): Retained at 33% in all subsequent phases (3-6) to maintain core vocabulary
  • Stages 3-6 (Progressive Complexity): Each retained at 50% in subsequent phases to balance learning new patterns while preventing catastrophic forgetting

Complexity Progression

The curriculum is structured around emoji count, progressively increasing output complexity:

  1. Phase 1 (Single Emoji Foundation): Simple phrases mapping to one emoji

    • Example: "I feel very happy" β†’ πŸ˜„
    • Example: "You pour some wine" β†’ 🍷
  2. Phase 2 (Two Emojis): Basic two-concept expressions

    • Example: "They move to the rhythm" β†’ 🎡 πŸ’ƒ
    • Example: "I like to drink wine" β†’ ❀️ 🍷
  3. Phase 3 (Three Emojis): Short sentences with three distinct concepts

    • Example: "He makes money selling his Islamic art" β†’ πŸ€‘ β˜ͺ️ 🎨
    • Example: "We looked for the eagle and llama on the map" β†’ πŸ¦… πŸ¦™ πŸ—ΊοΈ
  4. Phase 4 (Four Emojis): Longer phrases with multiple related concepts

    • Example: "We took the car to see the nature scenery" β†’ πŸš— ⛰️ 🌲 🏞️
    • Example: "He made breakfast with eggs and toast" β†’ 🍳 🍞 πŸ₯› πŸ˜‹
  5. Phase 5 (Five Emojis): Complex sentences requiring sequential emoji representations

    • Example: "The breakfast included eggs cheese bread and fresh milk" β†’ πŸ₯š πŸ§€ 🍞 πŸ₯› πŸ˜‹
    • Example: "The bar served wine champagne beer and whiskey all night" β†’ 🍷 πŸ₯‚ 🍾 🍺 πŸ₯ƒ
  6. Phase 6 (Six+ Emojis): Complex narratives with action sequences and multiple events

    • Example: "Board plane pack luggage reach beach swim drink coconut milk nap" β†’ ✈️ 🧳 🏝️ 🏊 πŸ₯₯ 😴
    • Example: "Man in suit and woman in dress drink wine eat noodles listen to music get engaged" β†’ 🀡 πŸ‘— 🍷 🍜 🎡 πŸ’

This emoji-count-based curriculum allows the model to:

  • Master single-concept mappings before handling multiple concepts
  • Gradually increase sequence length and complexity (progressive difficulty without overwhelming the model)
  • Prevention of catastrophic forgetting through strategic data retention (Stage 2 at 33%, Stages 3-6 at 50%)
  • Learn compositional patterns (how emojis combine to represent complex ideas)

Training Dynamics

  • Each phase uses stage-specific validation sets to prevent metric contamination across difficulty levels
  • Early stopping with patience=3 prevents overfitting when validation loss increases
  • The model automatically loads the best checkpoint (lowest validation loss) from each phase

Limitations

  • Works best with English input text
  • May not recognize very rare or newly created emojis
  • Performance varies with text complexity and length
  • Optimal for text under 32 tokens

Training Infrastructure

  • Experiment: bart-large_custom_curriculum_lr0.0001_r128_20251121_115133
  • Date: 2025-11-21
  • Framework: Transformers + PEFT
  • Training: Kaggle GPU T4 x2

Citation

If you use this model, please cite:

@misc{bart-emoji-translator,
  author = {mohamedmostafa259},
  title = {BART Emoji Translator},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/mohamedmostafa259/bart-emoji-translator}}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for mohamedmostafa259/bart-emoji-translator

Adapter
(26)
this model

Space using mohamedmostafa259/bart-emoji-translator 1