Model Card for Model ID

This is a LoRA fine-tuned version of microsoft/Phi-4-mini-instruct for African History using the DannyAI/African-History-QA-Dataset dataset. It achieves a loss value of 1.5099 on the validation set

Model Details

Model Description

  • Developed by: Daniel Ihenacho
  • Funded by: Daniel Ihenacho
  • Shared by: Daniel Ihenacho
  • Model type: Text Generation
  • Language(s) (NLP): English
  • License: mit
  • Finetuned from model: microsoft/Phi-4-mini-instruct

Uses

This can be used for QA datasets about African History

Out-of-Scope Use

Can be used beyond African History but should not.

How to Get Started with the Model

from transformers import pipeline
from transformers import (
    AutoTokenizer, 
    AutoModelForCausalLM)
from peft import PeftModel


model_id = "microsoft/Phi-4-mini-instruct"

tokeniser = AutoTokenizer.from_pretrained(model_id)

# load base model
model  = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map = "auto",
    torch_dtype = torch.bfloat16,
    trust_remote_code = False
)

# Load the fine-tuned LoRA model
lora_id = "DannyAI/phi4_african_history_lora_ds2"
lora_model = PeftModel.from_pretrained(
    model,lora_id
)

generator = pipeline(
    "text-generation",
    model=lora_model,
    tokenizer=tokeniser,
)
question = "What is the significance of African feminist scholarly activism in contemporary resistance movements?"
def generate_answer(question)->str:
    """Generates an answer for the given question using the fine-tuned LoRA model.
    """
    messages = [
        {"role": "system", "content": "You are a helpful AI assistant specialised in African history which gives concise answers to questions asked."},
        {"role": "user", "content": question}
    ]
    
    output = generator(
        messages, 
        max_new_tokens=2048, 
        temperature=0.1, 
        do_sample=False,
        return_full_text=False
    )
    return output[0]['generated_text'].strip()
# Example output
African feminist scholarly activism is significant in contemporary resistance movements as it provides a critical framework for understanding and addressing the specific challenges faced by African women in the context of global capitalism, neocolonialism, and patriarchal structures.

Training Details

Training Data

Training Loss Epoch Step Validation Loss
1.6515 0.3784 100 1.6736
1.5844 0.7569 200 1.6175
1.6068 1.1325 300 1.5855
1.6075 1.5109 400 1.5679
1.5188 1.8893 500 1.5525
1.4248 2.2649 600 1.5423
1.5465 2.6433 700 1.5363
1.454 3.0189 800 1.5331
1.5759 3.3974 900 1.5275
1.4626 3.7758 1000 1.5268
1.4861 4.1514 1100 1.5230
1.4863 4.5298 1200 1.5232
1.4312 4.9082 1300 1.5185
1.5311 5.2838 1400 1.5193
1.5135 5.6623 1500 1.5179
1.4092 6.0378 1600 1.5144
1.5621 6.4163 1700 1.5145
1.485 6.7947 1800 1.5147
1.4301 7.1703 1900 1.5109
1.5346 7.5487 2000 1.5156
1.4597 7.9272 2100 1.5124
1.4548 8.3027 2200 1.5118
1.4485 8.6812 2300 1.5108
1.4466 9.0568 2400 1.5116
1.4672 9.4352 2500 1.5132
1.4881 9.8136 2600 1.5099

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Deep Speed Configuration

{
    "fp16": { "enabled": false },
    "bf16": { "enabled": true },
    "zero_optimization": {
        "stage": 2,
        "offload_optimizer": { 
            "device": "cpu", 
            "pin_memory": true 
        },
        "overlap_comm": true,
        "contiguous_gradients": true,
        "reduce_bucket_size": "auto"
    },
    "gradient_accumulation_steps": "auto",
    "gradient_clipping": "auto",
    "train_batch_size": "auto",
    "train_micro_batch_size_per_gpu": "auto"
}

Lora Configuration

  • r: 8
  • lora_alpha: 16
  • target_modules: ["q_proj", "v_proj", "k_proj", "o_proj"]
  • lora_dropout: 0.05 # dataset is small, hence a low dropout value
  • bias: "none"
  • task_type: "CAUSAL_LM"

Evaluation

Metrics

Models Bert Score TinyMMLU TinyTrufulQA
Base model 0.88868 0.6837 0.49745
Fine tuned Model 0.90726 0.67788 0.43822

Compute Infrastructure

Runpod.

Hardware

Runpod A40 GPU instance

Framework versions

  • PEFT 0.18.1
  • Transformers 4.57.6
  • Pytorch 2.4.1+cu124
  • Datasets 4.5.0
  • Tokenizers 0.22.2

Citation

If you use this dataset, please cite:

@Model{
Ihenacho2026phi4_african_history_lora_ds2,
  author    = {Daniel Ihenacho},
  title     = {phi4_african_history_lora_ds2},
  year      = {2026},
  publisher = {Hugging Face Models},
  url       = {https://huggingface.co/DannyAI/phi4_african_history_lora_ds2},
  urldate   = {2026-01-27},
}

Model Card Authors

Daniel Ihenacho

Model Card Contact

Downloads last month
62
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DannyAI/phi4_african_history_lora_ds2

Adapter
(162)
this model

Dataset used to train DannyAI/phi4_african_history_lora_ds2