train_qqp_42_1767887023

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qqp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1292
  • Num Input Tokens Seen: 227720032

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.0845 0.5000 81866 0.2008 11379856
0.0388 1.0000 163732 0.1560 22767592
0.0055 1.5000 245598 0.1423 34157208
0.1723 2.0000 327464 0.1374 45539168
0.0081 2.5000 409330 0.1292 56933264
0.7173 3.0000 491196 0.1305 68313856
0.0042 3.5000 573062 0.1361 79696608
0.2785 4.0000 654928 0.1426 91087176
0.3417 4.5000 736794 0.1397 102453768
0.0026 5.0000 818660 0.1313 113858720
0.1722 5.5000 900526 0.1378 125249408
0.0015 6.0000 982392 0.1377 136629904
0.0093 6.5000 1064258 0.1392 148010560
0.2748 7.0000 1146124 0.1487 159400080
0.1673 7.5000 1227990 0.1518 170788128
0.0641 8.0000 1309856 0.1485 182174680
0.1738 8.5001 1391722 0.1510 193567608
0.1104 9.0001 1473588 0.1503 204949648
0.2339 9.5001 1555454 0.1556 216330720

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_qqp_42_1767887023

Adapter
(2367)
this model