BartTayFinal-test
This model is a fine-tuned version of FiveC/BartTay on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.1129
- Sacrebleu: 31.5508
- Chrf++: 41.2951
- Bertscore F1: 0.8234
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 3
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Sacrebleu | Chrf++ | Bertscore F1 |
|---|---|---|---|---|---|---|
| 0.2708 | 0.0999 | 548 | 0.1762 | 6.1958 | 14.3337 | 0.7402 |
| 0.1967 | 0.1998 | 1096 | 0.1543 | 13.0595 | 22.6490 | 0.7681 |
| 0.1653 | 0.2997 | 1644 | 0.1433 | 16.4281 | 26.5054 | 0.7790 |
| 0.148 | 0.3996 | 2192 | 0.1372 | 18.6916 | 29.1575 | 0.7880 |
| 0.1334 | 0.4995 | 2740 | 0.1309 | 20.7037 | 30.9321 | 0.7929 |
| 0.1234 | 0.5995 | 3288 | 0.1291 | 21.8427 | 31.8394 | 0.7953 |
| 0.1153 | 0.6994 | 3836 | 0.1260 | 23.2862 | 33.1552 | 0.7983 |
| 0.1123 | 0.7993 | 4384 | 0.1231 | 24.3244 | 34.1894 | 0.8022 |
| 0.1043 | 0.8992 | 4932 | 0.1210 | 25.3951 | 35.1031 | 0.8037 |
| 0.0982 | 0.9991 | 5480 | 0.1201 | 25.6618 | 35.4972 | 0.8048 |
| 0.0869 | 1.0990 | 6028 | 0.1193 | 25.8156 | 35.9535 | 0.8083 |
| 0.0857 | 1.1989 | 6576 | 0.1179 | 26.9340 | 36.8392 | 0.8107 |
| 0.0815 | 1.2988 | 7124 | 0.1179 | 27.6491 | 37.4053 | 0.8114 |
| 0.08 | 1.3987 | 7672 | 0.1172 | 28.0729 | 37.7781 | 0.8126 |
| 0.0781 | 1.4986 | 8220 | 0.1158 | 28.3941 | 38.2541 | 0.8146 |
| 0.0751 | 1.5985 | 8768 | 0.1145 | 28.9190 | 38.6033 | 0.8150 |
| 0.0743 | 1.6985 | 9316 | 0.1133 | 29.5192 | 39.0347 | 0.8163 |
| 0.0712 | 1.7984 | 9864 | 0.1131 | 29.9176 | 39.4411 | 0.8181 |
| 0.0714 | 1.8983 | 10412 | 0.1122 | 30.1874 | 39.6889 | 0.8190 |
| 0.069 | 1.9982 | 10960 | 0.1115 | 30.7540 | 40.5206 | 0.8205 |
| 0.0591 | 2.0981 | 11508 | 0.1148 | 30.3703 | 40.1852 | 0.8208 |
| 0.059 | 2.1980 | 12056 | 0.1139 | 30.3753 | 40.3092 | 0.8220 |
| 0.0583 | 2.2979 | 12604 | 0.1140 | 30.8041 | 40.6839 | 0.8216 |
| 0.058 | 2.3978 | 13152 | 0.1129 | 31.5508 | 41.2951 | 0.8234 |
| 0.0577 | 2.4977 | 13700 | 0.1126 | 30.9483 | 40.6855 | 0.8231 |
| 0.0564 | 2.5976 | 14248 | 0.1123 | 30.8206 | 40.7765 | 0.8235 |
| 0.0571 | 2.6975 | 14796 | 0.1118 | 31.1163 | 41.0993 | 0.8230 |
Framework versions
- Transformers 4.57.1
- Pytorch 2.8.0+cu126
- Datasets 4.0.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support