BartTayFinal-test

This model is a fine-tuned version of FiveC/BartTay on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Sacrebleu	Chrf++	Bertscore F1
0.2708	0.0999	548	0.1762	6.1958	14.3337	0.7402
0.1967	0.1998	1096	0.1543	13.0595	22.6490	0.7681
0.1653	0.2997	1644	0.1433	16.4281	26.5054	0.7790
0.148	0.3996	2192	0.1372	18.6916	29.1575	0.7880
0.1334	0.4995	2740	0.1309	20.7037	30.9321	0.7929
0.1234	0.5995	3288	0.1291	21.8427	31.8394	0.7953
0.1153	0.6994	3836	0.1260	23.2862	33.1552	0.7983
0.1123	0.7993	4384	0.1231	24.3244	34.1894	0.8022
0.1043	0.8992	4932	0.1210	25.3951	35.1031	0.8037
0.0982	0.9991	5480	0.1201	25.6618	35.4972	0.8048
0.0869	1.0990	6028	0.1193	25.8156	35.9535	0.8083
0.0857	1.1989	6576	0.1179	26.9340	36.8392	0.8107
0.0815	1.2988	7124	0.1179	27.6491	37.4053	0.8114
0.08	1.3987	7672	0.1172	28.0729	37.7781	0.8126
0.0781	1.4986	8220	0.1158	28.3941	38.2541	0.8146
0.0751	1.5985	8768	0.1145	28.9190	38.6033	0.8150
0.0743	1.6985	9316	0.1133	29.5192	39.0347	0.8163
0.0712	1.7984	9864	0.1131	29.9176	39.4411	0.8181
0.0714	1.8983	10412	0.1122	30.1874	39.6889	0.8190
0.069	1.9982	10960	0.1115	30.7540	40.5206	0.8205
0.0591	2.0981	11508	0.1148	30.3703	40.1852	0.8208
0.059	2.1980	12056	0.1139	30.3753	40.3092	0.8220
0.0583	2.2979	12604	0.1140	30.8041	40.6839	0.8216
0.058	2.3978	13152	0.1129	31.5508	41.2951	0.8234
0.0577	2.4977	13700	0.1126	30.9483	40.6855	0.8231
0.0564	2.5976	14248	0.1123	30.8206	40.7765	0.8235
0.0571	2.6975	14796	0.1118	31.1163	41.0993	0.8230

Safetensors

Model size

0.4B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

Finetuned

(71)

this model