indic-whisper-reverse-ml-mft-1-0-1-7

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 64
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 128
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 116
training_steps: 580

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.4154	0.9915	58	0.0491	25.3272	4.5829
0.1158	1.9744	116	0.0447	25.0	4.7898
0.0856	2.9573	174	0.0433	24.9673	4.6277
0.0846	3.9402	232	0.0428	24.7055	4.5726
0.0576	4.9231	290	0.0423	24.2801	4.3726
0.0579	5.9060	348	0.0420	24.2474	4.3760
0.0516	6.8889	406	0.0414	24.0183	4.3519
0.043	7.8718	464	0.0421	24.1492	4.4002
0.0454	8.8547	522	0.0420	23.9856	4.3691
0.0422	9.8376	580	0.0418	24.1165	4.3588

Safetensors

Model size

0.8B params

Tensor type

BF16