whisper-small-fa

This model is a fine-tuned version of openai/whisper-small on an unknown dataset. This was an experiment; better results are likely with more data and longer training. It achieves the following results on the evaluation set:

Loss: 0.1537
Wer: 19.2460

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 4000

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.2216	0.1935	1000	0.2209	28.1653
0.1947	0.3871	2000	0.1808	24.9731
0.1465	0.5806	3000	0.1621	20.7613
0.129	0.7741	4000	0.1537	19.2460

Note: Early stopping at 4k steps due to rising gap (train vs val) indicating overfitting.

How to use

from transformers import pipeline

asr = pipeline(
    task="automatic-speech-recognition",
    model="kiarashQ/fa-ir-stt-whisper-small-v1",
    chunk_length_s=30,
    stride_length_s=(5, 5),
    return_timestamps=False
)
out = asr("example.wav")
print(out["text"])

Framework versions

Transformers 4.56.2
Pytorch 2.8.0+cu128
Datasets 4.1.1
Tokenizers 0.22.1

Downloads last month: 3

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for kiarashQ/fa-ir-stt-whisper-small-v1

Base model

openai/whisper-small

Finetuned

(3309)

this model

Evaluation results

wer
self-reported

19.250