Khowar Automatic Speech Recognition (ASR) Model
This is a low-resource Automatic Speech Recognition (ASR) model for the Khowar language, based on the OpenAI Whisper-base model. It is designed to transcribe spoken Khowar into text accurately and efficiently.
Model Overview
- Language: Khowar (primarily spoken in Chitral, Pakistan)
- Base Model: OpenAI Whisper-base
- Pipeline: Automatic Speech Recognition (ASR)
- Intended Use: Transcription of Khowar audio, research in low-resource ASR, multilingual speech processing
- Tags: ASR, Khowar, Chitrali, low-resource, multilingual
Features
- Supports Khowar speech recognition with reasonable accuracy despite low-resource data.
- Can be integrated into applications for real-time or batch transcription.
- Provides a foundation for building multilingual ASR systems including regional languages of Pakistan.
Model Training
- Data: Collected from local speakers of Khowar, covering a range of accents and contexts.
- Preprocessing: Audio normalized, resampled to 16 kHz.
- Fine-tuning: Whisper-base model fine-tuned on Khowar dataset.
- Evaluation: Model evaluated using Word Error Rate (WER).
Intended Use
- Transcribing Khowar audio recordings.
- Research in multilingual and low-resource ASR.
- Applications requiring integration of Khowar speech-to-text capabilities.
How to Use
You can use the model with Hugging Face Transformers or Whisper pipelines:
from transformers import WhisperForConditionalGeneration, WhisperProcessor
processor = WhisperProcessor.from_pretrained("your-username/khowar-asr")
model = WhisperForConditionalGeneration.from_pretrained("your-username/khowar-asr")
audio_input = processor("path/to/audio.wav", return_tensors="pt").input_features
predicted_ids = model.generate(audio_input)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(transcription)
- Downloads last month
- 17
Model tree for Aizazayyubi/khowar-whisper-asr
Base model
openai/whisper-base