Automatic Speech Recognition for Languages of Ethiopia 🇪🇹

🍇 Model Description

This is a multilingual Automatic Speech Recognition (ASR) model for Amharic, Tigrinya, Afaan Oromo, Sidama, and Wolaytta. It is fine‑tuned from Wav2Vec2‑BERT 2.0 using the Ethio speech corpus.

Developed by: Badr al-Absi
Model type: Speech Recognition (ASR)
Languages: Amharic, Tigrinya, Afaan Oromo, Sidama, and Wolaytta
License: CC-BY-4.0
Finetuned from: facebook/w2v-bert-2.0

🎧 Direct Use

from transformers import Wav2Vec2BertProcessor, Wav2Vec2BertForCTC
import torchaudio, torch

processor = Wav2Vec2BertProcessor.from_pretrained("badrex/w2v-bert-2.0-ethiopian-asr")
model = Wav2Vec2BertForCTC.from_pretrained("badrex/w2v-bert-2.0-ethiopian-asr")

audio, sr = torchaudio.load("audio.wav")
inputs = processor(audio.squeeze(), sampling_rate=sr, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits

pred_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(pred_ids)[0]

print(transcription)

🔧 Downstream Use

Voice assistants
Accessibility tools
Research baselines

🚫 Out‑of‑Scope Use

Languages outside Amharic, Tigrinya, Afaan Oromo, Sidama, and Wolaytta.
High‑stakes deployments without human review
Noisy audio without further tuning

⚠️ Risks & Limitations

Performance varies with accents, dialects, and recording quality.

📌 Citation

@misc{w2v_bert_ethiopian_asr,
  author = {Badr M. Abdullah},
  title = {Fine-tuning Wav2Vec2-BERT 2.0 for Ethiopian ASR},
  year = {2025},
  url = {https://huggingface.co/badrex/w2v-bert-2.0-ethiopian-asr}
}

Downloads last month: 56

Safetensors

Model size

0.6B params

Tensor type

F32

Dataset used to train badrex/w2v-bert-2.0-ethiopian-asr

Space using badrex/w2v-bert-2.0-ethiopian-asr 1

Collection including badrex/w2v-bert-2.0-ethiopian-asr

Speech - Languages of Ethiopia 🇪🇹 💬

Collection

ASR Models and Speech Resources for Languages of Ethiopia • 12 items • Updated 5 days ago • 2