We share, support and encourage everyone to contribute to Arabic speech resources Numerous efforts have been given to produce spoken Arabic data set resources. From CallHome task (1996/97 NIST benchmark) to the Global Autonomous Language Exploitation (GALE) [2006-2009], many resources have been created.

Hours of Arabic Data
Hours of ASR Data
Hours of ADI Data

ASR Resources

QASR Dataset

2000 hours

MGB-2 Dataset

1200 hours

MGB-3 Dataset

16 hours

MGB-5 Dataset

62 hours

ESCWA-CS Dataset

2.8 hours

DACS Dataset

2 hours

Arabic Dialect Identification Resources

MGB-3 [ADI-5]

50 hours

MGB-5 [ADI-17]

3000 hours