COMPARISON TABLE BETWEEN DIFFERENT CORPUS TAKEN FROM LDC



Characteristics of the query in the LDC website:

Language(s): Hindi

DCMI Type(s): Sound

Application(s): speech recognition



No. LDC Catalog No. Item Name Author(s) Release Date Member Year(s) DCMI Type(s) Sample Type Sample Rate Data Source(s) Application(s) Language(s)
1 LDC2012S01 2006 NIST Speaker Recognition Evaluation Test Set Part 2 NIST Multimodal Information Group January 19, 2012 2012 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Urdu, Thai, Spanish, Russian, Korean, Hindi, Persian, English, Mandarin Chinese, Bengali, Standard Arabic, Dari, Iranian Persian, Chinese, Arabic
2 LDC2011S05 2008 NIST Speaker Recognition Evaluation Training Set Part 1 NIST Multimodal Information Group August 15, 2011 2011 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Wu Chinese, Vietnamese, Uzbek, Urdu, Tigrinya, Thai, Tagalog, Spanish, Russian, Panjabi, Min Nan Chinese, Lao, Korean, Central Khmer, Georgian, Japanese, Italian, Hindi, Persian, English, Mandarin Chinese, Bengali, Egyptian Arabic, Moroccan Arabic, Northern Khmer, Dari, Iranian Persian, Chinese, Arabic
3 LDC94S17 OGI Multilanguage Corpus Ronald Cole, Yeshwant Muthusamy Not Specified 1994 Sound 1-channel pcm compressed 8000 telephone speech speech recognition Vietnamese, Tamil, Korean, Japanese, Hindi, French, English, German, Spanish, Mandarin Chinese, Persian, Dari, Iranian Persian
4 LDC2006S31 2003 NIST Language Recognition Evaluation Alvin Martin, Mark Pryzbocki June 15, 2006 2006 Sound ulaw 8000 telephone conversations speech recognition Vietnamese, Tamil, Spanish, Iranian Persian, Korean, Japanese, Hindi, French, English, German, Mandarin Chinese, Egyptian Arabic
5 LDC2008S05 2005 NIST Language Recognition Evaluation Audrey Le, Alvin Martin, Hannah Hadfield, Jacques de Villiers, John-Paul Hosom, Jan van Santen June 16, 2008 2008 Sound ulaw 8000 telephone conversations speech recognition, language identification Tamil, Korean, Japanese, Hindi, English, Spanish, Mandarin Chinese
6 LDC2011S07 2008 NIST Speaker Recognition Evaluation Training Set Part 2 NIST Multimodal Information Group September 15, 2011 2011 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Wu Chinese, Vietnamese, Uzbek, Urdu, Tigrinya, Thai, Tagalog, Spanish, Russian, Panjabi, Min Nan Chinese, Lao, Korean, Central Khmer, Georgian, Japanese, Italian, Hindi, Persian, English, Mandarin Chinese, Bengali, Egyptian Arabic, Moroccan Arabic, Northern Khmer, Dari, Iranian Persian, Chinese, Arabic
7 LDC2011S08 2008 NIST Speaker Recognition Evaluation Test Set NIST Multimodal Information Group October 21, 2011 2011 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Wu Chinese, Vietnamese, Uzbek, Urdu, Thai, Tagalog, Tamil, Russian, Panjabi, Min Nan Chinese, Lao, Korean, Japanese, Italian, Hindi, Persian, Mandarin Chinese, Bengali, Egyptian Arabic, Moroccan Arabic, Dari, Iranian Persian, English, Chinese, Arabic
8 LDC2011S09 2006 NIST Speaker Recognition Evaluation Training Set NIST Multimodal Information Group November 16, 2011 2011 Sound ulaw 8000 telephone speech speech recognition Yue Chinese, Urdu, Thai, Russian, Korean, Hindi, English, Mandarin Chinese, Bengali, Standard Arabic, Chinese, Arabic
9 LDC2011S10 2006 NIST Speaker Recognition Evaluation Test Set Part 1 NIST Multimodal Information Group December 15, 2011 2011 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Urdu, Thai, Spanish, Russian, Korean, Hindi, Persian, English, Mandarin Chinese, Bengali, Standard Arabic, Dari, Iranian Persian, Chinese, Arabic