COMPARISON TABLE BETWEEN DIFFERENT CORPUS TAKEN FROM LDC



Characteristics of the query in the LDC website:

Language(s): Spanish

DCMI Type(s): Sound

Application(s): speech recognition



No. LDC Catalog No. Item Name Author(s) Release Date Member Year(s) DCMI Type(s) Sample Type Sample Rate Data Source(s) Application(s) Language(s)
1 LDC2001S91 1997 HUB4 Broadcast News Evaluation Non-English Test Material Jonathan Fiscus, John Garofolo, Mark Przybocki, William Fisher, David Pallett Not Specified 2001 Sound Not Specified Not Specified broadcast news speech recognition Spanish, Mandarin Chinese
2 LDC2002S25 1997 HUB5 Spanish Evaluation Not Specified December 22, 2002 2002 Sound ulaw 8000 telephone conversations speech recognition Spanish
3 LDC2012S01 2006 NIST Speaker Recognition Evaluation Test Set Part 2 NIST Multimodal Information Group January 19, 2012 2012 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Urdu, Thai, Spanish, Russian, Korean, Hindi, Persian, English, Mandarin Chinese, Bengali, Standard Arabic, Dari, Iranian Persian, Chinese, Arabic
4 LDC2011S05 2008 NIST Speaker Recognition Evaluation Training Set Part 1 NIST Multimodal Information Group August 15, 2011 2011 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Wu Chinese, Vietnamese, Uzbek, Urdu, Tigrinya, Thai, Tagalog, Spanish, Russian, Panjabi, Min Nan Chinese, Lao, Korean, Central Khmer, Georgian, Japanese, Italian, Hindi, Persian, English, Mandarin Chinese, Bengali, Egyptian Arabic, Moroccan Arabic, Northern Khmer, Dari, Iranian Persian, Chinese, Arabic
5 LDC94S17 OGI Multilanguage Corpus Ronald Cole, Yeshwant Muthusamy Not Specified 1994 Sound 1-channel pcm compressed 8000 telephone speech speech recognition Vietnamese, Tamil, Korean, Japanese, Hindi, French, English, German, Spanish, Mandarin Chinese, Persian, Dari, Iranian Persian
6 LDC95S28 LATINO-40 Spanish Read News Jared Bernstein, Bill Grundy, Elizabeth Rosenfeld, Amir Najmi, Psi Mankoski Not Specified 1995 Sound 1-channel pcm 16000 microphone speech speech recognition Spanish
7 LDC96S35 CALLHOME Spanish Speech Alexandra Canavan, George Zipperlen Not Specified 1996, 1997 Sound 2-channel ulaw 8000 telephone conversations speech recognition Spanish
8 LDC96S41 VAHA (POLYPHONE II) Yeshwant Muthusamy Not Specified 1996 Sound 1-channel ulaw 8000 telephone speech speech recognition Spanish
9 LDC98S70 HUB5 Spanish Telephone Speech Corpus Not Specified Not Specified 1998 Sound 2-channel ulaw 8000 telephone conversations speech recognition Spanish
10 LDC98S74 1997 Spanish Broadcast News Speech (HUB4-NE) Not Specified Not Specified 1998 Sound 1-channel pcm 16000 broadcast news speech recognition Spanish
11 LDC2006S37 West Point Heroico Spanish Speech John Morgan October 25, 2006 2006 Sound pcm 22050 microphone speech speech recognition Spanish
12 LDC2006S31 2003 NIST Language Recognition Evaluation Alvin Martin, Mark Pryzbocki June 15, 2006 2006 Sound ulaw 8000 telephone conversations speech recognition Vietnamese, Tamil, Spanish, Iranian Persian, Korean, Japanese, Hindi, French, English, German, Mandarin Chinese, Egyptian Arabic
13 LDC2008S05 2005 NIST Language Recognition Evaluation Audrey Le, Alvin Martin, Hannah Hadfield, Jacques de Villiers, John-Paul Hosom, Jan van Santen June 16, 2008 2008 Sound ulaw 8000 telephone conversations speech recognition, language identification Tamil, Korean, Japanese, Hindi, English, Spanish, Mandarin Chinese
14 LDC2010S01 Fisher Spanish Speech David Graff, Shudong Huang, Ingrid Cartagena, Kevin Walker, Christopher Cieri February 22, 2010 2010 Sound 2-channel μ-law 8000 telephone conversations speech recognition Spanish
15 LDC2011S01 2005 NIST Speaker Recognition Evaluation Training Data NIST Multimodal Information Group May 24, 2011 2011 Sound ulaw 8000 telephone speech speech recognition Spanish, Russian, English, Mandarin Chinese, Arabic
16 LDC2011S04 2005 NIST Speaker Recognition Evaluation Test Data NIST Multimodal Information Group July 15, 2011 2011 Sound ulaw 8000 telephone speech speech recognition Spanish, Russian, English, Mandarin Chinese, Arabic
17 LDC2011S07 2008 NIST Speaker Recognition Evaluation Training Set Part 2 NIST Multimodal Information Group September 15, 2011 2011 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Wu Chinese, Vietnamese, Uzbek, Urdu, Tigrinya, Thai, Tagalog, Spanish, Russian, Panjabi, Min Nan Chinese, Lao, Korean, Central Khmer, Georgian, Japanese, Italian, Hindi, Persian, English, Mandarin Chinese, Bengali, Egyptian Arabic, Moroccan Arabic, Northern Khmer, Dari, Iranian Persian, Chinese, Arabic
18 LDC2011S10 2006 NIST Speaker Recognition Evaluation Test Set Part 1 NIST Multimodal Information Group December 15, 2011 2011 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Urdu, Thai, Spanish, Russian, Korean, Hindi, Persian, English, Mandarin Chinese, Bengali, Standard Arabic, Dari, Iranian Persian, Chinese, Arabic
19 LDC2014S05 Hispanic-English Database William Byrne, Eva Knodt, Jared Bernstein, Farzhad Emami May 15, 2014 2014 Sound pcm 16000 microphone speech spoken dialogue modeling, speech recognition, speech activity detection, speaker identification Spanish, English
20 LDC2014S08 United Nations Proceedings Speech Kevin Chay, Cecilia Elizalde, Michal Ziemski October 15, 2014 2014 Sound flac 22050 microphone speech speech recognition, language identification English, Mandarin Chinese, Standard Arabic, French, Russian, Spanish