Language(s): Mandarin
DCMI Type(s): Sound
Application(s): speech recognition
No. | LDC Catalog No. | Item Name | Author(s) | Release Date | Member Year(s) | DCMI Type(s) | Sample Type | Sample Rate | Data Source(s) | Application(s) | Language(s) |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | LDC2001S91 | 1997 HUB4 Broadcast News Evaluation Non-English Test Material | Jonathan Fiscus, John Garofolo, Mark Przybocki, William Fisher, David Pallett | Not Specified | 2001 | Sound | Not Specified | Not Specified | broadcast news | speech recognition | Spanish, Mandarin Chinese |
2 | LDC2001S93 | TDT2 Mandarin Audio Corpus | David Graff | Not Specified | 2001 | Sound | Not Specified | Not Specified | broadcast news | topic detection and tracking, speech recognition | Mandarin Chinese |
3 | LDC2001S95 | TDT3 Mandarin Audio | David Graff | Not Specified | 2001 | Sound | 1-channel pcm | 16000 | broadcast news | topic detection and tracking, speech recognition | Mandarin Chinese |
4 | LDC2002S12 | 2001 HUB5 Mandarin Evaluation | David Graff, Alvin Martin, David Miller, Mark Przybocki, Kevin Walker | April 30, 2002 | 2002 | Sound | 2-channel ulaw | 8000 | telephone conversations | speech recognition | Mandarin Chinese |
5 | LDC2012S01 | 2006 NIST Speaker Recognition Evaluation Test Set Part 2 | NIST Multimodal Information Group | January 19, 2012 | 2012 | Sound | ulaw | 8000 | telephone speech, microphone speech | speech recognition | Yue Chinese, Urdu, Thai, Spanish, Russian, Korean, Hindi, Persian, English, Mandarin Chinese, Bengali, Standard Arabic, Dari, Iranian Persian, Chinese, Arabic |
6 | LDC2011S05 | 2008 NIST Speaker Recognition Evaluation Training Set Part 1 | NIST Multimodal Information Group | August 15, 2011 | 2011 | Sound | ulaw | 8000 | telephone speech, microphone speech | speech recognition | Yue Chinese, Wu Chinese, Vietnamese, Uzbek, Urdu, Tigrinya, Thai, Tagalog, Spanish, Russian, Panjabi, Min Nan Chinese, Lao, Korean, Central Khmer, Georgian, Japanese, Italian, Hindi, Persian, English, Mandarin Chinese, Bengali, Egyptian Arabic, Moroccan Arabic, Northern Khmer, Dari, Iranian Persian, Chinese, Arabic |
7 | LDC94S17 | OGI Multilanguage Corpus | Ronald Cole, Yeshwant Muthusamy | Not Specified | 1994 | Sound | 1-channel pcm compressed | 8000 | telephone speech | speech recognition | Vietnamese, Tamil, Korean, Japanese, Hindi, French, English, German, Spanish, Mandarin Chinese, Persian, Dari, Iranian Persian |
8 | LDC96S34 | CALLHOME Mandarin Chinese Speech | Alexandra Canavan, George Zipperlen | Not Specified | 1996, 1997 | Sound | 2-channel ulaw | 8000 | telephone conversations | speech recognition | Mandarin Chinese |
9 | LDC98S69 | HUB5 Mandarin Telephone Speech Corpus | Not Specified | Not Specified | 1998 | Sound | 2-channel ulaw | 8000 | telephone conversations | speech recognition | Mandarin Chinese |
10 | LDC98S72 | Taiwanese Putonghua Speech and Transcripts | San Duanmu, Gregory Wakefield, Yi-ping Hsu, Shan-ping Qui, Guevara Rowena Cristina | Not Specified | 1998 | Sound | 1-channel pcm | 16000 | microphone speech | speech recognition | Mandarin Chinese |
11 | LDC98S73 | 1997 Mandarin Broadcast News Speech (HUB4-NE) | Shudong Huang, Jing Liu, Xuling Wu, Lei Wu, Yongmin Yan, Zhoakai Qin | Not Specified | 1998 | Sound | 1-channel pcm | 16000 | broadcast news | speech recognition | Mandarin Chinese |
12 | LDC2007S09 | Mandarin Affective Speech | Yingchun Yang, Zhaohui Wu, Tian Wu, Dongdong Li | July 17, 2007 | 2007 | Sound | pcm | 22050 | microphone speech | prosody, pronunciation modeling, speech recognition | Mandarin Chinese |
13 | LDC2006S31 | 2003 NIST Language Recognition Evaluation | Alvin Martin, Mark Pryzbocki | June 15, 2006 | 2006 | Sound | ulaw | 8000 | telephone conversations | speech recognition | Vietnamese, Tamil, Spanish, Iranian Persian, Korean, Japanese, Hindi, French, English, German, Mandarin Chinese, Egyptian Arabic |
14 | LDC2008S05 | 2005 NIST Language Recognition Evaluation | Audrey Le, Alvin Martin, Hannah Hadfield, Jacques de Villiers, John-Paul Hosom, Jan van Santen | June 16, 2008 | 2008 | Sound | ulaw | 8000 | telephone conversations | speech recognition, language identification | Tamil, Korean, Japanese, Hindi, English, Spanish, Mandarin Chinese |
15 | LDC2011S01 | 2005 NIST Speaker Recognition Evaluation Training Data | NIST Multimodal Information Group | May 24, 2011 | 2011 | Sound | ulaw | 8000 | telephone speech | speech recognition | Spanish, Russian, English, Mandarin Chinese, Arabic |
16 | LDC2011S04 | 2005 NIST Speaker Recognition Evaluation Test Data | NIST Multimodal Information Group | July 15, 2011 | 2011 | Sound | ulaw | 8000 | telephone speech | speech recognition | Spanish, Russian, English, Mandarin Chinese, Arabic |
17 | LDC2011S07 | 2008 NIST Speaker Recognition Evaluation Training Set Part 2 | NIST Multimodal Information Group | September 15, 2011 | 2011 | Sound | ulaw | 8000 | telephone speech, microphone speech | speech recognition | Yue Chinese, Wu Chinese, Vietnamese, Uzbek, Urdu, Tigrinya, Thai, Tagalog, Spanish, Russian, Panjabi, Min Nan Chinese, Lao, Korean, Central Khmer, Georgian, Japanese, Italian, Hindi, Persian, English, Mandarin Chinese, Bengali, Egyptian Arabic, Moroccan Arabic, Northern Khmer, Dari, Iranian Persian, Chinese, Arabic |
18 | LDC2011S08 | 2008 NIST Speaker Recognition Evaluation Test Set | NIST Multimodal Information Group | October 21, 2011 | 2011 | Sound | ulaw | 8000 | telephone speech, microphone speech | speech recognition | Yue Chinese, Wu Chinese, Vietnamese, Uzbek, Urdu, Thai, Tagalog, Tamil, Russian, Panjabi, Min Nan Chinese, Lao, Korean, Japanese, Italian, Hindi, Persian, Mandarin Chinese, Bengali, Egyptian Arabic, Moroccan Arabic, Dari, Iranian Persian, English, Chinese, Arabic |
19 | LDC2011S09 | 2006 NIST Speaker Recognition Evaluation Training Set | NIST Multimodal Information Group | November 16, 2011 | 2011 | Sound | ulaw | 8000 | telephone speech | speech recognition | Yue Chinese, Urdu, Thai, Russian, Korean, Hindi, English, Mandarin Chinese, Bengali, Standard Arabic, Chinese, Arabic |
20 | LDC2011S10 | 2006 NIST Speaker Recognition Evaluation Test Set Part 1 | NIST Multimodal Information Group | December 15, 2011 | 2011 | Sound | ulaw | 8000 | telephone speech, microphone speech | speech recognition | Yue Chinese, Urdu, Thai, Spanish, Russian, Korean, Hindi, Persian, English, Mandarin Chinese, Bengali, Standard Arabic, Dari, Iranian Persian, Chinese, Arabic |
21 | LDC2013S04 | GALE Phase 2 Chinese Broadcast Conversation Speech | Kevin Walker, Christopher Caruso, Kazuaki Maeda, Denise DiPersio, Stephanie Strassel | April 15, 2013 | 2013 | Sound | pcm | 16000 | broadcast conversation | speech recognition | Mandarin Chinese, Chinese |
22 | LDC2013S08 | GALE Phase 2 Chinese Broadcast News Speech | Kevin Walker, Christopher Caruso, Kazuaki Maeda, Denise DiPersio, Stephanie Strassel | October 16, 2013 | 2013 | Sound | pcm | 16000 | broadcast news | speech recognition | Mandarin Chinese, Chinese |
23 | LDC2014S08 | United Nations Proceedings Speech | Kevin Chay, Cecilia Elizalde, Michal Ziemski | October 15, 2014 | 2014 | Sound | flac | 22050 | microphone speech | speech recognition, language identification | English, Mandarin Chinese, Standard Arabic, French, Russian, Spanish |
24 | LDC2014S09 | GALE Phase 3 Chinese Broadcast Conversation Speech Part 1 | Kevin Walker, Christopher Caruso, Kazuaki Maeda, Denise DiPersio, Stephanie Strassel | December 15, 2014 | 2014 | Sound | pcm | 16000 | broadcast conversation | speech recognition | Mandarin Chinese, Chinese |