COMPARISON TABLE BETWEEN DIFFERENT CORPUS TAKEN FROM LDC

Characteristics of the query in the LDC website:

Language(s): English

DCMI Type(s): Sound

Application(s): speech recognition

No. LDC Catalog No. Item Name Author(s) Release Date Member Year(s) DCMI Type(s) Sample Type Sample Rate Data Source(s) Application(s) Language(s)

1 LDC2000S86 1998 HUB4 Broadcast News Evaluation English Test Material Not Specified Not Specified 2000 Sound Not Specified Not Specified broadcast news speech recognition English

2 LDC2000S87 Speech in Noisy Environments (SPINE) Training Audio Astrid Schmidt-Nielsen, Elaine Marsh, John Tardelli, Paul Gatewood, Elizabeth Kreamer, Thomas Tremain, Christopher Cieri, Jonathan Wright Not Specified 2000 Sound Not Specified Not Specified microphone conversation speech recognition English

3 LDC2000S88 1999 HUB4 Broadcast News Evaluation English Test Material Not Specified Not Specified 2000 Sound Not Specified Not Specified broadcast news speech recognition English

4 LDC2000S92 TDT2 Careful Transcription Audio John Garofolo, David Graff Not Specified 2000 Sound Not Specified Not Specified broadcast news topic detection and tracking, speech recognition English

5 LDC2000S96 Speech in Noisy Environments (SPINE) Evaluation Audio Astrid Schmidt-Nielsen, Elaine Marsh, John Tardelli, Paul Gatewood, Elizabeth Kreamer, Thomas Tremain, Christopher Cieri, Jonathan Wright Not Specified 2000 Sound Not Specified Not Specified microphone speech, microphone conversation speech recognition English

6 LDC2001S04 Speech in Noisy Environments (SPINE2) Part 1 Audio Astrid Schmidt-Nielsen, Elaine Marsh, John Tardelli, Paul Gatewood, Elizabeth Kreamer, Thomas Tremain, Christopher Cieri, Stephanie Strassel, Nii Martey, David Graff, Cristina Tofan Not Specified 2001 Sound Not Specified Not Specified microphone speech speech recognition English

7 LDC2001S06 Speech in Noisy Environments (SPINE2) Part 2 Audio Astrid Schmidt-Nielsen, Elaine Marsh, John Tardelli, Paul Gatewood, Elizabeth Kreamer, Thomas Tremain, Christopher Cieri, Stephanie Strassel, Nii Martey, David Graff, Cristina Tofan Not Specified 2001 Sound Not Specified Not Specified microphone speech speech recognition English

8 LDC2001S08 Speech in Noisy Environments (SPINE2) Part 3 Audio Astrid Schmidt-Nielsen, Elaine Marsh, John Tardelli, Paul Gatewood, Elizabeth Kreamer, Thomas Tremain, Christopher Cieri, Stephanie Strassel, Nii Martey, David Graff, Cristina Tofan January 14, 2002 2001, 2002 Sound Not Specified 16000 microphone speech speech recognition English

9 LDC2001S94 TDT3 English Audio David Graff Not Specified 2001 Sound Not Specified Not Specified broadcast news topic detection and tracking, speech recognition English

10 LDC2001S97 2000 NIST Speaker Recognition Evaluation Mark Przybocki, Alvin Martin Not Specified 2001 Sound Not Specified Not Specified telephone speech speaker verification, speaker segmentation and tracking, speaker identification, speech recognition English

11 LDC2001S99 Speech in Noisy Environments 1 (SPINE1 CODED) Coded Audio Astrid Schmidt-Nielsen, Elaine Marsh, John Tardelli, Paul Gatewood, Elizabeth Kreamer, Thomas Tremain, Christopher Cieri, David Graff, Cristina Tofan, Kai Shun Soo Not Specified 2001 Sound Not Specified 16000 microphone speech speech recognition English

12 LDC2002S04 Translanguage English Database (TED) Speech A Kipp, L Lamel, J Mariani, F Schiel Not Specified 2002 Sound Not Specified Not Specified microphone speech speech recognition English

13 LDC2002S10 1998 HUB5 English Evaluation David Graff, Alvin Martin, David Miller March 27, 2002 2002 Sound 2-channel ulaw 8000 telephone conversations speech recognition English

14 LDC2002S11 1997 HUB4 English Evaluation Speech and Transcripts David Graff, Jonathan Fiscus, John Garofolo May 29, 2002 2002 Sound 2-channel ulaw 16000 broadcast news speech recognition English

15 LDC2002S13 2001 HUB5 English Evaluation David Graff, Alvin Martin, David Miller, Mark Przybocki, Kevin Walker April 16, 2002 2002 Sound, Text 2-channel ulaw 8000 telephone conversations speech recognition English

16 LDC2002S23 1997 HUB5 English Evaluation NIST Multimodal Information Group Not Specified 2002 Sound 2-channel ulaw 8000 telephone conversations speech recognition English

17 LDC2002S28 Emotional Prosody Speech and Transcripts Mark Liberman, Kelly Davis, Murray Grossman, Nii Martey, John Bell July 23, 2002 2002 Sound 2-channel pcm 22050 microphone speech speech recognition, prosody, pronunciation modeling English

18 LDC2002S34 2001 NIST Speaker Recognition Evaluation Corpus Mark Przybocki, Alvin Martin September 26, 2002 2002 Sound ulaw 8000 telephone speech speech recognition, speaker verification, speaker segmentation and tracking, speaker identification English

19 LDC2002S35 Voicemail Corpus Part II Mukund Padmanabhan, Brian Kingsbury, Bhuvana Ramabhadran, Jing Huang, Stanley Chen, George Saon, Lidia Mangu November 8, 2002 2002 Sound ulaw 8000 telephone speech speech recognition English

20 LDC2004S02 ICSI Meeting Speech Adam Janin, Jane Edwards, Dan Ellis, David Gelbart, Nelson Morgan, Barbara Peskin, Thilo Pfau, Elizabeth Shriberg, Andreas Stolcke, Chuck Wooters January 30, 2004 2004 Sound pcm 16000 microphone conversation speech recognition, speaker segmentation and tracking, discourse analysis English

21 LDC2004S05 ISL Meeting Speech Part 1 Susanne Burger, Victoria MacLaren, Alex Waibel May 21, 2004 2004 Sound pcm 16000 microphone conversation speech recognition, speaker identification, meeting summarization English

22 LDC2004S09 NIST Meeting Pilot Corpus Speech John Garofolo, Martial Michel, Vincent Stanford, Elham Tabassi, Jonathan Fiscus, Christophe Laprun, Nicolas Pratz, Jerome Lard July 12, 2004 2004 Sound pcm 16000 meeting speech speaker verification, speaker identification, language modeling, information retrieval, discourse analysis, automatic content extraction, speech recognition English

23 LDC2004S11 2002 Rich Transcription Broadcast News and Conversational Telephone Speech John Garofolo, Jonathan Fiscus, Audrey Le November 19, 2004 2004 Sound Not Specified Not Specified telephone conversations, broadcast news speaker verification, speaker identification, language modeling, information retrieval, discourse analysis, automatic content extraction, speech recognition English

24 LDC2012S01 2006 NIST Speaker Recognition Evaluation Test Set Part 2 NIST Multimodal Information Group January 19, 2012 2012 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Urdu, Thai, Spanish, Russian, Korean, Hindi, Persian, English, Mandarin Chinese, Bengali, Standard Arabic, Dari, Iranian Persian, Chinese, Arabic

25 LDC2011S05 2008 NIST Speaker Recognition Evaluation Training Set Part 1 NIST Multimodal Information Group August 15, 2011 2011 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Wu Chinese, Vietnamese, Uzbek, Urdu, Tigrinya, Thai, Tagalog, Spanish, Russian, Panjabi, Min Nan Chinese, Lao, Korean, Central Khmer, Georgian, Japanese, Italian, Hindi, Persian, English, Mandarin Chinese, Bengali, Egyptian Arabic, Moroccan Arabic, Northern Khmer, Dari, Iranian Persian, Chinese, Arabic

26 LDC2011S06 2005 Spring NIST Rich Transcription (RT-05S) Evaluation Set NIST Multimodal Information Group August 15, 2011 2011 Sound pcm 16000 meeting speech speech recognition, speaker verification, speaker identification, metadata extraction, discourse analysis, diarization English

27 LDC2004S13 Fisher English Training Speech Part 1 Speech Christopher Cieri, David Graff, Owen Kimball, Dave Miller, Kevin Walker December 15, 2004 2004 Sound ulaw 8000 telephone conversations speech recognition English

28 LDC2005S13 Fisher English Training Part 2, Speech Christopher Cieri, David Graff, Owen Kimball, Dave Miller, Kevin Walker April 15, 2005 2005 Sound 2-channel ulaw 8000 telephone conversations speech recognition English

29 LDC2013S09 CSC Deceptive Speech Columbia University, SRI International, University of Colorado Boulder November 15, 2013 2013 Sound pcm 16000 microphone conversation speech recognition, anomaly analysis English

30 LDC93S1 TIMIT Acoustic-Phonetic Continuous Speech Corpus John Garofolo, Lori Lamel, William Fisher, Jonathan Fiscus, David Pallett, Nancy Dahlgren, Victor Zue Not Specified 1993 Sound 1-channel pcm 16000 microphone speech speech recognition English

31 LDC93S10 TIDIGITS R. Gary Leonard, George Doddington Not Specified 1993 Sound 1-channel pcm compressed 20000 microphone speech speech recognition English

32 LDC93S2 NTIMIT William Fisher, George Doddington, Kathleen Goudie-Marshall, Charles Jankowski, Ashok Kalyanswamy, Sara Basson, Judith Spitz Not Specified 1993 Sound 1-channel pcm 16000 telephone speech speech recognition English

33 LDC93S4 ATIS0 Complete Charles Hemphill, John Godfrey, George Doddington, John Garofolo, Jonathan Fiscus Not Specified 1993 Sound 1-channel pcm 16000 microphone speech speech recognition, spoken dialogue systems English

34 LDC93S4 ATIS0 Pilot Charles Hemphill, John Godfrey, George Doddington, John Garofolo, Jonathan Fiscus, Nancy Dahlgren, William Fisher, Brett Tjaden, David Pallett Not Specified 1993 Sound 1-channel pcm 16000 microphone speech spoken dialogue systems, speech recognition English

35 LDC93S4 ATIS0 Read Charles Hemphill, John Godfrey, George Doddington, John Garofolo, Jonathan Fiscus, Nancy Dahlgren, William Fisher, Brett Tjaden, David Pallett Not Specified 1993 Sound 1-channel pcm 16000 microphone speech spoken dialogue systems, speech recognition English

36 LDC93S4 ATIS0 SD Read Charles Hemphill, John Godfrey, George Doddington, John Garofolo, Jonathan Fiscus, Nancy Dahlgren, William Fisher, Brett Tjaden, David Pallett Not Specified 1993 Sound 1-channel pcm 16000 microphone speech spoken dialogue systems, speech recognition English

37 LDC93S5 ATIS2 John Garofolo, Jonathan Fiscus, Kate Hunicke-Smith, Denise Danielson, Elizabeth Shriberg, Enrico Bocchieri, Bruce Buntschuh, Beverly Schwartz, Sandra Peters, Robert Ingria, Robert Weide, Yuzong Chang, Eric Thayer, Lynette Hirschman, Joe Polifroni, Bruce Lund, Goh Kawai, Tom Kuhn, Lew Norton, Deborah Dahl, Madeleine Bates, Michael Brown, Alexander Rudnicky, David Pallett Not Specified 1993 Sound 1-channel pcm compressed 16000 microphone speech spoken dialogue systems, speech recognition English

38 LDC93S6 CSR-I (WSJ0) Complete John Garofolo, David Graff, Doug Paul, David Pallett May 30, 2007 1993, 1996 Sound 1-channel pcm compressed 16000 microphone speech speech recognition English

39 LDC93S6 CSR-I (WSJ0) Sennheiser John Garofolo, David Graff, Doug Paul, David Pallett Not Specified 1993, 1996 Sound 1-channel pcm compressed 16000 microphone speech speech recognition English

40 LDC93S6 CSR-I (WSJ0) Other John Garofolo, David Graff, Doug Paul, David Pallett Not Specified 1993, 1996 Sound 1-channel pcm compressed 16000 microphone speech speech recognition English

41 LDC93S8 Switchboard Credit Card John Godfrey, Ed Holliman Not Specified 1993 Sound 2-channel ulaw 8000 telephone conversations speech recognition English

42 LDC93S9 TI 46-Word Mark Liberman, Robert Amsler, Ken Church, Ed Fox, Carole Hafner, Judy Klavans, Mitch Marcus, Bob Mercer, Jan Pedersen, Paul Roossin, Don Walker, Susan Warwick, Antonio Zampolli Not Specified 1993 Sound 1-channel 12-bit pcm 12500 microphone speech speech recognition English

43 LDC94S13 CSR-II (WSJ1) Complete Not Specified Not Specified 1994, 1997 Sound 1-channel pcm compressed 16000 microphone speech speech recognition English

44 LDC94S13 CSR-II (WSJ1) Sennheiser Not Specified Not Specified 1994, 1997 Sound 1-channel pcm compressed 16000 microphone speech speech recognition English

45 LDC94S13 CSR-II (WSJ1) Other Not Specified Not Specified 1994 Sound 1-channel pcm compressed 16000 microphone speech speech recognition English

46 LDC94S14 Air Traffic Control Complete John Godfrey Not Specified 1994, 1997 Sound 1-channel pcm 8000 field recordings speech recognition English

47 LDC94S14 Air Traffic Control BOS John Godfrey Not Specified 1994 Sound 1-channel pcm 8000 field recordings speech recognition English

48 LDC94S14 Air Traffic Control DCA John Godfrey Not Specified 1994 Sound 1-channel pcm 8000 field recordings speech recognition English

49 LDC94S17 OGI Multilanguage Corpus Ronald Cole, Yeshwant Muthusamy Not Specified 1994 Sound 1-channel pcm compressed 8000 telephone speech speech recognition Vietnamese, Tamil, Korean, Japanese, Hindi, French, English, German, Spanish, Mandarin Chinese, Persian, Dari, Iranian Persian

50 LDC94S18 OGI Spelled and Spoken Word Ronald Cole, Yeshwant Muthusamy Not Specified 1994 Sound 1-channel pcm compressed 8000 telephone speech speech recognition English

51 LDC94S19 ATIS3 Training Data Deborah Dahl, Madeleine Bates, Michael Brown, William Fisher, Kate Hunicke-Smith, David Pallett, Christine Pao, Alexander Rudnicky, Elizabeth Shriberg, John Garofolo, Jonathan Fiscus, Denise Danielson, Enrico Bocchieri, Bruce Buntschuh, Beverly Schwartz, Sandra Peters, Robert Ingria, Robert Weide, Yuzong Chang, Eric Thayer, Lynette Hirschman, Joe Polifroni, Bruce Lund, Goh Kawai, Tom Kuhn, Lew Norton Not Specified 1994 Sound 1-channel pcm compressed 16000 microphone speech spoken dialogue systems, speech recognition English

52 LDC94S21 MACROPHONE Jared Bernstein, Kelsey Taussig, Jack Godfrey Not Specified 1994 Sound 1-channel ulaw compressed 8000 telephone speech speech recognition English

53 LDC95S23 CSR-III Speech Not Specified Not Specified 1995, 1998 Sound 1-channel pcm compressed 16000 microphone speech speech recognition English

54 LDC95S24 WSJCAM0 Cambridge Read News Tony Robinson, Jeroen Fransen, David Pye, Jonathan Foote, Steve Renals, Phil Woodland, Steve Young Not Specified 1995 Sound 1-channel pcm compressed 16000 microphone speech speech recognition English

55 LDC95S25 TRAINS Spoken Dialog Corpus James Allen, Peter Heeman Not Specified 1995 Sound 1-channel pcm compressed 16000 microphone conversation spoken dialogue systems, speech recognition, discourse analysis English

56 LDC95S26 ATIS3 Test Data Deborah Dahl, Madeleine Bates, Michael Brown, William Fisher, Kate Hunicke-Smith, David Pallett, Christing Pao, Alexander Rudnicky, Elizabeth Shriberg, John Garofolo, Jonathan Fiscus, Denise Danielson, Enrico Bocchieri, Bruce Buntschuh, Beverly Schwartz, Sandra Peters, Robert Ingria, Robert Weide, Yuzong Chang, Eric Thayer, Lynette Hirschman, Joe Polifroni, Bruce Lund, Goh Kawai, Tom Kuhn, Lew Norton Not Specified 1995 Sound 1-channel pcm compressed 16000 microphone speech spoken dialogue systems, speech recognition English

57 LDC95S27 PhoneBook: NYNEX Isolated Words John Pitrelli, Cynthia Fong Not Specified 1995 Sound 1-channel ulaw 8000 telephone speech speech recognition English

58 LDC96S30 CTIMIT E. Bryan George, Kathy Brown, Martha Birnbaum, Michael Macon Not Specified 1996 Sound 1-channel pcm 8000 telephone speech speech recognition English

59 LDC96S31 CSR-IV HUB4 John Garofolo, Jonathan Fiscus, William Fisher, David Pallett Not Specified 1996 Sound 1-channel pcm 16000 broadcast news speech recognition English

60 LDC96S32 FFMTIMIT John Garofolo, Lori Lamel, William Fisher, Jonathan Fiscus, David Pallett, Nancy Dahlgren, Victor Zue Not Specified 1996 Sound 1-channel pcm 16000 microphone speech speech recognition English

61 LDC96S33 CSR-IV HUB3 Jonathan Fiscus, John Garofolo, David Pallett Not Specified 1996 Sound 1-channel pcm 16000 microphone speech speech recognition English

62 LDC96S36 Boston University Radio Speech Corpus Mari Ostendorf, Patti Price, Stefanie Shattuck-Hufnagel Not Specified 1996, 1997 Sound 1-channel pcm 16000 microphone speech speech synthesis, speech recognition, prosody English

63 LDC96S38 DCIEM/HCRC Martin Taylor, Ellen Gurman Bard, Cathy Sotillo, David McKelvie, Anne Anderson Not Specified 1996 Sound 2-channel pcm 20000 microphone speech speech recognition English

64 LDC96S39 RM Isolated and Spelled Word Data Not Specified Not Specified 1996 Sound 1-channel pcm 16000 microphone speech speech recognition English

65 LDC94S14 Air Traffic Control DFW John Godfrey Not Specified 1994 Sound 1-channel pcm 8000 field recordings speech recognition English

66 LDC97S42 CALLHOME American English Speech Alexandra Canavan, David Graff, George Zipperlen Not Specified 1997 Sound 2-channel ulaw 8000 telephone conversations speech recognition English

67 LDC97S44 1996 English Broadcast News Speech (HUB4) David Graff, John Garofolo, Jonathan Fiscus, William Fisher, David Pallett Not Specified 1997, 1998 Sound 1-channel pcm 16000 broadcast news speech recognition English

68 LDC97S62 Switchboard-1 Release 2 John Godfrey, Edward Holliman Not Specified 1993, 1997 Sound 2-channel ulaw 8000 telephone conversations speech recognition, speaker identification English

69 LDC97S63 The CMU Kids Corpus Maxine Eskenazi, Jack Mostow, David Graff Not Specified 1997 Sound 1-channel pcm 16000 microphone speech speech recognition English

70 LDC97S66 1996 English Broadcast News Dev and Eval (HUB4) David Graff, Jennifer Alabiso, Jonathan Fiscus, John Garofolo, William Fisher, David Pallett Not Specified 1997, 1998 Sound 1-channel pcm 16000 broadcast news speech recognition English

71 LDC98S67 HTIMIT Douglas Reynolds Not Specified 1998 Sound 1-channel pcm 8000 telephone speech speech recognition, speaker identification English

72 LDC98S68 LLHDB Douglas Reynolds Not Specified 1998 Sound 1-channel pcm 8000 telephone speech speech recognition, speaker identification English

73 LDC98S71 1997 English Broadcast News Speech (HUB4) Jonathan Fiscus, John Garofolo, Mark Przybocki, William Fisher, David Pallett Not Specified 1998 Sound 1-channel pcm 16000 broadcast news speech recognition English

74 LDC98S77 Voicemail Corpus Part I M Padmanabhan, G Ramaswamy, B Ramabhadran, P Gopalakrishnan, C Dunn Not Specified 1998 Sound 1-channel ulaw 8000 telephone speech speech recognition English

75 LDC99S78 SUSAS John H. L. Hansen Not Specified 1999 Sound 1-channel pcm 8000 microphone speech speech recognition English

76 LDC99S82 USC Marketplace Broadcast News Speech Alexandra Canavan, David Miller, Paul Morgovsky Not Specified 1999 Sound Not Specified Not Specified broadcast news speech recognition English

77 LDC99S83 Tactical Speaker Identification Speech Corpus (TSID) David Graff, Douglas Reynolds, Gerald C. O' Leary Not Specified 1999 Sound Not Specified Not Specified microphone speech speech recognition, speaker identification English

78 LDC99S84 TDT2 English Audio David Graff Not Specified 1999 Sound Not Specified Not Specified broadcast news topic detection and tracking, speech recognition English

79 LDC2007S08 CSLU: Foreign Accented English Release 1.2 T Lander May 17, 2007 2007 Sound ulaw 8000 telephone speech speech recognition English

80 LDC2007S05 CSLU: Yes/No Version 1.2 Mike Noel July 17, 2007 2007 Sound pcm 8000 telephone speech speech synthesis, speech recognition, speaker verification, speaker identification, pronunciation modeling English

81 LDC2007S15 Nationwide Speech Project Cynthia Clopper, David Pisoni September 17, 2007 2007 Sound pcm 44100 microphone speech speech recognition, sociolinguistics, natural language processing, linguistic analysis, language modeling English

82 LDC2007S13 CSLU: Apple Words and Phrases Mike Noel September 17, 2007 2007 Sound Not Specified Not Specified telephone speech speech recognition, speaker verification, speaker identification English

83 LDC2005S30 West Point Company G3 American English Speech John Morgan, Stephen LaRocca, Sherri Bellinger, Charles (Chip) Ruscelli November 29, 2005 2005 Sound pcm 22050 microphone speech speech recognition English

84 LDC2006S01 CSLU: Voices Alexander Kain January 19, 2006 2006 Sound Not Specified 22050 microphone speech speech recognition, speaker verification, speaker identification, speech synthesis English

85 LDC2006S15 CSLU: Spelled and Spoken Words Ronald Cole, Mark Fanty, K Roginski March 24, 2006 2006 Sound 16 bit linear pcm 8000 telephone speech speaker identification, speech recognition, pronunciation modeling English

86 LDC2006S26 CSLU: Speaker Recognition Version 1.1 CSLU May 18, 2006 2006 Sound ulaw 8000 telephone speech, telephone conversations speech recognition English

87 LDC2006S13 N4 NATO Native and Non-Native Speech John Grieco, Laurent Benarousse, Edouard Geoffrois, Robert Series, Herman Steeneken, Hans Stumpf, Carl Swail, Dieter Thiel April 17, 2006 2006 Sound pcm 16000 microphone speech speaker verification, sociolinguistics, cross-lingual information retrieval, speech recognition Dutch, English, German

88 LDC2006S31 2003 NIST Language Recognition Evaluation Alvin Martin, Mark Pryzbocki June 15, 2006 2006 Sound ulaw 8000 telephone conversations speech recognition Vietnamese, Tamil, Spanish, Iranian Persian, Korean, Japanese, Hindi, French, English, German, Mandarin Chinese, Egyptian Arabic

89 LDC2006S44 2004 NIST Speaker Recognition Evaluation Alvin Martin, Mark Przybocki October 25, 2006 2006 Sound ulaw 8000 telephone speech speaker verification, speaker segmentation and tracking, speaker identification, speech recognition English

90 LDC2007S11 2004 Spring NIST Rich Transcription (RT-04S) Development Data Jonathan Fiscus, John Garofolo, Audrey Le, Alvin Martin, Greg Sanders, Mark Przybocki, David Pallett December 20, 2007 2007 Sound Not Specified Not Specified meeting speech speaker verification, speaker identification, metadata extraction, discourse analysis, diarization, speech recognition English

91 LDC2007S12 2004 Spring NIST Rich Transcription (RT-04S) Evaluation Data Jonathan Fiscus, John Garofolo, Audrey Le, Alvin Martin, Greg Sanders, Mark Przybocki, David Pallett October 17, 2007 2007 Sound Not Specified Not Specified meeting speech speech recognition, speaker verification, speaker identification, metadata extraction, discourse analysis, diarization English

92 LDC2007S18 CSLU: Kids` Speech Version 1.1 Khaldoun Shobaki, John-Paul Hosom, Ronald Cole November 20, 2007 2007 Sound Not Specified 16000 microphone speech spoken dialogue modeling, speech recognition, sociolinguistics English

93 LDC2008S02 CSLU: National Cellular Telephone Speech Release 2.3 Ronald Cole, M Noel, T Lander, T Durham March 19, 2008 2008 Sound ulaw 8000 telephone speech speech recognition, speaker identification English

94 LDC2008S03 STC-TIMIT 1.0 Nicolas Morales March 19, 2008 2008 Sound ulaw 8000 telephone conversations speech recognition, speech synthesis English

95 LDC2008S05 2005 NIST Language Recognition Evaluation Audrey Le, Alvin Martin, Hannah Hadfield, Jacques de Villiers, John-Paul Hosom, Jan van Santen June 16, 2008 2008 Sound ulaw 8000 telephone conversations speech recognition, language identification Tamil, Korean, Japanese, Hindi, English, Spanish, Mandarin Chinese

96 LDC2008S06 CSLU: Alphadigit Version 1.3 Ronald Cole, M Noel, T Lander, T Durham July 16, 2008 2008 Sound ulaw 8000 telephone conversations speech recognition English

97 LDC2009S01 CSLU: Numbers Version 1.3 Ronald Cole, M Noel, T Lander, T Durham January 16, 2009 2009 Sound Signed 16 bit PCM,1 Channel 8000 telephone speech speech recognition English

98 LDC2008S09 CHAracterizing INdividual Speakers (CHAINS) Fred Cummins, Marco Grimaldi, Thomas Leonard, Juraj Simko November 18, 2008 2008 Sound 16 bit linear PCM Not Specified microphone speech speech recognition English

99 LDC2009S03 CSLU: S4X Release 1.2 Ronald Cole, M Noel, T Lander, T Durham September 15, 2009 2009 Sound 8 bit ulaw 8000 telephone speech speech recognition English

100 LDC2010S02 WTIMIT 1.0 Patrick Bauer, Tim Fingscheidt March 17, 2010 2010 Sound 1-channel signed linear PCM (raw) 16000 telephone speech speech recognition, speaker identification English

101 LDC2010S03 2003 NIST Speaker Recognition Evaluation NIST Multimodal Information Group May 14, 2010 2010 Sound 8 bit u-law 8000 telephone conversations speech recognition, speaker verification, speaker identification English

102 LDC93S1 TIMIT Acoustic-Phonetic Continuous Speech (MS-WAV version) John Garofolo, Lori Lamel, William Fisher, Jonathan Fiscus, David Pallett, Nancy Dahlgren, Victor Zue Not Specified 1993 Sound 1-channel pcm 16000 microphone speech speech recognition English

103 LDC2011S01 2005 NIST Speaker Recognition Evaluation Training Data NIST Multimodal Information Group May 24, 2011 2011 Sound ulaw 8000 telephone speech speech recognition Spanish, Russian, English, Mandarin Chinese, Arabic

104 LDC2011S04 2005 NIST Speaker Recognition Evaluation Test Data NIST Multimodal Information Group July 15, 2011 2011 Sound ulaw 8000 telephone speech speech recognition Spanish, Russian, English, Mandarin Chinese, Arabic

105 LDC2011S07 2008 NIST Speaker Recognition Evaluation Training Set Part 2 NIST Multimodal Information Group September 15, 2011 2011 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Wu Chinese, Vietnamese, Uzbek, Urdu, Tigrinya, Thai, Tagalog, Spanish, Russian, Panjabi, Min Nan Chinese, Lao, Korean, Central Khmer, Georgian, Japanese, Italian, Hindi, Persian, English, Mandarin Chinese, Bengali, Egyptian Arabic, Moroccan Arabic, Northern Khmer, Dari, Iranian Persian, Chinese, Arabic

106 LDC2011S08 2008 NIST Speaker Recognition Evaluation Test Set NIST Multimodal Information Group October 21, 2011 2011 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Wu Chinese, Vietnamese, Uzbek, Urdu, Thai, Tagalog, Tamil, Russian, Panjabi, Min Nan Chinese, Lao, Korean, Japanese, Italian, Hindi, Persian, Mandarin Chinese, Bengali, Egyptian Arabic, Moroccan Arabic, Dari, Iranian Persian, English, Chinese, Arabic

107 LDC2011S09 2006 NIST Speaker Recognition Evaluation Training Set NIST Multimodal Information Group November 16, 2011 2011 Sound ulaw 8000 telephone speech speech recognition Yue Chinese, Urdu, Thai, Russian, Korean, Hindi, English, Mandarin Chinese, Bengali, Standard Arabic, Chinese, Arabic

108 LDC2011S10 2006 NIST Speaker Recognition Evaluation Test Set Part 1 NIST Multimodal Information Group December 15, 2011 2011 Sound ulaw 8000 telephone speech, microphone speech speech recognition Yue Chinese, Urdu, Thai, Spanish, Russian, Korean, Hindi, Persian, English, Mandarin Chinese, Bengali, Standard Arabic, Dari, Iranian Persian, Chinese, Arabic

109 LDC2011S11 2008 NIST Speaker Recognition Evaluation Supplemental Set NIST Multimodal Information Group December 15, 2011 2011 Sound ulaw 8000 microphone speech speech recognition English

110 LDC2012S02 TORGO Database of Dysarthric Articulation Frank Rudzicz, Graeme Hirst, Pascal van Lieshout, Gerald Penn, Fraser Shein, Aravind Namasivayam, Talya Wolff January 19, 2012 2012 Sound pcm 16000 microphone speech speech recognition English

111 LDC2012S05 USC-SFI MALACH Interviews and Transcripts English Bhuvana Ramabhadran, Samuel Gustman, William Byrne, Jan Hajic, Douglas Oard, J. Scott Olsson, Michael Picheny, Josef Psutka April 20, 2012 2012 Sound MPEG audio 44100 field recordings speech recognition, sociolinguistics English

112 LDC2014S03 Multi-Channel WSJ Audio Mike Lincoln, Erich Zwyssig, Iain McCowan April 15, 2014 2014 Sound pcm 16000 newswire speech recognition, speaker segmentation and tracking, speaker identification English

113 LDC2013S03 Mixer 6 Speech Linda Brandschain, David Graff, Kevin Walker August 19, 2013 2013 Sound 1-channel pcm 16000 telephone speech, microphone speech speech recognition English

114 LDC2014S05 Hispanic-English Database William Byrne, Eva Knodt, Jared Bernstein, Farzhad Emami May 15, 2014 2014 Sound pcm 16000 microphone speech spoken dialogue modeling, speech recognition, speech activity detection, speaker identification Spanish, English

115 LDC2014S08 United Nations Proceedings Speech Kevin Chay, Cecilia Elizalde, Michal Ziemski October 15, 2014 2014 Sound flac 22050 microphone speech speech recognition, language identification English, Mandarin Chinese, Standard Arabic, French, Russian, Spanish

116 LDC2006S30 Speech Controlled Computing Christopher Cieri, David Miller, Nii Martey, Kazuaki Maeda March 24, 2006 2006 Sound pcm 48000 microphone speech machine learning, speech recognition English