MarketCMU Pronouncing Dictionary
Company Profile

CMU Pronouncing Dictionary

The CMU Pronouncing Dictionary is an open-source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research.

Database format
The database is distributed as a plain text file with one entry to a line in the format "WORD  " with a two-space separator between the parts. If multiple pronunciations are available for a word, variants are identified using numbered versions (e.g. WORD(1)). The pronunciation is encoded using a modified form of the ARPABET system, with the addition of stress marks on vowels of levels 0, 1, and 2. A line-initial ;;; token indicates a comment. A derived format, directly suitable for speech recognition engines is also available as part of the distribution; this format collapses stress distinctions (typically not used in ASR). The following is a table of phonemes used by CMU Pronouncing Dictionary. ==History==
Applications
• The Unifon converter is based on the CMU Pronouncing Dictionary. • The Natural Language Toolkit contains an interface to the CMU Pronouncing Dictionary. • The Carnegie Mellon Logios tool incorporates the CMU Pronouncing Dictionary. • PronunDict, a pronunciation dictionary of American English, uses the CMU Pronouncing Dictionary as its data source. Pronunciation is transcribed in IPA symbols. This dictionary also supports searching by pronunciation. • Some singing voice synthesizer software like CeVIO Creative Studio and Synthesizer V uses modified version of CMU Pronouncing Dictionary for synthesizing English singing voices. • Transcriber, a tool for the full text phonetic transcription, uses the CMU Pronouncing Dictionary • 15.ai, a real-time text-to-speech tool using artificial intelligence, uses the CMU Pronouncing Dictionary ==See also==
tickerdossier.comtickerdossier.substack.com