Grapheme-to-phoneme conversion and its application to transliteration

dc.contributor.advisorGrzegorz Kondrak (Computing Science)
dc.contributor.authorJiampojamarn, Sittichai
dc.contributor.otherHarald Baayen (Linguistics)
dc.contributor.otherRandy Goebel (Computing Science)
dc.contributor.otherAnoop Sarkar (School of Computing Science, Simon Fraser University)
dc.contributor.otherDale Schuurmans (Computing Science)
dc.date.accessioned2025-05-29T06:53:20Z
dc.date.available2025-05-29T06:53:20Z
dc.date.issued2011-06
dc.description.abstractGrapheme-to-phoneme conversion (G2P) is the task of converting a word, represented by a sequence of graphemes, to its pronunciation, represented by a sequence of phonemes. The G2P task plays a crucial role in speech synthesis systems, and is an important part of other applications, including spelling correction and speech-to-speech machine translation. G2P conversion is a complex task, for which a number of diverse solutions have been proposed. In general, the problem is challenging because the source string does not unambiguously specify the target representation. In addition, the training data include only example word pairs without the structural information of subword alignments. In this thesis, I introduce several novel approaches for G2P conversion. My contributions can be categorized into (1) new alignment models and (2) new output generation models. With respect to alignment models, I present techniques including many-to-many alignment, phonetic-based alignment, alignment by integer linear programing and alignment-by-aggregation. Many-to-many alignment is designed to replace the one-to-one alignment that has been used almost exclusively in the past. The new many-to-many alignments are more precise and accurate in expressing grapheme-phoneme relationships. The other proposed alignment approaches attempt to advance the training method beyond the use of Expectation-Maximization (EM). With respect to generation models, I first describe a framework for integrating many-to-many alignments and language models for grapheme classification. I then propose joint processing for G2P using online discriminative training. I integrate a generative joint n-gram model into the discriminative framework. Finally, I apply the proposed G2P systems to name transliteration generation and mining tasks. Experiments show that the proposed system achieves state-of-the-art performance in both the G2P and name transliteration tasks.
dc.identifier.doihttps://doi.org/10.7939/R3NX5Z
dc.language.isoen
dc.rightsThis thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
dc.subjectComputational linguistics
dc.subjectNLP
dc.subjectString transduction
dc.subjectGrapheme to phoneme
dc.subjectText to speech
dc.subjectOnline large margin training
dc.subjectSpeech synthesis
dc.subjectTransliteration
dc.subjectNatural language processing
dc.subjectAlignments
dc.titleGrapheme-to-phoneme conversion and its application to transliteration
dc.typehttp://purl.org/coar/resource_type/c_46ec
thesis.degree.grantorhttp://id.loc.gov/authorities/names/n79058482
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy
ual.date.graduationSpring 2011
ual.departmentDepartment of Computing Science
ual.jupiterAccesshttp://terms.library.ualberta.ca/public

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Jiampojamarn_Sittichai_Spring-202011.pdf
Size:
912.12 KB
Format:
Adobe Portable Document Format