Skip to content

Wikidepia/g2p-id

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Indonesian Grapheme-to-Phoneme

This module is designed to convert Indonesian graphemes (spelling) into phonemes (pronunciation). Fortunately, most Indonesian word pronunciations can be inferred from their spelling. Most of the work needed to convert grapheme to phoneme are: finding glottal stop /ʔ/ (baʔso) and determining which 'e' to use, there are two: /e/ (pensil) and /ə/ (têman). (there might be more... shrug)

Installation

pip install git+https://github.com/Wikidepia/g2p-id

Example usage

from g2p_id import G2P

g2p = G2P()
phonemes, syllables = g2p.to_phoneme("Tak seorang pun boleh ditangkap, ditahan atau dibuang dengan sewenang-wenang.")

print(phonemes) # taʔ seoraŋ pun boleh ditaŋkap, ditahan ataʊ dibuaŋ deŋan sewenaŋ-wenaŋ.
print(syllables) # ['taʔ', ' ', 'se', 'o', 'raŋ', ' ', 'pun', ..., 'we', 'naŋ', ' ']

References

TODO

  • Add test cases
  • Add BERT based homograph (resolver?)
  • Proper versioning

About

Indonesian Grapheme-to-Phoneme (IPA notation)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages