Skip to content

Why Learn Jyutping?

Speaking (oral proficiency) comes naturally in an immersive environment. In a non-immersive, structured learning environment, learners need a way to systematically register the sounds they heard.

In phonetic scripts, such as Latin script for Spanish, the pronunciation can be perfectly inferred from manzana. The Chinese script (漢字 honzi or hanzi), on the other hand, is ideographic 表意文字: glyphs 字符 primarily specifies the idea. 蘋果 is not meant to inform the reader of the pronunciation.

Literacy in Chinese (and Japanese) is a long process and does not come naturally. Children, with full-time schooling, master 5,000 characters over six years; this learning is aided by phonetic symbols such as bopomofo 注音 in Taiwan, Pinyin 拼音 in the mainland China, and hiragana/katagana 假名 for Japanese Kanji. PRC’s use of Pinyin is relatively modern, and it swept schools nationwide because studies show that even under-privileged children can often acquire Grade 6 literacy by Grade 2 with phonics assistance.

Not being able to rely on Chinese script to show the pronunciation means that learners need to register the sounds in a different way. Learners who were not taught (or did not learn) a standard phonics system invariably invent their own scheme based on their native language. Designing a standard phonics scheme requires intimate knowledge of the phonology, and for Cantonese, took generations of linguists a century of iterations. Learner-invented schemes invariably end up with clashes (one symbol ambiguously pointing to multiple sounds), holes (sounds that cannot be represented), inconsistency (letters pronounced differently in different contexts), or difficult to transmit digitally.

Invented schemes also tend to be unstable. As learners find out about new sounds that they did not hear before, they adapt the previous scheme, and the meaning of what was written drifts over time.

By learning one standard phonics scheme, you can unambiguously write down a sound you’ve heard, and read text that had been annotated with phonics. Standard phonics lets you re-use your (quick, natural to acquire) oral proficiency.

Standards are like toothbrushes. Everybody needs one but nobody wants to use anybody elses’.

Jyutping 粵拼 is the phonics scheme for Cantonese, standardized in 1993 by the Linguistics Society of Hong Kong (LSHK). Jyutping absorbed the lessons from over a century of attempts to prepare phonics for Cantonese, and had been stable since its inception.

Jyutping have been the standard that linguists, creators, and teachers unify around. When you learn Jyutping, you open up a whole realm that we’ve been building for you in the last two decades:

  • dictionaries 字典/詞典, which lets you search by Jyutping, may be arranged by Jyutping, and shows how something is read with Jyutping
  • books and multimedia annotated with Jyutping
    • a graded reader collection called 冚唪唥 Hambaanglaang
    • canto.hk collections
    • modern textbooks, both physical and digital
  • Google Translate’s hints
  • Keyboard Input Methods, which let you write emails / WhatsApp messages if you know Jyutping
    • TypeDuck which accepts partially right input, and provides English hint to disambiguate identically sounding terms
  • Automated Jyutping annotations, which converts Chinese characters back to Jyutping
    • the Cantonese Font 粵語字體 from canto.hk
  • Jyutping-based pronunciation feedback
  • Examinations, for example, the Cantonese Read-Aloud Test that uses Jyutping to designate accepted pronunciations
  • games, typing exercises, and other interactive learning material

The vast majority of language teachers and Cantonese learners are using Jyutping, and Jyutping is the lingua franca for seeking help.

Institutionally, while Jyutping is not used in Hong Kong for mainstream primary/secondary schooling, it is used by special educational schools and for L2 of ethnic minorities. Jyutping is adopted by

  • the Yale Center of Chinese University (the leading intensive L2 learning programme),
  • UBC Cantonese, the largest Cantonese-as-L2 programme overseas, and
  • NGOs such as the Cantonese Language Teachers’ Association and CantoGather (ethnic minority children)

These institutional support, in a virtuous cycle, creates a larger, richer set of learning materials every day.

Comparison with alternative phonics schemes

Section titled “Comparison with alternative phonics schemes”

The first missionaries annotated the Cantonese sound with the Latin alphabet and funny symbols in the mid-1800s, and since then, over forty phonics schemes have been published. In the Internet age, every year we see half-dozen of social media and Reddit threads claiming invention of a new scheme.

The most significant contenders are Yale and Gwongping 廣拼 variations. Yale was the major Cantonese phonics system from 1950s to mid-1990s, and uses a combination of diacritics (not easily typed on standard QWERTY keyboard) and h to designate tones. Yale remains popular in North America, particularly in older institutions that had much material prepared. The major pedagogical advantage it had is that learners find diacritics more digestable than arbitrary tone-numbers. Teachers and schools that have tried the Cantonese Font are all replacing Yale with Jyutping+.

Gwongping 廣拼 took inspirations from Jyutping, blending some letters in trying to make it closer to Mandarin Pinyin, and is popular with Guangzhou textbook publishers.

Linguistically, Jyutping, Yale, and Gwongping shares 99% of their analysis,and divergence are superficial. However, strength of a standard phonics scheme derives from its ecosystem. The more content and tools are available for a phonics system, the more valuable it is for users; the more users there is, creations have a larger market, and attracts more content and tools to be built.

Jyutping is linguistically complete, and its ecosystem / network advantage is overwhelming and growing every year. Jyutping is likely to remain the dominant phonics system for the next fifty years. If you are shopping for a Cantonese phonics system, Jyutping(+) should be your first choice.

Native speakers may declare phonics unnecessary because “the pronunciation can usually be inferred”, citing examples such as

rootderived characters
baau1 baau1 baau1
maa5 maa5 maa5

In reality, exact matches happens for about 15% of the characters. Partial matches (e.g., 軍 gwan1 -> 運 wan6 , 乍 zaa3 -> 作 zok3 ) and characters that cannot be decomposed (e.g., 為、中、西) are the norm. The following figure shows the exact matches in green and (generous) partial matches in yellow.

Unreliable phonetic components

However, what demolishes the “inferred pronunciation” approach are the copious false friends (highlighted red in previous image). These glyphs deceptively look like combination of a radical 部首 and phonetic component 聲旁. The next table shows examples where the “phonetic component” recurs but derived glyphs are all pronounced differently.

rootderived characters
heoi3 faat3 hip3 keoi1
cyun3 fu6 se6 si4 tou2
mui5 hoi2 fui3 mou5
jaa5 dei6 ci4 taa1
toi4 ji4 je5 zi6
him3 ci3 jyun5 ham2 hei1
sik1 ze3 cou3 zoek3 co3
jat6 coeng1 zing1
muk6 lam4 sam1

Why does this inference fail so often? Chinese glyphs that are composites may be idea composite or sound composite. For example, ⿰人木 = 休 (to rest) is an idea composite, showing a man 人 leaning against ⿰ a tree 木; ⿰日月 = 明 illustrates the idea of “brilliance” by showing the sun 日 appearing with the moon 月 simultaneously. Learners encountering the glyph for the first time do not know its etymology.

Even when a character is a true sound composite, pronunciations of the standalone phonetic root and each character may have diverged. In Ancient Chinese 上古漢語, 海 is dervied from 每 and have the same pronunciation, and all of 怡 冶 治 were read as 台. Two thousand years later this is no longer true.

“Pronunciation by inference” does have its place. The method is more frequently applicable with rarely encountered characters. By the time a learner is familiar with 5,000 characters, inference becomes reasonably reliable.

The other suggestion native speakers often give is to use an identically-sounding character to note the sound of a new character. For example, seeing 討 and annotating it with 土, since they are both tou2 .

This requires you to first learn to associate 土 with tou2 (somehow, without reliance on any phonics), and know at least one unique character for each plausible sound. A large number of these sounds are singletons: 𢝵 is read as fit1 , and fit1 is only for 𢝵. To use same-sounding-glyphs as a phonics system, you’ll need to memorize 5000+ glyphs instead of 23 alphabets and 6 numbers.

And then, there are sounds that are in actual use but have no accepted characters, such as he3 , kam5 , and beu6 . Yikes!

上字取聲母,下字取韻母; 上字分陰陽,下字辨平仄。

請君試以反切標注「掉」白讀音 deu6