Understanding Corruption in Kanji (part 2) – Outlier Linguistics

Understanding Corruption in Kanji (part 2)

Note: Before we get started here, I’d like to mention our motivation for making posts like these. It is not to show how we explain kanji in our dictionary, or to say that learners need to understand kanji corruption on a deep level in order to learn kanji. In the dictionary, explanations will be much shorter and more concise. But hey! This is a blog! Here we’d like to have a little fun and show how paleography really works to people who normally wouldn’t get the chance. This can be difficult stuff (it’s a graduate-level class in Taiwan, after all!), so please ask us if anything is unclear. We love talking about this stuff! We’re here to help. Get ready for a wild ride. Here are some of the main ingredients for this post: tattooing people’s faces, bloody battlefields with rotting corpses, ghost fire and rain dances!

Understanding Corruption in Kanji (part 2), by

Looking at the Etymologies of 黒, 粦, 無, and 舞

By Ash Henson

First, a note on our use of the term “corruption.”

In the previous post in this series, the notion of corruption was introduced as well as one of the main reasons it occurs: the process of writing itself (杜忠誥 2002:32)1. Now, we’ll start looking at some of the types of corruption that occur. Specifically, we’ll be looking at the first of the eight types of corruption identified by Prof. Tu Chung-kao (杜忠誥) via exploring the etymologies of some common kanji.

Type 1 Corruption: Disintegration

Corruption by way of Disintegration is defined as a kanji form or the form of one of its components that was originally a single piece being broken into two pieces (2002:53)2. Let’s take a look at the etymologies of 黒 & 粦 to get a better understanding of what this means.

Example 1: コク “black”

Above are two early forms of 黒. (1a) is from Shang dynasty (1700 to 1100 BCE) oracle bone inscriptions (コウコツブン), while (1b) is from the early Zhou dynasty3 (1059 to 255 BCE). (1a) and (1b) are pictures of the front view of a person whose face had been tattooed as punishment4 (ボクケイ) for committing a crime (2002:54)5. The body is the same form as ダイ (a picture of an adult person from the front), with the head exaggerated. According to Chinese History: A New Manual by Endymion Wilkinson6, this was one of the "five punishments (ケイ)"7. A person receiving 墨刑 had the name of their crime tattooed on their body. If the tattoo was on the face, as it is here with 黒, it was called ケイメン. Forms from the Spring and Autumn and Warring States periods sometimes added dots, such as can be seen in (1c) and (1d)8 below (note that in pre-Han dynasty scripts, kanji change and variation was extremely common and also varied by geographic area):

According to Prof. Tu, the addition of dots here was to reinforce the notion of being marked by the 墨刑 tattoo (2002:55).

The bottom of 黒 as stated above was originally 大. During the Eastern Zhou, 大 became more stylized and was written as follows9:


Forms (1e) to (1h) are very typical ways of writing 大 during Warring States; and as can be clearly seen, 大 has already broken into two parts. Forms (1e) and (1g) are still almost a single piece, while forms (1f) and (1h) show a significant amount of separation. What caused this change?


Form (1i) is already very close to the modern form. (1k) is the typical Warring States form similar to forms (1e) to (1h) above. (1j) is a superimposition of (1i) in red, onto (1k) in black, showing that these forms are still essentially the same. The most likely cause for this change is convenience and ease of writing. Though (1k) has one more stroke, it’s actually easier to write and can be written in a more fluid motion (because the way the brush is lifted between strokes is less laborious). At the same time, we can also see how forms can get broken into smaller pieces.

The addition of the extra dots as seen in the forms image00, combined with writing 大 in two separate parts, like above, caused the bottom of 黒 to look very similar to エン “the top of a burning flame.” This is readily seen in (1l), which is the form that appears in the Setsumon Kaiji セツモンカイ.


In forms (1m) and (1n), the top “fire” gets straightened out, while the bottom retains its -ness. In (1n), the top parts looks like 里, while the bottom looks like a taller version of 灬, the form that 火 usually takes in modern kanji when it appears at the bottom of a kanji. (1o) looks mostly like the modern form. Had 黒’s form not become corrupted, it would probably look like image12 now. The original form image13, a single “piece” being corrupted into image14, which has multiple parts, is an example of corruption by way of Disintegration.

So, 黒 originally depicted a person with a tattooed face. Dots were added to emphasize the tattoo. The dots, along with the person’s body, separated over time and corrupted into 炎. The top 火 “straightened out” and combined with the face to resemble 里, while the bottom 火 became 灬. Thus, the 灬 in the modern form is simply a corruption of an earlier form, and actually has nothing to do with fire.

And if you’re wondering how we boil this information down in our dictionary, here’s a screen shot:

黒 entry

Example 2: リン “friar’s lantern, ignis fatuus, jack-o’-lantern”

Another example of corruption by way of disintegration is 粦, which is phonetic in kanji such as リン “neighbor,” レン “pitiful,” and リン “(Chinese) unicorn; giraffe.” If you aren’t familiar with any of those terms given as the definition for 粦 above (I wasn’t), Dictionary.com gives this definition: “A pale flame or phosphorescence sometimes seen over marshy ground at night. It is believed to be due to the spontaneous combustion of methane or other hydrocarbons originating from decomposing organic matter.” In Japanese this type of fire is called リン.


The earliest form for 粦 is the oracle bone (コウセツブン) form (2a) and the bronze inscription (キンブン) (2b). This is a picture of a person represented by ダイ image03, with dots (or small lines) above and below each arm. Notice the difference at the bottom. Form (2a) doesn’t have the feet explicitly drawn, while (2b) does.

Side note: Getting off on the right foot:

An interesting thing about the feet is, in the oracle bone and bronze inscription scripts, there are many kanji which have variants that have feet, but the meaning expressed is the same whether the feet are there or not. We know this because of the meaning they are used to express in a given context. A good example is “not, none” & “dance.”


無 is the original form of 舞. Its earliest forms (2c) and (2d)10 are pictures of a person doing a rain dance, holding either ox tails or bird feathers as ornaments (季旭昇 2004:470)11. It originally meant “rain dance” and later came to mean just dancing in general. It wasn’t until after the Western Zhou dynasty that 無 was borrowed by way of sound loan to mean “not, none” (2004:493). The meaning “dance” was then represented by adding feet (舛) to the 無 form, producing 舞, as seen below in (2e):


So, these two variant forms (at one time or another) both represented the meaning “rain dance” or “dance,” as such it can be seen that the addition of feet didn’t affect the meaning.

A quick detour en route back to the dots:

As to the explanation for the dots (or little lines), first we must take a rather gruesome detour. Rotting corpses produce phosphine gas. According to Wikipedia: “Phosphine gas is more dense than air and hence may collect in low-lying areas. It can form explosive mixtures with air and also self-ignite.” The Setsumon Kaiji’s definition for 㷠 (粦) is “㷠 (粦) is the blood of dead soldiers as well as that of their cows and horses. 㷠 is ghost fire.”12 So, basically, the rotting corpses of soldiers, cows and horses on a battlefield release phosphine gas and under hot conditions, the phospine gas may self-ignite. When it does, it gives off a green colored flame, hence the name 鬼火 or ghost fire.

Now back to the dots (for real):

Returning to the explanation for the dots, Shirakawa Shizuka (白川静) believes the dots to represent blood13, while Chi Hsiu-Sheng (季旭昇) sees them to be the actual flames of the ghost fire. In the images below, you can see that dots have been used to represent both fire () and blood (ケツ):


From a kanji form analysis perspective, both explanations are reasonable, since each has a precedent. However, from a meaning perspective, the fire explanation has the upper hand because it is a more direct explanation that is tied directly into the original meaning, basically “pale green flame.” Blood is also related in the sense of the role it plays in gruesome, bloody battlefields, which are one of the places where the “ghost fire” appears, but its role is indirect, whereas the fire explanation is direct. As such, the dots representing fire is the better explanation.

In the form that appears in the Setsumon Kaiji, the top part has already been corrupted into エン “flame” by roughly the same process that corrupted the bottom of 黒 into 炎. The fact that the meaning of 㷠 has to do with fire most likely also played a role (even though flames don’t usually have feet).


Forms (2h) - (2k) are Han dynasty forms of 隣 & 隣 – variant forms, both pronounced lín (杜忠誥 2002:67). Interestingly, the left half of (2h) is still very close to the original form (2b) above. The left half of (2i) is close to the version that appears in the Setsumon Kaiji, while the top of the right half of (2j) has been corrupted into with dots above it. The (2k) form is basically the same as the modern form: the top part has been corrupted into ベイ (rice), while the feet (舛) on the bottom are still intact.


So, to recap, the original form (on the left above) was a single, unified form. By the time of the Setsumon Kaiji (middle form), it had already been broken into several pieces and the modern form (on the right) remains in separate pieces, with the further corruption of 炎 into 米. So, what was originally a picture of a person on fire became corrupted into a flame with feet, then into uncooked rice with feet! In Outlier terms, 米 is an empty component (i.e., doesn’t give a sound or meaning; it’s like a placeholder for an older form) for the original form of a burning body.

In a future corruption post, we’ll discuss the 2nd type of corruption: Corruption by way of Connection, where we’ll visit the etymologies for 折 & 制.

Before we begin, let's define the special use of the word "corruption" as it is used in this article. The use of "corruption" is reserved for kanji form changes that result in a degredation of the ability of a kanji to express sound and meaning. It is a translation of the Chinese term 訛變.

Important points:

1. It does not describe all kanji form changes, only ones that result in a degradation of meaning/sound representation.

2. It is not meant to convey the idea of getting back to a previous perfect state. It is merely saying that an unhindered ability to represent a sound or meaning is better than the lack of such ability.

Take a specific example, 做:

做 is derived from 作+攵 or 亻+𢼎. Let's take a look at the sounds of the parts:

乍 サ、 サク

作 サ、 サク

做 サ、 サク、 ソ

故, 古 コ

In the modern form, the middle part 古 was originally the sound component 乍. It changed from 乍 to 古 as the result of graphical confusion. Now, you have 做 with a structure that makes very little sense. It is not related to 故 nor 古 in any meaningful way (i.e., it does not give the sound コ. It does not give a meaning related to 故 or 古.) Yes, corruption has a negative connotation, but that's an accurate description. 乍 gives a sound in 作. 做 is an a corruption of 作+攵. 古 does not give sound or meaning in 做. 做 has lost part of its ability to express sound.

Prescriptivism vs. Descriptivism

We are not prescriptivist. I, myself, come from Texas and pronounce the word "get" as if it rhymed with "sit" (not "set"). I like my pronunciation even though it is not standard. When I learn foreign languages, I'm okay with making mistakes as long as they are mistakes that a native speaker would make. We are seeking to describe linguistic phenomena, not tell everyone what to do.

Since written records are only incomplete reflections of spoken language and since they appear late in history, it makes no sense to talk about "the original meaning" of a spoken word. It also doesn't make sense to say that semantic, syntatic, phonological change is necessarily bad. Language change is necessary and happens in all languages at all times (though at varying speeds). The use of "corruption" to describe a character form that has lost some or all of its ability to record sound and meaning is in no way similar to describing regular language change as being a "corruption."

The notion of corruption is relevant to language learning

According to memory experts, the number one rule for effective memorization is understanding the thing you are trying to remember. Kanji corruption is one of the major reasons for empty components (components that neither express sound or meaning in a kanji). If you go giving a meaning to every empty component, you add noise to your learning system. Any kanji has an infinite number of possible stories, but only the real story will help you see the overall semantic and sound patterns that kanji express. Knowing those patterns is useful for both learning and recall.

  1. 杜忠誥,《說文篆文訛形釋例》,台北市:文史哲出版社,2002年。

  2. My definition here is a more general version of Tu Chung-kao’s definition. His original definition: 凡商、周以來一脈相承的古篆文字,其形體或部件,本當連合為一體的,在《說文》篆文中,則離析或斷裂為二,因而導致字形與原義之乖離者,是為「離析之訛」。And my (quite literal) translation: If any kanji or component forms for any ancient Seal kanji since the Shang and Zhou dynasties, derived from the same origin, that used to form a single unit, but within the Seal kanji forms in the Shuōwén have disintegrated or broken into two, and consequently cause the kanji form and its original meaning to become separated, this is Corruption by Way of Disintegration (2002: 53).

  3. Images taken from 季旭昇2004年《說文新證‧下冊》,藝文印書館印行,第113頁。 季旭昇 labels them as 1 and 2.

  4. This explanation was first put forth by Tang Lan (唐蘭).

  5. Prof. Tu’s original description: 象顔面被墨刑之人的正面形(2002:54)。

  6. Endymion Wilkinson, 2012. Chinese History: A New Manual. Cambridge: Harvard University Press.

  7. The other four punishments were: 劓 “cutting off the nose,” 刖 “cutting off one or both feet,” 宫 “castration,” and 大辟 “the death penalty” (Wilkinson 2013:311). Cheery bunch, the ancients.

  8. (c) and (d) are taken from 杜忠誥:2002, page 53. Our (c) is his 7 and our (d) is his 9.

  9. These images come from 杜忠誥:2002, page 53. Our (e) is his 1, (f) his 2, (g) his 3 and (h) his 5.

  10. Images taken from 季旭昇2004年《說文新證‧上冊》,藝文印書館印行,第470頁。 季旭昇 labels them as 2 and 3.

  11. The original description:「人持牛尾、鳥羽等舞具跳舞求兩;引伸為一切跳舞」(English translation is mine).

  12. This is my own translation. 《說文》「兵死及牛馬之血為㷠,㷠、鬼火也。」

  13. “(They) represent the image of being dripping wet with blood”「表示鮮血淋漓之象」(my translation) (杜忠誥2002:67).


  • @Nathan, yes we want to do a paper copy, but at the moment we don’t have any concrete plans.

    Henshall is a historian, not a paleographer, and it shows in his work. He tends to stick with Japanese paleographers from the 70s, and even in his updated version from a few years ago, the only recent Chinese paleographic work he cites is 裘錫圭《文字學概要》, which, while it is excellent, is sort of a survey/intro-level book. The result is that a lot of Henshall’s explanations are quite dated at best, and very controversial at worst (Shirakawa, who he cites often, was controversial even when he first published—even more so now). Not only that, but Henshall has a tendency to cite several competing theories, and then just recommend a mnemonic that isn’t based on any of them. On the one hand, it’s good that he’s not “picking a winner,” because he probably doesn’t have the academic training to do so, but on the other hand, it’s confusing for the learner to have contradictory information presented.

    All that being said, at the moment it’s probably the closest thing there is in print to an authoritative English-language dictionary of character etymology for learners. Unfortunately, it still leaves much to be desired.

    Hope that helps!

    John Renfroe
  • After reading a couple entries here, I was immediately excited to get my hands on the dictionary. I hope you will consider doing a paper copy someday.

    I have a copy of Henshall’s book, but it doesn’t not seem to be as thorough as yours. I’m curious to hear what you think of the Henshall book.

    Nathan Glenn

Leave a comment

Please note, comments must be approved before they are published