Date: Mon, 10 Jul 1995 18:22:32 EDT From: "Randy Nichols, ACA Pres." <75542.1003@compuserve.com> CHINESE CRYPTOGRAPHY Part II ENCIPHERING Dr. August suggests that the Four Corner System and the Chinese Phonetic Alphabet System lend themselves to manual cryptographic treatment. His treatment of these two systems is easier to understand than some military texts on the subject. (16) Let a message in Chinese be X1, X2, X3.. Xn, where Xi represents a character. The code for Xi is vector union of three sets, v1, v2, and v3. v1 is a single digit code for tone v2 is a four or five digit Four Corner representation code, and v3 is a 6 digit phonetic code representing 3 phonetic symbols each by two digits. (17) 3 Xj = U v1 eq 1 1-3 This union is called an asymmetric code. The Four Corner System encodes characters into several generic shapes. Each character is broken into four (4) quadrants, and assigned a digit to the generic shape that best corresponds to the actual shape. (18) The Chinese Phonetic Alphabet is Pinyin with symbols instead of English letters. Each symbol corresponds to one of 37 ordered phonetic sounds. The 21 initial, 3 medial and 13 finals are a unique ordered set - a true alphabet. (19) The strength of encryption of Chinese is dependent on the specific Chinese encoding character schemes. Three cases are: 1). Phonetic Alphabet Only: The cipher must include both a transposition (to hide cohesion and positional limitations) and a substitution (to hide the frequency patterns.) 2) Four Corner System: The cipher can be based on ring operations [performed on codewords rather than characters, either on an individual basis or over the whole message; the name comes from the algebraic operations involving integers mod 10 or mod 37] which super-encipher the encoded text. 3) Combination of Methods 1) and 2): A text encoded by a combination of both methods will need a cipher employing both transposition and substitution. The transposition needs to mix up the symbols within codewords and the message itself. This prevents a bifurcated analysis. (20) CRYPTANALYSIS A) Phonetic Alphabet: 12.6 7 5.7 4.8 4.2 3.8 3.4 3 2.9 2.8 2.4 2.2 I U D ENG/E an/en SH X/ZH J/u G O ao H 2.1 2 1.9 1.8 1.6 1.4 1.3 1.2 1.1 ang a/b/ai/B/z ei Q ou/M ie L F R 0.8 0.7 0.6 0.3 0.1 t n/c ch k/s p/el Initials: sh, d Medials: i Finals: e, en, eng, in, un, ing, ong Phi for monalphabetic substitution = 0.051 (random text = 0.027) Common Digraphs: ji, ieng, ueng, gu, de, ian, iie, li, ien, qi, xi, uo, izh, zu, shi Positional Limitations: 1. Initials follow a medial or final. 2. Finals follow an initial or medial. 3. [zh, ch, sh ] do not combine with i or u'. 4. [ j, q, x ] do not combine with a or e finals. 5. qa, qan = no but quan, qian, qia = yes 6. no double phonetics in a single codeword. 7. medials double frequently. 8. 13 limits on combinations within a codeword. Approximately 63% of characters require 2 phonetic symbols. About 1/3 were three long, and about 4% are one symbol. Tone indicator digits were about 22--23% likely. B) Four Corner Digital frequencies: 0 = .30 1 = .14 2 = .15 3 = .07 4 = .10 5 = .03 6 = .07 7 = .08 8 = .04 9 = .02 Phi value = 0.160 compared to random text value of 0.100 Dr. August presents a table of digraphs. (22) Combinations of Xn - Ym where n= 0-9 and m=0,1,2,3,4,7 showed highest frequencies of text encoded with 5 digit scheme. DEPENDENCE In Chinese there is more dependence between encoding and enciphering operations than in English. The choice of the encoding system influences the type of enciphering operations. Dr. August provides solved examples of the above systems. (21) ============================================================ (16) D. A. August, "Cryptography and Exploitation of Chinese Manual Cryptosystems - Part II:The Encrypting Problem", Cryptologia, Vol XIV, No. 1, August 1990. (17) D. A. August, "Cryptography and Exploitation of Chinese Manual Cryptosystems - Part I:The Encoding Problem", Cryptologia, Vol XIII, No. 4, October 1989. (18) IBID, Part I. p287. (19) IBID, Part I, p299. (20) IBID, Part II, p69. (21) IBID, Part II, p73-78. (22) IBID, Part II, p72.