From 75542.1003@compuserve.comFri Sep  8 14:52:46 1995
Date: Fri, 8 Sep 1995 14:49:23 EDT
From: "Randy Nichols, ACA Pres." <75542.1003@compuserve.com>
Reply to: ACA-L <ACA-L@VM1.NoDak.EDU>
To: Multiple recipients of list ACA-L <ACA-L@VM1.NoDak.EDU>
Newsgroups: bit.listserv.aca-l
Subject: XENO's

NORTH DECODER has asked that I repost the short XENO's
data.

Here it is:

LANAKI


                     XENOS by LANAKI




As a prelude to the thread on Chinese crypto (which is the
exception to the rule),  I thought I might point out the
cryptographic common ground for most languages.

I used to think that Xeno's (aka Xenocrypts - non English
cryptograms) were very difficult to solve.  The 'aha' light went
on several years ago, when I realized that most languages share
the common framework of mathematics and statistics.  To be able
to solve Xenos, it is only necessary to learn the basic
mathematical structure of the language, to use a bidirectional
translation dictionary and to recognize the underlying cipher
construct.  (1)

Many challenge ciphers start with the problem of recognizing the
language and then the distribution of characters within the
particular language. (2)   The legendary W. F. Friedman once
remarked: "treating the frequency distribution as a statistical
curve, when such treatment is possible, is one of the most
useful and trustworthy methods in cryptography." (3)

Table 1 gives the frequency distributions of ten of my favorite
languages (sans Russian and Japanese which require character
sets that will not transfer via my e-mail).  The frequencies in
Table 1 have been developed from various sources.  (4), (5),
(6), (7)  Table 1 frequencies may differ from other published
data, based on text derived solely from literature or military
sources, because I have included the practical text from my
solved Xeno's over the years.  Letters used in cryptograms tend
to shift the frequency distribution.  Frequencies of letters,
and their order, are not fixed quantities in any language.
Group frequencies, however, are fairly constant in every
language.


                           TABLE 1
    Partial Frequency Distribution For Cracking Xenocrypts


            16   8   7  6    5    4     2       <1

NORWEGIAN:  E   RNS  T  AI  LDO  GKM  UVFHPA'  JBO' YAECWXZQ



            10  9    7    6    4   3      <2

LATIN:      I   E   UTA  SRN  OM  CPL    (bal)


            18   8    7    6   5  4   3   2    <1

FRENCH:     E    AN  RSIT  UO  L  D  CMP  VB   F-Y



            14  13  12   8  6    5     4   3   2   <1

PORTUGUESE: A   E   O   RS  IN  DMT   UCL  P  QV   (bal)



            18  11  8  7    5     4    3    2     <1

GERMAN:     E   N   I  RS  ADTU  GHO  LBM  CW    (bal)



            15  12  8    7    5   4   3    1      <1


CATALAN:    E   A   S  ILRNT  OC  DU  MP  BVQGF   (bal)



            16  13  8   6    5      4    3    <2

HUNGARIAN:  E   A   T   OS   LNZ   KIM  RGU  (bal)



            13  12  11  9  7    6   5    3     2   <1

ITALIAN:    E   A   I   O  L   NRT  SC  DMO'U  VG   (bal)



            20  10   7   6  5   4   3      2       <1

DUTCH:      E   N   IAT  O  DL  S  GKH  UVWBJMPZ   (bal)




            13   9  8   7   5    4   3    1    <1

SPANISH:    EA   O  S  RNI  DL  CTU  MP   GYB  (bal)



             [ note: ' = accent not on my keyboard ]

English has its characteristic frequencies and sequence data
(based on 10000 letters):

%       12   10 8   8 7 7 7 6 5   4-3     2      1     < 1

ENGLISH: E / T  A / O N I S R H / LDCU / PFMW / YBGV /  KQXJZ



Group Percentages:

A E I O U          38.58%

L N R S T          33.43%

J K Q X Z           1.11%

E T A O N          45.08%

E T A O N I S R H  70.02%


Digram Order:  TH / HE / AN / IN / ER / RE / ES / ON / EA / TI
                / AT / ST / EN / ND / OR

Trigram Order: THE / AND / THA / ENT / ION / TIO / FOR / NDE

Reversals:   ER RE / ES SE / AN NA /TI IT /ON NO / IN NI

Initials:  T A O   S H I W C   B P F D M R

Finals:    E S T D N R O Y

Vowel %    40%   (y included)

We can develop a similar mathematical picture on all the
languages in Table 1

(and hemce, an entry into a Xeno). I leave this as a homework
assignment.

------------------------------------------------------

(1)  R. K. Nichols, quote from Keynote Speech to  A.C.A.
     Convention, New Orleans, La., 1993.

(2)  IBID, from presentation "Breaking Ciphers in Other
     Languages.," 1993

(3)  W. F. Friedman, Riverbank Publications, "No. 22.", 1922.

(4)  H. F. Gaines, "Cryptanalysis," Dover, New York, 1956.

(5)  W. F. Friedman, "Elements of Cryptanalysis," Agean Park
     Press, Laguna Hills, CA. 1976.

(6)  W. F. Friedman, "Military Cryptanalysis, Vols I - IV "
     Agean Park Press, Laguna Hills, CA. 1990.

(7)  Anonymous, "The Cryptogram," data taken from various
     issues, American Cryptogram Association,  1929 - 1995.

