Last updated February 11, 2005

About Konton 2.1

Description of Konton2's PRNG

Konton2's Strength

Konton2's Speed

Legal Issues

Konton2 Source Code

The Tests in `KonStat2.c`

Cryptography Links

Books

Other Links

Results of Statistical Tests of Randomness on Konton2,
RC4, and Snow2

Results of Statistical Tests of Randomness on
Some Weak PRNGs

Analysis of Yasmarang (a PRNG devised by Ilya O.
Levin)

Introduction to Pseudo-Random Number Generators (PRNGs)

Randomness

Konton (pronounced "kone-tone") is a Japanese word which means "primordial
chaos." It appears, written in Japanese *kanji*, at
the top of this page.

Konton 2.1 is a stream cipher based on a cipher I developed called Konton 2.0, which was based on a cipher called Konton. Rather than continue to detail the history of the changes, I will just refer the reader to Description of Konton2's PRNG below. Konton 2.1 will be referred to simply as Konton2 in the rest of this document.

Konton2:

- does not resemble RC4 and does not use shift registers
- encrypts byte-by-byte
- can encrypt with feedback from the encryption and decryption functions so that different plaintexts encrypted with the same key are encrypted with different keystreams.
- can encrypt in synchronous mode
- can encrypt only text and leave other bytes untouched
- can use indexed encryption, where each block of data of data is stored or sent along with an index value that, along with key, is used to initialize Konton2 before each encryption and decryption.

In the explanation that follows:

"`x mod y`

" means "x modulo y," and

"`x rol y`

" means "x rotated y bits to the left."

Bit rotation is sometimes called "circular shifting."

The variables in the PRNG are: | |
---|---|

`n` |
a constant: `n = 32` |

`C` through `C` |
a series of constants: `C` |

`S` through `S` |
an array of 32-bit integers: the main state data.
It is initialized to an image of the encryption/decryption key. |

`P` |
a 32-bit integer: the highest-order 16 bits of `P` are the output
of the PRNG.
`P` is initialized to zero and left intact as part of the state
data between calls to the PRNG. |

`R` |
an integer: the number of bits to rotate left.
`R` is initialized to zero and left intact as part of the state data
between calls to the PRNG. |

The steps in one call to the PRNG are as follows:

`Let S`

_{i}= (S_{i}+ C_{i}) mod 2^{32}`If i > 0 and S`

_{i-1}= 0

Let S_{i}= (S_{i}+ C_{i}) mod 2^{32}`Let R = (R + S`

_{i}) mod 32`Let P = P rol R`

`Let R = (R + P) mod 32`

`Let P = (P + (S`

_{i}rol R)) mod 2^{32}

**For i = 0 to n-1:**

`If P = 0 and R = 0`

Let R = 1

**Final Step (after n iterations of steps 1 through 6):**

- Steps 1 and 2 cause
`S`

to assume all 2^{1024}possible states before any state is repeated, ensuring a cycle length of at least 2^{1024}. `n`

is actually equal to`8 × KONCRYPTSTRENGTH`

, where`KONCRYPTSTRENGTH`

is a macro defined in the source code. The standard definition of`KONCRYPTSTRENGTH`

is 4.- The final step deals with a small, but real, problem. After many (something
on the order of 2
^{37}) calls to the PRNG,`P`

and`R`

will eventually both return to zero, their original state. From that point on, the keystream generated would (without the final step) be the same as the keystream that would have been generated had the original state of`S`

been the state of`S`

at the time that`P`

and`R`

returned to zero. In other words, after about 2^{37}calls, one key would be equivalent to another. If keys are chosen randomly, this is not likely to become a problem. However, because the problem is easy to eliminate, it is eliminated. - Two bytes are encrypted or decrypted with each call to the
PRNG. Bits 16 through 23 of
`P`

are used to encrypt or decrypt the first byte, and bits 24 through 31 are used to encrypt or decrypt the second (where bit 0 is the lowest-order bit, and bit 31 is the highest-order bit). The lowest-order 16 bits of`P`

are not used for encryption or decryption. The combining function for encryption and decryption is simple bitwise XORing. After each pair of bytes is encrypted or decrypted, the quantity`(a + (b × 256))`

is added (mod`2`

) to^{32}`P`

(where`a`

is the first plaintext byte encrypted or decrypted and`b`

is the second). Note that it is always the**plaintext**that is added to`P`

(not the ciphertext), and that the plaintext is added to`P`

**after**encryption or decryption. - When an odd number of bytes are encrypted or decrypted, the last byte is
encrypted or decrypted with bits 16 through 23 of
`P`

, and the last plaintext byte is added (mod`2`

) to^{32}`P`

. - Konton2 can encrypt and decrypt in synchronous mode, where the state data
is unaffected by the plaintext. In this case the plaintext is not added to
`P`

. An advantage of this mode is the lack of forward error propagation. A disadvantage is that every plaintext encrypted with a given key is encrypted with the same keystream. This problem can be dealt with by using indexed encryption, where`S`

is initialized with the key and with an index value up to 128 bits long. Note that there are only 2^{1024}possible initial states of`S`

, so a given key-index combination will be equivalent to others. Even so, 2^{1024}virtual keys should be far more than enough. - When Konton2 is used to encrypt text and leave other bytes untouched, subtraction (rather than bitwise XORing) is used to combine the keystream with the plaintext.

The PRNG could be seen as having two blocks of state data with two separate
next-state functions. The elements of `S`

could be seen as one block,
and `P`

and `R`

together could be seen as the other.
The next-state function for `S`

(steps 1 and 2) is completely linear,
but the next-state function for `P`

and `R`

(steps 3
through 6) is very non-linear. Moreover, while the next state of `S`

is a function only of its previous state, the next state of `P`

and
`R`

is a function the previous state of `S`

, `P`

,
and `R`

. With each call to the PRNG, every bit of `S`

affects the value of `P`

, and these effects cascade in
complex, chaotic ways as `P`

affects `R`

which affects
`P`

again.

Another way to view the PRNG would be to see `S`

as an essentially
infinite supply of unique 1024-bit integers to feed into the non-linear
function which updates `P`

and `R`

. Each new state of
`P`

and `R`

is a function of their previous states and
the integer supplied by `S`

. Because each 1024-bit integer is easily
predictable from the previous 1024-bit integer, it remains only to analyze the
non-linear function, looking for correlations between its 1061-bit input
(`S`

, `P`

, and `R`

) and its 37-bit output
(`P`

, and `R`

).

If the lowest-order 16 bits of `P`

were used for encryption,
Konton2 would have weak keys. Consider what happens when every element of
`S`

is a multiple of 32. If the previous `P`

was also a multiple of 32, and the previous `R`

was zero,
`P`

will be a multiple of 32 again. `S`

has
2^{1024} possible states, which it always visits in the same order.
Within this series, there are 2^{864} states in which every element of
`S`

is a multiple of 32. This is a very small fraction of the
possible states: one in 2^{160}. However, 2^{27} of these
states occur, spaced 32 states apart, in a segment of states which is
2^{32} states long, beginning with the state where every element of
`S`

is zero. The special condition, where multiples of 32 follow
multiples of 32, occurs too often in some parts of this segment.

The combination of modular addition and pseudo-random bit rotation is also
used in other ciphers. The ciphers discussed in
"Mod *n* Cryptanalysis,
with Applications Against RC5P and M6" by Kelsey, Schneier, and Wagner
are vulnerable because they expose the lower-order bits of the result of
pseudo-random bit rotation and modular addition.

I have done some experiments with altered versions of this PRNG. In one
experiment I reset `P`

and `R`

to zero before each call
to the PRNG. The output appeared random. In another experiment, the algorithm
was just as above except that steps 1 and 2 were eliminated. When the state
of `S`

was pseudo-random, the output appeared random. I have also
experimented with smaller values of `KONCRYPTSTRENGTH`

. The output
appeared random, but I have two concerns about these "cut-down" versions:
first, the avalanche effect (where a given bit of `S`

eventually
affects every bit of `P`

) may not be complete after fewer than 32
iterations of steps 1 through 6; and second, a simpler equation is, in
general, more likely to have a feasibly calculable (possibly even linear)
approximation.

Konton2 (to the best of my knowledge) has not yet received independent analysis. As outlined above (in Description of Konton2's PRNG), the most likely attack against Konton2 seems to be an attack on the non-linear function which updates the parts of the state data used for internal feedback and output. Perhaps some distinguishing characteristic of this function can be found. Perhaps it can even be deduced from the design. Perhaps the equation can be simplified or approximated.

The paper mentioned above,
"Mod *n* Cryptanalysis, with Applications Against RC5P and M6",
describes an attack that might be fruitful, if Konton2 were slightly modified.
Fortunately I noticed the issue when I determined what made the output of Konton
2.0's PRNG apparently non-random. This present version, Konton 2.1, was designed
with a deeper understanding of results of pseudo-random bit rotation and modular
addition.

Every stream cipher PRNG must produce a keystream which is apparently random,
though apparent randomness, alone, doesn't guarantee strength. Konton2's keystream
has passed all the statistical tests of randomness I've subjected it to. In
particular, it has passed all the tests performed by the program created by
compiling and linking `KonStat2.c`

and `Konton2.c`

together
(see Konton2 Source Code below). For information about
the tests this program performs, see The Tests in
`KonStat2.c`

below. To see sample output from this program, look
at Results of Statistical Tests of Randomness on Konton2,
RC4, and Snow2. Konton2 has also passed Marsaglia's and Tsang's GCD, Birthday
Spacings, Gorilla, and Collision Tests. See
"Some difficult-to-pass
tests of randomness" by Marsaglia and Tsang.

Konton2 is slower than some other stream ciphers. On my computer, with its Pentium II running at 231.782 MHz, Konton2 encrypts (in synchronous mode) about 1600 kilobytes per second. Snow 2.0 (the reference implementation) encrypts about 13000 kilobytes per second.

Notice that I have revised the implementation of RC4 in
`KonStat2.c`

to make it as fast as it should be.
See Results of Statistical Tests of Randomness on Konton2,
RC4, and Snow2.

Konton2, nevertheless, should be more than fast enough for most applications. In addition, its small state, just 2085 bits (including a copy of the key), may give it an advantage over some new ciphers which require a few kilobytes.

If you're going to download the Konton2 source code, be aware that U.S.
government regulation of cryptographic software has changed during the last
twenty years and may change again. To see the relevant statutes and regulations,
go to the Legal Information
Institute at Cornell to look at the U.S. Code, and to the
Code of Federal Regulations Online
to look at the U.S. regulations. To stay up to date on cryptography regulation
worldwide, look at
Bert-Jaap Koops' web site.
The book *Cracking DES* (listed below in Books) is an
interesting source of information on the history of U.S. government regulation
of cryptographic software.

If I understand current law correctly, Konton2 is now upatentable (at least in the United States) because a complete (though not beautifully written) specification of it was published more than one year ago.

RC4 (mentioned often in these documents) should, arguably, be called
*alleged* RC4. RC4 is supposed to be a trade secret, but the secret leaked
out some years ago. To the best of my knowledge, though, RC4 is still a
registered tradename of RSA Data Security, Inc.

I have implemented Konton2 in three source code files:

`Konton2.h`

`KonTyps2.h`

`Konton2.c`

The following two files are programs that call Konton2 functions:

There is also a configure script for GNU/Linux (or UNIX) users:

`configure`

I have compressed these six files into the following ZIP archive.

`konton2.zip`

(29 KB--29497 bytes)

To help C programmers use Konton2 correctly, `Konton2.h`

includes
comments which explain and demonstrate how to call Konton2 functions.

`TestKon2.c`

compiled and linked with `Konton2.c`

creates a program which tests to see if Konton2 has compiled correctly.

`KonStat2.c`

compiled and linked with `Konton2.c`

produces a program that runs statistical tests of randomness on Konton2, RC4,
and (optionally) Snow2. Alternatively, a file can be evaluated. This file
should have no special format; it should simply be a sequence of random or
pseudo-random bits. Invoke KonStat2 with a command line argument of "-h" to
find out how to specify the file. For information about the tests performed, see
The Tests in `KonStat2.c`

below.
Results of Statistical Tests of Randomness on Konton2,
RC4, and Snow2 contains sample output from this program.

Before you compile `KonStat2.c`

, you may want to read the first
part of the file to make sure that the definition of `EAT_THE_ENTER`

is appropriate for your implementation of C.

To evaluate Snow2 along with Konton2 and RC4, define the macro
`TEST_SNOW2`

(in `KonStat2.c`

) as 1 and compile and link
`snow2.c`

with `KonStat2.c`

and `Konton2.c`

.

The Snow2 cipher is documented at
the Snow cipher web site, and the file `snow2.c`

is available there
(along with the necessary headers) in the ZIP archive
snow2.0.zip
(approximately 21 KB).

I have included a "configure" script for GNU/Linux (or UNIX) users. To build
KonStat2 and TestKon2, unzip `konton2.zip`

. This will create a new
directory called `Konton2`

. If you want to evaluate Snow2, copy the
Snow2 files (`snow2.c`

, `snow2.h`

, and `snow2tab.h`

)
into this new `Konton2`

directory along with the six files
already there. Then "cd" to the `Konton2`

directory and run the
configure script by typing

`./configure`

When the configure script has finished, type

`make`

That's all there is to it. The configure script takes care of defining the
`TEST_SNOW2`

macro in the makefile it creates, so you don't have to
edit `KonStat2.c`

to change it.

`KonStat2.c`

is a work-in-progress. I am constantly investigating new
statistical tests and trying them out on various PRNGs.

`KonStat2.c`

`KonStat2.c`

runs statistical tests of randomness on Konton2, RC4,
and (optionally) Snow2. Alternatively, a file can be evaluated. This file
should have no special format; it should simply be a sequence of random or
pseudo-random bits. Invoke KonStat2 with a command line argument of "-h" to
find out how to specify the file.
Results of Statistical Tests of Randomness on Konton2, RC4,
and Snow2 contains sample output from this program. The source code is
available above at Konton2 Source Code.

I am constantly experimenting with new tests of randomness. I don't expect that this program will ever be finished.

In the reports created by `KonStat2.c`

:

"P" is the tail probability [i.e., a P-value less than 0.1 indicates failure
at the 0.1 level of significance]

"MAD" is the mean absolute deviation of the P-values from their theoretical
median of 0.5

"tidbit" is a 2-bit unsigned integer [4 possible values]

"nibble" is a 4-bit unsigned integer [16 possible values]

"byte" is an 8-bit unsigned integer [256 possible values]

In the explanation that follows:

">=" means "greater than or equal to"

"<=" means "less than or equal to"

The P-value is a percentile; it is the probability of a statistic equal to or greater than the one reported. P-values for a given test should be more-or-less evenly distributed between 0 and 1. One asterisk (*) appears to the right of a P-value less than 0.1, two to the right of a P-value less than 0.01, three to the right of a P-value less than 0.001, and four to the right of a P-value less than 0.0001. One caret (^) appears to the right of a P-value greater than 0.9, two to the right of a P-value greater than 0.99, three to the right of a P-value greater than 0.999, and four to the right of a P-value greater than 0.9999.

All the statistics calculated are chi-squared statistics. An effort was made to keep all the expected counts in each chi-squared calculation high. Expected counts in the following tests (with a minimum test sequence length of one megabyte) are never lower than 2048.

The "cumulative sum" chi-squared statistic for each test is calculated by summing the statistics for all the keys (or sequences) tested. The degrees-of-freedom for the resulting chi-squared statistic is the degrees-of-freedom for the test multiplied by the number of statistics summed.

When more than one key or sequence is tested, the Mean Absolute Deviation (MAD) of the P-values for each test is calculated. This is the mean absolute deviation of the P-values from their theoretical median of 0.5. If a large number of keys or sequences is tested, the MAD should be very close to 0.25. A value much higher than 0.25 indicates that too many P-values are far from 0.5. A value much lower than 0.25 indicates that too many P-values are close to 0.5.

This MAD has two weaknesses. First, it will look very good if all the P-values are very close to 0.25 and 0.75 (I'm not sure if this is even possible, but it is a weakness); and, second, it is difficult to calculate just how probable a given MAD is. I can give some general guidelines. If the number of sequences or keys tested is at least 256, any MADs greater than 0.30 or less than 0.20 should be viewed with suspicion. If at least 1024 keys or sequences are tested, the MADs should all be 0.24, 0.25, or 0.26.

I'd like to replace the MAD, in the future, with something better. It might present some difficulties, but perhaps some variation of the KS test could be devised.

The tests are mostly byte-oriented because the PRNGs I was most interested in testing produce their output as bytes. The tests devised by Marsaglia and Tsang, which are often used to test stream cipher PRNGs, are mostly oriented toward 32-bit words.

A run is one or more consecutive bits with the same value (either 0 or 1). Whenever a bit has a value different from the preceding bit, a new run is begun. For example, 01001011010100 consists of eleven runs: 0, 1, 00, 1, 0, 11, 0, 1, 0, 1, and 00. In a random sequence, approximately half the runs should be one bit long, a fourth should be two bits long, an eighth should be three bits long, etc. Runs of 1 to 10 bits are counted. Runs of 11 bits and longer are lumped together. A chi-squared statistic is calculated. The degrees-of-freedom is 10.

A run is one or more consecutive bits that alternate in value between 0 and 1 in a regular pattern. The pattern is specified by a parameter passed to the function. If the parameter is 1, the pattern is 010101... or its complement 101010...; if the parameter is 2, the pattern is 001100110011... or its complement 110011001100...; if the parameter is 3, the pattern is 000111000111... or its complement 111000111000...; and so on. Whenever the pattern is not followed, a new run is begun.

For example, if the parameter is 1, the sequence 01001011010100 consists of the following four runs: 010, 0101, 101010, and 0. If the parameter is 2, the same sequence consists of the following nine runs: 0, 1, 001, 0, 110, 1, 0, 1, and 00.

In a random sequence, approximately half the runs should be one bit long, a fourth should be two bits long, an eighth should be three bits long, etc. Runs of 1 to 10 bits are counted. Runs of 11 bits and longer are lumped together. A chi-squared statistic is calculated. The degrees-of-freedom is 10.

The sequence is divided into 256 blocks of equal length. The total number of bits of each possible value in each block is counted. In a random sequence the probability of each possible bit value is 1/2. A chi-squared statistic is calculated. The degrees-of-freedom is 256.

The total number of bits of each possible value is counted. In a random sequence the probability of each possible bit value is 1/2. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

The total number of tidbits of each possible value is counted. In a random sequence the probability of each possible tidbit value is 1/4. A chi-squared statistic is calculated. The degrees-of-freedom is 3.

The total number of nibbles of each possible value is counted. In a random sequence the probability of each possible nibble value is 1/16. A chi-squared statistic is calculated. The degrees-of-freedom is 15.

The total number of bytes of each possible value is counted. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

One bit of each byte of the sequence is evaluated in this test. The bit is specified by a parameter passed to the function. This parameter will be a number from 0 to 7: where 0 specifies the lowest-order bit, and 7 specifies the highest-order bit. The value of the specified bit of each byte in the sequence will be either 0 or 1. In a random sequence the probability of each possible bit value is 1/2. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

The bit test described above is performed for each of the 8 bit positions. The resulting chi-squared statistics for each of the 8 tests are totaled; the total is an overall chi-squared statistic. The degrees-of-freedom is 8.

The bits of each byte of the sequence are summed. The bit sum will be either 0, 1, 2, 3, 4, 5, 6, 7, or 8. In a random sequence, the probabilities of 0, 1, 2, 3, 4, 5, 6, 7, and 8 are 1/256, 8/256, 28/256, 56/256, 70/256, 56/256, 28/256, 8/256, and 1/256 respectively. The total number of bytes with each possible bit sum is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 8.

The sequence is divided into segments of 16 bits. The bits of each segment are
summed. The bit sum (`s`

) will fall within one of three ranges:
`[s<7]`

, `[s>9]`

, or `[7<=s<=9]`

.
In a random sequence, the probabilities of `[s<7]`

and
`[s>9]`

are both `14893/65536`

, and the probability
of `[7<=s<=9]`

is `35750/65536`

. The total number
of segments with bit sums in each of the 3 ranges is counted. A chi-squared
statistic is calculated. The degrees-of-freedom is 2.

The sequence is divided into segments of 32 bits. The bits of each segment are
summed. The bit sum (`s`

) will fall within one of three ranges:
`[s<15]`

, `[s>17]`

, or `[15<=s<=17]`

.
In a random sequence, the probabilities of `[s<15]`

and
`[s>17]`

are both `1281220733/4294967296`

, and the
probability of `[15<=s<=17]`

is
`1732525830/4294967296`

. The total number of segments with bit
sums in each of the 3 ranges is counted. A chi-squared statistic is calculated.
The degrees-of-freedom is 2.

The sequence is divided into 6-byte segments. For each segment, the bits in each byte are summed. The bit sums will be either even or odd. In a random sequence the probability of both even and odd bit sums is 1/2. In each segment, there are 64 possible permutations of even and odd bit sums, each with a probability of 1/64. The number of segments with each possible permutation is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 63.

The sequence is divided into 64-bit segments. Eight bytes are built from these 64 bits. The first byte is built from the 1st, 9th, 17th, 25th, etc. bits in the segment; the second is built from the 2nd, 10th, 18th, 26th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

The sequence is divided into 128-bit segments. Sixteen bytes are built from these 128 bits. The first byte is built from the 1st, 17th, 33th, 49th, etc. bits in the segment; the second is built from the 2nd, 18th, 34th, 50th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

The sequence is divided into 256-bit segments. Thirty-two bytes are built from these 256 bits. The first byte is built from the 1st, 33th, 65th, 97th, etc. bits in the segment; the second is built from the 2nd, 34th, 66th, 98th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

The sequence is divided into 512-bit segments. Sixty-four bytes are built from these 512 bits. The first byte is built from the 1st, 65th, 129th, 193th, etc. bits in the segment; the second is built from the 2nd, 66th, 130th, 194th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

The sequence is divided into 1024-bit segments. One hundred and twenty-eight bytes are built from these 1024 bits. The first byte is built from the 1st, 129th, 257th, 385th, etc. bits in the segment; the second is built from the 2nd, 130th, 258th, 386th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

The sequence is divided into 2048-bit segments. Two hundred and fifty-six bytes are built from these 2048 bits. The first byte is built from the 1st, 257th, 513th, 769th, etc. bits in the segment; the second is built from the 2nd, 258th, 514th, 770th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

An algorithm is used to predict the value of each bit of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/2. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

The algorithm is as follows: the numbers of zeros and ones in all the previous bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted. Otherwise the prediction is the same as for the previous bit.

The interesting part of this test is the prediction made when the numbers of zeros and ones are equal.

An algorithm is used to predict the value of each bit of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/2. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

The algorithm is as follows: the numbers of zeros and ones in the previous 9 bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted.

An algorithm is used to predict the value of each bit of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/2. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

The algorithm is as follows: the numbers of zeros and ones in the previous 17 bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted.

The algorithm is as follows: the numbers of zeros and ones in the previous 33 bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted.

The algorithm is as follows: the numbers of zeros and ones in the previous 65 bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted.

An algorithm is used to predict the value of each byte of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/256. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

The algorithm is as follows: the next byte is predicted to be equal to all the previous bytes bitwise XORed together. The first byte of the sequence is predicted to equal zero.

An algorithm is used to predict the value of each byte of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/256. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

The algorithm is as follows: the next byte is predicted to be equal to the sum of all the previous bytes, modulo 256. The first byte of the sequence is predicted to equal zero.

An algorithm is used to predict the value of each byte of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/256. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

The algorithm is as follows: the next byte value is predicted to be zero until the first zero is found. From that point on, the next byte value is predicted to be the byte value whose last appearance was furthest back in the sequence.

The algorithm is as follows: a given byte value is predicted to be followed by the same byte value it was followed by the last time it appeared in the sequence. A byte value that has not previously appeared in the sequence is predicted to be followed by the byte value of the first byte in the sequence. The first byte of the sequence is predicted to equal zero.

Each byte in the sequence either is or is not identical to its preceding byte. In a random sequence the probability that a byte will be identical to its preceding byte is 1/256. The total number of bytes identical to their preceding bytes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

This test is equivalent to a byte prediction test where each byte is predicted to be equal to its preceding byte. The first byte of the sequence is predicted to equal the last byte of the sequence.

The sequence is divided into two-byte segments. The two bytes in each segment
bitwise ANDed together. The result (`d`

) will fall into one of 7
ranges: `[d<37]`

, `[37<=d<74]`

,
`[74<=d<111]`

, `[111<=d<148]`

,
`[148<=d<185]`

, `[185<=d<222]`

,
or `[d>=222]`

. In a random sequence the probabilities of these
ranges are `32265/65536`

, `10755/65536`

,
`5355/65536`

, `8985/65536`

, `3969/65536`

,
`3171/65536`

, and `1036/65536`

, respectively. The total
number of segments in each of these 7 ranges is counted. A chi-squared statistic
is calculated. The degrees-of-freedom is 6.

The sequence is divided into four-byte segments. The four bytes in each segment
bitwise ANDed together. The result (`d`

) will fall into one of 3
ranges: `[d<73]`

, `[73<=d<146]`

,
or `[d>=146]`

. In a random sequence the probabilities of these
ranges are `3993624225/4294967296`

,
`266241615/4294967296`

, and `35101456/4294967296`

,
respectively. The total number of segments in each of these 3 ranges is counted.
A chi-squared statistic is calculated. The degrees-of-freedom is 2.

The sequence is divided into four-byte segments. The first two bytes and the
last two bytes in each segment are bitwise ANDed together. The results of these
two AND operations are then bitwise ORed together. The result (`d`

)
will fall into one of 3 ranges: `[d<73]`

,
`[73<=d<146]`

, or `[d>=146]`

. In a random
sequence the probabilities of these ranges are
`1573112097/4294967296`

,
`1223531631/4294967296`

, and
`1498323568/4294967296`

, respectively. The total number of
segments in each of these 3 ranges is counted. A chi-squared statistic is
calculated. The degrees-of-freedom is 2.

The sequence is divided into two-byte segments. The second byte in each segment is either equal to, greater than, or less than the first. The total number of segments in each of these three categories is counted. In a random sequence the probability that the second byte will be equal to the first is 1/256. The probabilities that second byte will be greater than or less than the first are both 255/512. A chi-squared statistic is calculated. The degrees-of-freedom is 2.

The sequence is divided into four-byte segments. If the bytes of each segment
are called `a`

, `b`

, `c`

, and `d`

;
then comparing these four bytes will classify the segments into 8 categories:
[`a`

<=`b`

<=`c`

<=`d`

],
[`a`

>`b`

<=`c`

<=`d`

],
[`a`

<=`b`

>`c`

<=`d`

],
[`a`

>`b`

>`c`

<=`d`

],
[`a`

<=`b`

<=`c`

>`d`

],
[`a`

>`b`

<=`c`

>`d`

],
[`a`

<=`b`

>`c`

>`d`

], and
[`a`

>`b`

>`c`

>`d`

].
In a random sequence, the probabilities that a segment will
belong to each of these categories are `2862209/67108864`

,
`8454015/67108864`

, `14046335/67108864`

,
`8322945/67108864`

, `8454015/67108864`

,
`13915265/67108864`

, `8322945/67108864`

, and
`2731135/67108864`

, respectively. The total number of segments in each
category is counted. A chi-squared statistic is calculated. The degrees-of-freedom
is 7.

The sequence is divided into four-byte segments. For each segment, the absolute
difference between the first and third bytes of the segment is added to the absolute
difference between the second and fourth bytes of the segment. The result
(`d`

) will fall into one of 7 ranges: `[d<64]`

,
`[64<=d<128]`

, `[128<=d<192]`

,
`[192<=d<256]`

, `[256<=d<320]`

,
`[320<=d<384]`

, or `[d>=384]`

. In a random sequence
the probabilities of these ranges are `6935371/67108864`

,
`15991947/67108864`

, `18225611/67108864`

,
`14683915/67108864`

, `7696309/67108864`

,
`2865781/67108864`

, and `709930/67108864`

,
respectively. The total number of segments in each of these 7 ranges is counted.
A chi-squared statistic is calculated. The degrees-of-freedom is 6.

The sequence is divided into segments. The length of the segments, in bytes, is a
parameter passed to the function. If any byte value occurs more than once in a
segment, a collision is said to have occurred. In a random sequence, the
probability that no collisions will occur in a segment is
```
(255/256) × (254/256) × (253/256) × ... ×
((257 - x)/256)
```

, where `x`

is the segment length in bytes.
The total number of segments with and without collisions is counted. A chi-squared
statistic is calculated. The degrees-of-freedom is 1.

The sequence is divided into pairs of bytes. The offset in the sequence between the bytes in each pair is a parameter passed to the function. The two bytes in each pair are bitwise XORed together. The total number of resulting bytes of each possible value is counted. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

The divisor (`d`

) is a parameter passed to the function. For each
byte of the sequence (`x`

), `(x mod d)`

is calculated. In
a random sequence, the probability of a given remainder (`r`

) is
`(q + 1)/256`

, if `r < (256 mod d)`

, and
`q/256`

otherwise (where `q = 256/d`

, rounded down). The
total number of byte values that are congruent to each possible remainder
`(mod d)`

is counted. A chi-squared statistic is calculated. The
degrees-of-freedom is `d - 1`

.

sci.crypt news group

sci.crypt news group via Google

International Association for Cryptologic
Research (IACR)

ePrint IACR Cryptology Archive

The Ciphers Page of the Open Directory Project

Search for "stream
cipher" with Google

Bert-Jaap Koops' web site
about cryptography regulation

Prof. George Marsaglia's
famous DIEHARD suite of statistical tests of randomness

"Some difficult-to-pass tests of randomness" by Marsaglia and Tsang

The source code (in C) for Marsaglia's and Tsang's difficult-to-pass tests

The NIST's statistical tests for
cryptographic PRNGs

"Mod *n* Cryptanalysis,
with Applications Against RC5P and M6" by Kelsey, Schneier, and Wagner.

Useful Stuff from
Qualcomm Australia

The Snow cipher web site

The VMPC web site

Terry Ritter's web site

Counterpane's web site

The president of Counterpane is (or, at least, was) Bruce Schneier,
the author of *Applied Cryptography* and *Secrets and Lies* (listed
below in Books).

Bruce Schneier's Site

**Electronic Frontier Foundation**. 1998.
*Cracking DES: Secrets of Encryption Research, Wiretap Politics & Chip Design*.
Electronic Frontier Foundation: [Sebastopal, CA?].

This is the book that couldn't be published electronically in the USA, but
could be published on paper. However, on the cover of the book, the phrase
"Scan this book" appears. It has been scanned and uploaded to the internet,
though the scans I saw were on a domain apparently not in the USA (see
Legal Issues above).

This book shows, in detail, how to build a very expensive machine that can decrypt data encrypted with DES, without the key. The machine does this by brute force, trying every possible key until the right one is discovered. Unfortunately, DES had too small a key space (too few possible keys). DES (the Data Encryption Standard) has now been superseded by AES (the Advanced Encryption Standard).

**Flannery, Sarah**. 2001. *In Code*. Workman Publishing: New York.

This book is a fun, easy introduction to cryptography written by a very
bright high school student to be read by the general public.

**Gaines, Helen**. 1956. *Cryptanalysis*. Dover Publications Inc.: New
York.

This is a good introduction to classic cryptography and cryptanalysis. This
happens to be the first book (for adults) about cryptography I ever read. I
first read it (and did the exercises) back in high school.

**Kahn, David**. 1996. *The Codebreakers*. Scribner: New York.

This is a good history of cryptography, but it was written before the
Bletchley Park stuff was declassified. On the other hand, Kahn's account of
December 7, 1941 is remarkably detailed and completely fascinating.

**Menezes, et al**. 1997. *Handbook of Applied
Cryptography*. CRC Press Inc.: Boca Raton, Florida.

This is extremely useful, but it's not as readable as Schneier's book below.

**Schneier, Bruce**. 1996.
*Applied Cryptography*. John Wiley & Sons:
New York.

This is a very readable book for those who need (or just want) to know about
cryptography.

**Schneier, Bruce**. 2000.
*Secrets and Lies: Digital Security in a Networked World*.
John Wiley & Sons: New York.

In this book Schneier explains why cryptography alone is never enough to
ensure security.

**Singh, Simon**. 1999. *The
Code Book*. Fourth Estate: London.

This is a fun history of cryptography which **does** cover the Bletchley
Park stuff. This book was written for the general public.

**Stephenson, Neal**. 1999. *Cryptonomicon*. Harper Collins:
New York.

This is Stephenson's fifth novel, and it's probably his best. His third
novel, *Snow Crash*, and his fourth novel, *The Diamond Age*, are also
excellent. Both have been published in paperback by Bantam Spectra. Stephenson has recently
completed a trilogy, the *
Baroque Cycle*, published by
Harper Collins. I found the first novel of the trilogy, *Quicksilver*,
disappointing. The second novel, *The Confusion*, was better, I thought. The
third, called *The System of the World*, is (I think) quite good.

Wikipedia

MathWorld

CIA Factbook

Slashdot

Hubble Telescope

Chandra X-Ray Observatory

Neal Stephenson's Home Page

Stephenson is the author of *Cryptonomicon* (listed above in Books).

Note: If you're looking at the local copy of this page created by unzipping the zip file below, the link to the zip file won't work; the zip file can't contain itself. In any case, it shouldn't need to contain itself; it is itself.

This Konton 2.1 Site Unaltered by Yahoo (in a zip file)

Note: The links below are non-local; they refer to directory, http://www.oocities.com/da5id65536/. This will only matter if you're looking at the local copy of this page created by unzipping the zip file above.

The Original Documents

The Infamous HR 3920

Sudoku

Logarithms and Antilogarithms in C

Math Functions

Dave's MIDI