Konton 2.1: A Stream Cipher by J. David Sexton

Last updated February 11, 2005

About Konton 2.1
Description of Konton2's PRNG
Konton2's Strength
Konton2's Speed
Legal Issues
Konton2 Source Code
The Tests in KonStat2.c
Cryptography Links
Books
Other Links

Appendices

Results of Statistical Tests of Randomness on Konton2, RC4, and Snow2
Results of Statistical Tests of Randomness on Some Weak PRNGs
Analysis of Yasmarang (a PRNG devised by Ilya O. Levin)

Appendices for Novices

Introduction to Pseudo-Random Number Generators (PRNGs)
Randomness

About Konton 2.1

Konton (pronounced "kone-tone") is a Japanese word which means "primordial chaos." It appears, written in Japanese kanji, at the top of this page.

Konton 2.1 is a stream cipher based on a cipher I developed called Konton 2.0, which was based on a cipher called Konton. Rather than continue to detail the history of the changes, I will just refer the reader to Description of Konton2's PRNG below. Konton 2.1 will be referred to simply as Konton2 in the rest of this document.

Konton2:

does not resemble RC4 and does not use shift registers
encrypts byte-by-byte
can encrypt with feedback from the encryption and decryption functions so that different plaintexts encrypted with the same key are encrypted with different keystreams.
can encrypt in synchronous mode
can encrypt only text and leave other bytes untouched
can use indexed encryption, where each block of data of data is stored or sent along with an index value that, along with key, is used to initialize Konton2 before each encryption and decryption.

[Table of Contents]

Description of Konton2's PRNG

In the explanation that follows:

"x mod y" means "x modulo y," and
"x rol y" means "x rotated y bits to the left."

Bit rotation is sometimes called "circular shifting."

The variables in the PRNG are:
`n`	a constant: `n = 32`
`C₀` through `C_n-1`	a series of constants: `C_i = (3,989,547,399 × 3ⁱ) mod 2³²`
`S₀` through `S_n-1`	an array of 32-bit integers: the main state data. It is initialized to an image of the encryption/decryption key.
`P`	a 32-bit integer: the highest-order 16 bits of `P` are the output of the PRNG. `P` is initialized to zero and left intact as part of the state data between calls to the PRNG.
`R`	an integer: the number of bits to rotate left. `R` is initialized to zero and left intact as part of the state data between calls to the PRNG.

The steps in one call to the PRNG are as follows:

For i = 0 to n-1:

Let S_i = (S_i + C_i) mod 2³²
If i > 0 and S_i-1 = 0 Let S_i = (S_i + C_i) mod 2³²
Let R = (R + S_i) mod 32
Let P = P rol R
Let R = (R + P) mod 32
Let P = (P + (S_i rol R)) mod 2³²

Final Step (after n iterations of steps 1 through 6):

If P = 0 and R = 0 Let R = 1

Notes:

Steps 1 and 2 cause S to assume all 2¹⁰²⁴ possible states before any state is repeated, ensuring a cycle length of at least 2¹⁰²⁴.
n is actually equal to 8 × KONCRYPTSTRENGTH, where KONCRYPTSTRENGTH is a macro defined in the source code. The standard definition of KONCRYPTSTRENGTH is 4.
The final step deals with a small, but real, problem. After many (something on the order of 2³⁷) calls to the PRNG, P and R will eventually both return to zero, their original state. From that point on, the keystream generated would (without the final step) be the same as the keystream that would have been generated had the original state of S been the state of S at the time that P and R returned to zero. In other words, after about 2³⁷ calls, one key would be equivalent to another. If keys are chosen randomly, this is not likely to become a problem. However, because the problem is easy to eliminate, it is eliminated.
Two bytes are encrypted or decrypted with each call to the PRNG. Bits 16 through 23 of P are used to encrypt or decrypt the first byte, and bits 24 through 31 are used to encrypt or decrypt the second (where bit 0 is the lowest-order bit, and bit 31 is the highest-order bit). The lowest-order 16 bits of P are not used for encryption or decryption. The combining function for encryption and decryption is simple bitwise XORing. After each pair of bytes is encrypted or decrypted, the quantity (a + (b × 256)) is added (mod 2³²) to P (where a is the first plaintext byte encrypted or decrypted and b is the second). Note that it is always the plaintext that is added to P (not the ciphertext), and that the plaintext is added to P after encryption or decryption.
When an odd number of bytes are encrypted or decrypted, the last byte is encrypted or decrypted with bits 16 through 23 of P, and the last plaintext byte is added (mod 2³²) to P.
Konton2 can encrypt and decrypt in synchronous mode, where the state data is unaffected by the plaintext. In this case the plaintext is not added to P. An advantage of this mode is the lack of forward error propagation. A disadvantage is that every plaintext encrypted with a given key is encrypted with the same keystream. This problem can be dealt with by using indexed encryption, where S is initialized with the key and with an index value up to 128 bits long. Note that there are only 2¹⁰²⁴ possible initial states of S, so a given key-index combination will be equivalent to others. Even so, 2¹⁰²⁴ virtual keys should be far more than enough.
When Konton2 is used to encrypt text and leave other bytes untouched, subtraction (rather than bitwise XORing) is used to combine the keystream with the plaintext.

The PRNG could be seen as having two blocks of state data with two separate next-state functions. The elements of S could be seen as one block, and P and R together could be seen as the other. The next-state function for S (steps 1 and 2) is completely linear, but the next-state function for P and R (steps 3 through 6) is very non-linear. Moreover, while the next state of S is a function only of its previous state, the next state of P and R is a function the previous state of S, P, and R. With each call to the PRNG, every bit of S affects the value of P, and these effects cascade in complex, chaotic ways as P affects R which affects P again.

Another way to view the PRNG would be to see S as an essentially infinite supply of unique 1024-bit integers to feed into the non-linear function which updates P and R. Each new state of P and R is a function of their previous states and the integer supplied by S. Because each 1024-bit integer is easily predictable from the previous 1024-bit integer, it remains only to analyze the non-linear function, looking for correlations between its 1061-bit input (S, P, and R) and its 37-bit output (P, and R).

If the lowest-order 16 bits of P were used for encryption, Konton2 would have weak keys. Consider what happens when every element of S is a multiple of 32. If the previous P was also a multiple of 32, and the previous R was zero, P will be a multiple of 32 again. S has 2¹⁰²⁴ possible states, which it always visits in the same order. Within this series, there are 2⁸⁶⁴ states in which every element of S is a multiple of 32. This is a very small fraction of the possible states: one in 2¹⁶⁰. However, 2²⁷ of these states occur, spaced 32 states apart, in a segment of states which is 2³² states long, beginning with the state where every element of S is zero. The special condition, where multiples of 32 follow multiples of 32, occurs too often in some parts of this segment.

The combination of modular addition and pseudo-random bit rotation is also used in other ciphers. The ciphers discussed in "Mod n Cryptanalysis, with Applications Against RC5P and M6" by Kelsey, Schneier, and Wagner are vulnerable because they expose the lower-order bits of the result of pseudo-random bit rotation and modular addition.

I have done some experiments with altered versions of this PRNG. In one experiment I reset P and R to zero before each call to the PRNG. The output appeared random. In another experiment, the algorithm was just as above except that steps 1 and 2 were eliminated. When the state of S was pseudo-random, the output appeared random. I have also experimented with smaller values of KONCRYPTSTRENGTH. The output appeared random, but I have two concerns about these "cut-down" versions: first, the avalanche effect (where a given bit of S eventually affects every bit of P) may not be complete after fewer than 32 iterations of steps 1 through 6; and second, a simpler equation is, in general, more likely to have a feasibly calculable (possibly even linear) approximation.

[Table of Contents]

Konton2's Strength

Konton2 (to the best of my knowledge) has not yet received independent analysis. As outlined above (in Description of Konton2's PRNG), the most likely attack against Konton2 seems to be an attack on the non-linear function which updates the parts of the state data used for internal feedback and output. Perhaps some distinguishing characteristic of this function can be found. Perhaps it can even be deduced from the design. Perhaps the equation can be simplified or approximated.

The paper mentioned above, "Mod n Cryptanalysis, with Applications Against RC5P and M6", describes an attack that might be fruitful, if Konton2 were slightly modified. Fortunately I noticed the issue when I determined what made the output of Konton 2.0's PRNG apparently non-random. This present version, Konton 2.1, was designed with a deeper understanding of results of pseudo-random bit rotation and modular addition.

Every stream cipher PRNG must produce a keystream which is apparently random, though apparent randomness, alone, doesn't guarantee strength. Konton2's keystream has passed all the statistical tests of randomness I've subjected it to. In particular, it has passed all the tests performed by the program created by compiling and linking KonStat2.c and Konton2.c together (see Konton2 Source Code below). For information about the tests this program performs, see The Tests in KonStat2.c below. To see sample output from this program, look at Results of Statistical Tests of Randomness on Konton2, RC4, and Snow2. Konton2 has also passed Marsaglia's and Tsang's GCD, Birthday Spacings, Gorilla, and Collision Tests. See "Some difficult-to-pass tests of randomness" by Marsaglia and Tsang.

[Table of Contents]

Konton2's Speed

Konton2 is slower than some other stream ciphers. On my computer, with its Pentium II running at 231.782 MHz, Konton2 encrypts (in synchronous mode) about 1600 kilobytes per second. Snow 2.0 (the reference implementation) encrypts about 13000 kilobytes per second.

Notice that I have revised the implementation of RC4 in KonStat2.c to make it as fast as it should be. See Results of Statistical Tests of Randomness on Konton2, RC4, and Snow2.

Konton2, nevertheless, should be more than fast enough for most applications. In addition, its small state, just 2085 bits (including a copy of the key), may give it an advantage over some new ciphers which require a few kilobytes.

[Table of Contents]

Legal Issues

If you're going to download the Konton2 source code, be aware that U.S. government regulation of cryptographic software has changed during the last twenty years and may change again. To see the relevant statutes and regulations, go to the Legal Information Institute at Cornell to look at the U.S. Code, and to the Code of Federal Regulations Online to look at the U.S. regulations. To stay up to date on cryptography regulation worldwide, look at Bert-Jaap Koops' web site. The book Cracking DES (listed below in Books) is an interesting source of information on the history of U.S. government regulation of cryptographic software.

If I understand current law correctly, Konton2 is now upatentable (at least in the United States) because a complete (though not beautifully written) specification of it was published more than one year ago.

RC4 (mentioned often in these documents) should, arguably, be called alleged RC4. RC4 is supposed to be a trade secret, but the secret leaked out some years ago. To the best of my knowledge, though, RC4 is still a registered tradename of RSA Data Security, Inc.

[Table of Contents]

Konton2 Source Code

I have implemented Konton2 in three source code files:

Konton2.h
KonTyps2.h
Konton2.c

The following two files are programs that call Konton2 functions:

TestKon2.c
KonStat2.c

There is also a configure script for GNU/Linux (or UNIX) users:

configure

I have compressed these six files into the following ZIP archive.

konton2.zip (29 KB--29497 bytes)

To help C programmers use Konton2 correctly, Konton2.h includes comments which explain and demonstrate how to call Konton2 functions.

TestKon2.c compiled and linked with Konton2.c creates a program which tests to see if Konton2 has compiled correctly.

KonStat2.c compiled and linked with Konton2.c produces a program that runs statistical tests of randomness on Konton2, RC4, and (optionally) Snow2. Alternatively, a file can be evaluated. This file should have no special format; it should simply be a sequence of random or pseudo-random bits. Invoke KonStat2 with a command line argument of "-h" to find out how to specify the file. For information about the tests performed, see The Tests in KonStat2.c below. Results of Statistical Tests of Randomness on Konton2, RC4, and Snow2 contains sample output from this program.

Before you compile KonStat2.c, you may want to read the first part of the file to make sure that the definition of EAT_THE_ENTER is appropriate for your implementation of C.

To evaluate Snow2 along with Konton2 and RC4, define the macro TEST_SNOW2 (in KonStat2.c) as 1 and compile and link snow2.c with KonStat2.c and Konton2.c.

The Snow2 cipher is documented at the Snow cipher web site, and the file snow2.c is available there (along with the necessary headers) in the ZIP archive snow2.0.zip (approximately 21 KB).

I have included a "configure" script for GNU/Linux (or UNIX) users. To build KonStat2 and TestKon2, unzip konton2.zip. This will create a new directory called Konton2. If you want to evaluate Snow2, copy the Snow2 files (snow2.c, snow2.h, and snow2tab.h) into this new Konton2 directory along with the six files already there. Then "cd" to the Konton2 directory and run the configure script by typing

./configure

When the configure script has finished, type

make

That's all there is to it. The configure script takes care of defining the TEST_SNOW2 macro in the makefile it creates, so you don't have to edit KonStat2.c to change it.

KonStat2.c is a work-in-progress. I am constantly investigating new statistical tests and trying them out on various PRNGs.

[Table of Contents]

The Tests in `KonStat2.c`

KonStat2.c runs statistical tests of randomness on Konton2, RC4, and (optionally) Snow2. Alternatively, a file can be evaluated. This file should have no special format; it should simply be a sequence of random or pseudo-random bits. Invoke KonStat2 with a command line argument of "-h" to find out how to specify the file. Results of Statistical Tests of Randomness on Konton2, RC4, and Snow2 contains sample output from this program. The source code is available above at Konton2 Source Code.

I am constantly experimenting with new tests of randomness. I don't expect that this program will ever be finished.

In the reports created by KonStat2.c:

"P" is the tail probability [i.e., a P-value less than 0.1 indicates failure at the 0.1 level of significance]
"MAD" is the mean absolute deviation of the P-values from their theoretical median of 0.5
"tidbit" is a 2-bit unsigned integer [4 possible values]
"nibble" is a 4-bit unsigned integer [16 possible values]
"byte" is an 8-bit unsigned integer [256 possible values]

In the explanation that follows:

">=" means "greater than or equal to"
"<=" means "less than or equal to"

The P-value is a percentile; it is the probability of a statistic equal to or greater than the one reported. P-values for a given test should be more-or-less evenly distributed between 0 and 1. One asterisk (*) appears to the right of a P-value less than 0.1, two to the right of a P-value less than 0.01, three to the right of a P-value less than 0.001, and four to the right of a P-value less than 0.0001. One caret (^) appears to the right of a P-value greater than 0.9, two to the right of a P-value greater than 0.99, three to the right of a P-value greater than 0.999, and four to the right of a P-value greater than 0.9999.

All the statistics calculated are chi-squared statistics. An effort was made to keep all the expected counts in each chi-squared calculation high. Expected counts in the following tests (with a minimum test sequence length of one megabyte) are never lower than 2048.

The "cumulative sum" chi-squared statistic for each test is calculated by summing the statistics for all the keys (or sequences) tested. The degrees-of-freedom for the resulting chi-squared statistic is the degrees-of-freedom for the test multiplied by the number of statistics summed.

When more than one key or sequence is tested, the Mean Absolute Deviation (MAD) of the P-values for each test is calculated. This is the mean absolute deviation of the P-values from their theoretical median of 0.5. If a large number of keys or sequences is tested, the MAD should be very close to 0.25. A value much higher than 0.25 indicates that too many P-values are far from 0.5. A value much lower than 0.25 indicates that too many P-values are close to 0.5.

This MAD has two weaknesses. First, it will look very good if all the P-values are very close to 0.25 and 0.75 (I'm not sure if this is even possible, but it is a weakness); and, second, it is difficult to calculate just how probable a given MAD is. I can give some general guidelines. If the number of sequences or keys tested is at least 256, any MADs greater than 0.30 or less than 0.20 should be viewed with suspicion. If at least 1024 keys or sequences are tested, the MADs should all be 0.24, 0.25, or 0.26.

I'd like to replace the MAD, in the future, with something better. It might present some difficulties, but perhaps some variation of the KS test could be devised.

The tests are mostly byte-oriented because the PRNGs I was most interested in testing produce their output as bytes. The tests devised by Marsaglia and Tsang, which are often used to test stream cipher PRNGs, are mostly oriented toward 32-bit words.

Bit Runs Test

A run is one or more consecutive bits with the same value (either 0 or 1). Whenever a bit has a value different from the preceding bit, a new run is begun. For example, 01001011010100 consists of eleven runs: 0, 1, 00, 1, 0, 11, 0, 1, 0, 1, and 00. In a random sequence, approximately half the runs should be one bit long, a fourth should be two bits long, an eighth should be three bits long, etc. Runs of 1 to 10 bits are counted. Runs of 11 bits and longer are lumped together. A chi-squared statistic is calculated. The degrees-of-freedom is 10.

Alternate Bit Runs Test

A run is one or more consecutive bits that alternate in value between 0 and 1 in a regular pattern. The pattern is specified by a parameter passed to the function. If the parameter is 1, the pattern is 010101... or its complement 101010...; if the parameter is 2, the pattern is 001100110011... or its complement 110011001100...; if the parameter is 3, the pattern is 000111000111... or its complement 111000111000...; and so on. Whenever the pattern is not followed, a new run is begun.

For example, if the parameter is 1, the sequence 01001011010100 consists of the following four runs: 010, 0101, 101010, and 0. If the parameter is 2, the same sequence consists of the following nine runs: 0, 1, 001, 0, 110, 1, 0, 1, and 00.

In a random sequence, approximately half the runs should be one bit long, a fourth should be two bits long, an eighth should be three bits long, etc. Runs of 1 to 10 bits are counted. Runs of 11 bits and longer are lumped together. A chi-squared statistic is calculated. The degrees-of-freedom is 10.

Block Bit Frequency Test

The sequence is divided into 256 blocks of equal length. The total number of bits of each possible value in each block is counted. In a random sequence the probability of each possible bit value is 1/2. A chi-squared statistic is calculated. The degrees-of-freedom is 256.

Bit Frequency Test

The total number of bits of each possible value is counted. In a random sequence the probability of each possible bit value is 1/2. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

Tidbit Frequency Test

The total number of tidbits of each possible value is counted. In a random sequence the probability of each possible tidbit value is 1/4. A chi-squared statistic is calculated. The degrees-of-freedom is 3.

Nibble Frequency Test

The total number of nibbles of each possible value is counted. In a random sequence the probability of each possible nibble value is 1/16. A chi-squared statistic is calculated. The degrees-of-freedom is 15.

Byte Frequency Test

The total number of bytes of each possible value is counted. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

Bit Test

One bit of each byte of the sequence is evaluated in this test. The bit is specified by a parameter passed to the function. This parameter will be a number from 0 to 7: where 0 specifies the lowest-order bit, and 7 specifies the highest-order bit. The value of the specified bit of each byte in the sequence will be either 0 or 1. In a random sequence the probability of each possible bit value is 1/2. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

Overall Bit Test

The bit test described above is performed for each of the 8 bit positions. The resulting chi-squared statistics for each of the 8 tests are totaled; the total is an overall chi-squared statistic. The degrees-of-freedom is 8.

8-Bit-Sum Test

The bits of each byte of the sequence are summed. The bit sum will be either 0, 1, 2, 3, 4, 5, 6, 7, or 8. In a random sequence, the probabilities of 0, 1, 2, 3, 4, 5, 6, 7, and 8 are 1/256, 8/256, 28/256, 56/256, 70/256, 56/256, 28/256, 8/256, and 1/256 respectively. The total number of bytes with each possible bit sum is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 8.

16-Bit-Sum Test

The sequence is divided into segments of 16 bits. The bits of each segment are summed. The bit sum (s) will fall within one of three ranges: [s<7], [s>9], or [7<=s<=9]. In a random sequence, the probabilities of [s<7] and [s>9] are both 14893/65536, and the probability of [7<=s<=9] is 35750/65536. The total number of segments with bit sums in each of the 3 ranges is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 2.

32-Bit-Sum Test

The sequence is divided into segments of 32 bits. The bits of each segment are summed. The bit sum (s) will fall within one of three ranges: [s<15], [s>17], or [15<=s<=17]. In a random sequence, the probabilities of [s<15] and [s>17] are both 1281220733/4294967296, and the probability of [15<=s<=17] is 1732525830/4294967296. The total number of segments with bit sums in each of the 3 ranges is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 2.

Odd/Even-Bit-Sum Test

The sequence is divided into 6-byte segments. For each segment, the bits in each byte are summed. The bit sums will be either even or odd. In a random sequence the probability of both even and odd bit sums is 1/2. In each segment, there are 64 possible permutations of even and odd bit sums, each with a probability of 1/64. The number of segments with each possible permutation is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 63.

64-bit Matrix Test

The sequence is divided into 64-bit segments. Eight bytes are built from these 64 bits. The first byte is built from the 1st, 9th, 17th, 25th, etc. bits in the segment; the second is built from the 2nd, 10th, 18th, 26th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

128-bit Matrix Test

The sequence is divided into 128-bit segments. Sixteen bytes are built from these 128 bits. The first byte is built from the 1st, 17th, 33th, 49th, etc. bits in the segment; the second is built from the 2nd, 18th, 34th, 50th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

256-bit Matrix Test

The sequence is divided into 256-bit segments. Thirty-two bytes are built from these 256 bits. The first byte is built from the 1st, 33th, 65th, 97th, etc. bits in the segment; the second is built from the 2nd, 34th, 66th, 98th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

512-bit Matrix Test

The sequence is divided into 512-bit segments. Sixty-four bytes are built from these 512 bits. The first byte is built from the 1st, 65th, 129th, 193th, etc. bits in the segment; the second is built from the 2nd, 66th, 130th, 194th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

1024-bit Matrix Test

The sequence is divided into 1024-bit segments. One hundred and twenty-eight bytes are built from these 1024 bits. The first byte is built from the 1st, 129th, 257th, 385th, etc. bits in the segment; the second is built from the 2nd, 130th, 258th, 386th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

2048-bit Matrix Test

The sequence is divided into 2048-bit segments. Two hundred and fifty-six bytes are built from these 2048 bits. The first byte is built from the 1st, 257th, 513th, 769th, etc. bits in the segment; the second is built from the 2nd, 258th, 514th, 770th, etc. bits, and so on. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

Bit Prediction A Test

An algorithm is used to predict the value of each bit of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/2. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

The algorithm is as follows: the numbers of zeros and ones in all the previous bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted. Otherwise the prediction is the same as for the previous bit.

The interesting part of this test is the prediction made when the numbers of zeros and ones are equal.

Bit Prediction B Test

The algorithm is as follows: the numbers of zeros and ones in the previous 9 bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted.

Bit Prediction C Test

The algorithm is as follows: the numbers of zeros and ones in the previous 17 bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted.

Bit Prediction D Test

The algorithm is as follows: the numbers of zeros and ones in the previous 33 bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted.

Bit Prediction E Test

The algorithm is as follows: the numbers of zeros and ones in the previous 65 bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted.

Byte Prediction A Test

An algorithm is used to predict the value of each byte of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/256. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

The algorithm is as follows: the next byte is predicted to be equal to all the previous bytes bitwise XORed together. The first byte of the sequence is predicted to equal zero.

Byte Prediction B Test

The algorithm is as follows: the next byte is predicted to be equal to the sum of all the previous bytes, modulo 256. The first byte of the sequence is predicted to equal zero.

Byte Prediction C Test

The algorithm is as follows: the next byte value is predicted to be zero until the first zero is found. From that point on, the next byte value is predicted to be the byte value whose last appearance was furthest back in the sequence.

Byte Prediction D Test

The algorithm is as follows: a given byte value is predicted to be followed by the same byte value it was followed by the last time it appeared in the sequence. A byte value that has not previously appeared in the sequence is predicted to be followed by the byte value of the first byte in the sequence. The first byte of the sequence is predicted to equal zero.

Byte Repetition Test

Each byte in the sequence either is or is not identical to its preceding byte. In a random sequence the probability that a byte will be identical to its preceding byte is 1/256. The total number of bytes identical to their preceding bytes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

This test is equivalent to a byte prediction test where each byte is predicted to be equal to its preceding byte. The first byte of the sequence is predicted to equal the last byte of the sequence.

Two-Byte AND Test

The sequence is divided into two-byte segments. The two bytes in each segment bitwise ANDed together. The result (d) will fall into one of 7 ranges: [d<37], [37<=d<74], [74<=d<111], [111<=d<148], [148<=d<185], [185<=d<222], or [d>=222]. In a random sequence the probabilities of these ranges are 32265/65536, 10755/65536, 5355/65536, 8985/65536, 3969/65536, 3171/65536, and 1036/65536, respectively. The total number of segments in each of these 7 ranges is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 6.

Four-Byte AND Test

The sequence is divided into four-byte segments. The four bytes in each segment bitwise ANDed together. The result (d) will fall into one of 3 ranges: [d<73], [73<=d<146], or [d>=146]. In a random sequence the probabilities of these ranges are 3993624225/4294967296, 266241615/4294967296, and 35101456/4294967296, respectively. The total number of segments in each of these 3 ranges is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 2.

Four-Byte AND-OR Test

The sequence is divided into four-byte segments. The first two bytes and the last two bytes in each segment are bitwise ANDed together. The results of these two AND operations are then bitwise ORed together. The result (d) will fall into one of 3 ranges: [d<73], [73<=d<146], or [d>=146]. In a random sequence the probabilities of these ranges are 1573112097/4294967296, 1223531631/4294967296, and 1498323568/4294967296, respectively. The total number of segments in each of these 3 ranges is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 2.

Two-Byte-Up/Down Test

The sequence is divided into two-byte segments. The second byte in each segment is either equal to, greater than, or less than the first. The total number of segments in each of these three categories is counted. In a random sequence the probability that the second byte will be equal to the first is 1/256. The probabilities that second byte will be greater than or less than the first are both 255/512. A chi-squared statistic is calculated. The degrees-of-freedom is 2.

Four-Byte-Up/Down Test

The sequence is divided into four-byte segments. If the bytes of each segment are called a, b, c, and d; then comparing these four bytes will classify the segments into 8 categories: [a<=b<=c<=d], [a>b<=c<=d], [a<=b>c<=d], [a>b>c<=d], [a<=b<=c>d], [a>b<=c>d], [a<=b>c>d], and [a>b>c>d]. In a random sequence, the probabilities that a segment will belong to each of these categories are 2862209/67108864, 8454015/67108864, 14046335/67108864, 8322945/67108864, 8454015/67108864, 13915265/67108864, 8322945/67108864, and 2731135/67108864, respectively. The total number of segments in each category is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 7.

Rectangular Distance Test

The sequence is divided into four-byte segments. For each segment, the absolute difference between the first and third bytes of the segment is added to the absolute difference between the second and fourth bytes of the segment. The result (d) will fall into one of 7 ranges: [d<64], [64<=d<128], [128<=d<192], [192<=d<256], [256<=d<320], [320<=d<384], or [d>=384]. In a random sequence the probabilities of these ranges are 6935371/67108864, 15991947/67108864, 18225611/67108864, 14683915/67108864, 7696309/67108864, 2865781/67108864, and 709930/67108864, respectively. The total number of segments in each of these 7 ranges is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 6.

Byte Collision Test

The sequence is divided into segments. The length of the segments, in bytes, is a parameter passed to the function. If any byte value occurs more than once in a segment, a collision is said to have occurred. In a random sequence, the probability that no collisions will occur in a segment is (255/256) × (254/256) × (253/256) × ... × ((257 - x)/256), where x is the segment length in bytes. The total number of segments with and without collisions is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

Offset Byte XOR Test

The sequence is divided into pairs of bytes. The offset in the sequence between the bytes in each pair is a parameter passed to the function. The two bytes in each pair are bitwise XORed together. The total number of resulting bytes of each possible value is counted. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

Mod n Test

The divisor (d) is a parameter passed to the function. For each byte of the sequence (x), (x mod d) is calculated. In a random sequence, the probability of a given remainder (r) is (q + 1)/256, if r < (256 mod d), and q/256 otherwise (where q = 256/d, rounded down). The total number of byte values that are congruent to each possible remainder (mod d) is counted. A chi-squared statistic is calculated. The degrees-of-freedom is d - 1.

[Table of Contents]

Cryptography Links

sci.crypt news group
sci.crypt news group via Google
International Association for Cryptologic Research (IACR)
ePrint IACR Cryptology Archive
The Ciphers Page of the Open Directory Project
Search for "stream cipher" with Google
Bert-Jaap Koops' web site about cryptography regulation
Prof. George Marsaglia's famous DIEHARD suite of statistical tests of randomness
"Some difficult-to-pass tests of randomness" by Marsaglia and Tsang
The source code (in C) for Marsaglia's and Tsang's difficult-to-pass tests
The NIST's statistical tests for cryptographic PRNGs
"Mod n Cryptanalysis, with Applications Against RC5P and M6" by Kelsey, Schneier, and Wagner.
Useful Stuff from Qualcomm Australia
The Snow cipher web site
The VMPC web site
Terry Ritter's web site
Counterpane's web site
The president of Counterpane is (or, at least, was) Bruce Schneier, the author of Applied Cryptography and Secrets and Lies (listed below in Books).
Bruce Schneier's Site

[Table of Contents]

Books

Electronic Frontier Foundation. 1998. Cracking DES: Secrets of Encryption Research, Wiretap Politics & Chip Design. Electronic Frontier Foundation: [Sebastopal, CA?].
This is the book that couldn't be published electronically in the USA, but could be published on paper. However, on the cover of the book, the phrase "Scan this book" appears. It has been scanned and uploaded to the internet, though the scans I saw were on a domain apparently not in the USA (see Legal Issues above).

This book shows, in detail, how to build a very expensive machine that can decrypt data encrypted with DES, without the key. The machine does this by brute force, trying every possible key until the right one is discovered. Unfortunately, DES had too small a key space (too few possible keys). DES (the Data Encryption Standard) has now been superseded by AES (the Advanced Encryption Standard).

Flannery, Sarah. 2001. In Code. Workman Publishing: New York.
This book is a fun, easy introduction to cryptography written by a very bright high school student to be read by the general public.

Gaines, Helen. 1956. Cryptanalysis. Dover Publications Inc.: New York.
This is a good introduction to classic cryptography and cryptanalysis. This happens to be the first book (for adults) about cryptography I ever read. I first read it (and did the exercises) back in high school.

Kahn, David. 1996. The Codebreakers. Scribner: New York.
This is a good history of cryptography, but it was written before the Bletchley Park stuff was declassified. On the other hand, Kahn's account of December 7, 1941 is remarkably detailed and completely fascinating.

Menezes, et al. 1997. Handbook of Applied Cryptography. CRC Press Inc.: Boca Raton, Florida.
This is extremely useful, but it's not as readable as Schneier's book below.

Schneier, Bruce. 1996. Applied Cryptography. John Wiley & Sons: New York.
This is a very readable book for those who need (or just want) to know about cryptography.

Schneier, Bruce. 2000. Secrets and Lies: Digital Security in a Networked World. John Wiley & Sons: New York.
In this book Schneier explains why cryptography alone is never enough to ensure security.

Singh, Simon. 1999. The Code Book. Fourth Estate: London.
This is a fun history of cryptography which does cover the Bletchley Park stuff. This book was written for the general public.

Stephenson, Neal. 1999. Cryptonomicon. Harper Collins: New York.
This is Stephenson's fifth novel, and it's probably his best. His third novel, Snow Crash, and his fourth novel, The Diamond Age, are also excellent. Both have been published in paperback by Bantam Spectra. Stephenson has recently completed a trilogy, the Baroque Cycle, published by Harper Collins. I found the first novel of the trilogy, Quicksilver, disappointing. The second novel, The Confusion, was better, I thought. The third, called The System of the World, is (I think) quite good.

[Table of Contents]

Konton 2.1: A Stream Cipher by J. David Sexton

Appendices

Appendices for Novices

Bit Runs Test

Alternate Bit Runs Test

Block Bit Frequency Test

Bit Frequency Test

Tidbit Frequency Test

Nibble Frequency Test

Byte Frequency Test

Bit Test

Overall Bit Test

8-Bit-Sum Test

16-Bit-Sum Test

32-Bit-Sum Test

Odd/Even-Bit-Sum Test

64-bit Matrix Test

128-bit Matrix Test

256-bit Matrix Test

512-bit Matrix Test

1024-bit Matrix Test

2048-bit Matrix Test

Bit Prediction A Test

Bit Prediction B Test

Bit Prediction C Test

Bit Prediction D Test

Bit Prediction E Test

Byte Prediction A Test

Byte Prediction B Test

Byte Prediction C Test

Byte Prediction D Test

Byte Repetition Test

Two-Byte AND Test

Four-Byte AND Test

Four-Byte AND-OR Test

Two-Byte-Up/Down Test

Four-Byte-Up/Down Test

Rectangular Distance Test

Byte Collision Test

Offset Byte XOR Test

Mod n Test

Generally Useful Links

Interesting Documents

Other Documents in this Directory (http://www.oocities.org/da5id65536/) Unrelated to Konton2