For those who enjoy or survive that finely dated blend of overly written prose, crude humor, and dense theoretical topics which constitutes Neal Stephenson's Cryptonomicon (1999), there comes a moment near the end where the reader is presented with an explanation of a unique encryption algorithm by the author's stand-in character.
Such an algorithm is presented as remarkably robust, ingenious even, in the universe of the book. But what about in real life?
If you want the short answer, it is in fact rather biased but not in a way that makes it immediately breakable.
An appendix can be found in most editions by the technologist who actually concocted the "Pontifex" algorithm, Bruce Schneier, explaining how it works, himself calling it "Solitaire". One can find this appendix freely distributed on his website, including most of the information I present below.
I merely wish to try to present it in even more succinct and clear terms than the various authors who have participated in this dialogue, more as a way of introduction to someone at least superficially curious about this algorithm.
Pontifex - Solitaire Algorithm
Cryptographic Pre-Requisites
First some basics.
Cryptography is concerned with the transmission of information in a way that only the intended parties can discern its meaning.
If my plaintext message "RUN" is unencrypted and I send it in a transmission, anyone who intercepts my message and knows its intended recipient can discern its intent.
One method of cryptography to prevent this is encryption. The message itself will still be transmitted through the same potentially interceptable channel but the plaintext will be converted into a ciphertext message: "NUR".
The encryption algorithm is the set of instructions by which plaintext is converted into ciphertext by the sender. The decryption algorithm is the set of instructions by which the received ciphertext is converted into plaintext by the intended recipient. The algorithms are paired with a key or keys to convert it properly.
The fewer discernible patterns there are to the ciphertext, the more resistant it can be to being deciphered or its algorithm broken by a malicious actor. Randomness, or the appearance of randomness, are key here. Something both humans and machines are bad at, in a pure mathematical sense.
Long since Turing, many encryption algorithms are digitally calculated.
The Solitaire algorithm is distinct in that it is intentionally analog in its nature. "Solitaire may be low-tech, but its security is intended to be high-tech."
If an agent in the field has no digital devices but does have a deck of card, they can participate in communication with other parties. Of course, this is significantly slower, and that must be taken into account.
Solitaire is analogous to an output-feedback mode stream cipher. It is of course not literally an OFB cipher as it does not use a block cipher.
Let us consider each of these individually.
A stream cipher uses a keystream sequence to encrypt the plaintext data character-by-character rather than in blocks, converting it into ciphertext.
Output-feedback mode (OFB) is a mode of operation for block ciphers that generates a keystream independent of the plaintext. Solitaire does not use a block cipher, but its structure is similar in that it evolves internal state to produce a keystream.
OFB creates its own independent keystream; nothing about the keystream can be surmised from the plaintext or vice versa.
Furthermore, if the initialization value is unique and unpredictable, it makes keystream reuse attacks more difficult.
OFB has largely been replaced by CTR in more recent years, but this analogy is useful for understanding the structure of Solitaire.
Solitaire Algorithm
Let us go into the execution of Solitaire Algorithm. First we will define our conventions, then enumerate how (1) the initialization vector is formed by keying the deck, (2) the keystream is generated given the IV, (3) how encryption and decryption occurs given the keystream.
Important Conventions
- Letters are mapped to numbers: A = 1, ..., Z = 26
- Modulo 26 arithmetic:
- If x + y > 26, subtract 26
- If x − y < 1, add 26
- The keystream must be the same length as the plaintext
The Deck
Solitaire generates its keystream using a 54-card deck.
It follows these rules:
- Two cards must be uniquely identifiable jokers: Joker A and Joker B
- Card values use bridge order:
- Clubs → Diamonds → Hearts → Spades
- When converting cards to numerical values, count sequentially through the bridge-ordereds suits.
- When moving jokers:
- If a joker is at the bottom, it wraps around to the second position from the top
1. Initialization Vector - Keying the Deck
As an OFB stream-mode cipher, one of our first steps is to identify a robust, secure initialization vector.
Before beginning, one must "key" the deck. That is arrange it in a precise order. Here are three suggested methods to key the deck:
- Random shuffle: Shuffle one deck then use another deck to make an identical copy. The sender and receiver each have one identical, ordered deck.
- Bridger ordered: A fully bridge-ordered deck corresponds to approximately a 95-bit key. The challenge is securely communicating what this ordering is.
- Passphrase-based keyring: With the deck in order from lowest to highest card followed by Joker A then Joker B. Perform the Solitaire operation but do Step 4 twice, using the first character of the passphrase as the cut number. Repeat for each character of the key. For security, this passphrase must be 80 characters at a minimum.
Once the deck is keyed, now a keystream can be generated.
2. Generating the keystream
To generate each keystream value, the deck is mutated according to the following procedure. These steps produce one keystream character.
- Find Joker A and move it one card down
- Find Joker B and move it two cards down
- Perform a triple cut:
- Swap the cards above the uppermost joker with the cards below the lowermost joker
- Perform a count cut:
- Convert the bottom card to a number N
- Cut the top N cards and insert them just above the bottom card
- Determine the output card:
- Convert the top card to a number
- Count down that many cards from the top
- The card immediately following is the output card
- If the output card is a joker, discard it and repeat the process
- Convert the output card to a number between 1 and 26
This number is one keystream value. Repeat this process to generate more keystream values.
3. Encrypt and Decrypt
Encryption and decryption are identical operations except for the arithmetic used.
- Generate the same number of keystream values as plaintext characters
- Convert plaintext or ciphertext letters to numbers
- Combine values using modulo 26 arithmetic:
- Encryption: add the keystream value to your "plaintext number".
- Decryption: subtract the keystream value to your "ciphertext number".
- Convert the resulting numbers back to letters
An Assessment of Solitaire Algorithm
At the outset, this algorithm is admittably extremely slow to calculate for each value and extremely brittle, as any misplaced or miscalculated card could easily destroy the whole cipher and render the whole algorithm useless until the sender and recipient can resync their decks.
But what about its mathematical strengths and weaknesses?
Paul Crowley was one early respondent to the posed algorithm, posting on his personal site about his findings.
Upon implementing it in "C", Crowley found that the Cryptographically secure pseudorandom number generator (CSPRNG) is not reversible, even though Scheier claims it is. CSRPNG is critical for the randomness of the key, to harden it against reverse engineering.
If there is apparent bias in the way "random" numbers are generated, this creates patterns, and patterns are exposure to being deciphered.
It is is not reversible because when a joker is moved to the top from the bottom, this cannot be undone. More than that, the CPRNG is biased.
The output of each step of CPRNG would be between 0 and 25 and thus there is a 1/26 in genuinely random systems for the output to be the same across two steps. However upon simulating 10^7 samples, it appears this is in fact closer to 1/22.5.
His hypothesis is that this bias toward repeated output occurs when the probability of the output letter is also very high (34%). He believes this is in large part because many cards tend to stay in the same position across rounds.
All in all, he calculates the number of concidences to be 159 standard deviations from the mean which is rather high and indicates there is in fact a structural weakness in the algorithm that exposes the keystream to breaking, and from there ciphertext or other attacks.
A more formal mathematical analysis of the algorithm was published by Boris Pogorelov and Marina Pudovkina in 2003.
Rather than considering it statistically, they review it algebraically through mathematical proofs.
In particular, they demonstrate that the CSRPNG is in fact formally non-reversible through Corollaries 11 and 12. They are rather dense so I will refer you to the link, as this webpage is not suited for mathematical notation.
This causes entropy (randomness) loss and the long term accummulation of bias in the keystream.
The study does not extend the same formal analysis to measuring output bias but it does point to a critical flaw in the keystream operation which causes the randomness to degrade faster than a secure algorithm should.
A third paper was published by Daniel Shiu in 2019 which provides a more complete and cryptographically-focused analysis of the Solitaire algorithm as a whole.
He uses Shannon's entropy function to confirm the bias of consecutive keystream values is in fact 1/22.5 and in turn means there is an entropy loss of ~0.0005 bits per character.
Shiu provides two attack scenarios to illustrate the algorithm's exposure:
- Short messages: If a short message is repeated (e.g. "RUN AWAY NOW"), it is possible to statistically recover the plaintext
- Long messages: Longer messages (10,000+ characters) can also have statistically recognizable patterns
Of course it is possible to steer users away from short messages and the painful analogue operation will naturally curtail the use of long messages as well.
But the crux of the similarity emerges when the top card repeats, card operations restore large contiguous values, or the extraction depends on two cards.
Taken altogether, the Pontifex / Solitaire algorithm is not immediately breakable or have catastrophic loopholes that expose one's plaintext, but it does possess a notable structural flaw which will cause a steady leakage of information due to the bias in the keystream generation. This certainly fails modern, digital cryptographic standards but it is still an entertaining puzzle to consider in a fully analogue world.
And of course would it not be fun if groundbreaking scientific or mathematical insights were to be genuinely published for the first time in the heart of a 900 page novel. Perhaps next time it will.
