AI Used to Decrypt Medieval Ciphers
Researchers are using machine learning algorithms to decrypt historical pencil-and-paper ciphers.
Researchers are using machine learning algorithms to decrypt historical pencil-and-paper ciphers.
Will be fascinating to see how much progress the DESCRYPT project researchers make in decoding the estimated 1% of all archival material that remains fully or partially encrypted.
For at least one document they had used an online platform Transkribus, that appears to host 300+ community AI models, to reveal it’s message; it had only been partly encrypted. They were also able to develop their own AI tool to work through some of the obscurities of their materials. Looks like a really neat project.
Browsing through DESCRYPT’s resources, also came across a nice mention of our host here:
Mexaly • June 3, 2026 12:37 PM
Please aim AI at the Voynich manuscript.
lurker • June 3, 2026 1:22 PM
@Mexaly
P’raps they have, and p’raps the AI is still shrugging,
which is not an answer they would want us to know …
NobodySpecial • June 3, 2026 3:58 PM
O.Henry wrote a short story – Calloway’s Code – in the early 1900’s. A wonderful example of something that AI would probably come up with, but I wonder if it could decode it?
Briefly, a reporter evades censorship rules by sending the first part of a common phrase and leaving off the last part. Knowing the phrases would give you the real message. Knowing the statistically-likely ending to the phrase would give it too.
But would an AI calculate that “cast of” should end in “thousands”, or that “3 to 4 cast of people” really meant “3 to 4 thousand people”?
Statistics are wonderful tools, but context is key.
Clive Robinson • June 3, 2026 7:14 PM
@ NobodySpecial, ALL,
With regards,
“… evades censorship rules by sending the first part of a common phrase and leaving off the last part. Knowing the phrases would give you the real message. Knowing the statistically-likely ending to the phrase would give it too.”
If you think about it replacing “phrase” with “word” more or less gives you a description of “Cockney Rhyming Slang”.
Which is a “cant”[1] or “verbal code” used by what are viewed as “undesirables” by “authorities”
But although used as part of communications it is clearly “a code” used to hide individual meanings within a common language, rather than “a language” designed to hide all the communication or message. The difference can be seen if you look up the WWI/II “Code Talkers”[2].
[1] The term of art “cant” is not well known and is often misheard as “chant”. More detail on it can be found at,
https://en.wikipedia.org/wiki/Cant_(language)
[2] The idea behind the use of Code Talkers is that it has a “speed advantage” and does not involve Codes or Ciphers that are slow and unsuitable for front line tactical or battlefield use. However the idea “blossomed out” as it’s strengths became more apparent.
Due to some very daft recent name change nonsense the URL has changed,
Rontea • June 7, 2026 9:38 AM
Fascinating read. The intersection of AI and historical cryptography is a perfect example of technology enabling new insights without replacing the human expertise that makes sense of the discoveries. The Borg cipher’s 400-year-old secrets surfacing through meticulous machine learning work highlight both the promise and the challenge: automation accelerates the legwork, but validation and context still rely on skilled analysts. The notion of eventually combining transcription and decryption in one step could be a game-changer for unlocking the roughly 1% of archival material that’s still coded. Exciting to see methods emerging that scale without losing rigor.
Subscribe to comments on this entry
Sidebar photo of Bruce Schneier by Joe MacInnis.
Clive Robinson • June 3, 2026 9:30 AM
@ ALL,
Overly simply all codes are about,
“Hiding or flattening the statistics of plaintext in a reversible way.”
Untill last century this was mostly done by chosen individuals by hand.
This had certain implications that would make the hiding/flattening some what simple.
One such would be “short key phrases”.
Current AI LLM and ML systems are basically Statistical devices with a little random fuzzing thrown in.
It does not take much imagination to see how AI could invert the statistics from the ciphertext by simply finding the statistics. Thus in effect multiply them out back to the plaintext.
However that said it does not of necessity make the task any easier.