The idea is to use Monte Carlo Markov Chain (MCMC) sampling methods to decrypt messages. In particular, we use Metropolis-Hastings algorithm to decrypt messages encrypted by asimple substitution cipher.
Consider a cipher (an injective mapping) that maps every symbol to a different symbol from the same set (The set of 53 symbols is given in symbols.txt).
First, we model English letters and punctuation as a first-order Markov Chain and learn the transition probabilities between every pair of symbols. Then we use Metropolis-Hastings to find an optimal permutaiton of symbols.
A Markov Chain is a way to model sequential data in which current observations depend on previous observations. A fist-order Markov Chain assumes that the current observation is independent of all the previous observations apart form the most recent one. That is, the joint distribution for a sequence of N observations is given by
The transition probabilties represent the probability of a symbol being followed by another symbol. They are learnt using a large English text (war-and-peace.txt) and illustrated as a Hinton diagram.
The purpose of MCMC sampling techniques is to approximate distributions by drawings samples in cases when exact inference is intractable. Metropolis-Hastings algorithm is defined as follows:
Each step: Starting from the current state x
-
Propose a new state x' using a proposal distribution (a similar simpler distribution)
-
Otherwise, keep the old state.
In our case, the state variable is given by the symbol permutaiton and we assume a uniform prior over the permutations.
The potential new state is calculated by randomly choosing two symbols out of 53 possible and swapping their corresponding encrypted symbols. Hence, the proposal probability is given by:
Hence, the acceptance probability is given by:
During implementation, we consider the log-probabilities replacing division with subtraction and compare the acceptance ratio with a sample drawn from a uniform distribution to determine whether the new state is accepted or rejected.
All code with results for two different encrypted texts is given in the Jupyter Notebook MCMC.ipynb. The encrypted messages can be found in message_large.txt and message_small.txt.
- Christopher M. Bishop (2006) Pattern Recognition and Machine Learning, Springer-Verlag.
- P. Diaconis, The Markov chain Monte Carlo revolution, Bull. Amer. Math. Soc. 46 (2009) 179–205. Google Scholar
- Probabilistic and Unsupervised Learning, Approximate Inference and Learning in Probabilistic Models course at Gatsby Computational Neuroscience Unit, UCL