MCMC-to-Decrypt-Messages

The idea is to use Monte Carlo Markov Chain (MCMC) sampling methods to decrypt messages. In particular, we use Metropolis-Hastings algorithm to decrypt messages encrypted by asimple substitution cipher.

Consider a cipher (an injective mapping) that maps every symbol to a different symbol from the same set (The set of 53 symbols is given in symbols.txt).

First, we model English letters and punctuation as a first-order Markov Chain and learn the transition probabilities between every pair of symbols. Then we use Metropolis-Hastings to find an optimal permutaiton of symbols.

Markov Chain

A Markov Chain is a way to model sequential data in which current observations depend on previous observations. A fist-order Markov Chain assumes that the current observation is independent of all the previous observations apart form the most recent one. That is, the joint distribution for a sequence of N observations is given by $p(s_1, \ldots, s_N) = p(s_1) \prod_{n=2}^{N} p(s_n|s_{n-1})$

The transition probabilties $p(s_i = \alpha| s_{i-1} = \beta)$ represent the probability of a symbol being followed by another symbol. They are learnt using a large English text (war-and-peace.txt) and illustrated as a Hinton diagram.

Metropolis-Hastings

The purpose of MCMC sampling techniques is to approximate distributions by drawings samples in cases when exact inference is intractable. Metropolis-Hastings algorithm is defined as follows:

Each step: Starting from the current state x

Propose a new state x' using a proposal distribution (a similar simpler distribution) $S(x'|x) = S(x' \rightarrow x)$
Accept the new state with probability: $min(1, p(x')S(x' \rightarrow x)/p(x)S(x \rightarrow x'))$
Otherwise, keep the old state.

In our case, the state variable is given by the symbol permutaiton and we assume a uniform prior over the permutations.

The potential new state is calculated by randomly choosing two symbols out of 53 possible and swapping their corresponding encrypted symbols. Hence, the proposal probability is given by:

$S(x' \rightarrow x) = S(x \rightarrow x') = \frac{1}{{53 \choose 2}}$

Hence, the acceptance probability is given by:

During implementation, we consider the log-probabilities replacing division with subtraction and compare the acceptance ratio with a sample drawn from a uniform distribution to determine whether the new state is accepted or rejected.

All code with results for two different encrypted texts is given in the Jupyter Notebook MCMC.ipynb. The encrypted messages can be found in message_large.txt and message_small.txt.

References

Christopher M. Bishop (2006) Pattern Recognition and Machine Learning, Springer-Verlag.
P. Diaconis, The Markov chain Monte Carlo revolution, Bull. Amer. Math. Soc. 46 (2009) 179–205. Google Scholar
Probabilistic and Unsupervised Learning, Approximate Inference and Learning in Probabilistic Models course at Gatsby Computational Neuroscience Unit, UCL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MCMC-to-Decrypt-Messages

Markov Chain

Metropolis-Hastings

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.ipynb_checkpoints		.ipynb_checkpoints
MCMC.ipynb		MCMC.ipynb
README.md		README.md
message_large.txt		message_large.txt
message_small.txt		message_small.txt
symbols.txt		symbols.txt
war-and-peace.txt		war-and-peace.txt

LianaMikael/MCMC-to-Decrypt-Messages

Folders and files

Latest commit

History

Repository files navigation

MCMC-to-Decrypt-Messages

Markov Chain

Metropolis-Hastings

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages