DNA Storage Simulator analyzes and simulates the error profile of Nanopore DNA. It was completed as part of an undergraduate research project at NUS, supervised by Professor Djordje Jevdjic. See our accepted poster at ISPASS '22 for a short summary, and the extended report for details.
The structure is as follows:
-
CodeReconstruction
: Forked from CodeReconstruction, with modifications to aid testing.real_data_clustered.txt
: Stores real data in the form:[original strand][\n] *****************************[\n] [copy][\n] [copy][\n] ... [copy][\n] [\n] [\n] [original strand][\n] *****************************[\n] [copy][\n] [copy][\n] ... [copy][\n] ...
synth_data_clustered.txt
: Stores synthetic data in the same form asreal_data_clustered
, can be generated via thenoisy.py
module, or DNASimulatorcompare.sh
: Bash script that runs reconstruction algorithms onreal_data_clustered
andsynth_data_clustered
-
Scripts
: Contains utility scriptsget_ground_from_clustered.py
: Parses files of the same form asreal_data_clustered.txt
above, and generates a filestrands.txt
containing the original strands only. Run usingpython get_ground_from_clustered.py
noisy.py
: Naive simulator that takes in astrands.txt
file, sequencing coverage, and error probabilities as input and generates noisy copies of multiple clusters in the same form asreal_data_clustered.txt