Skip to content

diffusion with DNA-- continuous not discrete!!! :D

Notifications You must be signed in to change notification settings

spour/GENErator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

GENErator

Torch implementation of a diffusion model designed for DNA sequence reconstruction and score prediction tasks, continuous, e.g. given a ChIP score, create a likely DNA sequence that might get it.

N.B. GRAHAM ON COMPUTE CANADA IS DOWN UNTIL JAN 7 -- THIS CODE IS NOT UP TO DATE AND MISSES A LOT OF STUFF THAT I CAN NOT ACCESS.

Features

  • handle DNA sequences with variable lengths using padding and masking (up to a max)
  • combined sequence reconstruction and score prediction objectives, helps to learn better

Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/dna-diffusion-model.git
    cd dna-diffusion-model
    pip install -e .
  2. Data should look like. 1.23 ATCGTAA 0.89 GCTAGCTGCTA

  3. Modify main to look like.

    data_generator = BEDFileDataGenerator(
     filepath='/path/to/your/dataset.bed', 
     num_sequences=15000, 
     maxlen=512
    )
  4. Run like:

python main.py

About

diffusion with DNA-- continuous not discrete!!! :D

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages