-
Notifications
You must be signed in to change notification settings - Fork 2
Home
aLib is a sets of software tools to do basic analysis of Illumina sequencers. The different components can be used in conjuction or independently. We provide instructions for whether users wish to use aLib as a whole or just sub-components.
First, make sure you are running a Linux computer with the following:
- C++ compiler
- Python interpreter
- R
- fastqc (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
- freeIbis (optional) (http://github.com/grenaud/freeIbis)
- biohazard (http://github.com/udo-stenzel/biohazard/)
- network-aware-bwa (http://github.com/udo-stenzel/network-aware-bwa/)
- libgab (https://github.com/grenaud/libgab)
- bamtools (https://github.com/pezmaster31/bamtools)
- Compile bamtools (https://github.com/pezmaster31/bamtools)
- In the main directory, just type make.
The first step, is to configure the config.json file. This has to be done once. Once this is done, you can run aLib on a given sequencing run. Whether you want to use the individual components or use them in conjunction, the basic configuration is stored in the config.json file. For the use of individual components, the default config.json can probably just be used as is.
If you have successfully typed "make" and configured the config.json file, the various components should be ready to use.
The workflow can be described as follows:
- The read directory from where your sequencer(s) will write their sequencing data (basecalls and intensities in /Data/Intensities/)
- The write directory is where aLib will produce the usable data
YYMMDD_SEQUENCERID_RUN-NUMBER_COMMENTS
The main configuration file is config.json.
Field | Meaning |
alibdir | The base directory where aLib is installed. |
fastqcdir | Directory containing fastqc |
illuminareaddir | The directory where the sequencer writes the sequencing data (basecalls and intensities) |
illuminawritedir | This is the directory where aLib will write the processed data |
sequencers | Enter the id and type of the sequencer for your sequencing center |
runstodisplay | The number of runs to display |
emailAddrToSend | Email of the administrator |
genomedirectory | Directory that contains the BWA genomic databases. (see details about setup). |
tempdirectory | Directory used by aLib to write temp files |
freeibispath | Path to freeIbis |
controlindex | 7 bp index for reads used a phiX control spike-in |
phixref | Path to the phiX reference |
chimeras | For various protocols, define the name of the protocol, the sequence of the adapters and putative chimeric sequences |
Indices | Define as the high level the indexing scheme and the id to sequence data for the indices used |
Create a directory that is web accessible and copy the contents of webForm/ in there. Let the URL defined by this directory as http://internal.webserver.com/aLib/
On the server where aLib is running, there should be an access to BWA genomes indices. Each BWA index should be in a directory of its own indicating the name of the build:
hg19/
and the index should be bwa-0.4.9 as such:
hg19/bwa-0.4.9.amb hg19/bwa-0.4.9.ann ...
Also, the directory should contain a BWA for the index used for the control genome (PhiX, not crucial but nice to have). This directory should be named :
phiX/
Have within it the directory control/:
phiX/control/
and have the following files for the fasta genome and BWA index:
phiX/control/whole_genome.fa phiX/control/bwa-0.4.9.{amb,ann,bwt,pac,rbwt,rpac,rsa,sa}
Direct the users to the address for the webserver described above to http://internal.webserver.com/aLib/form.php. Ask the user to select their run and click launch.
Users will fill the form and an email will be sent each time a run is processed. Once this is done, makefiles will be created in the write directory for the given run. In general, it suffices to cd to that directory, cd to build and type:
make -k -j 8