fNN is a command line tool for predicting the secondary structure of proteins using their primary sequence, multiple sequence alignment data (in FASTA format), or position-specific scoring matrix (PSSM).
For the tool to work properly, the following packages are required:
• Python 3.6 : https://anaconda.org/anaconda/python
• Keras : https://anaconda.org/conda-forge/keras
• Scikit-learn : https://anaconda.org/anaconda/scikit-learn
In order to download fNN, you should clone the repository via the commands
git clone https://github.com/SBNoor/fNN.git
cd fNN
Once the above process has completed, you can run
python tool.py -h
to list all of the command-line options. If this command fails it means that something went wrong during the installation process.
In this section you will predict the secondary structure of a protein using its primary sequence. We will assume you have already followed the instructions for downloading Python 3.6, Keras, Scikit-learn and fNN.
The following command will allow you to predict the secondary structure of a protein:
python tool.py <file_name>.fasta
Protein structure prediction will take approximately 12 minutes if a single FASTA sequence is given as an input and will take approximately 27 minutes if the input is a multiple sequence alignment. The time is mostly spent on training the neural network.
You can also specify which neural network you want to give your data to. In that case you can use one of the following flags:
-j JNN
This flag will run the neural network described by Qian and Sejnowski, which is a simple one hidden layer feed-forward neural network that requires a single sequence as an input. This neural network will run by default if you provide a single fasta sequence even without the above mentioned flag. The command is:
python tool.py -j JNN <file_name>.fasta
-js MSA
This flag will run a standard feed-forward neural network that requires a single sequence as an input but that sequence is generated from a multiple sequence alignment by virtue of majority voting. This neural network is the one that will be used by default if a multiple sequence alignment is provided without a flag. The command is:
python tool.py -js MSA <file_name>.fasta
-m mNN
This flag will predict the secondary structure using the network similar to the one explained by Rost and Sander. It is a cascaded neural network whereby the first neural network is a sequence - to - structure network and the second one is a structure - to - structure neural network. The command is as follows:
python tool.py -m mNN <file_name>.fasta
-s sNN
This flag runs a convolutional neural network based on the approach of Liu and Cheng. This neural network will run by default if the user enters a PSSM as an input. The command to be entered is:
python tool.py -s sNN <file_name>.pssm
You can also have the prediction written to a text file using the -o flag:
python tool.py -o <file_name>.fasta