-
Notifications
You must be signed in to change notification settings - Fork 12
PatternBinaryInteractionFramework
The BioPAX pattern framework implements several pre-defined binary interaction patterns between proteins and small molecules, and facilitates their search in BioPAX models. This document shows how to search a model for binary interactions, and how to process the output.
We defined 14 kinds of binary interactions. The types are listed in the enum SIFEnum. First 8 interaction types are defined between proteins. Next 4 are defined between a protein and a small molecule. Last 2 are defined between small molecules.
We chose the names of interaction types in a way that reading the text output of the interaction makes sense like a sentence. For instance:
MDM2 controls-state-change-of TP53
There are many ways to get a Model
using Paxtools. Below code is for reading a Model
from an OWL file.
SimpleIOHandler io = new SimpleIOHandler();
model = io.convertFromOWL(new FileInputStream("/path/to/file.owl"));
Below code searches all pre-defined types of binary interactions in the given model, and writes the results to the given file.
SIFSearcher searcher = new SIFSearcher(SIFEnum.values());
searcher.searchSIF(model, new FileOutputStream("/path/to/output.sif"));
Instead of a text file output, SIFSearcher
can return the SIFInteraction objects, so that users can do computation on them.
SIFSearcher searcher = new SIFSearcher(SIFEnum.values());
Set<SIFInteraction> binaryInts = searcher.searchSIF(model);
for (SIFInteraction inter : binaryInts)
{
...
}
SIFSearcher takes the relation types of interest as arguments in the constructor. Below code limits the relation types to controls-state-change-of
and controls-expression-of
relations.
SIFSearcher searcher = new SIFSearcher(SIFEnum.CONTROLS_STATE_CHANGE_OF, SIFEnum.CONTROLS_EXPRESSION_OF);
A SIFToText object can be used to customize the output text file contents. Following code adds the related publication IDs for each binary interaction to the end of each line.
SIFSearcher searcher = new SIFSearcher(SIFEnum.values());
searcher.searchSIF(model, new FileOutputStream("/path/to/output.sif"), new SIFToText()
{
@Override
public String convert(SIFInteraction inter)
{
String s = inter.toString();
for (String pmID : inter.getPubmedIDs())
{
s += "\t" + pmID;
}
return s;
}
});
As an alternative to implementing a new SIFToText class, users can use a CustomFormat object. This class generates a tab-delimited multi-column output, where the 4th and later column values can be customized. First 3 columns are not customized with this class and they are always source ID, interaction type, and target ID.
Below code generates a SIF output that contains 8 columns. The parameter strings passed to CustomFormat class indicates the types of the customized columns. These can either be pre-defined types (valid values are the first 4 parameters below), or a path string (such like the 5th parameter).
SIFSearcher searcher = new SIFSearcher(SIFEnum.values());
searcher.searchSIF(model, new FileOutputStream("/path/to/output.sif"),
new CustomFormat("mediator", "pubmed", "pathway", "resource",
"Interaction/participant/cellularLocation/term"));
When a column in the output needs to combine values from multiple paths, these path strings should be separated with a semicolon (;).
Finally, if the user wants to generate SIF in the old EXTENDED_BINARY_SIF format, they should use the below code.
SIFSearcher searcher = new SIFSearcher(SIFEnum.values());
Set<SIFInteraction> binaryInts = searcher.searchSIF(model);
OldFormatWriter.write(binaryInts, new FileOutputStream("/path/to/output.sif"));
When users do not want to include ubiquitous small molecules in the binary interaction patterns, they can prepare a Blacklist, and pass it to SIFSearcher. A Blacklist
keeps record of IDs of ubiquitous small molecules along with their ubiquity score and context. The context can be input, output, or both. Blacklist
can write itself to a file, and can load from a text file.
A sample blacklist (that was generated for one of Pathway Commons 2 BioPAX models) is here.
The blacklist file contains 3 columns.
- Column 1: URI of the blacklisted SmallMoleculeReference
- Column 2: Ubiquity score used for blacklisting
- Column 3: The context of blacklisting
The third column (context) can have values I, O, or B, which stands for Input, Output, and Both, respectively. As an example, ATP is ubiquitously consumed, while ADP is ubiquitously produced in the cell. So their context of blacklisting are I and O, respectively. On the other hand, NADH is both consumed and produced ubiquitously, so its context is B.
Below is the sample code that uses a blacklist.
SIFSearcher searcher = new SIFSearcher(SIFEnum.CONTROLS_STATE_CHANGE_OF, SIFEnum.INTERACTS_WITH);
Blacklist blacklist = new Blacklist("/path/to/blacklist.txt");
searcher.setBlacklist(blacklist);
searcher.searchSIF(model, new FileOutputStream("/path/to/output.sif"));
The BlacklistGenerator class is built to infer ubiquitous small molecules in a given large model. Since it evaluates connection statistics of small molecules in the model, it is important that the model is very big, like the whole data of a resource, or the integrated model of all the resources.
BlacklistGenerator gen = new BlacklistGenerator();
Blacklist blacklist = gen.generateBlacklist(model);
blacklist.write(new FileOutputStream("path/to/blacklist.txt"));
Each binary interaction is produced from one or more specific pattern matches in the data. A binary interaction is identified with its sourceID, targetID and type. By default, gene symbols for proteins, and display names for small molecules are used for source and target IDs. To customize this, users can pass their IDFetcher objects to the constructor of SIFSearcher. Below code is for using the displayName property of entities as ID.
IDFetcher idf = new IDFetcher()
{
@Override
public Set<String> fetchID(BioPAXElement ele)
{
if (ele instanceof Named) return Collections.singleton(((Named) ele).getDisplayName());
else return Collections.emptySet();
}
};
SIFSearcher searcher = new SIFSearcher(idf, SIFEnum.values());
To customize the graph patterns that correspond to an exsting binary interaction type, users need to write their own SIFMiner
class. This new SIFMiner
object can be passed to the constructor of the SIFSearcher.
SIFMiner miner = ... // User's own SIF miner object
SIFSearcher searcher = new SIFSearcher(miner);
searcher.searchSIF(model, new FileOutputStream("/path/to/output.sif"));
To define a new interaction type, users should implement their own SIFType object, and at least one SIFMiner that searches for it.
class MySIFType implements SIFType
{
@Override
public String getTag()
{
return "my-type";
}
@Override
public boolean isDirected()
{
return true; // or false
}
@Override
public String getDescription()
{
return "This is my very own binary interaction type";
}
/**
* The "getSIFType" method of MySIFMiner has to return this type.
*/
@Override
public List<Class<? extends SIFMiner>> getMiners()
{
return new ArrayList<Class<? extends SIFMiner>>(Arrays.asList(MySIFMiner.class));
}
};
SIFSearcher searcher = new SIFSearcher(new MySIFType());
searcher.searchSIF(model, new FileOutputStream("/path/to/output.sif"));