Skip to content

Commit

Permalink
Merge branch 'develop'
Browse files Browse the repository at this point in the history
  • Loading branch information
cfrainay committed Jul 22, 2024
2 parents 1b173c3 + bbe8d3b commit f35fa31
Show file tree
Hide file tree
Showing 8 changed files with 932 additions and 81 deletions.
49 changes: 37 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,49 @@
# SBML2RDF
```
_____ _____ _____ __ ___ _____ ____ _____
| __| __ | | | |_ | | __ | \| __|
|__ | __ -| | | | |__ | _| | -| | | __|
|_____|_____|_|_|_|_____| |___| |__|__|____/|__|
```

A simple tool that converts a metabolic network in SBML format into RDF (turtle synthax).
A tool that converts a metabolic network in SBML format into RDF (turtle synthax).
SBML2RDF also include an optional model enhancement for knowledge graph, adding links between same compounds in different compartments, direct links between reactants and products of the same reaction (bypassing [specie <- specieRef <- reaction -> specieRef -> specie] paths), and side compounds (also known as, or closely related to : ubiquitous/auxiliary/currency/ancillary compounds) typing from a provided list.
SBML2RDF use the [JSBML](http://sbml.org/Software/JSBML) library for SBML file parsing and the [JENA](https://jena.apache.org/documentation/rdf/index.html) RDF API for building the triples.
SBML2RDF use biomodels' [SBML vocabulary](https://registry.identifiers.org/registry/biomodels.vocabulary) to describe the SBML content.

## Usage

the SBML2RDF convertor requires a metabolic network in SBML file format and a URI (Uniform Resource Identifiers) that uniquely identify this model
The SBML2RDF convertor requires a metabolic network in SBML file format and a URI (Uniform Resource Identifiers) that uniquely identify this model
Examples: https://metexplore.toulouse.inra.fr/metexplore2/?idBioSource=1363, https://www.ebi.ac.uk/biomodels/MODEL1311110001

```
java -jar SBML2RDF.jar -i path/tp/sbml.xml -u 'http://my.model.uri#id' -o path/to/output.ttl
-h (--help) : prints the help (default: false)
-i (--sbml) VAL : input SBML file
-o (--ttl) VAL : path to RDF turtle output file
-s (--silent) : disable console print (default: false)
-u (--uri) VAL : URI that uniquely identify the model
```
```
java -jar SBML2RDF.jar -i path/to/sbml.xml -u 'http://my.model.uri#id' -o path/to/output.ttl
```
The final model can be enhanced with extra triples using the following options:

```
java -jar SBML2RDF.jar -i path/to/sbml.xml -u 'http://my.model.uri#id' -o path/to/output.ttl --linkCompartments --addMetaboLinks --importSideCompounds path/to/side_compounds_file.txt
```
The side compounds file must contains one entry per line, using the same identifier system as the input sbml. Such list can be defined manually or obtained using the Met4J toolbox.
The linkCompartments option requires that the SBML's entries of the same compound in different compartments share the same names.

```
-h (--help) : prints the help (default: true)
-i (--sbml) VAL : input SBML file
-lc (--linkCompartments) : [enhance] add links between same compounds
in different compartments (must share same
sbml.name) (default: false)
-ml (--addMetaboLinks) : [enhance] add direct "derives into" links
between reactants and products of the same
reaction (default: false)
-o (--ttl) VAL : path to RDF turtle output file
-s (--silent) : disable console print (default: false)
-sc (--importSideCompounds) VAL : [enhance] add side compounds typing, which
are ignored when using --addMetaboLink
(recommended). Requires a file with one side
compound sbml identifier per line
-u (--uri) VAL : URI that uniquely identify the model
```

## Acknowledgment

Expand Down
66 changes: 25 additions & 41 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,53 +5,24 @@
<modelVersion>4.0.0</modelVersion>

<groupId>fr.metexplore</groupId>
<artifactId>SBMLtoRDF</artifactId>
<version>1.0</version>
<artifactId>SBML2RDF</artifactId>
<name>SBML2RDF</name>
<version>1.0-SNAPSHOT</version>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>11</maven.compiler.source>
<maven.compiler.target>11</maven.compiler.target>
<maven.compiler.source>17</maven.compiler.source>
<maven.compiler.target>17</maven.compiler.target>
</properties>

<description>A tool that converts a metabolic network in SBML format into RDF</description>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>11</source>
<target>11</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
<configuration>
<archive>
<manifest>
<mainClass>App</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<finalName>SBML2RDF</finalName>
<appendAssemblyId>false</appendAssemblyId>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.1</version>
<version>3.2.4</version>

<executions>
<execution>
Expand All @@ -60,10 +31,24 @@
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>App</mainClass>
<manifestEntries>
<Multi-Release>true</Multi-Release>
<Implementation-Version>${project.version}</Implementation-Version>
</manifestEntries>
</transformer>
</transformers>
</configuration>
Expand All @@ -84,13 +69,12 @@
<groupId>org.apache.jena</groupId>
<artifactId>apache-jena-libs</artifactId>
<type>pom</type>
<version>4.3.2</version>
<version>4.10.0</version>
</dependency>

<dependency>
<groupId>org.apache.jena</groupId>
<artifactId>jena-core</artifactId>
<version>4.3.2</version>
<artifactId>jena-querybuilder</artifactId>
<version>4.10.0</version>
</dependency>

<dependency>
Expand Down
92 changes: 83 additions & 9 deletions src/main/java/App.java
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import org.apache.jena.base.Sys;
import org.apache.jena.riot.RDFDataMgr;
import org.apache.jena.riot.RDFFormat;
import org.kohsuke.args4j.CmdLineException;
Expand All @@ -13,6 +12,13 @@
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.time.Duration;
import java.time.Instant;
import java.util.Collection;
import java.util.Set;
import java.util.stream.Collectors;

/**
* The CLI for the conversion from sbml to turtle RDF
Expand All @@ -32,10 +38,22 @@ public class App {
@Option(name = "-s", aliases = {"--silent"},usage = "disable console print", required = false)
private Boolean silent = false;

@Option(name = "-lc", aliases = {"--linkCompartments"},usage = "[enhance] add links between same compounds in different compartments (must share same sbml.name)", required = false)
private Boolean linkCompartments = false;

@Option(name = "-ml", aliases = {"--addMetaboLinks"},usage = "[enhance] add direct \"derives into\" links between reactants and products of the same reaction", required = false)
private Boolean addMetaboLink = false;

@Option(name = "-sc", aliases = {"--importSideCompounds"},usage = "[enhance] add side compounds typing, which are ignored when using --addMetaboLink (recommended). Requires a file with one side compound sbml identifier per line", required = false)
private String importSideCompounds = null;

@Option(name = "-h", aliases = {"--help"},usage = "prints the help", required = false)
private Boolean h = false;


private static Set<String> parseSideCompoundsFile(String inputpath) throws IOException {
Set<String> sideCompounds = Files.lines(Paths.get(inputpath)).collect(Collectors.toSet());
return sideCompounds;
}

public static void main(String[] args) throws IOException {

Expand All @@ -46,6 +64,7 @@ public static void main(String[] args) throws IOException {
try {
//parse SBML using JSBML library
//------------------------------
Instant start = Instant.now();
if(!app.silent) System.out.println("parsing model...");
SBMLDocument doc = new SBMLReader().readSBMLFromFile(app.inputPath);
Model sbmlModel = doc.getModel(); //JSBML model stores all data from SBML file
Expand All @@ -63,18 +82,56 @@ public static void main(String[] args) throws IOException {
convert.run();
org.apache.jena.rdf.model.Model rdf = convert.getRdfModel();

if(!app.silent) System.out.println(rdf.listStatements().toList().size()+" triples");

// [optional] add extra links:
//----------------------------
int n = rdf.listStatements().toList().size();
if(!app.silent && (app.linkCompartments || app.importSideCompounds!=null || app.addMetaboLink)) System.out.println("[enhance] adding extra triples:");

// [optional] add links between compartments' compounds
//----------------------------------------------------------
if(app.linkCompartments){
if(!app.silent) System.out.println("[enhance] Harmonizing compartmentalized compound versions...");
PropertyFiller.harmonizeCompartments(rdf,false);
if(!app.silent) System.out.println((rdf.listStatements().toList().size()-n)+" triples added");
n = rdf.listStatements().toList().size();
}
// [optional] tag side compounds from file
//---------------------------------------------
if(app.importSideCompounds!=null){
if(!app.silent) System.out.println("[enhance] Importing side compounds...");
Collection<String> sideCompounds = parseSideCompoundsFile(app.importSideCompounds);
if(!app.silent) System.out.println("[enhance] "+sideCompounds.size()+" side compounds imported.");
if(!app.silent) System.out.println("[enhance] Tagging reactions' side reactants and side products...");
PropertyFiller.importSideCompounds(rdf,sideCompounds);
if(!app.silent) System.out.println((rdf.listStatements().toList().size()-n)+" triples added");
n = rdf.listStatements().toList().size();
}
// [optional] add compound-to-compound metabolic relationship
//----------------------------------------------------------------
if(app.addMetaboLink){
if(!app.silent && app.importSideCompounds!=null) System.out.println("[enhance] Adding compound-to-compound metabolic links, ignoring side compounds...");
if(!app.silent && app.importSideCompounds==null) System.out.println("[enhance] Adding compound-to-compound metabolic links...");
PropertyFiller.addMetaboLinks(rdf,false);
if(!app.silent) System.out.println((rdf.listStatements().toList().size()-n)+" triples added");
}

rdf.setNsPrefix("cid","http://identifiers.org/pubchem.compound/");
rdf.setNsPrefix("chebi","http://identifiers.org/chebi/CHEBI:");
rdf.setNsPrefix("mnxCHEM", "http://identifiers.org/metanetx.chemical/");
if(!app.silent) System.out.println("RDF model created.");
if(!app.silent) System.out.println(rdf.listStatements().toList().size()+" triples");

//write RDF model in turtle
//-----------------------------------
//-------------------------
OutputStream out = new FileOutputStream(new File(app.outputPath));
RDFDataMgr.write(out, rdf, RDFFormat.TURTLE);
if(!app.silent)System.out.println("\nRDF model exported");
if(!app.silent)System.out.println(app.outputPath);
if(!app.silent)System.out.println("\nRDF model exported : "+app.outputPath);
Instant end = Instant.now();
Duration elapsedTime = Duration.between(start, end);

System.out.println("Execution time: " + elapsedTime.toSeconds()+"s");

} catch (XMLStreamException e) {
e.printStackTrace();
Expand All @@ -92,19 +149,29 @@ public static String getLabel() {return "\n" +
"|_____|_____|_|_|_|_____| |___| |__|__|____/|__| \n";}

public static String getDescription() {return "" +
"A simple tool that converts a metabolic network in SBML format into RDF (turtle synthax).\n" +
"A tool that converts a metabolic network in SBML format into RDF (turtle synthax).\n" +
"SBML2RDF also include an optional model enhancement for knowledge graph, adding links between same compounds" +
" in different compartments, direct links between reactants and products of the same reaction " +
"(bypassing [specie <- specieRef <- reaction -> specieRef -> specie] paths), " +
"and side compounds (also known as, or closely related to : ubiquitous/auxiliary/currency/ancillary compounds) typing from a provided list.\n"+
"SBML2RDF use the [JSBML](http://sbml.org/Software/JSBML) library for SBML file parsing and the [JENA](https://jena.apache.org/documentation/rdf/index.html) RDF API for building the triples. \n" +
"SBML2RDF use biomodels' [SBML vocabulary](https://registry.identifiers.org/registry/biomodels.vocabulary) to describe the SBML content. \n";
}

public static String getUsage() {return "" +
"the SBML2RDF convertor requires a metabolic network in SBML file format and a URI (Uniform Resource Identifiers) that uniquely identify this model\n" +
"The SBML2RDF convertor requires a metabolic network in SBML file format and a URI (Uniform Resource Identifiers) that uniquely identify this model\n" +
"Examples: " +
"https://metexplore.toulouse.inra.fr/metexplore2/?idBioSource=1363, " +
"https://www.ebi.ac.uk/biomodels/MODEL1311110001\n" +
"\n\t```" +
"\n\tjava -jar SBML2RDF.jar -i path/tp/sbml.xml -u 'http://my.model.uri#id' -o path/to/output.ttl" +
"\n\t```\n";
"\n\tjava -jar SBML2RDF.jar -i path/to/sbml.xml -u 'http://my.model.uri#id' -o path/to/output.ttl" +
"\n\t```\n"+
"The final model can be enhanced with extra triples using the following options:\n" +
"\n\t```" +
"\n\tjava -jar SBML2RDF.jar -i path/to/sbml.xml -u 'http://my.model.uri#id' -o path/to/output.ttl --linkCompartments --addMetaboLinks --importSideCompounds path/to/side_compounds_file.txt" +
"\n\t```\n" +
"The side compounds file must contains one entry per line, using the same identifier system as the input sbml. Such list can be defined manually or obtained using the Met4J toolbox.\n" +
"The linkCompartments option requires that the SBML's entries of the same compound in different compartments share the same names.\n\n";
}


Expand Down Expand Up @@ -133,5 +200,12 @@ protected void parseArguments(String[] args) {
System.exit(0);
}
}

if (this.h == true) {
this.printHeader();
parser.printUsage(System.out);
System.exit(0);
}
}

}
Loading

0 comments on commit f35fa31

Please sign in to comment.