diff --git a/dev/getting-started-devs/index.html b/dev/getting-started-devs/index.html index ea7d7ff..d43f6e0 100644 --- a/dev/getting-started-devs/index.html +++ b/dev/getting-started-devs/index.html @@ -907,7 +907,7 @@

Jelly-JVM – getting started

This guide explains a few of the basic functionalities of Jelly-JVM and how to use them in your code. Jelly-JVM is written in Scala, but it can be used from Java as well. However, in this guide, we will focus on Scala 3.

Quick start – plain old files

Depending on your RDF library of choice (Apache Jena or RDF4J), you should import one of two dependencies: jelly-jena or jelly-rdf4j1. In our examples we will use Jena, so let's add this to your build.sbt file (this would be the same for other build tools like Maven or Gradle):

-
build.sbt
lazy val jellyVersion = "2.2.2"
+
build.sbt
lazy val jellyVersion = "2.3.0"
 
 libraryDependencies ++= Seq(
   "eu.ostrzyciel.jelly" %% "jelly-jena" % jellyVersion,
@@ -945,7 +945,7 @@ 

Quick start – plain old files

Read more about using Jelly-JVM with RDF4J

RDF streams

Now, the real power of Jelly lies in its streaming capabilities. Not only can it stream individual RDF triples/quads (this is called flat streaming), but it can also very effectively handle streams of RDF graphs or datasets. To work with streams, you need to use the jelly-stream module, which is based on the Apache Pekko Streams library. So, let's update our dependencies:

-
build.sbt
lazy val jellyVersion = "2.2.2"
+
build.sbt
lazy val jellyVersion = "2.3.0"
 
 libraryDependencies ++= Seq(
   "eu.ostrzyciel.jelly" %% "jelly-jena" % jellyVersion,
diff --git a/dev/search/search_index.json b/dev/search/search_index.json
index 1fcffe8..1e26764 100644
--- a/dev/search/search_index.json
+++ b/dev/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Jelly-JVM","text":"

Jelly-JVM is an implementation of the Jelly serialization format and gRPC streaming protocol for the Java Virtual Machine (JVM), written in Scala 31. The supported RDF libraries are Apache Jena and Eclipse RDF4J.

Jelly-JVM provides a full stack of utilities for fast and scalable RDF streaming with the Jelly protocol. Oh, and it's blazing-fast, too!

Getting started with plugins \u2013 no code required

See the getting started guide with plugins for a quick way to use Jelly with your Apache Jena or RDF4J application without writing any code.

Getting started for application developers

If you want to use the full feature set of Jelly-JVM in your code, see the getting started guide for application developers.

This documentation is for the latest development version of Jelly-JVM \u2013 it is not considered stable. If you are looking for the documentation of a stable release, use the version selector on the left of the top navigation bar. See: latest stable version.

"},{"location":"#library-modules","title":"Library modules","text":"

The implementation is split into a few modules that can be used separately:

  • jelly-core \u2013 implementation of the Jelly serialization format (using the scalapb library), along with generic utilities for converting the deserialized RDF data to/from the representations of RDF libraries (like Apache Jena or RDF4J).

  • jelly-jena \u2013 conversions and interop code for the Apache Jena library.

  • jelly-rdf4j \u2013 conversions and interop code for the RDF4J library.

  • jelly-stream \u2013 utilities for building Reactive Streams of RDF data (based on Pekko Streams). Useful for integrating with gRPC or other streaming protocols (e.g., Kafka, MQTT).

  • jelly-grpc \u2013 implementation of a gRPC client and server for the Jelly gRPC streaming protocol.

"},{"location":"#plugin-jars","title":"Plugin JARs","text":"

We also publish plugin JARs which allow you to use Jelly-JVM with Apache Jena and RDF4J just by dropping the JARs into the classpath. Find out more about using the plugins.

"},{"location":"#compatibility","title":"Compatibility","text":"

The Jelly-JVM implementation is compatible with Java 11 and newer. Java 11, 17, and 21 are tested in CI and are guaranteed to work. Jelly is built with Scala 3 LTS releases.

The following table shows the compatibility of the Jelly-JVM implementation with other libraries:

Jelly-JVM Scala Java RDF4J Apache Jena Apache Pekko 2.0.x \u2013 2.2.x 3.3.x (LTS) 17+ 5.x.x 5.x.x 1.1.x 1.0.x 3.3.x (LTS)2.13.x1 11+ 4.x.x 4.x.x 1.0.x

See the compatibility policy for more details and the release notes on GitHub.

"},{"location":"#documentation","title":"Documentation","text":"

Below is a list of all documentation pages about Jelly-JVM. You can also browse the Javadoc using the badges in the module list above. The documentation uses examples written in Scala, but the libraries can be used from Java as well.

  • Getting started with Jena/RDF4J plugins \u2013 how to use Jelly-JVM as a plugin for Apache Jena or RDF4J, without writing any code.
  • Getting started for application developers \u2013 how to use Jelly-JVM in code.
  • User guide
    • Apache Jena integration
    • RDF4J integration
    • Reactive streaming
    • gRPC
    • Useful utilities
    • Compatibility policy
  • Developer guide
    • Releases
    • Implementing Jelly for other libraries
  • Contributing to Jelly-JVM
  • License and citation
  • Release notes on GitHub
  • Main Jelly website \u2013 including the Jelly protocol specification and explanation of the various stream types.
  1. Scala 2.13-compatible builds of Jelly-JVM are available for Jelly-JVM 1.0.x. Scala 2 support was removed in subsequent versions. See more details.\u00a0\u21a9\u21a9

"},{"location":"contributing/","title":"Contributing to Jelly-JVM","text":"

Jelly-JVM is an open project \u2013 you are welcome to submit issues, pull requests, or just ask questions!

"},{"location":"contributing/#submitting-issues","title":"Submitting issues","text":"

If you have a question, found a bug, or have an idea for a new feature, please open an issue in the GitHub issue tracker.

"},{"location":"contributing/#security-issues","title":"Security issues","text":"

If you find a security issue or vulnerability, please do not open a public issue. Instead, use the dedicated vulnerability reporting page.

"},{"location":"contributing/#pull-requests","title":"Pull requests","text":"

Pull requests are welcome! Simply fork the GitHub repository and create a new branch for your changes. When you are ready, open a pull request to the main branch.

If you are working on a larger feature or a significant change, it is recommended to open an issue first to discuss the idea.

"},{"location":"contributing/#documentation","title":"Documentation","text":"

Jelly-JVM uses the exact same documentation system as the main Jelly documentation. Further information on editing the documentation can be found in the Contributing to the Jelly documentation guide.

"},{"location":"contributing/#releases","title":"Releases","text":"

See the dedicated page on making releases.

"},{"location":"contributing/#see-also","title":"See also","text":"
  • Licensing and citation
"},{"location":"getting-started-devs/","title":"Jelly-JVM \u2013 getting started for developers","text":"

If you don't want to code anything and only use Jelly with your Apache Jena/RDF4J application, see the dedicated guide about using Jelly-JVM as a plugin.

This guide explains a few of the basic functionalities of Jelly-JVM and how to use them in your code. Jelly-JVM is written in Scala, but it can be used from Java as well. However, in this guide, we will focus on Scala 3.

"},{"location":"getting-started-devs/#quick-start-plain-old-files","title":"Quick start \u2013 plain old files","text":"

Depending on your RDF library of choice (Apache Jena or RDF4J), you should import one of two dependencies: jelly-jena or jelly-rdf4j1. In our examples we will use Jena, so let's add this to your build.sbt file (this would be the same for other build tools like Maven or Gradle):

build.sbt
lazy val jellyVersion = \"2.2.2\"\n\nlibraryDependencies ++= Seq(\n  \"eu.ostrzyciel.jelly\" %% \"jelly-jena\" % jellyVersion,\n)\n

Now you can serialize/deserialize Jelly data with Apache Jena. Jelly is fully integrated with Jena, so it should all just magically work. Here is a simple example of reading a .jelly file (in this case, a metadata file from RiverBench) with RIOT:

Deserialization example (Scala 3)
import eu.ostrzyciel.jelly.convert.jena.riot.*\nimport org.apache.jena.riot.RDFDataMgr\n\n// Load an RDF graph from a Jelly file\nval model = RDFDataMgr.loadModel(\n  \"https://w3id.org/riverbench/v/2.0.1.jelly\", \n  JellyLanguage.JELLY\n)\n// Print the size of the model\nprintln(s\"Loaded an RDF graph with ${model.size} triples\")\n

Serialization is just as easy:

Serialization example (Scala 3)
import eu.ostrzyciel.jelly.convert.jena.riot.*\nimport org.apache.jena.riot.RDFDataMgr\n\nimport java.io.FileOutputStream\nimport scala.util.Using\n\n// Omitted here: creating an RDF model.\n// You can use the one from the previous example.\n\nUsing.resource(new FileOutputStream(\"metadata.jelly\")) { out =>\n  // Write the model to a Jelly file\n  RDFDataMgr.write(out, model, JellyLanguage.JELLY)\n  println(\"Saved the model to metadata.jelly\")\n}\n

Read more about using Jelly-JVM with Apache Jena

Read more about using Jelly-JVM with RDF4J

"},{"location":"getting-started-devs/#rdf-streams","title":"RDF streams","text":"

Now, the real power of Jelly lies in its streaming capabilities. Not only can it stream individual RDF triples/quads (this is called flat streaming), but it can also very effectively handle streams of RDF graphs or datasets. To work with streams, you need to use the jelly-stream module, which is based on the Apache Pekko Streams library. So, let's update our dependencies:

build.sbt
lazy val jellyVersion = \"2.2.2\"\n\nlibraryDependencies ++= Seq(\n  \"eu.ostrzyciel.jelly\" %% \"jelly-jena\" % jellyVersion,\n  \"eu.ostrzyciel.jelly\" %% \"jelly-stream\" % jellyVersion,\n)\n

Now, let's say we have a stream of RDF graphs \u2013 for example each graph corresponds to one set of measurements from an IoT sensor. We want to have a stream that turns these graphs into their serialized representations (byte arrays), which we can then send over the network. Here is how to do it:

Reactive streaming example (Scala 3)
// We need to import \"jena.given\" for Jena-to-Jelly conversions\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport scala.concurrent.ExecutionContext\n\n// We will need a Pekko actor system to run the streams\ngiven actorSystem: ActorSystem = ActorSystem()\n// And an execution context for the futures\ngiven ExecutionContext = actorSystem.getDispatcher\n\n// Load an RDF graph for testing\nval model = RDFDataMgr.loadModel(\n  \"https://w3id.org/riverbench/v/2.0.1.jelly\", \n  JellyLanguage.JELLY\n)\n\nSource.repeat(model) // Create a stream of the same model over and over\n  .take(10) // Take only the first 10 elements in the stream\n  .map(_.asTriples) // Convert each model to an iterable of triples\n  .via(EncoderFlow.graphStream( // Encode each iterable to a Jelly stream frame\n    maybeLimiter = None, // 1 RDF graph = 1 message\n    JellyOptions.smallStrict, // Jelly compression settings preset\n  ))\n  .via(JellyIo.toBytes) // Convert the stream frames to a byte arrays\n  .runForeach { bytes =>\n    // Just print the length of each byte array in the stream.\n    // You can also hook this up to MQTT, Kafka, etc.\n    println(s\"Streamed ${bytes.length} bytes\")\n  }\n  .onComplete(_ => actorSystem.terminate())\n

Jelly will compress this stream on-the-fly, so if the data is repetitive, it will be very efficient. If you run this code, you will notice that the byte sizes for the later graphs are smaller, even though we are sending the same graph over and over again. But, even if each graph is completely different, Jelly still should be much faster than other serialization formats.

These streams are very powerful, because they are reactive and asynchronous \u2013 in short, this means you can hook this up to any data source and any data sink \u2013 and you can scale it up as much as you want. If you are unfamiliar with the concept of reactive streams, we recommend you start with this Apache Pekko Streams guide.

Jelly-JVM supports streaming serialization and deserialization of all types of streams in the RDF Stream Taxonomy. You can read more about the theory of this and all available stream types in the Jelly protocol documentation.

Learn more about reactive streaming with Jelly-JVM

Learn more about the types of streams in Jelly

"},{"location":"getting-started-devs/#grpc-streaming","title":"gRPC streaming","text":"

Jelly is a bit more than just a serialization format \u2013 it also defines a gRPC-based straming protocol. You can use it for streaming RDF data between microservices, to build a pub/sub system, or to publish RDF data to the web.

Learn more about using Jelly gRPC protocol servers and clients

"},{"location":"getting-started-devs/#further-reading","title":"Further reading","text":"
  • Using Jelly-JVM with Apache Jena
  • Using Jelly-JVM with RDF4J
  • Reactive streaming with Jelly-JVM \u2013 using the jelly-stream module and Apache Pekko Streams
  • Using Jelly gRPC protocol servers and clients
  • Other useful utilities in Jelly-JVM
  • Low-level usage of Jelly-JVM
"},{"location":"getting-started-devs/#example-applications-using-jelly-jvm","title":"Example applications using Jelly-JVM","text":"
  • The examples directory in the Jelly-JVM repo contains code snippets that demonstrate how to use the library in various scenarios.
  • Jelly JVM benchmarks \u2013 research software for testing the performance of Jelly-JVM and other RDF serializations in Apache Jena. It uses most Jelly-JVM features.
  • RiverBench ci-worker \u2013 a real-world application that is used for processing large RDF datasets in a CI/CD pipeline. It uses Jelly-JVM for serialization and deserialization with Apache Jena. It also uses extensively Apache Pekko Streams.
"},{"location":"getting-started-devs/#questions","title":"Questions?","text":"

If you have any questions about using Jelly-JVM, feel free to open an issue on GitHub.

  1. There is nothing stopping you from using both at the same time. You can also pretty easily add support for any other Java-based RDF library by implementing a few interfaces. More details here.\u00a0\u21a9

"},{"location":"getting-started-plugins/","title":"Jelly-JVM \u2013 getting started with Jena/RDF4J plugins","text":"

This guide explains how to use Jelly-JVM with Apache Jena or RDF4J as a plugin, without writing a single line of code. Jelly-JVM provides plugin JARs that you can simply drop in the appropriate directory to get Jelly format support in your application.

"},{"location":"getting-started-plugins/#installation","title":"Installation","text":""},{"location":"getting-started-plugins/#apache-jena-apache-jena-fuseki","title":"Apache Jena, Apache Jena Fuseki","text":"

You can simply add Jelly format support to Apache Jena or Apacha Jena Fuseki with Jelly's plugin JAR.

  • First, download the plugin JAR. You can download the latest development version from here, or you can go the the releases page on GitHub to download a different version of the jelly-jena-plugin.jar file.
    • Note that the Jelly version must be compatible with your Apache Jena version. Consult the compatibility table.
  • Place the file in your classpath:
    • For Apache Jena Fuseki, simply place the file in $FUSEKI_BASE/extra/ directory. $FUSEKI_BASE is the directory usually called run where you have files such as config.ttl and shiro.ini. You will most likely need to create the extra directory yourself.
    • For Apache Jena, place the file in the lib/ directory of your Jena installation.
    • For other applications, consult the manual of the application.
  • You can now use the Jelly format for parsing, serialization, and streaming serialization in your Jena application.

Content negotiation in Fuseki

Content negotiation using the application/x-jelly-rdf media type in the Accept header works in Fuseki since Apache Jena version 5.2.0. Previous versions of Fuseki did not support media type registration.

How to use Jelly with Jena's CLI tools?

Jelly-JVM fully supports Apache Jena's command-line interface (CLI) utilities. See the dedicated guide for more information.

"},{"location":"getting-started-plugins/#eclipse-rdf4j","title":"Eclipse RDF4J","text":"

You can simply add Jelly format support to an application based on RDF4J with Jelly's plugin JAR.

  • First, download the plugin JAR. You can download the latest development version from here, or you can go the the releases page on GitHub to download a specific version of the jelly-rdf4j-plugin.jar file.
    • Note that the Jelly version must be compatible with your RDF4J version. Consult the compatibility table.
  • Place the file in your classpath:
    • For the RDF4J SDK distribution, place the file in the lib/ directory of your RDF4J installation.
    • For other applications, consult the manual of your application for the exact location.
  • You can now use the Jelly format for parsing and serialization in your RDF4J application.
"},{"location":"getting-started-plugins/#supported-features","title":"Supported features","text":"

The Jelly-JVM plugin JARs provide the following features:

  • Full support for parsing and serialization of RDF data (triples and quads) in the Jelly format.
    • The parser will automatically detect if the input data is delimited or not. Both delimited and non-delimited Jelly data can be parsed.
    • In Apache Jena also the stream serialization is supported.
  • Recognizing the .jelly file extension.
  • Recognizing the application/x-jelly-rdf media type.

The Jelly format is registered under the name jelly in the RDF libraries, so you can use it in the same way as other formats like Turtle, RDF/XML, or JSON-LD.

"},{"location":"getting-started-plugins/#see-also","title":"See also","text":"
  • Getting started for developers \u2013 if you want to get your hands dirty with code and get more features out of Jelly.
"},{"location":"licensing/","title":"Licensing and citation","text":"

Jelly-JVM is licensed under the Apache License 2.0.

"},{"location":"licensing/#attribution-citation","title":"Attribution / citation","text":"

If you use Jelly-JVM in your research, please the most recent paper about Jelly:

Sowi\u0144ski, P., Wasielewska-Michniewska, K., Ganzha, M., & Paprzycki, M. (2022, October). Efficient RDF streaming for the edge-cloud continuum. In 2022 IEEE 8th World Forum on Internet of Things (WF-IoT) (pp. 1-8). IEEE.

Or use this BibTeX entry:

@inproceedings{sowinski2022efficient,\n  title={Efficient RDF streaming for the edge-cloud continuum},\n  author={Sowi{\\'n}ski, Piotr and Wasielewska-Michniewska, Katarzyna and Ganzha, Maria and Paprzycki, Marcin and others},\n  booktitle={2022 IEEE 8th World Forum on Internet of Things (WF-IoT)},\n  pages={1--8},\n  year={2022},\n  organization={IEEE},\n  doi={10.1109/WF-IoT54382.2022.10152225}\n}\n

This paper describes an earlier version of Jelly from 2022. A new paper is in preparation.

"},{"location":"licensing/#jelly-maintainer","title":"Jelly maintainer","text":"

Jelly-JVM was created and is maintained by Piotr Sowi\u0144ski (Ostrzyciel) \u2013 GitHub.

"},{"location":"licensing/#see-also","title":"See also","text":"
  • Contributing to Jelly-JVM
"},{"location":"dev/implementing/","title":"Developer guide \u2013 implementing conversions for other libraries","text":"

Currently converters for the two most popular RDF JVM libraries are implemented \u2013 RDF4J and Jena. But it is possible to implement your own converters and adapt the Jelly serialization code to any RDF library with little effort.

To do this, you will need to implement three traits (interfaces in Java) from the jelly-core module: ProtoEncoder, ProtoDecoderConverter, and ConverterFactory.

  • ProtoEncoder (serialization)

    • get* methods deconstruct triple statements, quad statements, and quoted triples (RDF-star). You can make them inline.
    • nodeToProto and graphToProto should translate into Jelly's representation all possible variations of RDF terms in the SPO and G positions, respectively.
    • Example implementation for Jena: JenaProtoEncoder
    • You can skip implementing this trait if you don't need serialization.
    • You can also skip implementing some methods (make them throw an exception or return null) if, for example, you don't want to work with quads or RDF-start.
  • ProtoDecoderConverter (deserialization)

    • The make* methods should construct new RDF terms and statements. You can make them inline.
    • Example implementation for Jena: JenaDecoderConverter
    • You can skip implementing this trait if you don't need deserialization.
    • You can also skip implementing some methods (make them throw an exception or return null) if, for example, you don't want to work with quads or RDF-start.
  • ConverterFactory \u2013 wrapper that allows other modules to use your converter.

    • The methods should just return new instances of your ProtoEncoder and ProtoDecoderConverter implementations.
    • Example for Jena: JenaConverterFactory
"},{"location":"dev/releases/","title":"Developer guide \u2013 releases","text":""},{"location":"dev/releases/#full-versioned-releases","title":"Full (versioned) releases","text":"

Full (versioned) releases are created manually and follow the Semantic Versioning scheme for binary compatibility.

To create a new tagged release (example for version 1.2.3):

$ git checkout main\n$ git pull\n$ git tag v1.2.3\n$ git push origin v1.2.3\n

The rest (packaging and release creation) will be handled automatically by the CI. The release will be pushed to Maven Central.

"},{"location":"dev/releases/#snapshot-releases","title":"Snapshot releases","text":"

Snapshot releases are triggered automatically by commits in the main branch. Snapshots are pushed to the Sonatype snapshot repository.

"},{"location":"user/compatibility/","title":"Compatibility policy","text":"

Jelly-JVM follows Semantic Versioning 2.0.0, with MAJOR.MINOR.PATCH releases. Please see the compatibility table on the main page for the current compatibility information. The documentation is versioned to match each Jelly-JVM MAJOR.MINOR version.

"},{"location":"user/compatibility/#jvm-and-scala","title":"JVM and Scala","text":"

The current version of Jelly-JVM is compatible with Java 17 and newer. Java 17, 21, and 23 are tested in CI and are guaranteed to work. We recommend using a recent release of GraalVM to get the best performance. If you need Java 11 support, you should use Jelly-JVM 1.0.x.

Jelly is built with Scala 3 LTS releases and supports only Scala 3. If you need Scala 2 support, you should use Jelly-JVM 1.0.x.

"},{"location":"user/compatibility/#rdf-libraries","title":"RDF libraries","text":"

Major-version upgrades of RDF4J and Apache Jena (e.g., updating from 4.0.x to 5.0.x) are done in Jelly-JVM MINOR releases. Jelly-JVM generally does not use any complex features of these libraries, so it should work with multiple versions without any problems.

If you do encounter any compatibility issues, please report them on the issue tracker.

"},{"location":"user/compatibility/#internal-vs-external-apis","title":"Internal vs external APIs","text":"

Generally, all public classes and methods in Jelly-JVM are considered part of the public API. However, there are some exceptions.

Auto-generated classes in the jelly-core module, eu.ostrzyciel.jelly.core.proto.v1 package are not considered part of the public API, although we will avoid any incompatibilities where possible. These classes may change between MINOR releases.

"},{"location":"user/compatibility/#backward-and-forward-protocol-compatibility","title":"Backward and forward protocol compatibility","text":"

Jelly-JVM follows the Jelly protocol's backward compatibility policy. This means that Jelly-JVM can read data serialized with older versions of Jelly. Backward compatibility is tested in CI \u2013 the code is in BackCompatSpec.scala.

Forward compatibility is provided only in a very limited manner in Jelly-JVM. The parser is guaranteed to only parse the stream options header and reject the rest of the stream, if the used protocol version is not supported. You may choose to disable this check and try to parse the rest of the data anyway, but this is most certainly NOT recommended and may lead to unexpected results. In general, Jelly-JVM will ignore any unknown fields in the stream, but any other changes in the protocol may lead to really \"funny\" errors. Forward compatibility is tested in CI \u2013 the code is in ForwardCompatSpec.scala.

"},{"location":"user/compatibility/#see-also","title":"See also","text":"
  • Release notes on GitHub
  • Making Jelly-JVM releases
  • Contributing to Jelly-JVM
"},{"location":"user/grpc/","title":"User guide \u2013 gRPC","text":"

This guide explains the functionalities of the jelly-grpc module, which implements a gRPC client and server for the Jelly gRPC streaming protocol.

Prerequisites

If you are unfamiliar with gRPC, we recommend you first read some introductory material on the gRPC website or in the Apache Pekko gRPC documentation.

The jelly-grpc module builds on the functionalities of jelly-stream, so we recommend you first read the reactive streaming guide.

You may also want to first skim the Jelly gRPC streaming protocol specification to understand the protocol's structure.

As with the jelly-stream module, you can use jelly-grpc with any RDF library that has a Jelly integration, such as Apache Jena (using jelly-jena) or RDF4J (using jelly-rdf4j). The gRPC API is generic and identical across all libraries.

"},{"location":"user/grpc/#making-a-grpc-server-and-client","title":"Making a gRPC server and client","text":"

jelly-grpc builds on the Apache Pekko gRPC library. Jelly-JVM provides boilerplate code for setting up a gRPC server and client that can send and receive Jelly streams, as shown in the example below:

Example: PekkoGrpc.scala (click to expand)

Source code on GitHub

PekkoGrpc.scala
package eu.ostrzyciel.jelly.examples\n\nimport com.typesafe.config.ConfigFactory\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.*\nimport eu.ostrzyciel.jelly.grpc.RdfStreamServer\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.NotUsed\nimport org.apache.pekko.actor.typed.ActorSystem\nimport org.apache.pekko.actor.typed.javadsl.Behaviors\nimport org.apache.pekko.grpc.{GrpcClientSettings, GrpcServiceException}\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.concurrent.{Await, ExecutionContext, Future}\nimport scala.concurrent.duration.*\nimport scala.util.{Failure, Success}\n\n/**\n * Example of using Jelly's gRPC client and server to send Jelly streams over the network.\n * This uses the Apache Pekko gRPC library. Its documentation can be found at:\n * https://pekko.apache.org/docs/pekko-grpc/current/index.html\n * \n * See also examples named `PekkoStreams*` for instructions on encoding and decoding RDF streams with Jelly.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoGrpc extends shared.Example:\n  // Create a config for Pekko gRPC.\n  // We can use the same config for the client and the server, as we are communicating on localhost.\n  // This would usually be loaded from a configuration file (e.g., application.conf).\n  // More details: https://github.com/lightbend/config\n  val config = ConfigFactory.parseString(\n      \"\"\"\n        |pekko.http.server.preview.enable-http2 = on\n        |pekko.grpc.client.jelly.host = 127.0.0.1\n        |pekko.grpc.client.jelly.port = 8088\n        |pekko.grpc.client.jelly.enable-gzip = true\n        |pekko.grpc.client.jelly.use-tls = false\n        |pekko.grpc.client.jelly.backend = netty\n        |\"\"\".stripMargin\n    )\n    .withFallback(ConfigFactory.defaultApplication())\n\n  // We will need two Pekko actor systems to run the streams \u2013 one for the server and one for the client\n  val serverActorSystem: ActorSystem[_] = ActorSystem(Behaviors.empty, \"ServerSystem\")\n  val clientActorSystem: ActorSystem[_] = ActorSystem(Behaviors.empty, \"ClientSystem\", config)\n\n  // Our mock dataset that we will send around in the streams\n  val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n\n  /**\n   * Main method that starts the server and the client.\n   */\n  def main(args: Array[String]): Unit =\n    given system: ActorSystem[_] = serverActorSystem\n    given ExecutionContext = system.executionContext\n\n    // Start the server\n    val exampleService = ExampleJellyService()\n    RdfStreamServer(\n      RdfStreamServer.Options.fromConfig(config.getConfig(\"pekko.grpc.client.jelly\")),\n      exampleService\n    ).run() onComplete {\n      case Success(binding) =>\n        // If the server started successfully, start the client\n        println(s\"[SERVER] Bound to ${binding.localAddress}\")\n        runClient()\n      case Failure(exception) =>\n        // Otherwise, print the error and terminate the actor system\n        println(s\"[SERVER] Failed to bind: $exception\")\n        system.terminate()\n    }\n\n\n  /**\n   * The client part of the example.\n   */\n  private def runClient(): Unit =\n    given system: ActorSystem[_] = clientActorSystem\n    given ExecutionContext = system.executionContext\n\n    // Create a gRPC client\n    val client = RdfStreamServiceClient(GrpcClientSettings.fromConfig(\"jelly\"))\n\n    // First, let's try to publish some data to the server\n    val frameSource = EncoderSource.fromDatasetAsQuads(\n      dataset,\n      ByteSizeLimiter(500),\n      JellyOptions.smallStrict.withStreamName(\"weather\")\n    )\n    println(\"[CLIENT] Publishing data to the server...\")\n    val publishFuture = client.publishRdf(frameSource) map { response =>\n      println(s\"[CLIENT] Received acknowledgment: $response\")\n    } recover {\n      case e =>\n        println(s\"[CLIENT] Failed to publish data: $e\")\n    }\n    // Wait for the publish to complete\n    Await.ready(publishFuture, 10.seconds)\n\n    // Now, let's try to subscribe to some data from the server in the QUADS format\n    println(\"\\n\\n[CLIENT] Subscribing to QUADS data from the server...\")\n    val quadsFuture = client\n      .subscribeRdf(RdfStreamSubscribe(\n        \"weather\",\n        Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.QUADS))\n      ))\n      .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n      .runFold(0L)((acc, _) => acc + 1)\n      // Process the result of the stream (Future[Long])\n      .map { counter =>\n        println(s\"[CLIENT] Received $counter quads.\")\n      } recover {\n        case e =>\n          println(s\"[CLIENT] Failed to receive quads: $e\")\n      }\n    Await.ready(quadsFuture, 10.seconds)\n\n    // Let's try the same, with a GRAPHS stream\n    println(\"\\n\\n[CLIENT] Subscribing to GRAPHS data from the server...\")\n    val graphsFuture = client\n      .subscribeRdf(RdfStreamSubscribe(\n        \"weather\",\n        Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.GRAPHS))\n      ))\n      // Decode the response and transform it into a stream of quads\n      .via(DecoderFlow.decodeGraphs.asDatasetStreamOfQuads)\n      .mapConcat(identity)\n      .runFold(0L)((acc, _) => acc + 1)\n      // Process the result of the stream (Future[Long])\n      .map { counter =>\n        println(s\"[CLIENT] Received $counter quads.\")\n      } recover {\n        case e =>\n          println(s\"[CLIENT] Failed to receive data: $e\")\n      }\n    Await.ready(graphsFuture, 10.seconds)\n\n    // Finally, let's try to subscribe to a stream that the server does not support\n    // We will request TRIPLES, but the server only supports QUADS and GRAPHS.\n    println(\"\\n\\n[CLIENT] Subscribing to TRIPLES data from the server...\")\n    val triplesFuture = client\n      .subscribeRdf(RdfStreamSubscribe(\n        \"weather\",\n        Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.TRIPLES))\n      ))\n      .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n      .runFold(0L)((acc, _) => acc + 1)\n      .map { counter =>\n        println(s\"[CLIENT] Received $counter triples.\")\n      } recover {\n        case e =>\n          println(s\"[CLIENT] Failed to receive triples: $e\")\n      }\n    Await.result(triplesFuture, 10.seconds)\n\n    println(\"\\n\\n[CLIENT] Terminating...\")\n    system.terminate()\n    println(\"[SERVER] Terminating...\")\n    serverActorSystem.terminate()\n\n\n  /**\n   * Example implementation of RdfStreamService to act as the server.\n   * \n   * You will also need to implement this trait in your own service. It defines the logic with which the server\n   * will handle incoming streams and subscriptions.\n   */\n  class ExampleJellyService(using system: ActorSystem[_]) extends RdfStreamService:\n    given ExecutionContext = system.executionContext\n\n    /**\n     * Handler for clients publishing RDF streams to the server.\n     * \n     * We receive a stream of RdfStreamFrames and must respond with an acknowledgment (or an error).\n     */\n    override def publishRdf(in: Source[RdfStreamFrame, NotUsed]): Future[RdfStreamReceived] =\n      // Decode the incoming stream and count the number of RDF statements in it\n      in.via(DecoderFlow.decodeAny.asFlatStream)\n        .runFold(0L)((acc, _) => acc + 1)\n        .map(counter => {\n          println(s\"[SERVER] Received ${counter} RDF statements. Sending acknowledgment.\")\n          // Send an acknowledgment back to the client\n          RdfStreamReceived()\n        })\n\n    /**\n     * Handler for clients subscribing to RDF streams from the server.\n     * \n     * We receive a subscription request and must respond with a stream of RdfStreamFrames or an error.\n     */\n    override def subscribeRdf(in: RdfStreamSubscribe): Source[RdfStreamFrame, NotUsed] =\n      println(s\"[SERVER] Received subscription request for topic ${in.topic}.\")\n      // First, check the requested physical stream type\n      val streamType = in.requestedOptions match\n        case Some(options) =>\n          println(s\"[SERVER] Requested physical stream type: ${options.physicalType}.\")\n          options.physicalType\n        case None =>\n          println(s\"[SERVER] No requested stream options.\")\n          PhysicalStreamType.UNSPECIFIED\n\n      // Get the stream options requested by the client or the default options if none were provided\n      val options = in.requestedOptions.getOrElse(JellyOptions.smallStrict)\n        .withStreamName(in.topic)\n      // Check if the requested options are supported\n      // !!! THIS IS IMPORTANT !!!\n      // If you don't check if the requested options are supported, you may be vulnerable to\n      // denial-of-service attacks. For example, a client could request a very large lookup table\n      // that would consume a lot of memory on the server.\n      try\n        JellyOptions.checkCompatibility(options, JellyOptions.defaultSupportedOptions)\n      catch\n        case e: IllegalArgumentException =>\n          // If the requested options are not supported, return an error\n          return Source.failed(new GrpcServiceException(\n            io.grpc.Status.INVALID_ARGUMENT.withDescription(e.getMessage)\n          ))\n\n      streamType match\n        // This server implementation only supports QUADS and GRAPHS streams... and in both cases\n        // it will always the same dataset.\n        // You can of course implement more complex logic here, e.g., to stream different data based on the topic.\n        case PhysicalStreamType.QUADS => EncoderSource.fromDatasetAsQuads(\n          dataset,\n          ByteSizeLimiter(16_000),\n          options\n        )\n        case PhysicalStreamType.GRAPHS => EncoderSource.fromDatasetAsGraphs(\n          dataset,\n          Some(ByteSizeLimiter(16_000)),\n          options\n        )\n        // PhysicalStreamType.TRIPLES is not supported here \u2013 the server will throw a gRPC error\n        // if the client requests it.\n        // This is an example of how to properly handle unsupported stream options requested by the client.\n        // The library is able to automatically convert the error into a gRPC status and send it back to the client.\n        case _ => Source.failed(new GrpcServiceException(\n          io.grpc.Status.INVALID_ARGUMENT.withDescription(\"Unsupported physical stream type\")\n        ))\n

The classes provided in jelly-grpc should cover most cases, but they only serve as the boilerplate. You must yourself define the logic for handling the incoming and outgoing streams, as shown in the example above.

Of course, you can also implement the server or the client from scratch, if you want to.

"},{"location":"user/grpc/#see-also","title":"See also","text":"
  • Reactive streaming with Jelly-JVM
  • Useful utilities
    • Using Typesafe config to configure Jelly
"},{"location":"user/jena-cli/","title":"Apache Jena CLI tools","text":"

Jelly-JVM fully supports Apache Jena's command-line interface (CLI) utilities.

"},{"location":"user/jena-cli/#parsing","title":"Parsing","text":"

Jena will automatically detect Jelly files based on their extension (.jelly, .jelly.gz) and parse them. You can also manually set the --syntax option to jelly.

"},{"location":"user/jena-cli/#writing","title":"Writing","text":"

You can use Jelly as an output format for Jena's CLI utilities by specifying the --output or --stream options with the jelly format. We recommend using the --stream option for better performance.

Example: converting a Turtle file to Jelly

./riot --stream=jelly data.ttl > data.jelly\n

By default Jena will use the \"small, all features\" Jelly preset (name table: 128 entries, prefix table: 16, datatype table: 16, RDF-star enabled, generalized RDF enabled). There are a few reasons why you might want to change these serialization options:

  • Performance \u2013 for larger files, the small preset does not offer the best performance or compression ratio. It's better to use larger lookup tables.
  • Compatibility \u2013 if your data does not include RDF-star or generalized RDF, you can mark these features as disabled. Later, parsers will know accurately what to expect in your data.

The following presets are available:

  • Small: 128 name table entries, 16 prefix table entries, 16 datatype table entries
    • SMALL_STRICT \u2013 RDF-star and generalized RDF disabled
    • SMALL_GENERALIZED \u2013 RDF-star disabled, generalized RDF enabled
    • SMALL_RDF_STAR \u2013 RDF-star enabled, generalized RDF disabled
    • SMALL_ALL_FEATURES \u2013 RDF-star and generalized RDF enabled (default)
  • Big: 4000 name table entries, 150 prefix table entries, 32 datatype table entries (recommended for larger files)
    • BIG_STRICT
    • BIG_GENERALIZED
    • BIG_RDF_STAR
    • BIG_ALL_FEATURES

To use one of these presets, use the --set CLI option with the https://ostrzyciel.eu/jelly/riot/symbols#preset symbol:

Example: converting a Turtle file to Jelly with a big preset (strict)

./riot --stream=jelly \\\n    --set=\"https://ostrzyciel.eu/jelly/riot/symbols#preset=BIG_STRICT\" \\\n    data.ttl > data.jelly\n

Example: dumping a TDB2 database to Jelly with a big preset (all features)

./tdb2.tdbdump --tdb=path/to/assembler.ttl \\\n    --set=\"https://ostrzyciel.eu/jelly/riot/symbols#preset=BIG_ALL_FEATURES\" \\\n    --stream=jelly > mydb.jelly\n
"},{"location":"user/jena-cli/#see-also","title":"See also","text":"
  • Installing Jelly with Jena
  • Jena CLI documentation
"},{"location":"user/jena/","title":"Apache Jena integration","text":"

This guide explains the functionalities of the jelly-jena module, which provides Jelly support for Apache Jena.

If you just want to add Jelly format support to Apache Jena / Apache Jena Fuseki, you can use the Jelly-JVM plugin JAR. See the dedicated guide for more information.

"},{"location":"user/jena/#base-facilities","title":"Base facilities","text":"

jelly-jena implements the eu.ostrzyciel.jelly.core.ConverterFactory trait in eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory . This factory allows you to build encoders and decoders that convert between Jelly's RdfStreamFrames and Apache Jena's Triple and Quad objects. The eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame class is an object representation of Jelly's binary format.

The module also implements the eu.ostrzyciel.jelly.core.IterableAdapter trait in eu.ostrzyciel.jelly.convert.jena.JenaIterableAdapter . This adapter provides extension methods for Apache Jena's Model, Dataset, Graph, and DatasetGraph classes to convert them into an iterable of triples (.asTriples), quads (.asQuads), or named graphs (.asGraphs). This is useful when working with Jelly on a lower level or when using the jelly-stream module.

"},{"location":"user/jena/#serialization-and-deserialization-with-riot","title":"Serialization and deserialization with RIOT","text":"

jelly-jena implements an RDF writer and reader for Apache Jena's RIOT library. This means you can use Jelly just like, for example, Turtle or RDF/XML. See the example below:

Example: JenaRiot.scala (click to expand)

Source code on GitHub

JenaRiot.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.*\nimport org.apache.jena.rdf.model.ModelFactory\nimport org.apache.jena.riot.{RDFDataMgr, RDFFormat, RDFParser, RDFWriterRegistry, RIOT}\n\nimport java.io.{File, FileOutputStream}\nimport scala.util.Using\n\n/**\n * Example of using Jelly's integration with Apache Jena's RIOT library for\n * writing and reading RDF graphs and datasets to/from disk.\n *\n * See also: https://jena.apache.org/documentation/io/\n */\nobject JenaRiot extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // Load the RDF graph from an N-Triples file\n    val model = RDFDataMgr.loadModel(File(getClass.getResource(\"/weather.nt\").toURI).toURI.toString)\n\n    // Print the size of the model\n    println(s\"Loaded an RDF graph from N-Triples with size: ${model.size}\")\n\n    Using.resource(new FileOutputStream(\"weather.jelly\")) { out =>\n      // Write the model to a Jelly file\n      // Note: by default this will use the [[JellyFormat.JELLY_SMALL_STRICT]] format variant\n      RDFDataMgr.write(out, model, JellyLanguage.JELLY)\n      println(\"Saved the model to a Jelly file\")\n    }\n\n    // Load the RDF graph from a Jelly file\n    val model2 = RDFDataMgr.loadModel(\"weather.jelly\", JellyLanguage.JELLY)\n\n    // Print the size of the model\n    println(s\"Loaded an RDF graph from Jelly with size: ${model2.size}\")\n\n\n\n    // ---------------------------------\n    println(\"\\n\")\n\n    // Try the same with an RDF dataset and some different settings\n    val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n    println(s\"Loaded an RDF dataset from a Trig file with ${dataset.asDatasetGraph.size} named graphs and \" +\n      s\"${dataset.asDatasetGraph.stream.count} quads\")\n\n    Using.resource(new FileOutputStream(\"weather-quads.jelly\")) { out =>\n      // Write the dataset to a Jelly file, using the \"BIG\" settings\n      // (better compression for big files, more memory usage)\n      RDFDataMgr.write(out, dataset, JellyFormat.JELLY_BIG_STRICT)\n      println(\"Saved the dataset to a Jelly file\")\n    }\n\n    // Load the RDF dataset from a Jelly file\n    val dataset2 = RDFDataMgr.loadDataset(\"weather-quads.jelly\", JellyLanguage.JELLY)\n    println(s\"Loaded an RDF dataset from Jelly with ${dataset2.asDatasetGraph.size} named graphs and \" +\n      s\"${dataset2.asDatasetGraph.stream.count} quads\")\n\n    // ---------------------------------\n    println(\"\\n\")\n\n    // Custom Jelly format \u2013 change any settings you like\n    val customFormat = new RDFFormat(\n      JellyLanguage.JELLY,\n      JellyFormatVariant(\n        opt = JellyOptions.smallStrict\n          .withMaxPrefixTableSize(0) // disable the prefix table\n          .withStreamName(\"My weather stream\"), // add metadata to the stream\n        frameSize = 16 // make RdfStreamFrames with 16 rows each\n      )\n    )\n\n    // Jena requires us to register the custom format \u2013 once for graphs and once for datasets,\n    // as Jelly supports both.\n    RDFWriterRegistry.register(customFormat, JellyGraphWriterFactory)\n    RDFWriterRegistry.register(customFormat, JellyDatasetWriterFactory)\n\n    Using.resource(new FileOutputStream(\"weather-quads-custom.jelly\")) { out =>\n      // Write the dataset to a Jelly file using the custom format\n      RDFDataMgr.write(out, dataset, customFormat)\n      println(\"Saved the dataset to a Jelly file with custom settings\")\n    }\n\n    // Load the RDF dataset from a Jelly file with the custom format\n    val dataset3 = RDFDataMgr.loadDataset(\"weather-quads-custom.jelly\", JellyLanguage.JELLY)\n    println(s\"Loaded an RDF dataset from Jelly with custom settings with ${dataset3.asDatasetGraph.size} named graphs\" +\n      s\" and ${dataset3.asDatasetGraph.stream.count} quads\")\n\n    // ---------------------------------\n    println(\"\\n\")\n\n    // By default, the parser has limits on for example the maximum size of the lookup tables.\n    // The default supported options are [[JellyOptions.defaultSupportedOptions]].\n    // You can change these limits by creating your own options object.\n    val customOptions = JellyOptions.defaultSupportedOptions\n      .withMaxNameTableSize(50) // set the maximum size of the name table to 100\n    // Create a Context object with the custom options\n    val parserContext = RIOT.getContext.copy()\n      .set(JellyLanguage.SYMBOL_SUPPORTED_OPTIONS, customOptions)\n\n    println(\"Trying to load the model with custom supported options...\")\n    val model3 = ModelFactory.createDefaultModel()\n    try\n      // The loading operation should fail because our allowed max name table size is too low\n      RDFParser.create()\n        .source(\"weather.jelly\")\n        .lang(JellyLanguage.JELLY)\n        // Set the context object with the custom options\n        .context(parserContext)\n        .parse(model3)\n    catch\n      case e: RdfProtoDeserializationError =>\n        // The stream uses a name table size of 128, which is larger than the maximum supported size of 50.\n        // To read this stream, set maxNameTableSize to at least 128 in the supportedOptions for this decoder.\n        println(s\"Failed to load the model with custom options: ${e.getMessage}\")\n

Usage notes:

  • eu.ostrzyciel.jelly.core.JellyOptions provides a few common presets for Jelly serialization options construct a JellyFormatVariant, as shown in the example above. You can also further customize the serialization options (e.g., dictionary size).
  • The RIOT writer (serializer) integration implements only the delimited variant of Jelly. It is used for writing Jelly to files on disk or sockets. Because of this, you cannot use RIOT to write non-delimited Jelly data (e.g., a single message to a Kafka stream). For this, you should use the jelly-stream module or the more low-level API: Low-level usage.
  • However, the RIOT parser (deserializer) integration will automatically detect if the parsed Jelly data is delimited or not. If it's non-delimited, the parser will assume that there is only one RdfStreamFrame in the file.
  • Jelly's parsers and writers are registered in the eu.ostrzyciel.jelly.convert.jena.riot.JellyLanguage object (source code). This registration should happen automatically when you include the jelly-jena module in your project, using Jena's component initialization mechanism.
"},{"location":"user/jena/#streaming-serialization-with-riot","title":"Streaming serialization with RIOT","text":"

jelly-jena also implements a streaming writer (StreamRDF API in Jena). Using it is similar to the regular RIOT writer, with a slightly different setup:

Example: JenaRiotStreaming.scala (click to expand)

Source code on GitHub

JenaRiotStreaming.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.PhysicalStreamType\nimport org.apache.jena.graph.{NodeFactory, Triple}\nimport org.apache.jena.riot.system.{StreamRDFLib, StreamRDFWriter}\nimport org.apache.jena.riot.{RDFDataMgr, RDFParser, RIOT}\n\nimport java.io.{File, FileOutputStream}\nimport scala.util.Using\n\n/**\n * Example of using Apache Jena's streaming IO API with Jelly.\n *\n * See also: https://jena.apache.org/documentation/io/streaming-io.html\n */\nobject JenaRiotStreaming extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // Initialize a Jena StreamRDF to consume the statements\n    val readerStream = StreamRDFLib.count()\n\n    println(\"Reading a stream of triples from a Jelly file...\")\n\n    // Parse a Jelly file as a stream of triples\n    val inputFileTriples = new File(getClass.getResource(\"/jelly/weather.jelly\").toURI)\n    RDFParser\n      .source(inputFileTriples.toURI.toString)\n      .lang(JellyLanguage.JELLY)\n      .parse(readerStream)\n\n    println(f\"Read ${readerStream.countTriples()} triples\")\n    println()\n    println(\"Reading a stream of quads from a Jelly file...\")\n\n    // Parse a different Jelly file as a stream of quads and send it to the same sink\n    val inputFileQuads = new File(getClass.getResource(\"/jelly/weather-quads.jelly\").toURI)\n    RDFParser\n      .source(inputFileQuads.toURI.toString)\n      .lang(JellyLanguage.JELLY)\n      .parse(readerStream)\n\n    // Print the number of triples and quads\n    //\n    // The number of triples here is the sum of the triples from the first file and the triples\n    // in the default graph of the second file. This is just how Jena handles it.\n    println(f\"Read ${readerStream.countTriples()} triples (in total)\" +\n      f\" and ${readerStream.countQuads()} quads\")\n\n    // -------------------------------------\n    println(\"\\n\")\n\n    println(\"Writing a stream of 10 triples to a file...\")\n\n    // Try writing some triples to a file\n    // We need to create an instance of RdfStreamOptions to pass to the writer:\n    val options = JellyOptions.smallStrict\n      // The stream writer does not know if we will be writing triples or quads \u2013 we\n      // have to specify the physical stream type explicitly.\n      .withPhysicalType(PhysicalStreamType.TRIPLES)\n      .withStreamName(\"A stream of 10 triples\")\n\n    // To pass the options, we use Jena's Context mechanism\n    val context = RIOT.getContext.copy()\n      .set(JellyLanguage.SYMBOL_STREAM_OPTIONS, options)\n      .set(JellyLanguage.SYMBOL_FRAME_SIZE, 128) // optional, default is 256\n\n    Using.resource(new FileOutputStream(\"stream-riot.jelly\")) { out =>\n      // Create the writer \u2013 remember to pass the context!\n      val writerStream = StreamRDFWriter.getWriterStream(out, JellyLanguage.JELLY, context)\n      writerStream.start()\n\n      for i <- 1 to 10 do\n        writerStream.triple(Triple.create(\n          NodeFactory.createBlankNode(),\n          NodeFactory.createURI(\"https://example.org/p\"),\n          NodeFactory.createLiteralString(s\"object $i\")\n        ))\n\n      writerStream.finish()\n    }\n\n    println(\"Done writing triples\")\n\n    // Load the RDF graph that we just saved using normal RIOT API\n    val model = RDFDataMgr.loadModel(\"stream-riot.jelly\", JellyLanguage.JELLY)\n\n    println(\"Loaded the stream from disk, contents:\\n\")\n    model.write(System.out, \"NT\")\n
"},{"location":"user/jena/#see-also","title":"See also","text":"
  • Useful utilities
  • Reactive streaming with Jelly-JVM
  • Using Jelly with Jena's CLI tools
"},{"location":"user/low-level/","title":"Low-level usage","text":"

Warning

This page describes a low-level API that is a bit of a hassle to use directly. It's recommended to use the higher-level abstractions provided by the jelly-stream module, or the integrations with Apache Jena's RIOT or RDF4J's Rio libraries. If you really want to use this, it is highly recommended that you first get a basic understanding of how Jelly works under the hood and take a look at the code in the jelly-stream module to see how it's done there.

Note

The following guide uses the Apache Jena library as an example. The exact same thing can be done with RDF4J or any other RDF library that has a Jelly integration.

"},{"location":"user/low-level/#deserialization","title":"Deserialization","text":"

To parse a serialized stream frame into triples/quads:

  1. Call eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame.parseFrom if it's a non-delimited frame (like you would see, e.g., in a Kafka or gRPC stream), or parseDelimitedFrom if it's a delimited stream (like you would see in a file or a socket).
    • There is also a utility method to detect if the stream is delimited or not: eu.ostrzyciel.jelly.core.IoUtils.autodetectDelimiting . In most cases you will not need to use it. It is used internally by the Jena and RDF4J integrations for user convenience.
  2. Obtain a decoder that turns RdfStreamFrames into triples/quads: eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory has different methods for different physical stream types:
    • anyStatementDecoder for any physical stream type, outputs Triple or Quad
    • triplesDecoder for TRIPLES streams, outputs Triple
    • quadsDecoder for QUADS streams, outputs Quad
    • graphsDecoder for GRAPHS streams, outputs (Node, Iterable[Triple])
    • graphsAsQuadsDecoder for GRAPHS streams, outputs Quad
  3. For each row in the frame, call the decoder's ingestRow method to get the output iteratively.
"},{"location":"user/low-level/#serialization","title":"Serialization","text":"

To serialize triples/quads into a stream frame:

  1. If you want to serialize an RDF graph/dataset, transform them first into triples/quads in an iterable form. Use the asTriples/asQuads/asGraphs extension methods provided by the eu.ostrzyciel.jelly.convert.jena.JenaIterableAdapter object.
  2. Obtain an encoder that turns triples/quads into RdfStreamRows (the rows of a stream frame): use the eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory.encoder method to get an instance of eu.ostrzyciel.jelly.convert.jena.JenaProtoEncoder .
  3. Call the encoder's methods to add quads, triples, or named graphs to the stream frame.
    • Note that YOU are responsible for sticking to a specific physical stream type. For example, you should not mix triples with quads. It is highly recommended that you first read on the available stream types in Jelly.
    • You are also responsible for setting the appropriate stream options with proper stream types. See the guide on Jelly options presets for more information.
  4. The encoder will be returning batches or rows. You are responsible for grouping those rows logically into RdfStreamFrames. What you do here depends highly on the logical stream type you are working with.
"},{"location":"user/low-level/#see-also","title":"See also","text":"
  • Useful utilities
  • Reactive streaming with Jelly-JVM
  • Implementing Jelly-JVM for a new RDF library
"},{"location":"user/rdf4j/","title":"RDF4J integration","text":"

This guide explains the functionalities of the jelly-rdf4j module, which provides Jelly support for Eclipse RDF4J.

If you just want to add Jelly format support to your RDF4J application, you can use the Jelly-JVM plugin JAR. See the dedicated guide for more information.

"},{"location":"user/rdf4j/#base-facilities","title":"Base facilities","text":"

jelly-rdf4j implements the eu.ostrzyciel.jelly.core.ConverterFactory trait in eu.ostrzyciel.jelly.convert.rdf4j.Rdf4jConverterFactory . This factory allows you to build encoders and decoders that convert between Jelly's RdfStreamFrames and RDF4J's Statement objects. The eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame class is an object representation of Jelly's binary format.

The module also implements the eu.ostrzyciel.jelly.core.IterableAdapter trait in eu.ostrzyciel.jelly.convert.rdf4j.Rdf4jIterableAdapter . This adapter provides extension methods for RDF4J's Model class to convert it into an iterable of triples (.asTriples), quads (.asQuads), or named graphs (.asGraphs). This is useful when working with Jelly on a lower level or when using the jelly-stream module.

"},{"location":"user/rdf4j/#serialization-and-deserialization-with-rdf4j-rio","title":"Serialization and deserialization with RDF4J Rio","text":"

jelly-rdf4j implements an RDF writer and parser for Eclipse RDF4J's Rio library. This means you can use Jelly just like any other RDF serialization format (e.g., RDF/XML, Turtle). See the example below:

Example: Rdf4jRio.scala (click to expand)

Source code on GitHub

Rdf4jRio.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.rdf4j.rio.*\nimport eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.{PhysicalStreamType, RdfStreamOptions}\nimport org.eclipse.rdf4j.model.Statement\nimport org.eclipse.rdf4j.rio.helpers.StatementCollector\nimport org.eclipse.rdf4j.rio.{RDFFormat, Rio}\n\nimport java.io.{File, FileOutputStream}\nimport scala.jdk.CollectionConverters.*\nimport scala.util.Using\n\n/**\n * Example of using RDF4J's Rio library to read and write RDF data.\n *\n * See also: https://rdf4j.org/documentation/programming/rio/\n */\nobject Rdf4jRio extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // Load the RDF graph from an N-Triples file\n    val inputFile = File(getClass.getResource(\"/weather.nt\").toURI)\n    val triples = readRdf4j(inputFile, RDFFormat.TURTLE, None)\n\n    // Print the size of the graph\n    println(s\"Loaded ${triples.size} triples from an N-Triples file\")\n\n    // Write the RDF graph to a Jelly file\n    // Fist, create the stream's options:\n    val options = JellyOptions.smallStrict\n      // Setting the physical stream type is mandatory! It will always be either TRIPLES or QUADS.\n      .withPhysicalType(PhysicalStreamType.TRIPLES)\n      // Set other optional options\n      .withStreamName(\"My weather data\")\n    // Create the config object to pass to the writer\n    val config = JellyWriterSettings.configFromOptions(options, frameSize = 128)\n\n    // Do the actual writing\n    Using.resource(new FileOutputStream(\"weather.jelly\")) { out =>\n      val writer = Rio.createWriter(JELLY, out)\n      writer.setWriterConfig(config)\n      writer.startRDF()\n      triples.foreach(writer.handleStatement)\n      writer.endRDF()\n    }\n\n    println(\"Saved the model to a Jelly file\")\n\n    // Load the RDF graph from the Jelly file\n    val jellyFile = File(\"weather.jelly\")\n    val jellyTriples = readRdf4j(jellyFile, JELLY, None)\n\n    // Print the size of the graph\n    println(s\"Loaded ${jellyTriples.size} triples from a Jelly file\")\n\n    // ---------------------------------\n    println(\"\\n\")\n    // By default, the parser has limits on for example the maximum size of the lookup tables.\n    // The default supported options are [[JellyOptions.defaultSupportedOptions]].\n    // You can change these limits by creating your own options object.\n    val customOptions = JellyOptions.defaultSupportedOptions\n      .withMaxPrefixTableSize(10) // set the maximum size of the prefix table to 10\n    println(\"Trying to read the Jelly file with custom options...\")\n    try\n      // This operation should fail because the Jelly file uses a prefix table larger than 10\n      val customTriples = readRdf4j(jellyFile, JELLY, Some(customOptions))\n    catch\n      case e: RdfProtoDeserializationError =>\n        // The stream uses a prefix table size of 16, which is larger than the maximum supported size of 10.\n        // To read this stream, set maxPrefixTableSize to at least 16 in the supportedOptions for this decoder.\n        println(s\"Failed to read the Jelly file with custom options: ${e.getMessage}\")\n\n\n  /**\n   * Helper function to read RDF data using RDF4J's Rio library.\n   * @param file file to read from\n   * @param format RDF format\n   * @param supportedOptions supported options for reading Jelly streams (optional)\n   * @return sequence of RDF statements\n   */\n  private def readRdf4j(file: File, format: RDFFormat, supportedOptions: Option[RdfStreamOptions]): Seq[Statement] =\n    val parser = Rio.createParser(format)\n    val collector = new StatementCollector()\n    parser.setRDFHandler(collector)\n    supportedOptions.foreach(opt =>\n      // If the user provided supported options, set them on the parser\n      parser.setParserConfig(JellyParserSettings.configFromOptions(opt))\n    )\n    Using.resource(file.toURI.toURL.openStream()) { is =>\n      parser.parse(is)\n    }\n    collector.getStatements.asScala.toSeq\n

Usage notes:

  • eu.ostrzyciel.jelly.core.JellyOptions provides a few common presets for Jelly serialization options. These options are passed through eu.ostrzyciel.jelly.convert.rdf4j.rio.JellyWriterSettings.configFromOptions and used to configure the writer, as shown in the example above. You can also further customize the serialization options (e.g., dictionary size).
  • The RDF4J Rio writer (serializer) integration implements only the delimited variant of Jelly. It is used for writing Jelly to files on disk or sockets. Because of this, you cannot use Rio to write non-delimited Jelly data (e.g., a single message to a Kafka stream). For this, you should use the jelly-stream module or the more low-level API: Low-level usage.
  • However, the Rio parser (deserializer) integration will automatically detect if the parsed Jelly data is delimited or not. If it's non-delimited, the parser will assume that there is only one RdfStreamFrame in the file.
  • Jelly's parsers and writers are in the eu.ostrzyciel.jelly.convert.rdf4j.rio package (source code). They are automatically registered on startup using the RDFParserFactory and RDFWriterFactory SPIs provided by RDF4J.
"},{"location":"user/rdf4j/#see-also","title":"See also","text":"
  • Useful utilities
  • Reactive streaming with Jelly-JVM
"},{"location":"user/reactive/","title":"User guide \u2013 reactive streaming","text":"

This guide explains the reactive streaming functionalities of the jelly-stream module.

Prerequisites

If you are unfamiliar with the concept of reactive streams or Apache Pekko Streams, we highly recommend you start from reading about the basic concepts of Pekko Streams.

We also recommend you first read about the RDF stream types in Jelly. Otherwise, this guide may not make much sense.

You can use jelly-stream with any RDF library that has a Jelly integration, such as Apache Jena (using jelly-jena) or RDF4J (using jelly-rdf4j). The streaming API is generic and identical across all libraries.

"},{"location":"user/reactive/#basic-concepts","title":"Basic concepts","text":"

A key notion of this API are the encoders and decoders.

  • An encoder turns objects from your RDF library of choice (e.g., Triple in Apache Jena) into an object representation of Jelly's binary format (RdfStreamFrame).
  • A decoder does the opposite: it turns RdfStreamFrames into objects from your RDF library of choice.

So, for example, an encoder flow for flat triple streams would have a type of Flow[Triple, RdfStreamFrame, NotUsed] in Apache Jena. The opposite (a flat triple stream decoder) would have a type of Flow[RdfStreamFrame, Triple, NotUsed].

RdfStreamFrames can be converted to and from raw bytes using a range of methods, depending on your use case. See the sections below for examples.

"},{"location":"user/reactive/#encoding-a-single-rdf-graph-or-dataset-as-a-flat-stream-encodersource","title":"Encoding a single RDF graph or dataset as a flat stream (EncoderSource)","text":"

The easiest way to start is with flat RDF streams (i.e., flat streams of triples or quads). You can convert an RDF dataset or graph into such using the methods in eu.ostrzyciel.jelly.stream.EncoderSource .

Example: PekkoStreamsEncoderSource.scala (click to expand)

Source code on GitHub

PekkoStreamsEncoderSource.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.EncoderSource]] utility to convert RDF graphs and datasets\n * into Jelly streams with a single method call.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsEncoderSource extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // We will need a Pekko actor system to run the streams\n    given actorSystem: ActorSystem = ActorSystem()\n    // And an execution context for the futures\n    given ExecutionContext = actorSystem.getDispatcher\n\n    // Load an example RDF graph from an N-Triples file\n    val model = RDFDataMgr.loadModel(File(getClass.getResource(\"/weather.nt\").toURI).toURI.toString)\n\n    println(s\"Loaded model with ${model.size()} triples\")\n    println(s\"Streaming the model to memory...\")\n\n    // Create a Pekko Streams Source from the Jena model\n    // This automatically sets the physical and logical stream types.\n    val encodedModelFuture = EncoderSource\n      .fromGraph(\n        model,\n        // Aim for frames with ~2000 bytes \u2013 may be more!\n        ByteSizeLimiter(2000),\n        JellyOptions.smallStrict,\n      )\n      // wireTap: print the size of the frames\n      // Notice in the output that the frames are slightly bigger than 2000 bytes.\n      .wireTap(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes on wire\"))\n      // Convert each stream frame to bytes\n      .via(JellyIo.toBytes)\n      // Collect the stream into a sequence\n      .runWith(Sink.seq)\n\n    // Wait for the stream to complete and collect the result\n    val encodedModel = Await.result(encodedModelFuture, 10.seconds)\n\n    println(s\"Streamed model to memory with ${encodedModel.size} frames and\" +\n      s\" ${encodedModel.map(_.length).sum} bytes on wire\")\n\n    println(\"\\n\")\n\n    // -------------------------------------------------------------------\n    // Second example: try encoding an RDF dataset as a GRAPHS stream\n    val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n    println(s\"Loaded dataset with ${dataset.asDatasetGraph.size} named graphs\")\n    println(s\"Streaming the dataset to memory...\")\n\n    val encodedDatasetFuture = EncoderSource\n      // Here we stream this is as a GRAPHS stream (physical type)\n      // You can also use .fromDatasetAsQuads to stream as QUADS\n      .fromDatasetAsGraphs(\n        dataset,\n        // This time we limit the number of rows in each frame to 30\n        // Note that for this particular encoder, we can skip the limiter entirely \u2013 but this can lead to huge frames!\n        // So, be careful with that, or may get an out-of-memory error.\n        Some(StreamRowCountLimiter(30)),\n        JellyOptions.smallStrict,\n      )\n      // wireTap: print the size of the frames\n      // Note that some frames smaller than the limit \u2013 this is because this encoder will always split frames\n      // on graph boundaries.\n      .wireTap(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes on wire\"))\n      // Convert each stream frame to bytes\n      .via(JellyIo.toBytes)\n      // Collect the stream into a sequence\n      .runWith(Sink.seq)\n\n    // Wait for the stream to complete and collect the result\n    val encodedDataset = Await.result(encodedDatasetFuture, 10.seconds)\n\n    println(s\"Streamed dataset to memory with ${encodedDataset.size} frames and\" +\n      s\" ${encodedDataset.map(_.length).sum} bytes on wire\")\n\n    actorSystem.terminate()\n
"},{"location":"user/reactive/#encoding-any-rdf-data-as-a-flat-or-grouped-stream-encoderflow","title":"Encoding any RDF data as a flat or grouped stream (EncoderFlow)","text":"

The eu.ostrzyciel.jelly.stream.EncoderFlow provides even more options for turning RDF data into Jelly streams, including both grouped and flat streams. Every type of RDF stream in Jelly can be created using this API.

Example: PekkoStreamsEncoderFlow.scala (click to expand)

Source code on GitHub

PekkoStreamsEncoderFlow.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.EncoderFlow]] utility to encode RDF data as Jelly streams.\n * \n * Here, the RDF data is turned into a series of byte buffers, with each buffer corresponding to exactly one frame.\n * This is suitable if your streaming protocol (e.g., Kafka, MQTT, AMQP) already frames the messages.\n * If you are writing to a raw socket or file, then you must use the DELIMITED variant of Jelly instead.\n * See [[eu.ostrzyciel.jelly.examples.PekkoStreamsWithIo]] for examples of that.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsEncoderFlow extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // We will need a Pekko actor system to run the streams\n    given actorSystem: ActorSystem = ActorSystem()\n    // And an execution context for the futures\n    given ExecutionContext = actorSystem.getDispatcher\n\n    // Load the example dataset\n    val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n    // First, let's see what views of the dataset can we obtain using Jelly's Iterable adapters:\n    // 1. Iterable of all quads in the dataset\n    val quads: immutable.Iterable[Quad] = dataset.asQuads\n    // 2. Iterable of all graphs (named and default) in the dataset\n    val graphs: immutable.Iterable[(Node, Iterable[Triple])] = dataset.asGraphs\n    // 3. Iterable of all triples in the default graph\n    val triples: immutable.Iterable[Triple] = dataset.getDefaultModel.asTriples\n\n    // Note: here we are not turning the frames into bytes, but just printing their size in bytes.\n    // You can find an example of how to turn a frame into a byte array in the `PekkoStreamsEncoderSource` example.\n    // This is done with: .via(JellyIo.toBytes)\n\n    // Let's try encoding this as flat RDF streams (streams of triples or quads)\n    // https://w3id.org/stax/ontology#flatQuadStream\n    println(f\"Encoding ${quads.size} quads as a flat RDF quad stream\")\n    val flatQuadsFuture = Source(quads)\n      .via(EncoderFlow.flatQuadStream(\n        // This encoder requires a size limiter \u2013 otherwise a stream frame could have infinite length!\n        StreamRowCountLimiter(20),\n        JellyOptions.smallStrict,\n      ))\n      .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n    Await.ready(flatQuadsFuture, 10.seconds)\n\n    // https://w3id.org/stax/ontology#flatTripleStream\n    println(f\"\\n\\nEncoding ${triples.size} triples as a flat RDF triple stream\")\n    val flatTriplesFuture = Source(triples)\n      .via(EncoderFlow.flatTripleStream(\n        // This encoder requires a size limiter \u2013 otherwise a stream frame could have infinite length!\n        ByteSizeLimiter(500),\n        JellyOptions.smallStrict,\n      ))\n      .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n    Await.ready(flatTriplesFuture, 10.seconds)\n\n    // We can also stream already grouped triples or quads \u2013 for example, if your system generates batches of\n    // N triples, you can just send those batches straight to be encoded, with one batch = one stream frame.\n    // https://w3id.org/stax/ontology#flatQuadStream\n    println(f\"\\n\\nEncoding ${quads.size} quads as a flat RDF quad stream, grouped in batches of 10\")\n    // First, group the quads into batches of 8\n    val groupedQuadsFuture = Source.fromIterator(() => quads.grouped(10))\n      .via(EncoderFlow.flatQuadStreamGrouped(\n        // Do not use a size limiter here \u2013 we want exactly one batch in each frame\n        None,\n        JellyOptions.smallStrict,\n      ))\n      .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n    Await.ready(groupedQuadsFuture, 10.seconds)\n\n    // Now, let's try grouped streams. Let's say we want to stream all graphs in a dataset, but put exactly one\n    // graph in each frame (message). This is very common in (for example) IoT systems.\n    // https://w3id.org/stax/ontology#namedGraphStream\n    println(f\"\\n\\nEncoding ${graphs.size} graphs as a named graph stream\")\n    val namedGraphsFuture = Source(graphs)\n      .via(EncoderFlow.namedGraphStream(\n        // Do not use a size limiter here \u2013 we want exactly one graph in each frame\n        None,\n        JellyOptions.smallStrict,\n      ))\n      // Note that we will see exactly as many frames as there are graphs in the dataset\n      .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n    Await.ready(namedGraphsFuture, 10.seconds)\n\n    // As a last example, we will stream a series of RDF graphs. In our case this will be just the default graph\n    // repeated a few times. This type of stream is also pretty common in practical applications.\n    // https://w3id.org/stax/ontology#graphStream\n    println(f\"\\n\\nEncoding 5 RDF graphs as a graph stream\")\n    val graphsFuture = Source.repeat(triples)\n      .take(5)\n      .via(EncoderFlow.graphStream(\n        // Do not use a size limiter here \u2013 we want exactly one graph in each frame\n        None,\n        JellyOptions.smallStrict,\n      ))\n      // Note that we will see exactly 5 frames \u2013 the number of graphs we streamed\n      .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n    Await.ready(graphsFuture, 10.seconds)\n\n    actorSystem.terminate()\n
"},{"location":"user/reactive/#decoding-rdf-streams-decoderflow","title":"Decoding RDF streams (DecoderFlow)","text":"

The eu.ostrzyciel.jelly.stream.DecoderFlow provides methods for decoding flat and grouped streams. There is no opposite equivalent to EncoderSource for decoding, though. This would require constructing an RDF graph or dataset from statements, which is a process that can vary a lot depending on your application. You will have to do this part yourself.

Example: PekkoStreamsDecoderFlow.scala (click to expand)

Source code on GitHub

PekkoStreamsDecoderFlow.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.query.Dataset\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.DecoderFlow]] utility to turn incoming Jelly streams\n * into usable RDF data.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsDecoderFlow extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // We will need a Pekko actor system to run the streams\n    given actorSystem: ActorSystem = ActorSystem()\n    // And an execution context for the futures\n    given ExecutionContext = actorSystem.getDispatcher\n\n    // Load the example dataset\n    val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n    // To decode something, we first need to encode it...\n    // See [[PekkoStreamsEncoderFlow]] and [[PekkoStreamsEncoderSource]] for an explanation of what is happening here.\n    // We have four seqences of byte arrays, with each byte array corresponding to one encoded stream frame:\n    // - encodedQuads: a flat RDF quad stream, physical type: QUADS\n    // - encodedTriples: a flat RDF triple stream, physical type: TRIPLES\n    // - encodedGraphs: a flat RDF quad stream, physical type: GRAPHS\n    val (encodedQuads, encodedTriples, encodedGraphs) = getEncodedData(dataset)\n\n    // Now we can decode the encoded data back into something useful.\n    // Let's start by simply decoding the quads as a flat RDF quad stream:\n    println(\"Decoding quads as a flat RDF quad stream...\")\n    val decodedQuadsFuture = Source(encodedQuads)\n      // We need to parse the bytes into a Jelly stream frame\n      .via(JellyIo.fromBytes)\n      // And then decode the frame into Jena quads.\n      // We use \"decodeQuads\" because the physical stream type is QUADS.\n      // And then we want to treat it as a flat RDF quad stream, so we call \"asFlatQuadStreamStrict\".\n      // We use the \"Strict\" method to tell the decoder to check if the incoming logical stream type is the same\n      // as we are expecting: flat RDF quad stream.\n      .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n      .runWith(Sink.seq)\n\n    val decodedQuads: Seq[Quad] = Await.result(decodedQuadsFuture, 10.seconds)\n    println(s\"Decoded ${decodedQuads.size} quads.\")\n\n    // We can also treat each stream frame as a separate dataset. This way we would get an\n    // RDF dataset stream.\n    println(f\"\\n\\nDecoding quads as an RDF dataset stream from ${encodedQuads.size} frames...\")\n    val decodedDatasetFuture = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      // Note that we cannot use the strict variant (asDatasetStreamOfQuadsStrict) here, because the stream says its\n      // logical type is flat RDF quad stream.\n      .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuads)\n      .runWith(Sink.seq)\n\n    val decodedDatasets: Seq[IterableOnce[Quad]] = Await.result(decodedDatasetFuture, 10.seconds)\n    println(s\"Decoded ${decodedDatasets.size} datasets with\" +\n      s\" ${decodedDatasets.map(_.iterator.size).sum} quads in total.\")\n\n    // If we tried that with the strict variant, we would get an exception:\n    println(f\"\\n\\nDecoding quads as an RDF dataset stream with strict logical type handling...\")\n    val future = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuadsStrict)\n      .runWith(Sink.seq)\n    Await.result(future.recover {\n      // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n      // Expected logical stream type LOGICAL_STREAM_TYPE_DATASETS, got LOGICAL_STREAM_TYPE_FLAT_QUADS.\n      // LOGICAL_STREAM_TYPE_FLAT_QUADS is not a subtype of LOGICAL_STREAM_TYPE_DATASETS.\n      case e: Exception => println(e.getCause)\n    }, 10.seconds)\n\n    // We can also pass entirely custom supported options to the decoder, instead of the defaults\n    // (see [[JellyOptions.defaultSupportedOptions]]). This is useful if we want to decode a stream with\n    // for example very large lookup tables or we want to put stricter limits on the streams that we accept.\n    println(f\"\\n\\nDecoding quads as an RDF dataset stream with custom supported options...\")\n    val customSupportedOptions = JellyOptions.defaultSupportedOptions\n      .withMaxNameTableSize(50) // This is too small for the stream we are decoding\n    val customSupportedOptionsFuture = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuads(customSupportedOptions))\n      .runWith(Sink.seq)\n    Await.result(customSupportedOptionsFuture.recover {\n      // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n      // The stream uses a name table size of 128, which is larger than the maximum supported size of 50.\n      // To read this stream, set maxNameTableSize to at least 128 in the supportedOptions for this decoder.\n      case e: Exception => println(e.getCause)\n    }, 10.seconds)\n\n    // Flat RDF triple stream\n    println(f\"\\n\\nDecoding triples as a flat RDF triple stream...\")\n    val decodedTriplesFuture = Source(encodedTriples)\n      .via(JellyIo.fromBytes)\n      .via(DecoderFlow.decodeTriples.asFlatTripleStreamStrict)\n      .runWith(Sink.seq)\n\n    val decodedTriples: Seq[Triple] = Await.result(decodedTriplesFuture, 10.seconds)\n    println(s\"Decoded ${decodedTriples.size} triples.\")\n\n    // We can interpret the GRAPHS stream in a few ways, see\n    // [[eu.ostrzyciel.jelly.stream.DecoderFlow.GraphsIngestFlowOps]] for more details.\n    // Here we will treat it as an RDF named graph stream.\n    println(f\"\\n\\nDecoding graphs as an RDF named graph stream...\")\n    val decodedGraphsFuture = Source(encodedGraphs)\n      .via(JellyIo.fromBytes)\n      // Non-strict because the original logical stream type is flat RDF quad stream.\n      .via(DecoderFlow.decodeGraphs.asNamedGraphStream)\n      .runWith(Sink.seq)\n\n    val decodedGraphs: Seq[(Node, Iterable[Triple])] = Await.result(decodedGraphsFuture, 10.seconds)\n    println(s\"Decoded ${decodedGraphs.size} graphs.\")\n\n    // If we tried using a decoder for a physical stream type that does not match the type of the stream,\n    // we would get an exception. Here let's try to decode a QUADS stream with a TRIPLES decoder.\n    println(f\"\\n\\nDecoding quads as a flat RDF triple stream...\")\n    val future2 = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      // Note the \"decodeTriples\" here\n      .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n      .runWith(Sink.seq)\n    Await.result(future2.recover {\n      // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n      // Incoming stream type is not TRIPLES.\n      case e: Exception => println(e.getCause)\n    }, 10.seconds)\n\n    // We can get around this by using the \"decodeAny\" method, which will pick the appropriate decoder\n    // based on the stream options in the stream.\n    // In this case we can only ask the decoder to output a flat or grouped RDF stream.\n    println(f\"\\n\\nDecoding quads as a flat RDF stream using decodeAny...\")\n    val decodedAnyFuture = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      // The is no strict variant at all for decodeAny, as we don't care about the stream type anyway.\n      .via(DecoderFlow.decodeAny.asFlatStream)\n      .runWith(Sink.seq)\n\n    val decodedAny: Seq[Triple | Quad] = Await.result(decodedAnyFuture, 10.seconds)\n    println(s\"Decoded ${decodedAny.size} statements.\")\n\n    // One last trick up our sleeves is the snoopStreamOptions method, which allows us to inspect the stream options\n    // and carry on with the decoding as normal.\n    // In this case, we will reuse the first example (flat RDF quad stream) and snoop the stream options.\n    println(f\"\\n\\nSnooping the stream options of the first frame while decoding a flat RDF quad stream...\")\n    val snoopFuture = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      // We add a .viaMat here to capture the materialized value of this stage.\n      .viaMat(DecoderFlow.snoopStreamOptions)(Keep.right)\n      .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n      .toMat(Sink.seq)(Keep.both)\n      .run()\n\n    val streamOptions = Await.result(snoopFuture._1, 10.seconds)\n    val decodedQuads2 = Await.result(snoopFuture._2, 10.seconds)\n\n    val streamOptionsIndented = (\"\\n\" + streamOptions.get.toProtoString.strip).replace(\"\\n\", \"\\n  \")\n    println(s\"Stream options: $streamOptionsIndented\")\n    println(s\"Decoded ${decodedQuads2.size} quads.\")\n\n    actorSystem.terminate()\n\n\n  /**\n   * Helper method to produce encoded data from a dataset.\n   */\n  private def getEncodedData(dataset: Dataset)(using ActorSystem, ExecutionContext):\n  (Seq[Array[Byte]], Seq[Array[Byte]], Seq[Array[Byte]]) =\n    val quadStream = EncoderSource.fromDatasetAsQuads(\n      dataset,\n      ByteSizeLimiter(500),\n      JellyOptions.smallStrict\n    )\n    val tripleStream = EncoderSource.fromGraph(\n      dataset.getDefaultModel,\n      ByteSizeLimiter(250),\n      JellyOptions.smallStrict\n    )\n    val graphStream = EncoderSource.fromDatasetAsGraphs(\n      dataset,\n      None,\n      JellyOptions.smallStrict\n    )\n    val results = Seq(quadStream, tripleStream, graphStream).map { stream =>\n      val streamFuture = stream\n        .via(JellyIo.toBytes)\n        .runWith(Sink.seq)\n      Await.result(streamFuture, 10.seconds)\n    }\n    (results.head, results(1), results(2))\n
"},{"location":"user/reactive/#byte-streams-delimited-variant","title":"Byte streams (delimited variant)","text":"

In all of the examples above, we used the non-delimited variant of Jelly, which is appropriate for, e.g., sending Jelly data over gRPC or Kafka. If you want to write Jelly data to a file or a socket, you will need to use the delimited variant. jelly-stream provides a few methods for this in eu.ostrzyciel.jelly.stream.JellyIo .

Example: PekkoStreamsWithIo.scala (click to expand)

Source code on GitHub

PekkoStreamsWithIo.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.query.Dataset\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\nimport org.apache.pekko.util.ByteString\n\nimport java.io.{File, FileInputStream, FileOutputStream}\nimport java.util.zip.GZIPInputStream\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\nimport scala.util.Using\n\n/**\n * Example of using Pekko Streams to read/write Jelly to a file or any other byte stream (e.g., socket).\n *\n * The examples here use the DELIMITED variant of Jelly, which is suitable only for situations where there is\n * no framing in the underlying stream. You should always use the delimited variant with raw files and sockets,\n * as otherwise it would be impossible to tell where one stream frame ends and another one begins.\n *\n * If you are working with something like MQTT, Kafka, JMS, AMQP... then check the examples in\n * [[eu.ostrzyciel.jelly.examples.PekkoStreamsEncoderFlow]].\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsWithIo extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // We will need a Pekko actor system to run the streams\n    given actorSystem: ActorSystem = ActorSystem()\n    // And an execution context for the futures\n    given ExecutionContext = actorSystem.getDispatcher\n\n    // We will read a gzipped Jelly file from disk and decode it on the fly, as we are decompressing it.\n    println(\"Decoding a gzipped Jelly file with Pekko Streams...\")\n    // The input file is a GZipped Jelly file\n    val inputFile = File(getClass.getResource(\"/jelly/weather.jelly.gz\").toURI)\n\n    // Use Java's GZIPInputStream to decompress the input file on the fly\n    val decodedTriples: Seq[Triple] = Using.resource(new GZIPInputStream(FileInputStream(inputFile))) { inputStream =>\n      val decodedTriplesFuture = JellyIo.fromIoStream(inputStream)\n        // Decode the Jelly frames to triples.\n        // Under the hood it uses the RdfStreamFrame.parseDelimitedFrom method.\n        .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n        .runWith(Sink.seq)\n\n      Await.result(decodedTriplesFuture, 10.seconds)\n    }\n\n    println(s\"Decoded ${decodedTriples.size} triples\")\n\n    // -----------------------------------------------------------\n    // Now we will write the decoded triples to a new Jelly file\n    println(\"\\n\\nWriting the decoded triples to a new Jelly file with Pekko Streams...\")\n    Using.resource(new FileOutputStream(\"weather.jelly\")) { outputStream =>\n      val writeFuture = Source(decodedTriples)\n        // Encode the triples to Jelly\n        .via(EncoderFlow.flatTripleStream(\n          ByteSizeLimiter(500),\n          JellyOptions.smallStrict\n        ))\n        // Write the Jelly frames to a Java byte stream.\n        // Under the hood it uses the RdfStreamFrame.writeDelimitedTo method.\n        .runWith(JellyIo.toIoStream(outputStream))\n\n      Await.ready(writeFuture, 10.seconds)\n      println(\"Done writing the Jelly file.\")\n    }\n\n    // -----------------------------------------------------------\n    // Pekko Streams offers its own utilities for reading and writing bytes that do not involve using Java's\n    // blocking implementation of streams.\n    // We will again write the decoded triples to a Jelly file, but this time use Pekko's facilities.\n    println(\"\\n\\nWriting the decoded triples to a new Jelly file with Pekko Streams' utilities...\")\n    val writeFuture = Source(decodedTriples)\n      .via(EncoderFlow.flatTripleStream(\n        ByteSizeLimiter(500),\n        JellyOptions.smallStrict\n      ))\n      // Convert the frames into Pekko's byte strings.\n      // Note: we are using the DELIMITED variant because we will write this to disk!\n      .via(JellyIo.toBytesDelimited)\n      .map(bytes => ByteString(bytes))\n      .runWith(FileIO.toPath(File(\"weather2.jelly\").toPath))\n\n    Await.ready(writeFuture, 10.seconds)\n    println(\"Done writing the Jelly file.\")\n\n    actorSystem.terminate()\n
"},{"location":"user/reactive/#see-also","title":"See also","text":"
  • Using Jelly gRPC servers and clients
  • Useful utilities
    • Using Typesafe config to configure Jelly
  • Low-level usage
"},{"location":"user/utilities/","title":"Useful utilities","text":"

This guide presents some useful utilities in the jelly-core and jelly-stream modules.

"},{"location":"user/utilities/#jelly-options-presets","title":"Jelly options presets","text":"

Every Jelly stream begins with a header that specifies the serialization options used to encode the stream \u2013 see the details in the specification. So, whenever you serialize some RDF with Jelly (e.g., using Apache Jena RIOT, RDF4J Rio, or the jelly-stream module), you need to specify these options.

The eu.ostrzyciel.jelly.core.JellyOptions object provides a few common presets for Jelly serialization options. They return an instance of eu.ostrzyciel.jelly.core.proto.v1.RdfStreamOptions that you can further customize. For example:

import eu.ostrzyciel.jelly.core.JellyOptions\n\nval options = JellyOptions.smallStrict\n\nval optionsWithRdfStarSupport = JellyOptions.smallRdfStar\n\nval bigWithCustomDictionarySize = JellyOptions.bigStrict\n  .withMaxNameTableSize(2000)  \n

Warning

These presets do not specify the physical or logical stream type. In most cases, the Jelly library will take care of this for you and set these types automatically later. However, if you use the low-level API, you need to set the stream types manually. For example:

import eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.*\n\nJellyOptions.smallStrict\n  .withPhysicalType(PhysicalStreamType.QUADS)\n  .withLogicalType(LogicalStreamType.DATASETS)\n
"},{"location":"user/utilities/#checking-supported-options","title":"Checking supported options","text":"

There is also the eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions method which specifies the maximum set of options supported by default in Jelly-JVM, when parsing a stream. By default, Jelly-JVM will refuse to parse any stream that uses options that are beyond what is specified in this method. This is important for security reasons, as it prevents the library from, for example, allocating a 10 GB dictionary (potential Denial of Service attack).

The supported options check is carried out automatically by the decoder when parsing a stream. You cannot disable the check, but you can customize the supported options by constructing a new RdfStreamOptions object from eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions , customizing it, and passing it to the decoder.

If you want to do this kind of check in some other context (e.g., in a gRPC service to check if you can support the options requested by the client), you can use the eu.ostrzyciel.jelly.core.JellyOptions.checkCompatibility method. It will throw an exception if the options are not supported.

"},{"location":"user/utilities/#useful-constants","title":"Useful constants","text":"

The eu.ostrzyciel.jelly.core.Constants object defines some useful constants, such as the file extension for Jelly, its content type, and the version of the Jelly protocol.

"},{"location":"user/utilities/#rdf-stream-taxonomy-rdf-stax-stream-type-utilities","title":"RDF Stream Taxonomy (RDF-STaX) stream type utilities","text":"

Jelly uses RDF-STaX to define the logical stream types (more details here). Jelly-JVM defines each of these types as a case object in eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType .

These objects have a few useful methods for working with the RDF-STaX ontology:

import eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\n\n// Get the RDF-STaX IRI of a stream type\n// returns \"https://w3id.org/stax/ontology#flatTripleStream\"\nLogicalStreamType.TRIPLES.getRdfStaxType\n

You can also obtain a full RDF-STaX annotation for your stream if you also import an RDF library interop module (e.g., jelly-jena or jelly-rdf4j):

// Here we import `jena.given` to get the necessary implicit conversions.\n// You can do the same with `rdf4j.given` if you are using RDF4J.\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\nimport org.apache.jena.graph.NodeFactory\n\nval subjectNode: Node = NodeFactory.createURI(\"http://example.org/subject\")\nval triples: Seq[Triple] = LogicalStreamType.QUADS.getRdfStaxAnnotation\n// Returns a Seq of three triples that would look like this in Turtle:\n// <http://example.org/subject> stax:hasStreamTypeUsage [\n//   a stax:RdfStreamTypeUsage ;\n//   stax:hasStreamType stax:flatQuadStream\n// ] .\n

You can then take this annotation and expose as semantic metadata of your stream.

You can also do the opposite and construct an instance of LogicalStreamType from an RDF-STaX IRI:

import eu.ostrzyciel.jelly.core.LogicalStreamTypeFactory\n\nval iri = \"https://w3id.org/stax/ontology#flatQuadStream\"\n// returns LogicalStreamType.QUADS\nval streamType = LogicalStreamTypeFactory.fromOntologyIri(iri)\n

Finally, there are also stream type checking and manipulation utilities:

import eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\n\n// Check if this type is equal or a subtype of another type.\n// This is useful for performing compatibility checks.\n// Returns false\nLogicalStreamType.TRIPLES.isEqualOrSubtypeOf(LogicalStreamType.DATASETS)\n// Returns true\nLogicalStreamType.NAMED_GRAPHS.isEqualOrSubtypeOf(LogicalStreamType.DATASETS)\n\n// Get the \"base\" type of a stream type. Base types are concrete stream types \n// that have no parent types. \n// There are only 4 base types: GRAPHS, DATASETS, TRIPLES, QUADS.\n// Returns LogicalStreamType.TRIPLES\nLogicalStreamType.TRIPLES.toBaseType\n// Returns LogicalStreamType.DATASETS\nLogicalStreamType.NAMED_GRAPHS.toBaseType\n// Returns LogicalStreamType.DATASETS\nLogicalStreamType.TIMESTAMPED_NAMED_GRAPHS.toBaseType\n
"},{"location":"user/utilities/#jelly-configuration-from-typesafe-config","title":"Jelly configuration from Typesafe config","text":"

The jelly-stream module also implements a utility for configuring Jelly serialization options using the Typesafe config library, which is commonly used in Apache Pekko applications.

The utility is provided by the eu.ostrzyciel.jelly.stream.JellyOptionsFromTypesafe object. For example:

import com.typesafe.config.ConfigFactory\nimport eu.ostrzyciel.jelly.stream.JellyOptionsFromTypesafe\n\nval config = ConfigFactory.parseString(\"\"\"\n  |jelly.physical-type = QUADS\n  |jelly.name-table-size = 1024\n  |jelly.prefix-table-size = 64\n  |\"\"\".stripMargin)\n\nval options = JellyOptionsFromTypesafe.fromConfig(config.getConfig(\"jelly\"))\noptions.physicalType // returns PhysicalStreamType.QUADS\noptions.maxNameTableSize // returns 1024\noptions.maxPrefixTableSize // returns 64\noptions.maxDatatypeTableSize // returns 16 (the default)\n

See the source code of this class for more details.

"},{"location":"user/utilities/#see-also","title":"See also","text":"
  • Reactive streaming with Jelly-JVM
  • Low-level usage of Jelly-JVM
"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Jelly-JVM","text":"

Jelly-JVM is an implementation of the Jelly serialization format and gRPC streaming protocol for the Java Virtual Machine (JVM), written in Scala 31. The supported RDF libraries are Apache Jena and Eclipse RDF4J.

Jelly-JVM provides a full stack of utilities for fast and scalable RDF streaming with the Jelly protocol. Oh, and it's blazing-fast, too!

Getting started with plugins \u2013 no code required

See the getting started guide with plugins for a quick way to use Jelly with your Apache Jena or RDF4J application without writing any code.

Getting started for application developers

If you want to use the full feature set of Jelly-JVM in your code, see the getting started guide for application developers.

This documentation is for the latest development version of Jelly-JVM \u2013 it is not considered stable. If you are looking for the documentation of a stable release, use the version selector on the left of the top navigation bar. See: latest stable version.

"},{"location":"#library-modules","title":"Library modules","text":"

The implementation is split into a few modules that can be used separately:

  • jelly-core \u2013 implementation of the Jelly serialization format (using the scalapb library), along with generic utilities for converting the deserialized RDF data to/from the representations of RDF libraries (like Apache Jena or RDF4J).

  • jelly-jena \u2013 conversions and interop code for the Apache Jena library.

  • jelly-rdf4j \u2013 conversions and interop code for the RDF4J library.

  • jelly-stream \u2013 utilities for building Reactive Streams of RDF data (based on Pekko Streams). Useful for integrating with gRPC or other streaming protocols (e.g., Kafka, MQTT).

  • jelly-grpc \u2013 implementation of a gRPC client and server for the Jelly gRPC streaming protocol.

"},{"location":"#plugin-jars","title":"Plugin JARs","text":"

We also publish plugin JARs which allow you to use Jelly-JVM with Apache Jena and RDF4J just by dropping the JARs into the classpath. Find out more about using the plugins.

"},{"location":"#compatibility","title":"Compatibility","text":"

The Jelly-JVM implementation is compatible with Java 11 and newer. Java 11, 17, and 21 are tested in CI and are guaranteed to work. Jelly is built with Scala 3 LTS releases.

The following table shows the compatibility of the Jelly-JVM implementation with other libraries:

Jelly-JVM Scala Java RDF4J Apache Jena Apache Pekko 2.0.x \u2013 2.2.x 3.3.x (LTS) 17+ 5.x.x 5.x.x 1.1.x 1.0.x 3.3.x (LTS)2.13.x1 11+ 4.x.x 4.x.x 1.0.x

See the compatibility policy for more details and the release notes on GitHub.

"},{"location":"#documentation","title":"Documentation","text":"

Below is a list of all documentation pages about Jelly-JVM. You can also browse the Javadoc using the badges in the module list above. The documentation uses examples written in Scala, but the libraries can be used from Java as well.

  • Getting started with Jena/RDF4J plugins \u2013 how to use Jelly-JVM as a plugin for Apache Jena or RDF4J, without writing any code.
  • Getting started for application developers \u2013 how to use Jelly-JVM in code.
  • User guide
    • Apache Jena integration
    • RDF4J integration
    • Reactive streaming
    • gRPC
    • Useful utilities
    • Compatibility policy
  • Developer guide
    • Releases
    • Implementing Jelly for other libraries
  • Contributing to Jelly-JVM
  • License and citation
  • Release notes on GitHub
  • Main Jelly website \u2013 including the Jelly protocol specification and explanation of the various stream types.
  1. Scala 2.13-compatible builds of Jelly-JVM are available for Jelly-JVM 1.0.x. Scala 2 support was removed in subsequent versions. See more details.\u00a0\u21a9\u21a9

"},{"location":"contributing/","title":"Contributing to Jelly-JVM","text":"

Jelly-JVM is an open project \u2013 you are welcome to submit issues, pull requests, or just ask questions!

"},{"location":"contributing/#submitting-issues","title":"Submitting issues","text":"

If you have a question, found a bug, or have an idea for a new feature, please open an issue in the GitHub issue tracker.

"},{"location":"contributing/#security-issues","title":"Security issues","text":"

If you find a security issue or vulnerability, please do not open a public issue. Instead, use the dedicated vulnerability reporting page.

"},{"location":"contributing/#pull-requests","title":"Pull requests","text":"

Pull requests are welcome! Simply fork the GitHub repository and create a new branch for your changes. When you are ready, open a pull request to the main branch.

If you are working on a larger feature or a significant change, it is recommended to open an issue first to discuss the idea.

"},{"location":"contributing/#documentation","title":"Documentation","text":"

Jelly-JVM uses the exact same documentation system as the main Jelly documentation. Further information on editing the documentation can be found in the Contributing to the Jelly documentation guide.

"},{"location":"contributing/#releases","title":"Releases","text":"

See the dedicated page on making releases.

"},{"location":"contributing/#see-also","title":"See also","text":"
  • Licensing and citation
"},{"location":"getting-started-devs/","title":"Jelly-JVM \u2013 getting started for developers","text":"

If you don't want to code anything and only use Jelly with your Apache Jena/RDF4J application, see the dedicated guide about using Jelly-JVM as a plugin.

This guide explains a few of the basic functionalities of Jelly-JVM and how to use them in your code. Jelly-JVM is written in Scala, but it can be used from Java as well. However, in this guide, we will focus on Scala 3.

"},{"location":"getting-started-devs/#quick-start-plain-old-files","title":"Quick start \u2013 plain old files","text":"

Depending on your RDF library of choice (Apache Jena or RDF4J), you should import one of two dependencies: jelly-jena or jelly-rdf4j1. In our examples we will use Jena, so let's add this to your build.sbt file (this would be the same for other build tools like Maven or Gradle):

build.sbt
lazy val jellyVersion = \"2.3.0\"\n\nlibraryDependencies ++= Seq(\n  \"eu.ostrzyciel.jelly\" %% \"jelly-jena\" % jellyVersion,\n)\n

Now you can serialize/deserialize Jelly data with Apache Jena. Jelly is fully integrated with Jena, so it should all just magically work. Here is a simple example of reading a .jelly file (in this case, a metadata file from RiverBench) with RIOT:

Deserialization example (Scala 3)
import eu.ostrzyciel.jelly.convert.jena.riot.*\nimport org.apache.jena.riot.RDFDataMgr\n\n// Load an RDF graph from a Jelly file\nval model = RDFDataMgr.loadModel(\n  \"https://w3id.org/riverbench/v/2.0.1.jelly\", \n  JellyLanguage.JELLY\n)\n// Print the size of the model\nprintln(s\"Loaded an RDF graph with ${model.size} triples\")\n

Serialization is just as easy:

Serialization example (Scala 3)
import eu.ostrzyciel.jelly.convert.jena.riot.*\nimport org.apache.jena.riot.RDFDataMgr\n\nimport java.io.FileOutputStream\nimport scala.util.Using\n\n// Omitted here: creating an RDF model.\n// You can use the one from the previous example.\n\nUsing.resource(new FileOutputStream(\"metadata.jelly\")) { out =>\n  // Write the model to a Jelly file\n  RDFDataMgr.write(out, model, JellyLanguage.JELLY)\n  println(\"Saved the model to metadata.jelly\")\n}\n

Read more about using Jelly-JVM with Apache Jena

Read more about using Jelly-JVM with RDF4J

"},{"location":"getting-started-devs/#rdf-streams","title":"RDF streams","text":"

Now, the real power of Jelly lies in its streaming capabilities. Not only can it stream individual RDF triples/quads (this is called flat streaming), but it can also very effectively handle streams of RDF graphs or datasets. To work with streams, you need to use the jelly-stream module, which is based on the Apache Pekko Streams library. So, let's update our dependencies:

build.sbt
lazy val jellyVersion = \"2.3.0\"\n\nlibraryDependencies ++= Seq(\n  \"eu.ostrzyciel.jelly\" %% \"jelly-jena\" % jellyVersion,\n  \"eu.ostrzyciel.jelly\" %% \"jelly-stream\" % jellyVersion,\n)\n

Now, let's say we have a stream of RDF graphs \u2013 for example each graph corresponds to one set of measurements from an IoT sensor. We want to have a stream that turns these graphs into their serialized representations (byte arrays), which we can then send over the network. Here is how to do it:

Reactive streaming example (Scala 3)
// We need to import \"jena.given\" for Jena-to-Jelly conversions\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport scala.concurrent.ExecutionContext\n\n// We will need a Pekko actor system to run the streams\ngiven actorSystem: ActorSystem = ActorSystem()\n// And an execution context for the futures\ngiven ExecutionContext = actorSystem.getDispatcher\n\n// Load an RDF graph for testing\nval model = RDFDataMgr.loadModel(\n  \"https://w3id.org/riverbench/v/2.0.1.jelly\", \n  JellyLanguage.JELLY\n)\n\nSource.repeat(model) // Create a stream of the same model over and over\n  .take(10) // Take only the first 10 elements in the stream\n  .map(_.asTriples) // Convert each model to an iterable of triples\n  .via(EncoderFlow.graphStream( // Encode each iterable to a Jelly stream frame\n    maybeLimiter = None, // 1 RDF graph = 1 message\n    JellyOptions.smallStrict, // Jelly compression settings preset\n  ))\n  .via(JellyIo.toBytes) // Convert the stream frames to a byte arrays\n  .runForeach { bytes =>\n    // Just print the length of each byte array in the stream.\n    // You can also hook this up to MQTT, Kafka, etc.\n    println(s\"Streamed ${bytes.length} bytes\")\n  }\n  .onComplete(_ => actorSystem.terminate())\n

Jelly will compress this stream on-the-fly, so if the data is repetitive, it will be very efficient. If you run this code, you will notice that the byte sizes for the later graphs are smaller, even though we are sending the same graph over and over again. But, even if each graph is completely different, Jelly still should be much faster than other serialization formats.

These streams are very powerful, because they are reactive and asynchronous \u2013 in short, this means you can hook this up to any data source and any data sink \u2013 and you can scale it up as much as you want. If you are unfamiliar with the concept of reactive streams, we recommend you start with this Apache Pekko Streams guide.

Jelly-JVM supports streaming serialization and deserialization of all types of streams in the RDF Stream Taxonomy. You can read more about the theory of this and all available stream types in the Jelly protocol documentation.

Learn more about reactive streaming with Jelly-JVM

Learn more about the types of streams in Jelly

"},{"location":"getting-started-devs/#grpc-streaming","title":"gRPC streaming","text":"

Jelly is a bit more than just a serialization format \u2013 it also defines a gRPC-based straming protocol. You can use it for streaming RDF data between microservices, to build a pub/sub system, or to publish RDF data to the web.

Learn more about using Jelly gRPC protocol servers and clients

"},{"location":"getting-started-devs/#further-reading","title":"Further reading","text":"
  • Using Jelly-JVM with Apache Jena
  • Using Jelly-JVM with RDF4J
  • Reactive streaming with Jelly-JVM \u2013 using the jelly-stream module and Apache Pekko Streams
  • Using Jelly gRPC protocol servers and clients
  • Other useful utilities in Jelly-JVM
  • Low-level usage of Jelly-JVM
"},{"location":"getting-started-devs/#example-applications-using-jelly-jvm","title":"Example applications using Jelly-JVM","text":"
  • The examples directory in the Jelly-JVM repo contains code snippets that demonstrate how to use the library in various scenarios.
  • Jelly JVM benchmarks \u2013 research software for testing the performance of Jelly-JVM and other RDF serializations in Apache Jena. It uses most Jelly-JVM features.
  • RiverBench ci-worker \u2013 a real-world application that is used for processing large RDF datasets in a CI/CD pipeline. It uses Jelly-JVM for serialization and deserialization with Apache Jena. It also uses extensively Apache Pekko Streams.
"},{"location":"getting-started-devs/#questions","title":"Questions?","text":"

If you have any questions about using Jelly-JVM, feel free to open an issue on GitHub.

  1. There is nothing stopping you from using both at the same time. You can also pretty easily add support for any other Java-based RDF library by implementing a few interfaces. More details here.\u00a0\u21a9

"},{"location":"getting-started-plugins/","title":"Jelly-JVM \u2013 getting started with Jena/RDF4J plugins","text":"

This guide explains how to use Jelly-JVM with Apache Jena or RDF4J as a plugin, without writing a single line of code. Jelly-JVM provides plugin JARs that you can simply drop in the appropriate directory to get Jelly format support in your application.

"},{"location":"getting-started-plugins/#installation","title":"Installation","text":""},{"location":"getting-started-plugins/#apache-jena-apache-jena-fuseki","title":"Apache Jena, Apache Jena Fuseki","text":"

You can simply add Jelly format support to Apache Jena or Apacha Jena Fuseki with Jelly's plugin JAR.

  • First, download the plugin JAR. You can download the latest development version from here, or you can go the the releases page on GitHub to download a different version of the jelly-jena-plugin.jar file.
    • Note that the Jelly version must be compatible with your Apache Jena version. Consult the compatibility table.
  • Place the file in your classpath:
    • For Apache Jena Fuseki, simply place the file in $FUSEKI_BASE/extra/ directory. $FUSEKI_BASE is the directory usually called run where you have files such as config.ttl and shiro.ini. You will most likely need to create the extra directory yourself.
    • For Apache Jena, place the file in the lib/ directory of your Jena installation.
    • For other applications, consult the manual of the application.
  • You can now use the Jelly format for parsing, serialization, and streaming serialization in your Jena application.

Content negotiation in Fuseki

Content negotiation using the application/x-jelly-rdf media type in the Accept header works in Fuseki since Apache Jena version 5.2.0. Previous versions of Fuseki did not support media type registration.

How to use Jelly with Jena's CLI tools?

Jelly-JVM fully supports Apache Jena's command-line interface (CLI) utilities. See the dedicated guide for more information.

"},{"location":"getting-started-plugins/#eclipse-rdf4j","title":"Eclipse RDF4J","text":"

You can simply add Jelly format support to an application based on RDF4J with Jelly's plugin JAR.

  • First, download the plugin JAR. You can download the latest development version from here, or you can go the the releases page on GitHub to download a specific version of the jelly-rdf4j-plugin.jar file.
    • Note that the Jelly version must be compatible with your RDF4J version. Consult the compatibility table.
  • Place the file in your classpath:
    • For the RDF4J SDK distribution, place the file in the lib/ directory of your RDF4J installation.
    • For other applications, consult the manual of your application for the exact location.
  • You can now use the Jelly format for parsing and serialization in your RDF4J application.
"},{"location":"getting-started-plugins/#supported-features","title":"Supported features","text":"

The Jelly-JVM plugin JARs provide the following features:

  • Full support for parsing and serialization of RDF data (triples and quads) in the Jelly format.
    • The parser will automatically detect if the input data is delimited or not. Both delimited and non-delimited Jelly data can be parsed.
    • In Apache Jena also the stream serialization is supported.
  • Recognizing the .jelly file extension.
  • Recognizing the application/x-jelly-rdf media type.

The Jelly format is registered under the name jelly in the RDF libraries, so you can use it in the same way as other formats like Turtle, RDF/XML, or JSON-LD.

"},{"location":"getting-started-plugins/#see-also","title":"See also","text":"
  • Getting started for developers \u2013 if you want to get your hands dirty with code and get more features out of Jelly.
"},{"location":"licensing/","title":"Licensing and citation","text":"

Jelly-JVM is licensed under the Apache License 2.0.

"},{"location":"licensing/#attribution-citation","title":"Attribution / citation","text":"

If you use Jelly-JVM in your research, please the most recent paper about Jelly:

Sowi\u0144ski, P., Wasielewska-Michniewska, K., Ganzha, M., & Paprzycki, M. (2022, October). Efficient RDF streaming for the edge-cloud continuum. In 2022 IEEE 8th World Forum on Internet of Things (WF-IoT) (pp. 1-8). IEEE.

Or use this BibTeX entry:

@inproceedings{sowinski2022efficient,\n  title={Efficient RDF streaming for the edge-cloud continuum},\n  author={Sowi{\\'n}ski, Piotr and Wasielewska-Michniewska, Katarzyna and Ganzha, Maria and Paprzycki, Marcin and others},\n  booktitle={2022 IEEE 8th World Forum on Internet of Things (WF-IoT)},\n  pages={1--8},\n  year={2022},\n  organization={IEEE},\n  doi={10.1109/WF-IoT54382.2022.10152225}\n}\n

This paper describes an earlier version of Jelly from 2022. A new paper is in preparation.

"},{"location":"licensing/#jelly-maintainer","title":"Jelly maintainer","text":"

Jelly-JVM was created and is maintained by Piotr Sowi\u0144ski (Ostrzyciel) \u2013 GitHub.

"},{"location":"licensing/#see-also","title":"See also","text":"
  • Contributing to Jelly-JVM
"},{"location":"dev/implementing/","title":"Developer guide \u2013 implementing conversions for other libraries","text":"

Currently converters for the two most popular RDF JVM libraries are implemented \u2013 RDF4J and Jena. But it is possible to implement your own converters and adapt the Jelly serialization code to any RDF library with little effort.

To do this, you will need to implement three traits (interfaces in Java) from the jelly-core module: ProtoEncoder, ProtoDecoderConverter, and ConverterFactory.

  • ProtoEncoder (serialization)

    • get* methods deconstruct triple statements, quad statements, and quoted triples (RDF-star). You can make them inline.
    • nodeToProto and graphToProto should translate into Jelly's representation all possible variations of RDF terms in the SPO and G positions, respectively.
    • Example implementation for Jena: JenaProtoEncoder
    • You can skip implementing this trait if you don't need serialization.
    • You can also skip implementing some methods (make them throw an exception or return null) if, for example, you don't want to work with quads or RDF-start.
  • ProtoDecoderConverter (deserialization)

    • The make* methods should construct new RDF terms and statements. You can make them inline.
    • Example implementation for Jena: JenaDecoderConverter
    • You can skip implementing this trait if you don't need deserialization.
    • You can also skip implementing some methods (make them throw an exception or return null) if, for example, you don't want to work with quads or RDF-start.
  • ConverterFactory \u2013 wrapper that allows other modules to use your converter.

    • The methods should just return new instances of your ProtoEncoder and ProtoDecoderConverter implementations.
    • Example for Jena: JenaConverterFactory
"},{"location":"dev/releases/","title":"Developer guide \u2013 releases","text":""},{"location":"dev/releases/#full-versioned-releases","title":"Full (versioned) releases","text":"

Full (versioned) releases are created manually and follow the Semantic Versioning scheme for binary compatibility.

To create a new tagged release (example for version 1.2.3):

$ git checkout main\n$ git pull\n$ git tag v1.2.3\n$ git push origin v1.2.3\n

The rest (packaging and release creation) will be handled automatically by the CI. The release will be pushed to Maven Central.

"},{"location":"dev/releases/#snapshot-releases","title":"Snapshot releases","text":"

Snapshot releases are triggered automatically by commits in the main branch. Snapshots are pushed to the Sonatype snapshot repository.

"},{"location":"user/compatibility/","title":"Compatibility policy","text":"

Jelly-JVM follows Semantic Versioning 2.0.0, with MAJOR.MINOR.PATCH releases. Please see the compatibility table on the main page for the current compatibility information. The documentation is versioned to match each Jelly-JVM MAJOR.MINOR version.

"},{"location":"user/compatibility/#jvm-and-scala","title":"JVM and Scala","text":"

The current version of Jelly-JVM is compatible with Java 17 and newer. Java 17, 21, and 23 are tested in CI and are guaranteed to work. We recommend using a recent release of GraalVM to get the best performance. If you need Java 11 support, you should use Jelly-JVM 1.0.x.

Jelly is built with Scala 3 LTS releases and supports only Scala 3. If you need Scala 2 support, you should use Jelly-JVM 1.0.x.

"},{"location":"user/compatibility/#rdf-libraries","title":"RDF libraries","text":"

Major-version upgrades of RDF4J and Apache Jena (e.g., updating from 4.0.x to 5.0.x) are done in Jelly-JVM MINOR releases. Jelly-JVM generally does not use any complex features of these libraries, so it should work with multiple versions without any problems.

If you do encounter any compatibility issues, please report them on the issue tracker.

"},{"location":"user/compatibility/#internal-vs-external-apis","title":"Internal vs external APIs","text":"

Generally, all public classes and methods in Jelly-JVM are considered part of the public API. However, there are some exceptions.

Auto-generated classes in the jelly-core module, eu.ostrzyciel.jelly.core.proto.v1 package are not considered part of the public API, although we will avoid any incompatibilities where possible. These classes may change between MINOR releases.

"},{"location":"user/compatibility/#backward-and-forward-protocol-compatibility","title":"Backward and forward protocol compatibility","text":"

Jelly-JVM follows the Jelly protocol's backward compatibility policy. This means that Jelly-JVM can read data serialized with older versions of Jelly. Backward compatibility is tested in CI \u2013 the code is in BackCompatSpec.scala.

Forward compatibility is provided only in a very limited manner in Jelly-JVM. The parser is guaranteed to only parse the stream options header and reject the rest of the stream, if the used protocol version is not supported. You may choose to disable this check and try to parse the rest of the data anyway, but this is most certainly NOT recommended and may lead to unexpected results. In general, Jelly-JVM will ignore any unknown fields in the stream, but any other changes in the protocol may lead to really \"funny\" errors. Forward compatibility is tested in CI \u2013 the code is in ForwardCompatSpec.scala.

"},{"location":"user/compatibility/#see-also","title":"See also","text":"
  • Release notes on GitHub
  • Making Jelly-JVM releases
  • Contributing to Jelly-JVM
"},{"location":"user/grpc/","title":"User guide \u2013 gRPC","text":"

This guide explains the functionalities of the jelly-grpc module, which implements a gRPC client and server for the Jelly gRPC streaming protocol.

Prerequisites

If you are unfamiliar with gRPC, we recommend you first read some introductory material on the gRPC website or in the Apache Pekko gRPC documentation.

The jelly-grpc module builds on the functionalities of jelly-stream, so we recommend you first read the reactive streaming guide.

You may also want to first skim the Jelly gRPC streaming protocol specification to understand the protocol's structure.

As with the jelly-stream module, you can use jelly-grpc with any RDF library that has a Jelly integration, such as Apache Jena (using jelly-jena) or RDF4J (using jelly-rdf4j). The gRPC API is generic and identical across all libraries.

"},{"location":"user/grpc/#making-a-grpc-server-and-client","title":"Making a gRPC server and client","text":"

jelly-grpc builds on the Apache Pekko gRPC library. Jelly-JVM provides boilerplate code for setting up a gRPC server and client that can send and receive Jelly streams, as shown in the example below:

Example: PekkoGrpc.scala (click to expand)

Source code on GitHub

PekkoGrpc.scala
package eu.ostrzyciel.jelly.examples\n\nimport com.typesafe.config.ConfigFactory\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.*\nimport eu.ostrzyciel.jelly.grpc.RdfStreamServer\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.NotUsed\nimport org.apache.pekko.actor.typed.ActorSystem\nimport org.apache.pekko.actor.typed.javadsl.Behaviors\nimport org.apache.pekko.grpc.{GrpcClientSettings, GrpcServiceException}\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.concurrent.{Await, ExecutionContext, Future}\nimport scala.concurrent.duration.*\nimport scala.util.{Failure, Success}\n\n/**\n * Example of using Jelly's gRPC client and server to send Jelly streams over the network.\n * This uses the Apache Pekko gRPC library. Its documentation can be found at:\n * https://pekko.apache.org/docs/pekko-grpc/current/index.html\n * \n * See also examples named `PekkoStreams*` for instructions on encoding and decoding RDF streams with Jelly.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoGrpc extends shared.Example:\n  // Create a config for Pekko gRPC.\n  // We can use the same config for the client and the server, as we are communicating on localhost.\n  // This would usually be loaded from a configuration file (e.g., application.conf).\n  // More details: https://github.com/lightbend/config\n  val config = ConfigFactory.parseString(\n      \"\"\"\n        |pekko.http.server.preview.enable-http2 = on\n        |pekko.grpc.client.jelly.host = 127.0.0.1\n        |pekko.grpc.client.jelly.port = 8088\n        |pekko.grpc.client.jelly.enable-gzip = true\n        |pekko.grpc.client.jelly.use-tls = false\n        |pekko.grpc.client.jelly.backend = netty\n        |\"\"\".stripMargin\n    )\n    .withFallback(ConfigFactory.defaultApplication())\n\n  // We will need two Pekko actor systems to run the streams \u2013 one for the server and one for the client\n  val serverActorSystem: ActorSystem[_] = ActorSystem(Behaviors.empty, \"ServerSystem\")\n  val clientActorSystem: ActorSystem[_] = ActorSystem(Behaviors.empty, \"ClientSystem\", config)\n\n  // Our mock dataset that we will send around in the streams\n  val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n\n  /**\n   * Main method that starts the server and the client.\n   */\n  def main(args: Array[String]): Unit =\n    given system: ActorSystem[_] = serverActorSystem\n    given ExecutionContext = system.executionContext\n\n    // Start the server\n    val exampleService = ExampleJellyService()\n    RdfStreamServer(\n      RdfStreamServer.Options.fromConfig(config.getConfig(\"pekko.grpc.client.jelly\")),\n      exampleService\n    ).run() onComplete {\n      case Success(binding) =>\n        // If the server started successfully, start the client\n        println(s\"[SERVER] Bound to ${binding.localAddress}\")\n        runClient()\n      case Failure(exception) =>\n        // Otherwise, print the error and terminate the actor system\n        println(s\"[SERVER] Failed to bind: $exception\")\n        system.terminate()\n    }\n\n\n  /**\n   * The client part of the example.\n   */\n  private def runClient(): Unit =\n    given system: ActorSystem[_] = clientActorSystem\n    given ExecutionContext = system.executionContext\n\n    // Create a gRPC client\n    val client = RdfStreamServiceClient(GrpcClientSettings.fromConfig(\"jelly\"))\n\n    // First, let's try to publish some data to the server\n    val frameSource = EncoderSource.fromDatasetAsQuads(\n      dataset,\n      ByteSizeLimiter(500),\n      JellyOptions.smallStrict.withStreamName(\"weather\")\n    )\n    println(\"[CLIENT] Publishing data to the server...\")\n    val publishFuture = client.publishRdf(frameSource) map { response =>\n      println(s\"[CLIENT] Received acknowledgment: $response\")\n    } recover {\n      case e =>\n        println(s\"[CLIENT] Failed to publish data: $e\")\n    }\n    // Wait for the publish to complete\n    Await.ready(publishFuture, 10.seconds)\n\n    // Now, let's try to subscribe to some data from the server in the QUADS format\n    println(\"\\n\\n[CLIENT] Subscribing to QUADS data from the server...\")\n    val quadsFuture = client\n      .subscribeRdf(RdfStreamSubscribe(\n        \"weather\",\n        Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.QUADS))\n      ))\n      .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n      .runFold(0L)((acc, _) => acc + 1)\n      // Process the result of the stream (Future[Long])\n      .map { counter =>\n        println(s\"[CLIENT] Received $counter quads.\")\n      } recover {\n        case e =>\n          println(s\"[CLIENT] Failed to receive quads: $e\")\n      }\n    Await.ready(quadsFuture, 10.seconds)\n\n    // Let's try the same, with a GRAPHS stream\n    println(\"\\n\\n[CLIENT] Subscribing to GRAPHS data from the server...\")\n    val graphsFuture = client\n      .subscribeRdf(RdfStreamSubscribe(\n        \"weather\",\n        Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.GRAPHS))\n      ))\n      // Decode the response and transform it into a stream of quads\n      .via(DecoderFlow.decodeGraphs.asDatasetStreamOfQuads)\n      .mapConcat(identity)\n      .runFold(0L)((acc, _) => acc + 1)\n      // Process the result of the stream (Future[Long])\n      .map { counter =>\n        println(s\"[CLIENT] Received $counter quads.\")\n      } recover {\n        case e =>\n          println(s\"[CLIENT] Failed to receive data: $e\")\n      }\n    Await.ready(graphsFuture, 10.seconds)\n\n    // Finally, let's try to subscribe to a stream that the server does not support\n    // We will request TRIPLES, but the server only supports QUADS and GRAPHS.\n    println(\"\\n\\n[CLIENT] Subscribing to TRIPLES data from the server...\")\n    val triplesFuture = client\n      .subscribeRdf(RdfStreamSubscribe(\n        \"weather\",\n        Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.TRIPLES))\n      ))\n      .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n      .runFold(0L)((acc, _) => acc + 1)\n      .map { counter =>\n        println(s\"[CLIENT] Received $counter triples.\")\n      } recover {\n        case e =>\n          println(s\"[CLIENT] Failed to receive triples: $e\")\n      }\n    Await.result(triplesFuture, 10.seconds)\n\n    println(\"\\n\\n[CLIENT] Terminating...\")\n    system.terminate()\n    println(\"[SERVER] Terminating...\")\n    serverActorSystem.terminate()\n\n\n  /**\n   * Example implementation of RdfStreamService to act as the server.\n   * \n   * You will also need to implement this trait in your own service. It defines the logic with which the server\n   * will handle incoming streams and subscriptions.\n   */\n  class ExampleJellyService(using system: ActorSystem[_]) extends RdfStreamService:\n    given ExecutionContext = system.executionContext\n\n    /**\n     * Handler for clients publishing RDF streams to the server.\n     * \n     * We receive a stream of RdfStreamFrames and must respond with an acknowledgment (or an error).\n     */\n    override def publishRdf(in: Source[RdfStreamFrame, NotUsed]): Future[RdfStreamReceived] =\n      // Decode the incoming stream and count the number of RDF statements in it\n      in.via(DecoderFlow.decodeAny.asFlatStream)\n        .runFold(0L)((acc, _) => acc + 1)\n        .map(counter => {\n          println(s\"[SERVER] Received ${counter} RDF statements. Sending acknowledgment.\")\n          // Send an acknowledgment back to the client\n          RdfStreamReceived()\n        })\n\n    /**\n     * Handler for clients subscribing to RDF streams from the server.\n     * \n     * We receive a subscription request and must respond with a stream of RdfStreamFrames or an error.\n     */\n    override def subscribeRdf(in: RdfStreamSubscribe): Source[RdfStreamFrame, NotUsed] =\n      println(s\"[SERVER] Received subscription request for topic ${in.topic}.\")\n      // First, check the requested physical stream type\n      val streamType = in.requestedOptions match\n        case Some(options) =>\n          println(s\"[SERVER] Requested physical stream type: ${options.physicalType}.\")\n          options.physicalType\n        case None =>\n          println(s\"[SERVER] No requested stream options.\")\n          PhysicalStreamType.UNSPECIFIED\n\n      // Get the stream options requested by the client or the default options if none were provided\n      val options = in.requestedOptions.getOrElse(JellyOptions.smallStrict)\n        .withStreamName(in.topic)\n      // Check if the requested options are supported\n      // !!! THIS IS IMPORTANT !!!\n      // If you don't check if the requested options are supported, you may be vulnerable to\n      // denial-of-service attacks. For example, a client could request a very large lookup table\n      // that would consume a lot of memory on the server.\n      try\n        JellyOptions.checkCompatibility(options, JellyOptions.defaultSupportedOptions)\n      catch\n        case e: IllegalArgumentException =>\n          // If the requested options are not supported, return an error\n          return Source.failed(new GrpcServiceException(\n            io.grpc.Status.INVALID_ARGUMENT.withDescription(e.getMessage)\n          ))\n\n      streamType match\n        // This server implementation only supports QUADS and GRAPHS streams... and in both cases\n        // it will always the same dataset.\n        // You can of course implement more complex logic here, e.g., to stream different data based on the topic.\n        case PhysicalStreamType.QUADS => EncoderSource.fromDatasetAsQuads(\n          dataset,\n          ByteSizeLimiter(16_000),\n          options\n        )\n        case PhysicalStreamType.GRAPHS => EncoderSource.fromDatasetAsGraphs(\n          dataset,\n          Some(ByteSizeLimiter(16_000)),\n          options\n        )\n        // PhysicalStreamType.TRIPLES is not supported here \u2013 the server will throw a gRPC error\n        // if the client requests it.\n        // This is an example of how to properly handle unsupported stream options requested by the client.\n        // The library is able to automatically convert the error into a gRPC status and send it back to the client.\n        case _ => Source.failed(new GrpcServiceException(\n          io.grpc.Status.INVALID_ARGUMENT.withDescription(\"Unsupported physical stream type\")\n        ))\n

The classes provided in jelly-grpc should cover most cases, but they only serve as the boilerplate. You must yourself define the logic for handling the incoming and outgoing streams, as shown in the example above.

Of course, you can also implement the server or the client from scratch, if you want to.

"},{"location":"user/grpc/#see-also","title":"See also","text":"
  • Reactive streaming with Jelly-JVM
  • Useful utilities
    • Using Typesafe config to configure Jelly
"},{"location":"user/jena-cli/","title":"Apache Jena CLI tools","text":"

Jelly-JVM fully supports Apache Jena's command-line interface (CLI) utilities.

"},{"location":"user/jena-cli/#parsing","title":"Parsing","text":"

Jena will automatically detect Jelly files based on their extension (.jelly, .jelly.gz) and parse them. You can also manually set the --syntax option to jelly.

"},{"location":"user/jena-cli/#writing","title":"Writing","text":"

You can use Jelly as an output format for Jena's CLI utilities by specifying the --output or --stream options with the jelly format. We recommend using the --stream option for better performance.

Example: converting a Turtle file to Jelly

./riot --stream=jelly data.ttl > data.jelly\n

By default Jena will use the \"small, all features\" Jelly preset (name table: 128 entries, prefix table: 16, datatype table: 16, RDF-star enabled, generalized RDF enabled). There are a few reasons why you might want to change these serialization options:

  • Performance \u2013 for larger files, the small preset does not offer the best performance or compression ratio. It's better to use larger lookup tables.
  • Compatibility \u2013 if your data does not include RDF-star or generalized RDF, you can mark these features as disabled. Later, parsers will know accurately what to expect in your data.

The following presets are available:

  • Small: 128 name table entries, 16 prefix table entries, 16 datatype table entries
    • SMALL_STRICT \u2013 RDF-star and generalized RDF disabled
    • SMALL_GENERALIZED \u2013 RDF-star disabled, generalized RDF enabled
    • SMALL_RDF_STAR \u2013 RDF-star enabled, generalized RDF disabled
    • SMALL_ALL_FEATURES \u2013 RDF-star and generalized RDF enabled (default)
  • Big: 4000 name table entries, 150 prefix table entries, 32 datatype table entries (recommended for larger files)
    • BIG_STRICT
    • BIG_GENERALIZED
    • BIG_RDF_STAR
    • BIG_ALL_FEATURES

To use one of these presets, use the --set CLI option with the https://ostrzyciel.eu/jelly/riot/symbols#preset symbol:

Example: converting a Turtle file to Jelly with a big preset (strict)

./riot --stream=jelly \\\n    --set=\"https://ostrzyciel.eu/jelly/riot/symbols#preset=BIG_STRICT\" \\\n    data.ttl > data.jelly\n

Example: dumping a TDB2 database to Jelly with a big preset (all features)

./tdb2.tdbdump --tdb=path/to/assembler.ttl \\\n    --set=\"https://ostrzyciel.eu/jelly/riot/symbols#preset=BIG_ALL_FEATURES\" \\\n    --stream=jelly > mydb.jelly\n
"},{"location":"user/jena-cli/#see-also","title":"See also","text":"
  • Installing Jelly with Jena
  • Jena CLI documentation
"},{"location":"user/jena/","title":"Apache Jena integration","text":"

This guide explains the functionalities of the jelly-jena module, which provides Jelly support for Apache Jena.

If you just want to add Jelly format support to Apache Jena / Apache Jena Fuseki, you can use the Jelly-JVM plugin JAR. See the dedicated guide for more information.

"},{"location":"user/jena/#base-facilities","title":"Base facilities","text":"

jelly-jena implements the eu.ostrzyciel.jelly.core.ConverterFactory trait in eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory . This factory allows you to build encoders and decoders that convert between Jelly's RdfStreamFrames and Apache Jena's Triple and Quad objects. The eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame class is an object representation of Jelly's binary format.

The module also implements the eu.ostrzyciel.jelly.core.IterableAdapter trait in eu.ostrzyciel.jelly.convert.jena.JenaIterableAdapter . This adapter provides extension methods for Apache Jena's Model, Dataset, Graph, and DatasetGraph classes to convert them into an iterable of triples (.asTriples), quads (.asQuads), or named graphs (.asGraphs). This is useful when working with Jelly on a lower level or when using the jelly-stream module.

"},{"location":"user/jena/#serialization-and-deserialization-with-riot","title":"Serialization and deserialization with RIOT","text":"

jelly-jena implements an RDF writer and reader for Apache Jena's RIOT library. This means you can use Jelly just like, for example, Turtle or RDF/XML. See the example below:

Example: JenaRiot.scala (click to expand)

Source code on GitHub

JenaRiot.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.*\nimport org.apache.jena.rdf.model.ModelFactory\nimport org.apache.jena.riot.{RDFDataMgr, RDFFormat, RDFParser, RDFWriterRegistry, RIOT}\n\nimport java.io.{File, FileOutputStream}\nimport scala.util.Using\n\n/**\n * Example of using Jelly's integration with Apache Jena's RIOT library for\n * writing and reading RDF graphs and datasets to/from disk.\n *\n * See also: https://jena.apache.org/documentation/io/\n */\nobject JenaRiot extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // Load the RDF graph from an N-Triples file\n    val model = RDFDataMgr.loadModel(File(getClass.getResource(\"/weather.nt\").toURI).toURI.toString)\n\n    // Print the size of the model\n    println(s\"Loaded an RDF graph from N-Triples with size: ${model.size}\")\n\n    Using.resource(new FileOutputStream(\"weather.jelly\")) { out =>\n      // Write the model to a Jelly file\n      // Note: by default this will use the [[JellyFormat.JELLY_SMALL_STRICT]] format variant\n      RDFDataMgr.write(out, model, JellyLanguage.JELLY)\n      println(\"Saved the model to a Jelly file\")\n    }\n\n    // Load the RDF graph from a Jelly file\n    val model2 = RDFDataMgr.loadModel(\"weather.jelly\", JellyLanguage.JELLY)\n\n    // Print the size of the model\n    println(s\"Loaded an RDF graph from Jelly with size: ${model2.size}\")\n\n\n\n    // ---------------------------------\n    println(\"\\n\")\n\n    // Try the same with an RDF dataset and some different settings\n    val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n    println(s\"Loaded an RDF dataset from a Trig file with ${dataset.asDatasetGraph.size} named graphs and \" +\n      s\"${dataset.asDatasetGraph.stream.count} quads\")\n\n    Using.resource(new FileOutputStream(\"weather-quads.jelly\")) { out =>\n      // Write the dataset to a Jelly file, using the \"BIG\" settings\n      // (better compression for big files, more memory usage)\n      RDFDataMgr.write(out, dataset, JellyFormat.JELLY_BIG_STRICT)\n      println(\"Saved the dataset to a Jelly file\")\n    }\n\n    // Load the RDF dataset from a Jelly file\n    val dataset2 = RDFDataMgr.loadDataset(\"weather-quads.jelly\", JellyLanguage.JELLY)\n    println(s\"Loaded an RDF dataset from Jelly with ${dataset2.asDatasetGraph.size} named graphs and \" +\n      s\"${dataset2.asDatasetGraph.stream.count} quads\")\n\n    // ---------------------------------\n    println(\"\\n\")\n\n    // Custom Jelly format \u2013 change any settings you like\n    val customFormat = new RDFFormat(\n      JellyLanguage.JELLY,\n      JellyFormatVariant(\n        opt = JellyOptions.smallStrict\n          .withMaxPrefixTableSize(0) // disable the prefix table\n          .withStreamName(\"My weather stream\"), // add metadata to the stream\n        frameSize = 16 // make RdfStreamFrames with 16 rows each\n      )\n    )\n\n    // Jena requires us to register the custom format \u2013 once for graphs and once for datasets,\n    // as Jelly supports both.\n    RDFWriterRegistry.register(customFormat, JellyGraphWriterFactory)\n    RDFWriterRegistry.register(customFormat, JellyDatasetWriterFactory)\n\n    Using.resource(new FileOutputStream(\"weather-quads-custom.jelly\")) { out =>\n      // Write the dataset to a Jelly file using the custom format\n      RDFDataMgr.write(out, dataset, customFormat)\n      println(\"Saved the dataset to a Jelly file with custom settings\")\n    }\n\n    // Load the RDF dataset from a Jelly file with the custom format\n    val dataset3 = RDFDataMgr.loadDataset(\"weather-quads-custom.jelly\", JellyLanguage.JELLY)\n    println(s\"Loaded an RDF dataset from Jelly with custom settings with ${dataset3.asDatasetGraph.size} named graphs\" +\n      s\" and ${dataset3.asDatasetGraph.stream.count} quads\")\n\n    // ---------------------------------\n    println(\"\\n\")\n\n    // By default, the parser has limits on for example the maximum size of the lookup tables.\n    // The default supported options are [[JellyOptions.defaultSupportedOptions]].\n    // You can change these limits by creating your own options object.\n    val customOptions = JellyOptions.defaultSupportedOptions\n      .withMaxNameTableSize(50) // set the maximum size of the name table to 100\n    // Create a Context object with the custom options\n    val parserContext = RIOT.getContext.copy()\n      .set(JellyLanguage.SYMBOL_SUPPORTED_OPTIONS, customOptions)\n\n    println(\"Trying to load the model with custom supported options...\")\n    val model3 = ModelFactory.createDefaultModel()\n    try\n      // The loading operation should fail because our allowed max name table size is too low\n      RDFParser.create()\n        .source(\"weather.jelly\")\n        .lang(JellyLanguage.JELLY)\n        // Set the context object with the custom options\n        .context(parserContext)\n        .parse(model3)\n    catch\n      case e: RdfProtoDeserializationError =>\n        // The stream uses a name table size of 128, which is larger than the maximum supported size of 50.\n        // To read this stream, set maxNameTableSize to at least 128 in the supportedOptions for this decoder.\n        println(s\"Failed to load the model with custom options: ${e.getMessage}\")\n

Usage notes:

  • eu.ostrzyciel.jelly.core.JellyOptions provides a few common presets for Jelly serialization options construct a JellyFormatVariant, as shown in the example above. You can also further customize the serialization options (e.g., dictionary size).
  • The RIOT writer (serializer) integration implements only the delimited variant of Jelly. It is used for writing Jelly to files on disk or sockets. Because of this, you cannot use RIOT to write non-delimited Jelly data (e.g., a single message to a Kafka stream). For this, you should use the jelly-stream module or the more low-level API: Low-level usage.
  • However, the RIOT parser (deserializer) integration will automatically detect if the parsed Jelly data is delimited or not. If it's non-delimited, the parser will assume that there is only one RdfStreamFrame in the file.
  • Jelly's parsers and writers are registered in the eu.ostrzyciel.jelly.convert.jena.riot.JellyLanguage object (source code). This registration should happen automatically when you include the jelly-jena module in your project, using Jena's component initialization mechanism.
"},{"location":"user/jena/#streaming-serialization-with-riot","title":"Streaming serialization with RIOT","text":"

jelly-jena also implements a streaming writer (StreamRDF API in Jena). Using it is similar to the regular RIOT writer, with a slightly different setup:

Example: JenaRiotStreaming.scala (click to expand)

Source code on GitHub

JenaRiotStreaming.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.PhysicalStreamType\nimport org.apache.jena.graph.{NodeFactory, Triple}\nimport org.apache.jena.riot.system.{StreamRDFLib, StreamRDFWriter}\nimport org.apache.jena.riot.{RDFDataMgr, RDFParser, RIOT}\n\nimport java.io.{File, FileOutputStream}\nimport scala.util.Using\n\n/**\n * Example of using Apache Jena's streaming IO API with Jelly.\n *\n * See also: https://jena.apache.org/documentation/io/streaming-io.html\n */\nobject JenaRiotStreaming extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // Initialize a Jena StreamRDF to consume the statements\n    val readerStream = StreamRDFLib.count()\n\n    println(\"Reading a stream of triples from a Jelly file...\")\n\n    // Parse a Jelly file as a stream of triples\n    val inputFileTriples = new File(getClass.getResource(\"/jelly/weather.jelly\").toURI)\n    RDFParser\n      .source(inputFileTriples.toURI.toString)\n      .lang(JellyLanguage.JELLY)\n      .parse(readerStream)\n\n    println(f\"Read ${readerStream.countTriples()} triples\")\n    println()\n    println(\"Reading a stream of quads from a Jelly file...\")\n\n    // Parse a different Jelly file as a stream of quads and send it to the same sink\n    val inputFileQuads = new File(getClass.getResource(\"/jelly/weather-quads.jelly\").toURI)\n    RDFParser\n      .source(inputFileQuads.toURI.toString)\n      .lang(JellyLanguage.JELLY)\n      .parse(readerStream)\n\n    // Print the number of triples and quads\n    //\n    // The number of triples here is the sum of the triples from the first file and the triples\n    // in the default graph of the second file. This is just how Jena handles it.\n    println(f\"Read ${readerStream.countTriples()} triples (in total)\" +\n      f\" and ${readerStream.countQuads()} quads\")\n\n    // -------------------------------------\n    println(\"\\n\")\n\n    println(\"Writing a stream of 10 triples to a file...\")\n\n    // Try writing some triples to a file\n    // We need to create an instance of RdfStreamOptions to pass to the writer:\n    val options = JellyOptions.smallStrict\n      // The stream writer does not know if we will be writing triples or quads \u2013 we\n      // have to specify the physical stream type explicitly.\n      .withPhysicalType(PhysicalStreamType.TRIPLES)\n      .withStreamName(\"A stream of 10 triples\")\n\n    // To pass the options, we use Jena's Context mechanism\n    val context = RIOT.getContext.copy()\n      .set(JellyLanguage.SYMBOL_STREAM_OPTIONS, options)\n      .set(JellyLanguage.SYMBOL_FRAME_SIZE, 128) // optional, default is 256\n\n    Using.resource(new FileOutputStream(\"stream-riot.jelly\")) { out =>\n      // Create the writer \u2013 remember to pass the context!\n      val writerStream = StreamRDFWriter.getWriterStream(out, JellyLanguage.JELLY, context)\n      writerStream.start()\n\n      for i <- 1 to 10 do\n        writerStream.triple(Triple.create(\n          NodeFactory.createBlankNode(),\n          NodeFactory.createURI(\"https://example.org/p\"),\n          NodeFactory.createLiteralString(s\"object $i\")\n        ))\n\n      writerStream.finish()\n    }\n\n    println(\"Done writing triples\")\n\n    // Load the RDF graph that we just saved using normal RIOT API\n    val model = RDFDataMgr.loadModel(\"stream-riot.jelly\", JellyLanguage.JELLY)\n\n    println(\"Loaded the stream from disk, contents:\\n\")\n    model.write(System.out, \"NT\")\n
"},{"location":"user/jena/#see-also","title":"See also","text":"
  • Useful utilities
  • Reactive streaming with Jelly-JVM
  • Using Jelly with Jena's CLI tools
"},{"location":"user/low-level/","title":"Low-level usage","text":"

Warning

This page describes a low-level API that is a bit of a hassle to use directly. It's recommended to use the higher-level abstractions provided by the jelly-stream module, or the integrations with Apache Jena's RIOT or RDF4J's Rio libraries. If you really want to use this, it is highly recommended that you first get a basic understanding of how Jelly works under the hood and take a look at the code in the jelly-stream module to see how it's done there.

Note

The following guide uses the Apache Jena library as an example. The exact same thing can be done with RDF4J or any other RDF library that has a Jelly integration.

"},{"location":"user/low-level/#deserialization","title":"Deserialization","text":"

To parse a serialized stream frame into triples/quads:

  1. Call eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame.parseFrom if it's a non-delimited frame (like you would see, e.g., in a Kafka or gRPC stream), or parseDelimitedFrom if it's a delimited stream (like you would see in a file or a socket).
    • There is also a utility method to detect if the stream is delimited or not: eu.ostrzyciel.jelly.core.IoUtils.autodetectDelimiting . In most cases you will not need to use it. It is used internally by the Jena and RDF4J integrations for user convenience.
  2. Obtain a decoder that turns RdfStreamFrames into triples/quads: eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory has different methods for different physical stream types:
    • anyStatementDecoder for any physical stream type, outputs Triple or Quad
    • triplesDecoder for TRIPLES streams, outputs Triple
    • quadsDecoder for QUADS streams, outputs Quad
    • graphsDecoder for GRAPHS streams, outputs (Node, Iterable[Triple])
    • graphsAsQuadsDecoder for GRAPHS streams, outputs Quad
  3. For each row in the frame, call the decoder's ingestRow method to get the output iteratively.
"},{"location":"user/low-level/#serialization","title":"Serialization","text":"

To serialize triples/quads into a stream frame:

  1. If you want to serialize an RDF graph/dataset, transform them first into triples/quads in an iterable form. Use the asTriples/asQuads/asGraphs extension methods provided by the eu.ostrzyciel.jelly.convert.jena.JenaIterableAdapter object.
  2. Obtain an encoder that turns triples/quads into RdfStreamRows (the rows of a stream frame): use the eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory.encoder method to get an instance of eu.ostrzyciel.jelly.convert.jena.JenaProtoEncoder .
  3. Call the encoder's methods to add quads, triples, or named graphs to the stream frame.
    • Note that YOU are responsible for sticking to a specific physical stream type. For example, you should not mix triples with quads. It is highly recommended that you first read on the available stream types in Jelly.
    • You are also responsible for setting the appropriate stream options with proper stream types. See the guide on Jelly options presets for more information.
  4. The encoder will be returning batches or rows. You are responsible for grouping those rows logically into RdfStreamFrames. What you do here depends highly on the logical stream type you are working with.
"},{"location":"user/low-level/#see-also","title":"See also","text":"
  • Useful utilities
  • Reactive streaming with Jelly-JVM
  • Implementing Jelly-JVM for a new RDF library
"},{"location":"user/rdf4j/","title":"RDF4J integration","text":"

This guide explains the functionalities of the jelly-rdf4j module, which provides Jelly support for Eclipse RDF4J.

If you just want to add Jelly format support to your RDF4J application, you can use the Jelly-JVM plugin JAR. See the dedicated guide for more information.

"},{"location":"user/rdf4j/#base-facilities","title":"Base facilities","text":"

jelly-rdf4j implements the eu.ostrzyciel.jelly.core.ConverterFactory trait in eu.ostrzyciel.jelly.convert.rdf4j.Rdf4jConverterFactory . This factory allows you to build encoders and decoders that convert between Jelly's RdfStreamFrames and RDF4J's Statement objects. The eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame class is an object representation of Jelly's binary format.

The module also implements the eu.ostrzyciel.jelly.core.IterableAdapter trait in eu.ostrzyciel.jelly.convert.rdf4j.Rdf4jIterableAdapter . This adapter provides extension methods for RDF4J's Model class to convert it into an iterable of triples (.asTriples), quads (.asQuads), or named graphs (.asGraphs). This is useful when working with Jelly on a lower level or when using the jelly-stream module.

"},{"location":"user/rdf4j/#serialization-and-deserialization-with-rdf4j-rio","title":"Serialization and deserialization with RDF4J Rio","text":"

jelly-rdf4j implements an RDF writer and parser for Eclipse RDF4J's Rio library. This means you can use Jelly just like any other RDF serialization format (e.g., RDF/XML, Turtle). See the example below:

Example: Rdf4jRio.scala (click to expand)

Source code on GitHub

Rdf4jRio.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.rdf4j.rio.*\nimport eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.{PhysicalStreamType, RdfStreamOptions}\nimport org.eclipse.rdf4j.model.Statement\nimport org.eclipse.rdf4j.rio.helpers.StatementCollector\nimport org.eclipse.rdf4j.rio.{RDFFormat, Rio}\n\nimport java.io.{File, FileOutputStream}\nimport scala.jdk.CollectionConverters.*\nimport scala.util.Using\n\n/**\n * Example of using RDF4J's Rio library to read and write RDF data.\n *\n * See also: https://rdf4j.org/documentation/programming/rio/\n */\nobject Rdf4jRio extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // Load the RDF graph from an N-Triples file\n    val inputFile = File(getClass.getResource(\"/weather.nt\").toURI)\n    val triples = readRdf4j(inputFile, RDFFormat.TURTLE, None)\n\n    // Print the size of the graph\n    println(s\"Loaded ${triples.size} triples from an N-Triples file\")\n\n    // Write the RDF graph to a Jelly file\n    // Fist, create the stream's options:\n    val options = JellyOptions.smallStrict\n      // Setting the physical stream type is mandatory! It will always be either TRIPLES or QUADS.\n      .withPhysicalType(PhysicalStreamType.TRIPLES)\n      // Set other optional options\n      .withStreamName(\"My weather data\")\n    // Create the config object to pass to the writer\n    val config = JellyWriterSettings.configFromOptions(options, frameSize = 128)\n\n    // Do the actual writing\n    Using.resource(new FileOutputStream(\"weather.jelly\")) { out =>\n      val writer = Rio.createWriter(JELLY, out)\n      writer.setWriterConfig(config)\n      writer.startRDF()\n      triples.foreach(writer.handleStatement)\n      writer.endRDF()\n    }\n\n    println(\"Saved the model to a Jelly file\")\n\n    // Load the RDF graph from the Jelly file\n    val jellyFile = File(\"weather.jelly\")\n    val jellyTriples = readRdf4j(jellyFile, JELLY, None)\n\n    // Print the size of the graph\n    println(s\"Loaded ${jellyTriples.size} triples from a Jelly file\")\n\n    // ---------------------------------\n    println(\"\\n\")\n    // By default, the parser has limits on for example the maximum size of the lookup tables.\n    // The default supported options are [[JellyOptions.defaultSupportedOptions]].\n    // You can change these limits by creating your own options object.\n    val customOptions = JellyOptions.defaultSupportedOptions\n      .withMaxPrefixTableSize(10) // set the maximum size of the prefix table to 10\n    println(\"Trying to read the Jelly file with custom options...\")\n    try\n      // This operation should fail because the Jelly file uses a prefix table larger than 10\n      val customTriples = readRdf4j(jellyFile, JELLY, Some(customOptions))\n    catch\n      case e: RdfProtoDeserializationError =>\n        // The stream uses a prefix table size of 16, which is larger than the maximum supported size of 10.\n        // To read this stream, set maxPrefixTableSize to at least 16 in the supportedOptions for this decoder.\n        println(s\"Failed to read the Jelly file with custom options: ${e.getMessage}\")\n\n\n  /**\n   * Helper function to read RDF data using RDF4J's Rio library.\n   * @param file file to read from\n   * @param format RDF format\n   * @param supportedOptions supported options for reading Jelly streams (optional)\n   * @return sequence of RDF statements\n   */\n  private def readRdf4j(file: File, format: RDFFormat, supportedOptions: Option[RdfStreamOptions]): Seq[Statement] =\n    val parser = Rio.createParser(format)\n    val collector = new StatementCollector()\n    parser.setRDFHandler(collector)\n    supportedOptions.foreach(opt =>\n      // If the user provided supported options, set them on the parser\n      parser.setParserConfig(JellyParserSettings.configFromOptions(opt))\n    )\n    Using.resource(file.toURI.toURL.openStream()) { is =>\n      parser.parse(is)\n    }\n    collector.getStatements.asScala.toSeq\n

Usage notes:

  • eu.ostrzyciel.jelly.core.JellyOptions provides a few common presets for Jelly serialization options. These options are passed through eu.ostrzyciel.jelly.convert.rdf4j.rio.JellyWriterSettings.configFromOptions and used to configure the writer, as shown in the example above. You can also further customize the serialization options (e.g., dictionary size).
  • The RDF4J Rio writer (serializer) integration implements only the delimited variant of Jelly. It is used for writing Jelly to files on disk or sockets. Because of this, you cannot use Rio to write non-delimited Jelly data (e.g., a single message to a Kafka stream). For this, you should use the jelly-stream module or the more low-level API: Low-level usage.
  • However, the Rio parser (deserializer) integration will automatically detect if the parsed Jelly data is delimited or not. If it's non-delimited, the parser will assume that there is only one RdfStreamFrame in the file.
  • Jelly's parsers and writers are in the eu.ostrzyciel.jelly.convert.rdf4j.rio package (source code). They are automatically registered on startup using the RDFParserFactory and RDFWriterFactory SPIs provided by RDF4J.
"},{"location":"user/rdf4j/#see-also","title":"See also","text":"
  • Useful utilities
  • Reactive streaming with Jelly-JVM
"},{"location":"user/reactive/","title":"User guide \u2013 reactive streaming","text":"

This guide explains the reactive streaming functionalities of the jelly-stream module.

Prerequisites

If you are unfamiliar with the concept of reactive streams or Apache Pekko Streams, we highly recommend you start from reading about the basic concepts of Pekko Streams.

We also recommend you first read about the RDF stream types in Jelly. Otherwise, this guide may not make much sense.

You can use jelly-stream with any RDF library that has a Jelly integration, such as Apache Jena (using jelly-jena) or RDF4J (using jelly-rdf4j). The streaming API is generic and identical across all libraries.

"},{"location":"user/reactive/#basic-concepts","title":"Basic concepts","text":"

A key notion of this API are the encoders and decoders.

  • An encoder turns objects from your RDF library of choice (e.g., Triple in Apache Jena) into an object representation of Jelly's binary format (RdfStreamFrame).
  • A decoder does the opposite: it turns RdfStreamFrames into objects from your RDF library of choice.

So, for example, an encoder flow for flat triple streams would have a type of Flow[Triple, RdfStreamFrame, NotUsed] in Apache Jena. The opposite (a flat triple stream decoder) would have a type of Flow[RdfStreamFrame, Triple, NotUsed].

RdfStreamFrames can be converted to and from raw bytes using a range of methods, depending on your use case. See the sections below for examples.

"},{"location":"user/reactive/#encoding-a-single-rdf-graph-or-dataset-as-a-flat-stream-encodersource","title":"Encoding a single RDF graph or dataset as a flat stream (EncoderSource)","text":"

The easiest way to start is with flat RDF streams (i.e., flat streams of triples or quads). You can convert an RDF dataset or graph into such using the methods in eu.ostrzyciel.jelly.stream.EncoderSource .

Example: PekkoStreamsEncoderSource.scala (click to expand)

Source code on GitHub

PekkoStreamsEncoderSource.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.EncoderSource]] utility to convert RDF graphs and datasets\n * into Jelly streams with a single method call.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsEncoderSource extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // We will need a Pekko actor system to run the streams\n    given actorSystem: ActorSystem = ActorSystem()\n    // And an execution context for the futures\n    given ExecutionContext = actorSystem.getDispatcher\n\n    // Load an example RDF graph from an N-Triples file\n    val model = RDFDataMgr.loadModel(File(getClass.getResource(\"/weather.nt\").toURI).toURI.toString)\n\n    println(s\"Loaded model with ${model.size()} triples\")\n    println(s\"Streaming the model to memory...\")\n\n    // Create a Pekko Streams Source from the Jena model\n    // This automatically sets the physical and logical stream types.\n    val encodedModelFuture = EncoderSource\n      .fromGraph(\n        model,\n        // Aim for frames with ~2000 bytes \u2013 may be more!\n        ByteSizeLimiter(2000),\n        JellyOptions.smallStrict,\n      )\n      // wireTap: print the size of the frames\n      // Notice in the output that the frames are slightly bigger than 2000 bytes.\n      .wireTap(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes on wire\"))\n      // Convert each stream frame to bytes\n      .via(JellyIo.toBytes)\n      // Collect the stream into a sequence\n      .runWith(Sink.seq)\n\n    // Wait for the stream to complete and collect the result\n    val encodedModel = Await.result(encodedModelFuture, 10.seconds)\n\n    println(s\"Streamed model to memory with ${encodedModel.size} frames and\" +\n      s\" ${encodedModel.map(_.length).sum} bytes on wire\")\n\n    println(\"\\n\")\n\n    // -------------------------------------------------------------------\n    // Second example: try encoding an RDF dataset as a GRAPHS stream\n    val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n    println(s\"Loaded dataset with ${dataset.asDatasetGraph.size} named graphs\")\n    println(s\"Streaming the dataset to memory...\")\n\n    val encodedDatasetFuture = EncoderSource\n      // Here we stream this is as a GRAPHS stream (physical type)\n      // You can also use .fromDatasetAsQuads to stream as QUADS\n      .fromDatasetAsGraphs(\n        dataset,\n        // This time we limit the number of rows in each frame to 30\n        // Note that for this particular encoder, we can skip the limiter entirely \u2013 but this can lead to huge frames!\n        // So, be careful with that, or may get an out-of-memory error.\n        Some(StreamRowCountLimiter(30)),\n        JellyOptions.smallStrict,\n      )\n      // wireTap: print the size of the frames\n      // Note that some frames smaller than the limit \u2013 this is because this encoder will always split frames\n      // on graph boundaries.\n      .wireTap(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes on wire\"))\n      // Convert each stream frame to bytes\n      .via(JellyIo.toBytes)\n      // Collect the stream into a sequence\n      .runWith(Sink.seq)\n\n    // Wait for the stream to complete and collect the result\n    val encodedDataset = Await.result(encodedDatasetFuture, 10.seconds)\n\n    println(s\"Streamed dataset to memory with ${encodedDataset.size} frames and\" +\n      s\" ${encodedDataset.map(_.length).sum} bytes on wire\")\n\n    actorSystem.terminate()\n
"},{"location":"user/reactive/#encoding-any-rdf-data-as-a-flat-or-grouped-stream-encoderflow","title":"Encoding any RDF data as a flat or grouped stream (EncoderFlow)","text":"

The eu.ostrzyciel.jelly.stream.EncoderFlow provides even more options for turning RDF data into Jelly streams, including both grouped and flat streams. Every type of RDF stream in Jelly can be created using this API.

Example: PekkoStreamsEncoderFlow.scala (click to expand)

Source code on GitHub

PekkoStreamsEncoderFlow.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.EncoderFlow]] utility to encode RDF data as Jelly streams.\n * \n * Here, the RDF data is turned into a series of byte buffers, with each buffer corresponding to exactly one frame.\n * This is suitable if your streaming protocol (e.g., Kafka, MQTT, AMQP) already frames the messages.\n * If you are writing to a raw socket or file, then you must use the DELIMITED variant of Jelly instead.\n * See [[eu.ostrzyciel.jelly.examples.PekkoStreamsWithIo]] for examples of that.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsEncoderFlow extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // We will need a Pekko actor system to run the streams\n    given actorSystem: ActorSystem = ActorSystem()\n    // And an execution context for the futures\n    given ExecutionContext = actorSystem.getDispatcher\n\n    // Load the example dataset\n    val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n    // First, let's see what views of the dataset can we obtain using Jelly's Iterable adapters:\n    // 1. Iterable of all quads in the dataset\n    val quads: immutable.Iterable[Quad] = dataset.asQuads\n    // 2. Iterable of all graphs (named and default) in the dataset\n    val graphs: immutable.Iterable[(Node, Iterable[Triple])] = dataset.asGraphs\n    // 3. Iterable of all triples in the default graph\n    val triples: immutable.Iterable[Triple] = dataset.getDefaultModel.asTriples\n\n    // Note: here we are not turning the frames into bytes, but just printing their size in bytes.\n    // You can find an example of how to turn a frame into a byte array in the `PekkoStreamsEncoderSource` example.\n    // This is done with: .via(JellyIo.toBytes)\n\n    // Let's try encoding this as flat RDF streams (streams of triples or quads)\n    // https://w3id.org/stax/ontology#flatQuadStream\n    println(f\"Encoding ${quads.size} quads as a flat RDF quad stream\")\n    val flatQuadsFuture = Source(quads)\n      .via(EncoderFlow.flatQuadStream(\n        // This encoder requires a size limiter \u2013 otherwise a stream frame could have infinite length!\n        StreamRowCountLimiter(20),\n        JellyOptions.smallStrict,\n      ))\n      .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n    Await.ready(flatQuadsFuture, 10.seconds)\n\n    // https://w3id.org/stax/ontology#flatTripleStream\n    println(f\"\\n\\nEncoding ${triples.size} triples as a flat RDF triple stream\")\n    val flatTriplesFuture = Source(triples)\n      .via(EncoderFlow.flatTripleStream(\n        // This encoder requires a size limiter \u2013 otherwise a stream frame could have infinite length!\n        ByteSizeLimiter(500),\n        JellyOptions.smallStrict,\n      ))\n      .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n    Await.ready(flatTriplesFuture, 10.seconds)\n\n    // We can also stream already grouped triples or quads \u2013 for example, if your system generates batches of\n    // N triples, you can just send those batches straight to be encoded, with one batch = one stream frame.\n    // https://w3id.org/stax/ontology#flatQuadStream\n    println(f\"\\n\\nEncoding ${quads.size} quads as a flat RDF quad stream, grouped in batches of 10\")\n    // First, group the quads into batches of 8\n    val groupedQuadsFuture = Source.fromIterator(() => quads.grouped(10))\n      .via(EncoderFlow.flatQuadStreamGrouped(\n        // Do not use a size limiter here \u2013 we want exactly one batch in each frame\n        None,\n        JellyOptions.smallStrict,\n      ))\n      .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n    Await.ready(groupedQuadsFuture, 10.seconds)\n\n    // Now, let's try grouped streams. Let's say we want to stream all graphs in a dataset, but put exactly one\n    // graph in each frame (message). This is very common in (for example) IoT systems.\n    // https://w3id.org/stax/ontology#namedGraphStream\n    println(f\"\\n\\nEncoding ${graphs.size} graphs as a named graph stream\")\n    val namedGraphsFuture = Source(graphs)\n      .via(EncoderFlow.namedGraphStream(\n        // Do not use a size limiter here \u2013 we want exactly one graph in each frame\n        None,\n        JellyOptions.smallStrict,\n      ))\n      // Note that we will see exactly as many frames as there are graphs in the dataset\n      .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n    Await.ready(namedGraphsFuture, 10.seconds)\n\n    // As a last example, we will stream a series of RDF graphs. In our case this will be just the default graph\n    // repeated a few times. This type of stream is also pretty common in practical applications.\n    // https://w3id.org/stax/ontology#graphStream\n    println(f\"\\n\\nEncoding 5 RDF graphs as a graph stream\")\n    val graphsFuture = Source.repeat(triples)\n      .take(5)\n      .via(EncoderFlow.graphStream(\n        // Do not use a size limiter here \u2013 we want exactly one graph in each frame\n        None,\n        JellyOptions.smallStrict,\n      ))\n      // Note that we will see exactly 5 frames \u2013 the number of graphs we streamed\n      .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n    Await.ready(graphsFuture, 10.seconds)\n\n    actorSystem.terminate()\n
"},{"location":"user/reactive/#decoding-rdf-streams-decoderflow","title":"Decoding RDF streams (DecoderFlow)","text":"

The eu.ostrzyciel.jelly.stream.DecoderFlow provides methods for decoding flat and grouped streams. There is no opposite equivalent to EncoderSource for decoding, though. This would require constructing an RDF graph or dataset from statements, which is a process that can vary a lot depending on your application. You will have to do this part yourself.

Example: PekkoStreamsDecoderFlow.scala (click to expand)

Source code on GitHub

PekkoStreamsDecoderFlow.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.query.Dataset\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.DecoderFlow]] utility to turn incoming Jelly streams\n * into usable RDF data.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsDecoderFlow extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // We will need a Pekko actor system to run the streams\n    given actorSystem: ActorSystem = ActorSystem()\n    // And an execution context for the futures\n    given ExecutionContext = actorSystem.getDispatcher\n\n    // Load the example dataset\n    val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n    // To decode something, we first need to encode it...\n    // See [[PekkoStreamsEncoderFlow]] and [[PekkoStreamsEncoderSource]] for an explanation of what is happening here.\n    // We have four seqences of byte arrays, with each byte array corresponding to one encoded stream frame:\n    // - encodedQuads: a flat RDF quad stream, physical type: QUADS\n    // - encodedTriples: a flat RDF triple stream, physical type: TRIPLES\n    // - encodedGraphs: a flat RDF quad stream, physical type: GRAPHS\n    val (encodedQuads, encodedTriples, encodedGraphs) = getEncodedData(dataset)\n\n    // Now we can decode the encoded data back into something useful.\n    // Let's start by simply decoding the quads as a flat RDF quad stream:\n    println(\"Decoding quads as a flat RDF quad stream...\")\n    val decodedQuadsFuture = Source(encodedQuads)\n      // We need to parse the bytes into a Jelly stream frame\n      .via(JellyIo.fromBytes)\n      // And then decode the frame into Jena quads.\n      // We use \"decodeQuads\" because the physical stream type is QUADS.\n      // And then we want to treat it as a flat RDF quad stream, so we call \"asFlatQuadStreamStrict\".\n      // We use the \"Strict\" method to tell the decoder to check if the incoming logical stream type is the same\n      // as we are expecting: flat RDF quad stream.\n      .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n      .runWith(Sink.seq)\n\n    val decodedQuads: Seq[Quad] = Await.result(decodedQuadsFuture, 10.seconds)\n    println(s\"Decoded ${decodedQuads.size} quads.\")\n\n    // We can also treat each stream frame as a separate dataset. This way we would get an\n    // RDF dataset stream.\n    println(f\"\\n\\nDecoding quads as an RDF dataset stream from ${encodedQuads.size} frames...\")\n    val decodedDatasetFuture = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      // Note that we cannot use the strict variant (asDatasetStreamOfQuadsStrict) here, because the stream says its\n      // logical type is flat RDF quad stream.\n      .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuads)\n      .runWith(Sink.seq)\n\n    val decodedDatasets: Seq[IterableOnce[Quad]] = Await.result(decodedDatasetFuture, 10.seconds)\n    println(s\"Decoded ${decodedDatasets.size} datasets with\" +\n      s\" ${decodedDatasets.map(_.iterator.size).sum} quads in total.\")\n\n    // If we tried that with the strict variant, we would get an exception:\n    println(f\"\\n\\nDecoding quads as an RDF dataset stream with strict logical type handling...\")\n    val future = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuadsStrict)\n      .runWith(Sink.seq)\n    Await.result(future.recover {\n      // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n      // Expected logical stream type LOGICAL_STREAM_TYPE_DATASETS, got LOGICAL_STREAM_TYPE_FLAT_QUADS.\n      // LOGICAL_STREAM_TYPE_FLAT_QUADS is not a subtype of LOGICAL_STREAM_TYPE_DATASETS.\n      case e: Exception => println(e.getCause)\n    }, 10.seconds)\n\n    // We can also pass entirely custom supported options to the decoder, instead of the defaults\n    // (see [[JellyOptions.defaultSupportedOptions]]). This is useful if we want to decode a stream with\n    // for example very large lookup tables or we want to put stricter limits on the streams that we accept.\n    println(f\"\\n\\nDecoding quads as an RDF dataset stream with custom supported options...\")\n    val customSupportedOptions = JellyOptions.defaultSupportedOptions\n      .withMaxNameTableSize(50) // This is too small for the stream we are decoding\n    val customSupportedOptionsFuture = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuads(customSupportedOptions))\n      .runWith(Sink.seq)\n    Await.result(customSupportedOptionsFuture.recover {\n      // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n      // The stream uses a name table size of 128, which is larger than the maximum supported size of 50.\n      // To read this stream, set maxNameTableSize to at least 128 in the supportedOptions for this decoder.\n      case e: Exception => println(e.getCause)\n    }, 10.seconds)\n\n    // Flat RDF triple stream\n    println(f\"\\n\\nDecoding triples as a flat RDF triple stream...\")\n    val decodedTriplesFuture = Source(encodedTriples)\n      .via(JellyIo.fromBytes)\n      .via(DecoderFlow.decodeTriples.asFlatTripleStreamStrict)\n      .runWith(Sink.seq)\n\n    val decodedTriples: Seq[Triple] = Await.result(decodedTriplesFuture, 10.seconds)\n    println(s\"Decoded ${decodedTriples.size} triples.\")\n\n    // We can interpret the GRAPHS stream in a few ways, see\n    // [[eu.ostrzyciel.jelly.stream.DecoderFlow.GraphsIngestFlowOps]] for more details.\n    // Here we will treat it as an RDF named graph stream.\n    println(f\"\\n\\nDecoding graphs as an RDF named graph stream...\")\n    val decodedGraphsFuture = Source(encodedGraphs)\n      .via(JellyIo.fromBytes)\n      // Non-strict because the original logical stream type is flat RDF quad stream.\n      .via(DecoderFlow.decodeGraphs.asNamedGraphStream)\n      .runWith(Sink.seq)\n\n    val decodedGraphs: Seq[(Node, Iterable[Triple])] = Await.result(decodedGraphsFuture, 10.seconds)\n    println(s\"Decoded ${decodedGraphs.size} graphs.\")\n\n    // If we tried using a decoder for a physical stream type that does not match the type of the stream,\n    // we would get an exception. Here let's try to decode a QUADS stream with a TRIPLES decoder.\n    println(f\"\\n\\nDecoding quads as a flat RDF triple stream...\")\n    val future2 = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      // Note the \"decodeTriples\" here\n      .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n      .runWith(Sink.seq)\n    Await.result(future2.recover {\n      // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n      // Incoming stream type is not TRIPLES.\n      case e: Exception => println(e.getCause)\n    }, 10.seconds)\n\n    // We can get around this by using the \"decodeAny\" method, which will pick the appropriate decoder\n    // based on the stream options in the stream.\n    // In this case we can only ask the decoder to output a flat or grouped RDF stream.\n    println(f\"\\n\\nDecoding quads as a flat RDF stream using decodeAny...\")\n    val decodedAnyFuture = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      // The is no strict variant at all for decodeAny, as we don't care about the stream type anyway.\n      .via(DecoderFlow.decodeAny.asFlatStream)\n      .runWith(Sink.seq)\n\n    val decodedAny: Seq[Triple | Quad] = Await.result(decodedAnyFuture, 10.seconds)\n    println(s\"Decoded ${decodedAny.size} statements.\")\n\n    // One last trick up our sleeves is the snoopStreamOptions method, which allows us to inspect the stream options\n    // and carry on with the decoding as normal.\n    // In this case, we will reuse the first example (flat RDF quad stream) and snoop the stream options.\n    println(f\"\\n\\nSnooping the stream options of the first frame while decoding a flat RDF quad stream...\")\n    val snoopFuture = Source(encodedQuads)\n      .via(JellyIo.fromBytes)\n      // We add a .viaMat here to capture the materialized value of this stage.\n      .viaMat(DecoderFlow.snoopStreamOptions)(Keep.right)\n      .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n      .toMat(Sink.seq)(Keep.both)\n      .run()\n\n    val streamOptions = Await.result(snoopFuture._1, 10.seconds)\n    val decodedQuads2 = Await.result(snoopFuture._2, 10.seconds)\n\n    val streamOptionsIndented = (\"\\n\" + streamOptions.get.toProtoString.strip).replace(\"\\n\", \"\\n  \")\n    println(s\"Stream options: $streamOptionsIndented\")\n    println(s\"Decoded ${decodedQuads2.size} quads.\")\n\n    actorSystem.terminate()\n\n\n  /**\n   * Helper method to produce encoded data from a dataset.\n   */\n  private def getEncodedData(dataset: Dataset)(using ActorSystem, ExecutionContext):\n  (Seq[Array[Byte]], Seq[Array[Byte]], Seq[Array[Byte]]) =\n    val quadStream = EncoderSource.fromDatasetAsQuads(\n      dataset,\n      ByteSizeLimiter(500),\n      JellyOptions.smallStrict\n    )\n    val tripleStream = EncoderSource.fromGraph(\n      dataset.getDefaultModel,\n      ByteSizeLimiter(250),\n      JellyOptions.smallStrict\n    )\n    val graphStream = EncoderSource.fromDatasetAsGraphs(\n      dataset,\n      None,\n      JellyOptions.smallStrict\n    )\n    val results = Seq(quadStream, tripleStream, graphStream).map { stream =>\n      val streamFuture = stream\n        .via(JellyIo.toBytes)\n        .runWith(Sink.seq)\n      Await.result(streamFuture, 10.seconds)\n    }\n    (results.head, results(1), results(2))\n
"},{"location":"user/reactive/#byte-streams-delimited-variant","title":"Byte streams (delimited variant)","text":"

In all of the examples above, we used the non-delimited variant of Jelly, which is appropriate for, e.g., sending Jelly data over gRPC or Kafka. If you want to write Jelly data to a file or a socket, you will need to use the delimited variant. jelly-stream provides a few methods for this in eu.ostrzyciel.jelly.stream.JellyIo .

Example: PekkoStreamsWithIo.scala (click to expand)

Source code on GitHub

PekkoStreamsWithIo.scala
package eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.query.Dataset\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\nimport org.apache.pekko.util.ByteString\n\nimport java.io.{File, FileInputStream, FileOutputStream}\nimport java.util.zip.GZIPInputStream\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\nimport scala.util.Using\n\n/**\n * Example of using Pekko Streams to read/write Jelly to a file or any other byte stream (e.g., socket).\n *\n * The examples here use the DELIMITED variant of Jelly, which is suitable only for situations where there is\n * no framing in the underlying stream. You should always use the delimited variant with raw files and sockets,\n * as otherwise it would be impossible to tell where one stream frame ends and another one begins.\n *\n * If you are working with something like MQTT, Kafka, JMS, AMQP... then check the examples in\n * [[eu.ostrzyciel.jelly.examples.PekkoStreamsEncoderFlow]].\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsWithIo extends shared.Example:\n  def main(args: Array[String]): Unit =\n    // We will need a Pekko actor system to run the streams\n    given actorSystem: ActorSystem = ActorSystem()\n    // And an execution context for the futures\n    given ExecutionContext = actorSystem.getDispatcher\n\n    // We will read a gzipped Jelly file from disk and decode it on the fly, as we are decompressing it.\n    println(\"Decoding a gzipped Jelly file with Pekko Streams...\")\n    // The input file is a GZipped Jelly file\n    val inputFile = File(getClass.getResource(\"/jelly/weather.jelly.gz\").toURI)\n\n    // Use Java's GZIPInputStream to decompress the input file on the fly\n    val decodedTriples: Seq[Triple] = Using.resource(new GZIPInputStream(FileInputStream(inputFile))) { inputStream =>\n      val decodedTriplesFuture = JellyIo.fromIoStream(inputStream)\n        // Decode the Jelly frames to triples.\n        // Under the hood it uses the RdfStreamFrame.parseDelimitedFrom method.\n        .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n        .runWith(Sink.seq)\n\n      Await.result(decodedTriplesFuture, 10.seconds)\n    }\n\n    println(s\"Decoded ${decodedTriples.size} triples\")\n\n    // -----------------------------------------------------------\n    // Now we will write the decoded triples to a new Jelly file\n    println(\"\\n\\nWriting the decoded triples to a new Jelly file with Pekko Streams...\")\n    Using.resource(new FileOutputStream(\"weather.jelly\")) { outputStream =>\n      val writeFuture = Source(decodedTriples)\n        // Encode the triples to Jelly\n        .via(EncoderFlow.flatTripleStream(\n          ByteSizeLimiter(500),\n          JellyOptions.smallStrict\n        ))\n        // Write the Jelly frames to a Java byte stream.\n        // Under the hood it uses the RdfStreamFrame.writeDelimitedTo method.\n        .runWith(JellyIo.toIoStream(outputStream))\n\n      Await.ready(writeFuture, 10.seconds)\n      println(\"Done writing the Jelly file.\")\n    }\n\n    // -----------------------------------------------------------\n    // Pekko Streams offers its own utilities for reading and writing bytes that do not involve using Java's\n    // blocking implementation of streams.\n    // We will again write the decoded triples to a Jelly file, but this time use Pekko's facilities.\n    println(\"\\n\\nWriting the decoded triples to a new Jelly file with Pekko Streams' utilities...\")\n    val writeFuture = Source(decodedTriples)\n      .via(EncoderFlow.flatTripleStream(\n        ByteSizeLimiter(500),\n        JellyOptions.smallStrict\n      ))\n      // Convert the frames into Pekko's byte strings.\n      // Note: we are using the DELIMITED variant because we will write this to disk!\n      .via(JellyIo.toBytesDelimited)\n      .map(bytes => ByteString(bytes))\n      .runWith(FileIO.toPath(File(\"weather2.jelly\").toPath))\n\n    Await.ready(writeFuture, 10.seconds)\n    println(\"Done writing the Jelly file.\")\n\n    actorSystem.terminate()\n
"},{"location":"user/reactive/#see-also","title":"See also","text":"
  • Using Jelly gRPC servers and clients
  • Useful utilities
    • Using Typesafe config to configure Jelly
  • Low-level usage
"},{"location":"user/utilities/","title":"Useful utilities","text":"

This guide presents some useful utilities in the jelly-core and jelly-stream modules.

"},{"location":"user/utilities/#jelly-options-presets","title":"Jelly options presets","text":"

Every Jelly stream begins with a header that specifies the serialization options used to encode the stream \u2013 see the details in the specification. So, whenever you serialize some RDF with Jelly (e.g., using Apache Jena RIOT, RDF4J Rio, or the jelly-stream module), you need to specify these options.

The eu.ostrzyciel.jelly.core.JellyOptions object provides a few common presets for Jelly serialization options. They return an instance of eu.ostrzyciel.jelly.core.proto.v1.RdfStreamOptions that you can further customize. For example:

import eu.ostrzyciel.jelly.core.JellyOptions\n\nval options = JellyOptions.smallStrict\n\nval optionsWithRdfStarSupport = JellyOptions.smallRdfStar\n\nval bigWithCustomDictionarySize = JellyOptions.bigStrict\n  .withMaxNameTableSize(2000)  \n

Warning

These presets do not specify the physical or logical stream type. In most cases, the Jelly library will take care of this for you and set these types automatically later. However, if you use the low-level API, you need to set the stream types manually. For example:

import eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.*\n\nJellyOptions.smallStrict\n  .withPhysicalType(PhysicalStreamType.QUADS)\n  .withLogicalType(LogicalStreamType.DATASETS)\n
"},{"location":"user/utilities/#checking-supported-options","title":"Checking supported options","text":"

There is also the eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions method which specifies the maximum set of options supported by default in Jelly-JVM, when parsing a stream. By default, Jelly-JVM will refuse to parse any stream that uses options that are beyond what is specified in this method. This is important for security reasons, as it prevents the library from, for example, allocating a 10 GB dictionary (potential Denial of Service attack).

The supported options check is carried out automatically by the decoder when parsing a stream. You cannot disable the check, but you can customize the supported options by constructing a new RdfStreamOptions object from eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions , customizing it, and passing it to the decoder.

If you want to do this kind of check in some other context (e.g., in a gRPC service to check if you can support the options requested by the client), you can use the eu.ostrzyciel.jelly.core.JellyOptions.checkCompatibility method. It will throw an exception if the options are not supported.

"},{"location":"user/utilities/#useful-constants","title":"Useful constants","text":"

The eu.ostrzyciel.jelly.core.Constants object defines some useful constants, such as the file extension for Jelly, its content type, and the version of the Jelly protocol.

"},{"location":"user/utilities/#rdf-stream-taxonomy-rdf-stax-stream-type-utilities","title":"RDF Stream Taxonomy (RDF-STaX) stream type utilities","text":"

Jelly uses RDF-STaX to define the logical stream types (more details here). Jelly-JVM defines each of these types as a case object in eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType .

These objects have a few useful methods for working with the RDF-STaX ontology:

import eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\n\n// Get the RDF-STaX IRI of a stream type\n// returns \"https://w3id.org/stax/ontology#flatTripleStream\"\nLogicalStreamType.TRIPLES.getRdfStaxType\n

You can also obtain a full RDF-STaX annotation for your stream if you also import an RDF library interop module (e.g., jelly-jena or jelly-rdf4j):

// Here we import `jena.given` to get the necessary implicit conversions.\n// You can do the same with `rdf4j.given` if you are using RDF4J.\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\nimport org.apache.jena.graph.NodeFactory\n\nval subjectNode: Node = NodeFactory.createURI(\"http://example.org/subject\")\nval triples: Seq[Triple] = LogicalStreamType.QUADS.getRdfStaxAnnotation\n// Returns a Seq of three triples that would look like this in Turtle:\n// <http://example.org/subject> stax:hasStreamTypeUsage [\n//   a stax:RdfStreamTypeUsage ;\n//   stax:hasStreamType stax:flatQuadStream\n// ] .\n

You can then take this annotation and expose as semantic metadata of your stream.

You can also do the opposite and construct an instance of LogicalStreamType from an RDF-STaX IRI:

import eu.ostrzyciel.jelly.core.LogicalStreamTypeFactory\n\nval iri = \"https://w3id.org/stax/ontology#flatQuadStream\"\n// returns LogicalStreamType.QUADS\nval streamType = LogicalStreamTypeFactory.fromOntologyIri(iri)\n

Finally, there are also stream type checking and manipulation utilities:

import eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\n\n// Check if this type is equal or a subtype of another type.\n// This is useful for performing compatibility checks.\n// Returns false\nLogicalStreamType.TRIPLES.isEqualOrSubtypeOf(LogicalStreamType.DATASETS)\n// Returns true\nLogicalStreamType.NAMED_GRAPHS.isEqualOrSubtypeOf(LogicalStreamType.DATASETS)\n\n// Get the \"base\" type of a stream type. Base types are concrete stream types \n// that have no parent types. \n// There are only 4 base types: GRAPHS, DATASETS, TRIPLES, QUADS.\n// Returns LogicalStreamType.TRIPLES\nLogicalStreamType.TRIPLES.toBaseType\n// Returns LogicalStreamType.DATASETS\nLogicalStreamType.NAMED_GRAPHS.toBaseType\n// Returns LogicalStreamType.DATASETS\nLogicalStreamType.TIMESTAMPED_NAMED_GRAPHS.toBaseType\n
"},{"location":"user/utilities/#jelly-configuration-from-typesafe-config","title":"Jelly configuration from Typesafe config","text":"

The jelly-stream module also implements a utility for configuring Jelly serialization options using the Typesafe config library, which is commonly used in Apache Pekko applications.

The utility is provided by the eu.ostrzyciel.jelly.stream.JellyOptionsFromTypesafe object. For example:

import com.typesafe.config.ConfigFactory\nimport eu.ostrzyciel.jelly.stream.JellyOptionsFromTypesafe\n\nval config = ConfigFactory.parseString(\"\"\"\n  |jelly.physical-type = QUADS\n  |jelly.name-table-size = 1024\n  |jelly.prefix-table-size = 64\n  |\"\"\".stripMargin)\n\nval options = JellyOptionsFromTypesafe.fromConfig(config.getConfig(\"jelly\"))\noptions.physicalType // returns PhysicalStreamType.QUADS\noptions.maxNameTableSize // returns 1024\noptions.maxPrefixTableSize // returns 64\noptions.maxDatatypeTableSize // returns 16 (the default)\n

See the source code of this class for more details.

"},{"location":"user/utilities/#see-also","title":"See also","text":"
  • Reactive streaming with Jelly-JVM
  • Low-level usage of Jelly-JVM
"}]} \ No newline at end of file diff --git a/dev/sitemap.xml b/dev/sitemap.xml index 67b5fc5..c65e133 100644 --- a/dev/sitemap.xml +++ b/dev/sitemap.xml @@ -2,62 +2,62 @@ https://jelly-rdf.github.io/jelly-jvm/dev/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/contributing/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/getting-started-devs/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/getting-started-plugins/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/licensing/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/dev/implementing/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/dev/releases/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/user/compatibility/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/user/grpc/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/user/jena-cli/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/user/jena/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/user/low-level/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/user/rdf4j/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/user/reactive/ - 2024-11-15 + 2024-11-17 https://jelly-rdf.github.io/jelly-jvm/dev/user/utilities/ - 2024-11-15 + 2024-11-17 \ No newline at end of file diff --git a/dev/sitemap.xml.gz b/dev/sitemap.xml.gz index 5736040..f99fd8f 100644 Binary files a/dev/sitemap.xml.gz and b/dev/sitemap.xml.gz differ diff --git a/dev/user/jena/index.html b/dev/user/jena/index.html index 3b76347..9357065 100644 --- a/dev/user/jena/index.html +++ b/dev/user/jena/index.html @@ -867,8 +867,8 @@

Apache Jena integration

This guide explains the functionalities of the jelly-jena module, which provides Jelly support for Apache Jena.

If you just want to add Jelly format support to Apache Jena / Apache Jena Fuseki, you can use the Jelly-JVM plugin JAR. See the dedicated guide for more information.

Base facilities

-

jelly-jena implements the eu.ostrzyciel.jelly.core.ConverterFactory trait in eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory . This factory allows you to build encoders and decoders that convert between Jelly's RdfStreamFrames and Apache Jena's Triple and Quad objects. The eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame class is an object representation of Jelly's binary format.

-

The module also implements the eu.ostrzyciel.jelly.core.IterableAdapter trait in eu.ostrzyciel.jelly.convert.jena.JenaIterableAdapter . This adapter provides extension methods for Apache Jena's Model, Dataset, Graph, and DatasetGraph classes to convert them into an iterable of triples (.asTriples), quads (.asQuads), or named graphs (.asGraphs). This is useful when working with Jelly on a lower level or when using the jelly-stream module.

+

jelly-jena implements the eu.ostrzyciel.jelly.core.ConverterFactory trait in eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory . This factory allows you to build encoders and decoders that convert between Jelly's RdfStreamFrames and Apache Jena's Triple and Quad objects. The eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame class is an object representation of Jelly's binary format.

+

The module also implements the eu.ostrzyciel.jelly.core.IterableAdapter trait in eu.ostrzyciel.jelly.convert.jena.JenaIterableAdapter . This adapter provides extension methods for Apache Jena's Model, Dataset, Graph, and DatasetGraph classes to convert them into an iterable of triples (.asTriples), quads (.asQuads), or named graphs (.asGraphs). This is useful when working with Jelly on a lower level or when using the jelly-stream module.

Serialization and deserialization with RIOT

jelly-jena implements an RDF writer and reader for Apache Jena's RIOT library. This means you can use Jelly just like, for example, Turtle or RDF/XML. See the example below:

@@ -1109,10 +1109,10 @@

Serialization and deseriali

Usage notes:

    -
  • eu.ostrzyciel.jelly.core.JellyOptions provides a few common presets for Jelly serialization options construct a JellyFormatVariant, as shown in the example above. You can also further customize the serialization options (e.g., dictionary size).
  • +
  • eu.ostrzyciel.jelly.core.JellyOptions provides a few common presets for Jelly serialization options construct a JellyFormatVariant, as shown in the example above. You can also further customize the serialization options (e.g., dictionary size).
  • The RIOT writer (serializer) integration implements only the delimited variant of Jelly. It is used for writing Jelly to files on disk or sockets. Because of this, you cannot use RIOT to write non-delimited Jelly data (e.g., a single message to a Kafka stream). For this, you should use the jelly-stream module or the more low-level API: Low-level usage.
  • However, the RIOT parser (deserializer) integration will automatically detect if the parsed Jelly data is delimited or not. If it's non-delimited, the parser will assume that there is only one RdfStreamFrame in the file.
  • -
  • Jelly's parsers and writers are registered in the eu.ostrzyciel.jelly.convert.jena.riot.JellyLanguage object (source code). This registration should happen automatically when you include the jelly-jena module in your project, using Jena's component initialization mechanism.
  • +
  • Jelly's parsers and writers are registered in the eu.ostrzyciel.jelly.convert.jena.riot.JellyLanguage object (source code). This registration should happen automatically when you include the jelly-jena module in your project, using Jena's component initialization mechanism.

Streaming serialization with RIOT

jelly-jena also implements a streaming writer (StreamRDF API in Jena). Using it is similar to the regular RIOT writer, with a slightly different setup:

diff --git a/dev/user/low-level/index.html b/dev/user/low-level/index.html index 8c602a9..5d5c333 100644 --- a/dev/user/low-level/index.html +++ b/dev/user/low-level/index.html @@ -862,11 +862,11 @@

Low-level usage

Deserialization

To parse a serialized stream frame into triples/quads:

    -
  1. Call eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame.parseFrom if it's a non-delimited frame (like you would see, e.g., in a Kafka or gRPC stream), or parseDelimitedFrom if it's a delimited stream (like you would see in a file or a socket).

Encoding any RDF data as a flat or grouped stream (EncoderFlow)

-

The eu.ostrzyciel.jelly.stream.EncoderFlow provides even more options for turning RDF data into Jelly streams, including both grouped and flat streams. Every type of RDF stream in Jelly can be created using this API.

+

The eu.ostrzyciel.jelly.stream.EncoderFlow provides even more options for turning RDF data into Jelly streams, including both grouped and flat streams. Every type of RDF stream in Jelly can be created using this API.

Example: PekkoStreamsEncoderFlow.scala (click to expand)

Source code on GitHub

@@ -1361,7 +1361,7 @@

Encoding

Decoding RDF streams (DecoderFlow)

-

The eu.ostrzyciel.jelly.stream.DecoderFlow provides methods for decoding flat and grouped streams. There is no opposite equivalent to EncoderSource for decoding, though. This would require constructing an RDF graph or dataset from statements, which is a process that can vary a lot depending on your application. You will have to do this part yourself.

+

The eu.ostrzyciel.jelly.stream.DecoderFlow provides methods for decoding flat and grouped streams. There is no opposite equivalent to EncoderSource for decoding, though. This would require constructing an RDF graph or dataset from statements, which is a process that can vary a lot depending on your application. You will have to do this part yourself.

Example: PekkoStreamsDecoderFlow.scala (click to expand)

Source code on GitHub

@@ -1773,7 +1773,7 @@

Decoding RDF streams (DecoderFlo

Byte streams (delimited variant)

-

In all of the examples above, we used the non-delimited variant of Jelly, which is appropriate for, e.g., sending Jelly data over gRPC or Kafka. If you want to write Jelly data to a file or a socket, you will need to use the delimited variant. jelly-stream provides a few methods for this in eu.ostrzyciel.jelly.stream.JellyIo .

+

In all of the examples above, we used the non-delimited variant of Jelly, which is appropriate for, e.g., sending Jelly data over gRPC or Kafka. If you want to write Jelly data to a file or a socket, you will need to use the delimited variant. jelly-stream provides a few methods for this in eu.ostrzyciel.jelly.stream.JellyIo .

Example: PekkoStreamsWithIo.scala (click to expand)

Source code on GitHub

diff --git a/dev/user/utilities/index.html b/dev/user/utilities/index.html index 2cdf598..f749492 100644 --- a/dev/user/utilities/index.html +++ b/dev/user/utilities/index.html @@ -908,7 +908,7 @@

Useful utilities

This guide presents some useful utilities in the jelly-core and jelly-stream modules.

Jelly options presets

Every Jelly stream begins with a header that specifies the serialization options used to encode the stream – see the details in the specification. So, whenever you serialize some RDF with Jelly (e.g., using Apache Jena RIOT, RDF4J Rio, or the jelly-stream module), you need to specify these options.

-

The eu.ostrzyciel.jelly.core.JellyOptions object provides a few common presets for Jelly serialization options. They return an instance of eu.ostrzyciel.jelly.core.proto.v1.RdfStreamOptions that you can further customize. For example:

+

The eu.ostrzyciel.jelly.core.JellyOptions object provides a few common presets for Jelly serialization options. They return an instance of eu.ostrzyciel.jelly.core.proto.v1.RdfStreamOptions that you can further customize. For example:

import eu.ostrzyciel.jelly.core.JellyOptions
 
 val options = JellyOptions.smallStrict
@@ -930,13 +930,13 @@ 

Jelly options presets

Checking supported options

-

There is also the eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions method which specifies the maximum set of options supported by default in Jelly-JVM, when parsing a stream. By default, Jelly-JVM will refuse to parse any stream that uses options that are beyond what is specified in this method. This is important for security reasons, as it prevents the library from, for example, allocating a 10 GB dictionary (potential Denial of Service attack).

-

The supported options check is carried out automatically by the decoder when parsing a stream. You cannot disable the check, but you can customize the supported options by constructing a new RdfStreamOptions object from eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions , customizing it, and passing it to the decoder.

-

If you want to do this kind of check in some other context (e.g., in a gRPC service to check if you can support the options requested by the client), you can use the eu.ostrzyciel.jelly.core.JellyOptions.checkCompatibility method. It will throw an exception if the options are not supported.

+

There is also the eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions method which specifies the maximum set of options supported by default in Jelly-JVM, when parsing a stream. By default, Jelly-JVM will refuse to parse any stream that uses options that are beyond what is specified in this method. This is important for security reasons, as it prevents the library from, for example, allocating a 10 GB dictionary (potential Denial of Service attack).

+

The supported options check is carried out automatically by the decoder when parsing a stream. You cannot disable the check, but you can customize the supported options by constructing a new RdfStreamOptions object from eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions , customizing it, and passing it to the decoder.

+

If you want to do this kind of check in some other context (e.g., in a gRPC service to check if you can support the options requested by the client), you can use the eu.ostrzyciel.jelly.core.JellyOptions.checkCompatibility method. It will throw an exception if the options are not supported.

Useful constants

-

The eu.ostrzyciel.jelly.core.Constants object defines some useful constants, such as the file extension for Jelly, its content type, and the version of the Jelly protocol.

+

The eu.ostrzyciel.jelly.core.Constants object defines some useful constants, such as the file extension for Jelly, its content type, and the version of the Jelly protocol.

RDF Stream Taxonomy (RDF-STaX) stream type utilities

-

Jelly uses RDF-STaX to define the logical stream types (more details here). Jelly-JVM defines each of these types as a case object in eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType .

+

Jelly uses RDF-STaX to define the logical stream types (more details here). Jelly-JVM defines each of these types as a case object in eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType .

These objects have a few useful methods for working with the RDF-STaX ontology:

import eu.ostrzyciel.jelly.core.*
 import eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType
@@ -992,7 +992,7 @@ 

RDF Stream Taxonomy

Jelly configuration from Typesafe config

The jelly-stream module also implements a utility for configuring Jelly serialization options using the Typesafe config library, which is commonly used in Apache Pekko applications.

-

The utility is provided by the eu.ostrzyciel.jelly.stream.JellyOptionsFromTypesafe object. For example:

+

The utility is provided by the eu.ostrzyciel.jelly.stream.JellyOptionsFromTypesafe object. For example:

import com.typesafe.config.ConfigFactory
 import eu.ostrzyciel.jelly.stream.JellyOptionsFromTypesafe
 
diff --git a/versions.json b/versions.json
index e973d24..f032cd2 100644
--- a/versions.json
+++ b/versions.json
@@ -18,8 +18,8 @@
     "version": "2.2.x",
     "title": "2.2.x",
     "aliases": [
-      "2.2.1",
       "2.2.2",
+      "2.2.1",
       "2.2.0"
     ]
   },
@@ -34,10 +34,10 @@
     "version": "2.0.x",
     "title": "2.0.x",
     "aliases": [
-      "2.0.0",
       "2.0.3",
       "2.0.1",
-      "2.0.2"
+      "2.0.2",
+      "2.0.0"
     ]
   },
   {
@@ -76,20 +76,20 @@
     "title": "0.11.x",
     "aliases": [
       "0.11.1",
-      "0.11.2",
       "0.11.5",
       "0.11.6",
       "0.11.3",
-      "0.11.0"
+      "0.11.0",
+      "0.11.2"
     ]
   },
   {
     "version": "0.10.x",
     "title": "0.10.x",
     "aliases": [
-      "0.10.0",
+      "0.10.2",
       "0.10.1",
-      "0.10.2"
+      "0.10.0"
     ]
   }
 ]