diff --git a/dev/index.html b/dev/index.html index 0972ade4..c9841450 100644 --- a/dev/index.html +++ b/dev/index.html @@ -903,7 +903,7 @@
Jelly-JVM is an implementation of the Jelly serialization format and gRPC streaming protocol for the Java Virtual Machine (JVM), written in Scala 31. The supported RDF libraries are Apache Jena and Eclipse RDF4J.
Jelly-JVM provides a full stack of utilities for fast and scalable RDF streaming with the Jelly protocol. Oh, and it's blazing-fast, too!
Getting started with plugins \u2013 no code required
See the getting started guide with plugins for a quick way to use Jelly with your Apache Jena or RDF4J application without writing any code.
Getting started for application developers
If you want to use the full feature set of Jelly-JVM in your code, see the getting started guide for application developers.
This documentation is for the latest development version of Jelly-JVM \u2013 it is not considered stable. If you are looking for the documentation of a stable release, use the version selector on the left of the top navigation bar. See: latest stable version.
"},{"location":"#library-modules","title":"Library modules","text":"The implementation is split into a few modules that can be used separately:
jelly-core
\u2013 implementation of the Jelly serialization format (using the scalapb library), along with generic utilities for converting the deserialized RDF data to/from the representations of RDF libraries (like Apache Jena or RDF4J).
jelly-jena
\u2013 conversions and interop code for the Apache Jena library.
jelly-rdf4j
\u2013 conversions and interop code for the RDF4J library.
jelly-stream
\u2013 utilities for building Reactive Streams of RDF data (based on Pekko Streams). Useful for integrating with gRPC or other streaming protocols (e.g., Kafka, MQTT).
jelly-grpc
\u2013 implementation of a gRPC client and server for the Jelly gRPC streaming protocol.
We also publish plugin JARs which allow you to use Jelly-JVM with Apache Jena and RDF4J just by dropping the JARs into the classpath. Find out more about using the plugins.
"},{"location":"#compatibility","title":"Compatibility","text":"The Jelly-JVM implementation is compatible with Java 11 and newer. Java 11, 17, and 21 are tested in CI and are guaranteed to work. Jelly is built with Scala 3 LTS releases.
The following table shows the compatibility of the Jelly-JVM implementation with other libraries:
Jelly-JVM Scala Java RDF4J Apache Jena Apache Pekko 2.0.x \u2013 2.1.x 3.3.x (LTS) 17+ 5.x.x 5.x.x 1.1.x 1.0.x 3.3.x (LTS)2.13.x1 11+ 4.x.x 4.x.x 1.0.xSee the compatibility policy for more details and the release notes on GitHub.
"},{"location":"#documentation","title":"Documentation","text":"Below is a list of all documentation pages about Jelly-JVM. You can also browse the Javadoc using the badges in the module list above. The documentation uses examples written in Scala, but the libraries can be used from Java as well.
Scala 2.13-compatible builds of Jelly-JVM are available for Jelly-JVM 1.0.x. Scala 2 support was removed in subsequent versions. See more details.\u00a0\u21a9\u21a9
Jelly-JVM is an open project \u2013 you are welcome to submit issues, pull requests, or just ask questions!
"},{"location":"contributing/#submitting-issues","title":"Submitting issues","text":"If you have a question, found a bug, or have an idea for a new feature, please open an issue in the GitHub issue tracker.
"},{"location":"contributing/#security-issues","title":"Security issues","text":"If you find a security issue or vulnerability, please do not open a public issue. Instead, use the dedicated vulnerability reporting page.
"},{"location":"contributing/#pull-requests","title":"Pull requests","text":"Pull requests are welcome! Simply fork the GitHub repository and create a new branch for your changes. When you are ready, open a pull request to the main
branch.
If you are working on a larger feature or a significant change, it is recommended to open an issue first to discuss the idea.
"},{"location":"contributing/#documentation","title":"Documentation","text":"Jelly-JVM uses the exact same documentation system as the main Jelly documentation. Further information on editing the documentation can be found in the Contributing to the Jelly documentation guide.
"},{"location":"contributing/#releases","title":"Releases","text":"See the dedicated page on making releases.
"},{"location":"contributing/#see-also","title":"See also","text":"If you don't want to code anything and only use Jelly with your Apache Jena/RDF4J application, see the dedicated guide about using Jelly-JVM as a plugin.
This guide explains a few of the basic functionalities of Jelly-JVM and how to use them in your code. Jelly-JVM is written in Scala, but it can be used from Java as well. However, in this guide, we will focus on Scala 3.
"},{"location":"getting-started-devs/#quick-start-plain-old-files","title":"Quick start \u2013 plain old files","text":"Depending on your RDF library of choice (Apache Jena or RDF4J), you should import one of two dependencies: jelly-jena
or jelly-rdf4j
1. In our examples we will use Jena, so let's add this to your build.sbt
file (this would be the same for other build tools like Maven or Gradle):
lazy val jellyVersion = \"2.1.0\"\n\nlibraryDependencies ++= Seq(\n \"eu.ostrzyciel.jelly\" %% \"jelly-jena\" % jellyVersion,\n)\n
Now you can serialize/deserialize Jelly data with Apache Jena. Jelly is fully integrated with Jena, so it should all just magically work. Here is a simple example of reading a .jelly
file (in this case, a metadata file from RiverBench) with RIOT:
import eu.ostrzyciel.jelly.convert.jena.riot.*\nimport org.apache.jena.riot.RDFDataMgr\n\n// Load an RDF graph from a Jelly file\nval model = RDFDataMgr.loadModel(\n \"https://w3id.org/riverbench/v/2.0.1.jelly\", \n JellyLanguage.JELLY\n)\n// Print the size of the model\nprintln(s\"Loaded an RDF graph with ${model.size} triples\")\n
Serialization is just as easy:
Serialization example (Scala 3)import eu.ostrzyciel.jelly.convert.jena.riot.*\nimport org.apache.jena.riot.RDFDataMgr\n\nimport java.io.FileOutputStream\nimport scala.util.Using\n\n// Omitted here: creating an RDF model.\n// You can use the one from the previous example.\n\nUsing.resource(new FileOutputStream(\"metadata.jelly\")) { out =>\n // Write the model to a Jelly file\n RDFDataMgr.write(out, model, JellyLanguage.JELLY)\n println(\"Saved the model to metadata.jelly\")\n}\n
Read more about using Jelly-JVM with Apache Jena
Read more about using Jelly-JVM with RDF4J
"},{"location":"getting-started-devs/#rdf-streams","title":"RDF streams","text":"Now, the real power of Jelly lies in its streaming capabilities. Not only can it stream individual RDF triples/quads (this is called flat streaming), but it can also very effectively handle streams of RDF graphs or datasets. To work with streams, you need to use the jelly-stream
module, which is based on the Apache Pekko Streams library. So, let's update our dependencies:
lazy val jellyVersion = \"2.1.0\"\n\nlibraryDependencies ++= Seq(\n \"eu.ostrzyciel.jelly\" %% \"jelly-jena\" % jellyVersion,\n \"eu.ostrzyciel.jelly\" %% \"jelly-stream\" % jellyVersion,\n)\n
Now, let's say we have a stream of RDF graphs \u2013 for example each graph corresponds to one set of measurements from an IoT sensor. We want to have a stream that turns these graphs into their serialized representations (byte arrays), which we can then send over the network. Here is how to do it:
Reactive streaming example (Scala 3)// We need to import \"jena.given\" for Jena-to-Jelly conversions\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport scala.concurrent.ExecutionContext\n\n// We will need a Pekko actor system to run the streams\ngiven actorSystem: ActorSystem = ActorSystem()\n// And an execution context for the futures\ngiven ExecutionContext = actorSystem.getDispatcher\n\n// Load an RDF graph for testing\nval model = RDFDataMgr.loadModel(\n \"https://w3id.org/riverbench/v/2.0.1.jelly\", \n JellyLanguage.JELLY\n)\n\nSource.repeat(model) // Create a stream of the same model over and over\n .take(10) // Take only the first 10 elements in the stream\n .map(_.asTriples) // Convert each model to an iterable of triples\n .via(EncoderFlow.graphStream( // Encode each iterable to a Jelly stream frame\n maybeLimiter = None, // 1 RDF graph = 1 message\n JellyOptions.smallStrict, // Jelly compression settings preset\n ))\n .via(JellyIo.toBytes) // Convert the stream frames to a byte arrays\n .runForeach { bytes =>\n // Just print the length of each byte array in the stream.\n // You can also hook this up to MQTT, Kafka, etc.\n println(s\"Streamed ${bytes.length} bytes\")\n }\n .onComplete(_ => actorSystem.terminate())\n
Jelly will compress this stream on-the-fly, so if the data is repetitive, it will be very efficient. If you run this code, you will notice that the byte sizes for the later graphs are smaller, even though we are sending the same graph over and over again. But, even if each graph is completely different, Jelly still should be much faster than other serialization formats.
These streams are very powerful, because they are reactive and asynchronous \u2013 in short, this means you can hook this up to any data source and any data sink \u2013 and you can scale it up as much as you want. If you are unfamiliar with the concept of reactive streams, we recommend you start with this Apache Pekko Streams guide.
Jelly-JVM supports streaming serialization and deserialization of all types of streams in the RDF Stream Taxonomy. You can read more about the theory of this and all available stream types in the Jelly protocol documentation.
Learn more about reactive streaming with Jelly-JVM
Learn more about the types of streams in Jelly
"},{"location":"getting-started-devs/#grpc-streaming","title":"gRPC streaming","text":"Jelly is a bit more than just a serialization format \u2013 it also defines a gRPC-based straming protocol. You can use it for streaming RDF data between microservices, to build a pub/sub system, or to publish RDF data to the web.
Learn more about using Jelly gRPC protocol servers and clients
"},{"location":"getting-started-devs/#further-reading","title":"Further reading","text":"jelly-stream
module and Apache Pekko Streamsexamples
directory in the Jelly-JVM repo contains code snippets that demonstrate how to use the library in various scenarios.If you have any questions about using Jelly-JVM, feel free to open an issue on GitHub.
There is nothing stopping you from using both at the same time. You can also pretty easily add support for any other Java-based RDF library by implementing a few interfaces. More details here.\u00a0\u21a9
This guide explains how to use Jelly-JVM with Apache Jena or RDF4J as a plugin, without writing a single line of code. Jelly-JVM provides plugin JARs that you can simply drop in the appropriate directory to get Jelly format support in your application.
"},{"location":"getting-started-plugins/#installation","title":"Installation","text":""},{"location":"getting-started-plugins/#apache-jena-apache-jena-fuseki","title":"Apache Jena, Apache Jena Fuseki","text":"You can simply add Jelly format support to Apache Jena or Apacha Jena Fuseki with Jelly's plugin JAR.
jelly-jena-plugin.jar
file.$FUSEKI_BASE/extra/
directory. $FUSEKI_BASE
is the directory usually called run
where you have files such as config.ttl
and shiro.ini
. You will most likely need to create the extra
directory yourself.lib/
directory of your Jena installation.Content negotiation in Fuseki
Content negotiation using the application/x-jelly-rdf
media type in the Accept
header works in Fuseki since Apache Jena version 5.2.0. Previous versions of Fuseki did not support media type registration.
You can simply add Jelly format support to an application based on RDF4J with Jelly's plugin JAR.
jelly-rdf4j-plugin.jar
file.The Jelly-JVM plugin JARs provide the following features:
.jelly
file extension.application/x-jelly-rdf
media type.The Jelly format is registered under the name jelly
in the RDF libraries, so you can use it in the same way as other formats like Turtle, RDF/XML, or JSON-LD.
Jelly-JVM is licensed under the Apache License 2.0.
"},{"location":"licensing/#attribution-citation","title":"Attribution / citation","text":"If you use Jelly-JVM in your research, please the most recent paper about Jelly:
Sowi\u0144ski, P., Wasielewska-Michniewska, K., Ganzha, M., & Paprzycki, M. (2022, October). Efficient RDF streaming for the edge-cloud continuum. In 2022 IEEE 8th World Forum on Internet of Things (WF-IoT) (pp. 1-8). IEEE.
Or use this BibTeX entry:
@inproceedings{sowinski2022efficient,\n title={Efficient RDF streaming for the edge-cloud continuum},\n author={Sowi{\\'n}ski, Piotr and Wasielewska-Michniewska, Katarzyna and Ganzha, Maria and Paprzycki, Marcin and others},\n booktitle={2022 IEEE 8th World Forum on Internet of Things (WF-IoT)},\n pages={1--8},\n year={2022},\n organization={IEEE},\n doi={10.1109/WF-IoT54382.2022.10152225}\n}\n
This paper describes an earlier version of Jelly from 2022. A new paper is in preparation.
"},{"location":"licensing/#jelly-maintainer","title":"Jelly maintainer","text":"Jelly-JVM was created and is maintained by Piotr Sowi\u0144ski (Ostrzyciel) \u2013 GitHub.
"},{"location":"licensing/#see-also","title":"See also","text":"Currently converters for the two most popular RDF JVM libraries are implemented \u2013 RDF4J and Jena. But it is possible to implement your own converters and adapt the Jelly serialization code to any RDF library with little effort.
To do this, you will need to implement three traits (interfaces in Java) from the jelly-core
module: ProtoEncoder
, ProtoDecoderConverter
, and ConverterFactory
.
ProtoEncoder (serialization)
get*
methods deconstruct triple statements, quad statements, and quoted triples (RDF-star). You can make them inline
.nodeToProto
and graphToProto
should translate into Jelly's representation all possible variations of RDF terms in the SPO and G positions, respectively.ProtoDecoderConverter (deserialization)
make*
methods should construct new RDF terms and statements. You can make them inline
.ConverterFactory \u2013 wrapper that allows other modules to use your converter.
ProtoEncoder
and ProtoDecoderConverter
implementations.Full (versioned) releases are created manually and follow the Semantic Versioning scheme for binary compatibility.
To create a new tagged release (example for version 1.2.3):
$ git checkout main\n$ git pull\n$ git tag v1.2.3\n$ git push origin v1.2.3\n
The rest (packaging and release creation) will be handled automatically by the CI. The release will be pushed to Maven Central.
"},{"location":"dev/releases/#snapshot-releases","title":"Snapshot releases","text":"Snapshot releases are triggered automatically by commits in the main
branch. Snapshots are pushed to the Sonatype snapshot repository.
Jelly-JVM follows Semantic Versioning 2.0.0, with MAJOR.MINOR.PATCH releases. Please see the compatibility table on the main page for the current compatibility information. The documentation is versioned to match each Jelly-JVM MAJOR.MINOR version.
"},{"location":"user/compatibility/#jvm-and-scala","title":"JVM and Scala","text":"The current version of Jelly-JVM is compatible with Java 17 and newer. Java 17, 21, and 23 are tested in CI and are guaranteed to work. We recommend using a recent release of GraalVM to get the best performance. If you need Java 11 support, you should use Jelly-JVM 1.0.x.
Jelly is built with Scala 3 LTS releases and supports only Scala 3. If you need Scala 2 support, you should use Jelly-JVM 1.0.x.
"},{"location":"user/compatibility/#rdf-libraries","title":"RDF libraries","text":"Major-version upgrades of RDF4J and Apache Jena (e.g., updating from 4.0.x to 5.0.x) are done in Jelly-JVM MINOR releases. Jelly-JVM generally does not use any complex features of these libraries, so it should work with multiple versions without any problems.
If you do encounter any compatibility issues, please report them on the issue tracker.
"},{"location":"user/compatibility/#internal-vs-external-apis","title":"Internal vs external APIs","text":"Generally, all public classes and methods in Jelly-JVM are considered part of the public API. However, there are some exceptions.
Auto-generated classes in the jelly-core
module, eu.ostrzyciel.jelly.core.proto.v1
package are not considered part of the public API, although we will avoid any incompatibilities where possible. These classes may change between MINOR releases.
Jelly-JVM follows the Jelly protocol's backward compatibility policy. This means that Jelly-JVM can read data serialized with older versions of Jelly. Backward compatibility is tested in CI \u2013 the code is in BackCompatSpec.scala.
Forward compatibility is provided only in a very limited manner in Jelly-JVM. The parser is guaranteed to only parse the stream options header and reject the rest of the stream, if the used protocol version is not supported. You may choose to disable this check and try to parse the rest of the data anyway, but this is most certainly NOT recommended and may lead to unexpected results. In general, Jelly-JVM will ignore any unknown fields in the stream, but any other changes in the protocol may lead to really \"funny\" errors. Forward compatibility is tested in CI \u2013 the code is in ForwardCompatSpec.scala.
"},{"location":"user/compatibility/#see-also","title":"See also","text":"This guide explains the functionalities of the jelly-grpc
module, which implements a gRPC client and server for the Jelly gRPC streaming protocol.
Prerequisites
If you are unfamiliar with gRPC, we recommend you first read some introductory material on the gRPC website or in the Apache Pekko gRPC documentation.
The jelly-grpc
module builds on the functionalities of jelly-stream
, so we recommend you first read the reactive streaming guide.
You may also want to first skim the Jelly gRPC streaming protocol specification to understand the protocol's structure.
As with the jelly-stream
module, you can use jelly-grpc
with any RDF library that has a Jelly integration, such as Apache Jena (using jelly-jena
) or RDF4J (using jelly-rdf4j
). The gRPC API is generic and identical across all libraries.
jelly-grpc
builds on the Apache Pekko gRPC library. Jelly-JVM provides boilerplate code for setting up a gRPC server and client that can send and receive Jelly streams, as shown in the example below:
Source code on GitHub
PekkoGrpc.scalapackage eu.ostrzyciel.jelly.examples\n\nimport com.typesafe.config.ConfigFactory\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.*\nimport eu.ostrzyciel.jelly.grpc.RdfStreamServer\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.NotUsed\nimport org.apache.pekko.actor.typed.ActorSystem\nimport org.apache.pekko.actor.typed.javadsl.Behaviors\nimport org.apache.pekko.grpc.{GrpcClientSettings, GrpcServiceException}\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.concurrent.{Await, ExecutionContext, Future}\nimport scala.concurrent.duration.*\nimport scala.util.{Failure, Success}\n\n/**\n * Example of using Jelly's gRPC client and server to send Jelly streams over the network.\n * This uses the Apache Pekko gRPC library. Its documentation can be found at:\n * https://pekko.apache.org/docs/pekko-grpc/current/index.html\n * \n * See also examples named `PekkoStreams*` for instructions on encoding and decoding RDF streams with Jelly.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoGrpc extends shared.Example:\n // Create a config for Pekko gRPC.\n // We can use the same config for the client and the server, as we are communicating on localhost.\n // This would usually be loaded from a configuration file (e.g., application.conf).\n // More details: https://github.com/lightbend/config\n val config = ConfigFactory.parseString(\n \"\"\"\n |pekko.http.server.preview.enable-http2 = on\n |pekko.grpc.client.jelly.host = 127.0.0.1\n |pekko.grpc.client.jelly.port = 8088\n |pekko.grpc.client.jelly.enable-gzip = true\n |pekko.grpc.client.jelly.use-tls = false\n |pekko.grpc.client.jelly.backend = netty\n |\"\"\".stripMargin\n )\n .withFallback(ConfigFactory.defaultApplication())\n\n // We will need two Pekko actor systems to run the streams \u2013 one for the server and one for the client\n val serverActorSystem: ActorSystem[_] = ActorSystem(Behaviors.empty, \"ServerSystem\")\n val clientActorSystem: ActorSystem[_] = ActorSystem(Behaviors.empty, \"ClientSystem\", config)\n\n // Our mock dataset that we will send around in the streams\n val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n\n /**\n * Main method that starts the server and the client.\n */\n def main(args: Array[String]): Unit =\n given system: ActorSystem[_] = serverActorSystem\n given ExecutionContext = system.executionContext\n\n // Start the server\n val exampleService = ExampleJellyService()\n RdfStreamServer(\n RdfStreamServer.Options.fromConfig(config.getConfig(\"pekko.grpc.client.jelly\")),\n exampleService\n ).run() onComplete {\n case Success(binding) =>\n // If the server started successfully, start the client\n println(s\"[SERVER] Bound to ${binding.localAddress}\")\n runClient()\n case Failure(exception) =>\n // Otherwise, print the error and terminate the actor system\n println(s\"[SERVER] Failed to bind: $exception\")\n system.terminate()\n }\n\n\n /**\n * The client part of the example.\n */\n private def runClient(): Unit =\n given system: ActorSystem[_] = clientActorSystem\n given ExecutionContext = system.executionContext\n\n // Create a gRPC client\n val client = RdfStreamServiceClient(GrpcClientSettings.fromConfig(\"jelly\"))\n\n // First, let's try to publish some data to the server\n val frameSource = EncoderSource.fromDatasetAsQuads(\n dataset,\n ByteSizeLimiter(500),\n JellyOptions.smallStrict.withStreamName(\"weather\")\n )\n println(\"[CLIENT] Publishing data to the server...\")\n val publishFuture = client.publishRdf(frameSource) map { response =>\n println(s\"[CLIENT] Received acknowledgment: $response\")\n } recover {\n case e =>\n println(s\"[CLIENT] Failed to publish data: $e\")\n }\n // Wait for the publish to complete\n Await.ready(publishFuture, 10.seconds)\n\n // Now, let's try to subscribe to some data from the server in the QUADS format\n println(\"\\n\\n[CLIENT] Subscribing to QUADS data from the server...\")\n val quadsFuture = client\n .subscribeRdf(RdfStreamSubscribe(\n \"weather\",\n Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.QUADS))\n ))\n .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n .runFold(0L)((acc, _) => acc + 1)\n // Process the result of the stream (Future[Long])\n .map { counter =>\n println(s\"[CLIENT] Received $counter quads.\")\n } recover {\n case e =>\n println(s\"[CLIENT] Failed to receive quads: $e\")\n }\n Await.ready(quadsFuture, 10.seconds)\n\n // Let's try the same, with a GRAPHS stream\n println(\"\\n\\n[CLIENT] Subscribing to GRAPHS data from the server...\")\n val graphsFuture = client\n .subscribeRdf(RdfStreamSubscribe(\n \"weather\",\n Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.GRAPHS))\n ))\n // Decode the response and transform it into a stream of quads\n .via(DecoderFlow.decodeGraphs.asDatasetStreamOfQuads)\n .mapConcat(identity)\n .runFold(0L)((acc, _) => acc + 1)\n // Process the result of the stream (Future[Long])\n .map { counter =>\n println(s\"[CLIENT] Received $counter quads.\")\n } recover {\n case e =>\n println(s\"[CLIENT] Failed to receive data: $e\")\n }\n Await.ready(graphsFuture, 10.seconds)\n\n // Finally, let's try to subscribe to a stream that the server does not support\n // We will request TRIPLES, but the server only supports QUADS and GRAPHS.\n println(\"\\n\\n[CLIENT] Subscribing to TRIPLES data from the server...\")\n val triplesFuture = client\n .subscribeRdf(RdfStreamSubscribe(\n \"weather\",\n Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.TRIPLES))\n ))\n .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n .runFold(0L)((acc, _) => acc + 1)\n .map { counter =>\n println(s\"[CLIENT] Received $counter triples.\")\n } recover {\n case e =>\n println(s\"[CLIENT] Failed to receive triples: $e\")\n }\n Await.result(triplesFuture, 10.seconds)\n\n println(\"\\n\\n[CLIENT] Terminating...\")\n system.terminate()\n println(\"[SERVER] Terminating...\")\n serverActorSystem.terminate()\n\n\n /**\n * Example implementation of RdfStreamService to act as the server.\n * \n * You will also need to implement this trait in your own service. It defines the logic with which the server\n * will handle incoming streams and subscriptions.\n */\n class ExampleJellyService(using system: ActorSystem[_]) extends RdfStreamService:\n given ExecutionContext = system.executionContext\n\n /**\n * Handler for clients publishing RDF streams to the server.\n * \n * We receive a stream of RdfStreamFrames and must respond with an acknowledgment (or an error).\n */\n override def publishRdf(in: Source[RdfStreamFrame, NotUsed]): Future[RdfStreamReceived] =\n // Decode the incoming stream and count the number of RDF statements in it\n in.via(DecoderFlow.decodeAny.asFlatStream)\n .runFold(0L)((acc, _) => acc + 1)\n .map(counter => {\n println(s\"[SERVER] Received ${counter} RDF statements. Sending acknowledgment.\")\n // Send an acknowledgment back to the client\n RdfStreamReceived()\n })\n\n /**\n * Handler for clients subscribing to RDF streams from the server.\n * \n * We receive a subscription request and must respond with a stream of RdfStreamFrames or an error.\n */\n override def subscribeRdf(in: RdfStreamSubscribe): Source[RdfStreamFrame, NotUsed] =\n println(s\"[SERVER] Received subscription request for topic ${in.topic}.\")\n // First, check the requested physical stream type\n val streamType = in.requestedOptions match\n case Some(options) =>\n println(s\"[SERVER] Requested physical stream type: ${options.physicalType}.\")\n options.physicalType\n case None =>\n println(s\"[SERVER] No requested stream options.\")\n PhysicalStreamType.UNSPECIFIED\n\n // Get the stream options requested by the client or the default options if none were provided\n val options = in.requestedOptions.getOrElse(JellyOptions.smallStrict)\n .withStreamName(in.topic)\n // Check if the requested options are supported\n // !!! THIS IS IMPORTANT !!!\n // If you don't check if the requested options are supported, you may be vulnerable to\n // denial-of-service attacks. For example, a client could request a very large lookup table\n // that would consume a lot of memory on the server.\n try\n JellyOptions.checkCompatibility(options, JellyOptions.defaultSupportedOptions)\n catch\n case e: IllegalArgumentException =>\n // If the requested options are not supported, return an error\n return Source.failed(new GrpcServiceException(\n io.grpc.Status.INVALID_ARGUMENT.withDescription(e.getMessage)\n ))\n\n streamType match\n // This server implementation only supports QUADS and GRAPHS streams... and in both cases\n // it will always the same dataset.\n // You can of course implement more complex logic here, e.g., to stream different data based on the topic.\n case PhysicalStreamType.QUADS => EncoderSource.fromDatasetAsQuads(\n dataset,\n ByteSizeLimiter(16_000),\n options\n )\n case PhysicalStreamType.GRAPHS => EncoderSource.fromDatasetAsGraphs(\n dataset,\n Some(ByteSizeLimiter(16_000)),\n options\n )\n // PhysicalStreamType.TRIPLES is not supported here \u2013 the server will throw a gRPC error\n // if the client requests it.\n // This is an example of how to properly handle unsupported stream options requested by the client.\n // The library is able to automatically convert the error into a gRPC status and send it back to the client.\n case _ => Source.failed(new GrpcServiceException(\n io.grpc.Status.INVALID_ARGUMENT.withDescription(\"Unsupported physical stream type\")\n ))\n
The classes provided in jelly-grpc
should cover most cases, but they only serve as the boilerplate. You must yourself define the logic for handling the incoming and outgoing streams, as shown in the example above.
Of course, you can also implement the server or the client from scratch, if you want to.
"},{"location":"user/grpc/#see-also","title":"See also","text":"This guide explains the functionalities of the jelly-jena
module, which provides Jelly support for Apache Jena.
If you just want to add Jelly format support to Apache Jena / Apache Jena Fuseki, you can use the Jelly-JVM plugin JAR. See the dedicated guide for more information.
"},{"location":"user/jena/#base-facilities","title":"Base facilities","text":"jelly-jena
implements the eu.ostrzyciel.jelly.core.ConverterFactory
trait in eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory
. This factory allows you to build encoders and decoders that convert between Jelly's RdfStreamFrame
s and Apache Jena's Triple
and Quad
objects. The eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame
class is an object representation of Jelly's binary format.
The module also implements the eu.ostrzyciel.jelly.core.IterableAdapter
trait in eu.ostrzyciel.jelly.convert.jena.JenaIterableAdapter
. This adapter provides extension methods for Apache Jena's Model
, Dataset
, Graph
, and DatasetGraph
classes to convert them into an iterable of triples (.asTriples
), quads (.asQuads
), or named graphs (.asGraphs
). This is useful when working with Jelly on a lower level or when using the jelly-stream
module.
jelly-jena
implements an RDF writer and reader for Apache Jena's RIOT library. This means you can use Jelly just like, for example, Turtle or RDF/XML. See the example below:
Source code on GitHub
JenaRiot.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.*\nimport org.apache.jena.rdf.model.ModelFactory\nimport org.apache.jena.riot.{RDFDataMgr, RDFFormat, RDFParser, RDFWriterRegistry, RIOT}\n\nimport java.io.{File, FileOutputStream}\nimport scala.util.Using\n\n/**\n * Example of using Jelly's integration with Apache Jena's RIOT library for\n * writing and reading RDF graphs and datasets to/from disk.\n *\n * See also: https://jena.apache.org/documentation/io/\n */\nobject JenaRiot extends shared.Example:\n def main(args: Array[String]): Unit =\n // Load the RDF graph from an N-Triples file\n val model = RDFDataMgr.loadModel(File(getClass.getResource(\"/weather.nt\").toURI).toURI.toString)\n\n // Print the size of the model\n println(s\"Loaded an RDF graph from N-Triples with size: ${model.size}\")\n\n Using.resource(new FileOutputStream(\"weather.jelly\")) { out =>\n // Write the model to a Jelly file\n // Note: by default this will use the [[JellyFormat.JELLY_SMALL_STRICT]] format variant\n RDFDataMgr.write(out, model, JellyLanguage.JELLY)\n println(\"Saved the model to a Jelly file\")\n }\n\n // Load the RDF graph from a Jelly file\n val model2 = RDFDataMgr.loadModel(\"weather.jelly\", JellyLanguage.JELLY)\n\n // Print the size of the model\n println(s\"Loaded an RDF graph from Jelly with size: ${model2.size}\")\n\n\n\n // ---------------------------------\n println(\"\\n\")\n\n // Try the same with an RDF dataset and some different settings\n val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n println(s\"Loaded an RDF dataset from a Trig file with ${dataset.asDatasetGraph.size} named graphs and \" +\n s\"${dataset.asDatasetGraph.stream.count} quads\")\n\n Using.resource(new FileOutputStream(\"weather-quads.jelly\")) { out =>\n // Write the dataset to a Jelly file, using the \"BIG\" settings\n // (better compression for big files, more memory usage)\n RDFDataMgr.write(out, dataset, JellyFormat.JELLY_BIG_STRICT)\n println(\"Saved the dataset to a Jelly file\")\n }\n\n // Load the RDF dataset from a Jelly file\n val dataset2 = RDFDataMgr.loadDataset(\"weather-quads.jelly\", JellyLanguage.JELLY)\n println(s\"Loaded an RDF dataset from Jelly with ${dataset2.asDatasetGraph.size} named graphs and \" +\n s\"${dataset2.asDatasetGraph.stream.count} quads\")\n\n // ---------------------------------\n println(\"\\n\")\n\n // Custom Jelly format \u2013 change any settings you like\n val customFormat = new RDFFormat(\n JellyLanguage.JELLY,\n JellyFormatVariant(\n opt = JellyOptions.smallStrict\n .withMaxPrefixTableSize(0) // disable the prefix table\n .withStreamName(\"My weather stream\"), // add metadata to the stream\n frameSize = 16 // make RdfStreamFrames with 16 rows each\n )\n )\n\n // Jena requires us to register the custom format \u2013 once for graphs and once for datasets,\n // as Jelly supports both.\n RDFWriterRegistry.register(customFormat, JellyGraphWriterFactory)\n RDFWriterRegistry.register(customFormat, JellyDatasetWriterFactory)\n\n Using.resource(new FileOutputStream(\"weather-quads-custom.jelly\")) { out =>\n // Write the dataset to a Jelly file using the custom format\n RDFDataMgr.write(out, dataset, customFormat)\n println(\"Saved the dataset to a Jelly file with custom settings\")\n }\n\n // Load the RDF dataset from a Jelly file with the custom format\n val dataset3 = RDFDataMgr.loadDataset(\"weather-quads-custom.jelly\", JellyLanguage.JELLY)\n println(s\"Loaded an RDF dataset from Jelly with custom settings with ${dataset3.asDatasetGraph.size} named graphs\" +\n s\" and ${dataset3.asDatasetGraph.stream.count} quads\")\n\n // ---------------------------------\n println(\"\\n\")\n\n // By default, the parser has limits on for example the maximum size of the lookup tables.\n // The default supported options are [[JellyOptions.defaultSupportedOptions]].\n // You can change these limits by creating your own options object.\n val customOptions = JellyOptions.defaultSupportedOptions\n .withMaxNameTableSize(50) // set the maximum size of the name table to 100\n // Create a Context object with the custom options\n val parserContext = RIOT.getContext.copy()\n .set(JellyLanguage.SYMBOL_SUPPORTED_OPTIONS, customOptions)\n\n println(\"Trying to load the model with custom supported options...\")\n val model3 = ModelFactory.createDefaultModel()\n try\n // The loading operation should fail because our allowed max name table size is too low\n RDFParser.create()\n .source(\"weather.jelly\")\n .lang(JellyLanguage.JELLY)\n // Set the context object with the custom options\n .context(parserContext)\n .parse(model3)\n catch\n case e: RdfProtoDeserializationError =>\n // The stream uses a name table size of 128, which is larger than the maximum supported size of 50.\n // To read this stream, set maxNameTableSize to at least 128 in the supportedOptions for this decoder.\n println(s\"Failed to load the model with custom options: ${e.getMessage}\")\n
Usage notes:
eu.ostrzyciel.jelly.core.JellyOptions
provides a few common presets for Jelly serialization options construct a JellyFormatVariant
, as shown in the example above. You can also further customize the serialization options (e.g., dictionary size).jelly-stream
module or the more low-level API: Low-level usage.RdfStreamFrame
in the file.eu.ostrzyciel.jelly.convert.jena.riot.JellyLanguage
object (source code). This registration should happen automatically when you include the jelly-jena
module in your project, using Jena's component initialization mechanism.jelly-jena
also implements a streaming writer (StreamRDF
API in Jena). Using it is similar to the regular RIOT writer, with a slightly different setup:
Source code on GitHub
JenaRiotStreaming.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.PhysicalStreamType\nimport org.apache.jena.graph.{NodeFactory, Triple}\nimport org.apache.jena.riot.system.{StreamRDFLib, StreamRDFWriter}\nimport org.apache.jena.riot.{RDFDataMgr, RDFParser, RIOT}\n\nimport java.io.{File, FileOutputStream}\nimport scala.util.Using\n\n/**\n * Example of using Apache Jena's streaming IO API with Jelly.\n *\n * See also: https://jena.apache.org/documentation/io/streaming-io.html\n */\nobject JenaRiotStreaming extends shared.Example:\n def main(args: Array[String]): Unit =\n // Initialize a Jena StreamRDF to consume the statements\n val readerStream = StreamRDFLib.count()\n\n println(\"Reading a stream of triples from a Jelly file...\")\n\n // Parse a Jelly file as a stream of triples\n val inputFileTriples = new File(getClass.getResource(\"/jelly/weather.jelly\").toURI)\n RDFParser\n .source(inputFileTriples.toURI.toString)\n .lang(JellyLanguage.JELLY)\n .parse(readerStream)\n\n println(f\"Read ${readerStream.countTriples()} triples\")\n println()\n println(\"Reading a stream of quads from a Jelly file...\")\n\n // Parse a different Jelly file as a stream of quads and send it to the same sink\n val inputFileQuads = new File(getClass.getResource(\"/jelly/weather-quads.jelly\").toURI)\n RDFParser\n .source(inputFileQuads.toURI.toString)\n .lang(JellyLanguage.JELLY)\n .parse(readerStream)\n\n // Print the number of triples and quads\n //\n // The number of triples here is the sum of the triples from the first file and the triples\n // in the default graph of the second file. This is just how Jena handles it.\n println(f\"Read ${readerStream.countTriples()} triples (in total)\" +\n f\" and ${readerStream.countQuads()} quads\")\n\n // -------------------------------------\n println(\"\\n\")\n\n println(\"Writing a stream of 10 triples to a file...\")\n\n // Try writing some triples to a file\n // We need to create an instance of RdfStreamOptions to pass to the writer:\n val options = JellyOptions.smallStrict\n // The stream writer does not know if we will be writing triples or quads \u2013 we\n // have to specify the physical stream type explicitly.\n .withPhysicalType(PhysicalStreamType.TRIPLES)\n .withStreamName(\"A stream of 10 triples\")\n\n // To pass the options, we use Jena's Context mechanism\n val context = RIOT.getContext.copy()\n .set(JellyLanguage.SYMBOL_STREAM_OPTIONS, options)\n .set(JellyLanguage.SYMBOL_FRAME_SIZE, 128) // optional, default is 256\n\n Using.resource(new FileOutputStream(\"stream-riot.jelly\")) { out =>\n // Create the writer \u2013 remember to pass the context!\n val writerStream = StreamRDFWriter.getWriterStream(out, JellyLanguage.JELLY, context)\n writerStream.start()\n\n for i <- 1 to 10 do\n writerStream.triple(Triple.create(\n NodeFactory.createBlankNode(),\n NodeFactory.createURI(\"https://example.org/p\"),\n NodeFactory.createLiteralString(s\"object $i\")\n ))\n\n writerStream.finish()\n }\n\n println(\"Done writing triples\")\n\n // Load the RDF graph that we just saved using normal RIOT API\n val model = RDFDataMgr.loadModel(\"stream-riot.jelly\", JellyLanguage.JELLY)\n\n println(\"Loaded the stream from disk, contents:\\n\")\n model.write(System.out, \"NT\")\n
"},{"location":"user/jena/#see-also","title":"See also","text":"Warning
This page describes a low-level API that is a bit of a hassle to use directly. It's recommended to use the higher-level abstractions provided by the jelly-stream
module, or the integrations with Apache Jena's RIOT or RDF4J's Rio libraries. If you really want to use this, it is highly recommended that you first get a basic understanding of how Jelly works under the hood and take a look at the code in the jelly-stream
module to see how it's done there.
Note
The following guide uses the Apache Jena library as an example. The exact same thing can be done with RDF4J or any other RDF library that has a Jelly integration.
"},{"location":"user/low-level/#deserialization","title":"Deserialization","text":"To parse a serialized stream frame into triples/quads:
eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame.parseFrom
if it's a non-delimited frame (like you would see, e.g., in a Kafka or gRPC stream), or parseDelimitedFrom
if it's a delimited stream (like you would see in a file or a socket).eu.ostrzyciel.jelly.core.IoUtils.autodetectDelimiting
. In most cases you will not need to use it. It is used internally by the Jena and RDF4J integrations for user convenience.RdfStreamFrame
s into triples/quads: eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory
has different methods for different physical stream types:anyStatementDecoder
for any physical stream type, outputs Triple
or Quad
triplesDecoder
for TRIPLES streams, outputs Triple
quadsDecoder
for QUADS streams, outputs Quad
graphsDecoder
for GRAPHS streams, outputs (Node, Iterable[Triple])
graphsAsQuadsDecoder
for GRAPHS streams, outputs Quad
ingestRow
method to get the output iteratively.To serialize triples/quads into a stream frame:
asTriples
/asQuads
/asGraphs
extension methods provided by the eu.ostrzyciel.jelly.convert.jena.JenaIterableAdapter
object.RdfStreamRow
s (the rows of a stream frame): use the eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory.encoder
method to get an instance of eu.ostrzyciel.jelly.convert.jena.JenaProtoEncoder
.RdfStreamFrame
s. What you do here depends highly on the logical stream type you are working with.This guide explains the functionalities of the jelly-rdf4j
module, which provides Jelly support for Eclipse RDF4J.
If you just want to add Jelly format support to your RDF4J application, you can use the Jelly-JVM plugin JAR. See the dedicated guide for more information.
"},{"location":"user/rdf4j/#base-facilities","title":"Base facilities","text":"jelly-rdf4j
implements the eu.ostrzyciel.jelly.core.ConverterFactory
trait in eu.ostrzyciel.jelly.convert.rdf4j.Rdf4jConverterFactory
. This factory allows you to build encoders and decoders that convert between Jelly's RdfStreamFrame
s and RDF4J's Statement
objects. The eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame
class is an object representation of Jelly's binary format.
The module also implements the eu.ostrzyciel.jelly.core.IterableAdapter
trait in eu.ostrzyciel.jelly.convert.rdf4j.Rdf4jIterableAdapter
. This adapter provides extension methods for RDF4J's Model
class to convert it into an iterable of triples (.asTriples
), quads (.asQuads
), or named graphs (.asGraphs
). This is useful when working with Jelly on a lower level or when using the jelly-stream
module.
jelly-rdf4j
implements an RDF writer and parser for Eclipse RDF4J's Rio library. This means you can use Jelly just like any other RDF serialization format (e.g., RDF/XML, Turtle). See the example below:
Source code on GitHub
Rdf4jRio.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.rdf4j.rio.*\nimport eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.{PhysicalStreamType, RdfStreamOptions}\nimport org.eclipse.rdf4j.model.Statement\nimport org.eclipse.rdf4j.rio.helpers.StatementCollector\nimport org.eclipse.rdf4j.rio.{RDFFormat, Rio}\n\nimport java.io.{File, FileOutputStream}\nimport scala.jdk.CollectionConverters.*\nimport scala.util.Using\n\n/**\n * Example of using RDF4J's Rio library to read and write RDF data.\n *\n * See also: https://rdf4j.org/documentation/programming/rio/\n */\nobject Rdf4jRio extends shared.Example:\n def main(args: Array[String]): Unit =\n // Load the RDF graph from an N-Triples file\n val inputFile = File(getClass.getResource(\"/weather.nt\").toURI)\n val triples = readRdf4j(inputFile, RDFFormat.TURTLE, None)\n\n // Print the size of the graph\n println(s\"Loaded ${triples.size} triples from an N-Triples file\")\n\n // Write the RDF graph to a Jelly file\n // Fist, create the stream's options:\n val options = JellyOptions.smallStrict\n // Setting the physical stream type is mandatory! It will always be either TRIPLES or QUADS.\n .withPhysicalType(PhysicalStreamType.TRIPLES)\n // Set other optional options\n .withStreamName(\"My weather data\")\n // Create the config object to pass to the writer\n val config = JellyWriterSettings.configFromOptions(options, frameSize = 128)\n\n // Do the actual writing\n Using.resource(new FileOutputStream(\"weather.jelly\")) { out =>\n val writer = Rio.createWriter(JELLY, out)\n writer.setWriterConfig(config)\n writer.startRDF()\n triples.foreach(writer.handleStatement)\n writer.endRDF()\n }\n\n println(\"Saved the model to a Jelly file\")\n\n // Load the RDF graph from the Jelly file\n val jellyFile = File(\"weather.jelly\")\n val jellyTriples = readRdf4j(jellyFile, JELLY, None)\n\n // Print the size of the graph\n println(s\"Loaded ${jellyTriples.size} triples from a Jelly file\")\n\n // ---------------------------------\n println(\"\\n\")\n // By default, the parser has limits on for example the maximum size of the lookup tables.\n // The default supported options are [[JellyOptions.defaultSupportedOptions]].\n // You can change these limits by creating your own options object.\n val customOptions = JellyOptions.defaultSupportedOptions\n .withMaxPrefixTableSize(10) // set the maximum size of the prefix table to 10\n println(\"Trying to read the Jelly file with custom options...\")\n try\n // This operation should fail because the Jelly file uses a prefix table larger than 10\n val customTriples = readRdf4j(jellyFile, JELLY, Some(customOptions))\n catch\n case e: RdfProtoDeserializationError =>\n // The stream uses a prefix table size of 16, which is larger than the maximum supported size of 10.\n // To read this stream, set maxPrefixTableSize to at least 16 in the supportedOptions for this decoder.\n println(s\"Failed to read the Jelly file with custom options: ${e.getMessage}\")\n\n\n /**\n * Helper function to read RDF data using RDF4J's Rio library.\n * @param file file to read from\n * @param format RDF format\n * @param supportedOptions supported options for reading Jelly streams (optional)\n * @return sequence of RDF statements\n */\n private def readRdf4j(file: File, format: RDFFormat, supportedOptions: Option[RdfStreamOptions]): Seq[Statement] =\n val parser = Rio.createParser(format)\n val collector = new StatementCollector()\n parser.setRDFHandler(collector)\n supportedOptions.foreach(opt =>\n // If the user provided supported options, set them on the parser\n parser.setParserConfig(JellyParserSettings.configFromOptions(opt))\n )\n Using.resource(file.toURI.toURL.openStream()) { is =>\n parser.parse(is)\n }\n collector.getStatements.asScala.toSeq\n
Usage notes:
eu.ostrzyciel.jelly.core.JellyOptions
provides a few common presets for Jelly serialization options. These options are passed through eu.ostrzyciel.jelly.convert.rdf4j.rio.JellyWriterSettings.configFromOptions
and used to configure the writer, as shown in the example above. You can also further customize the serialization options (e.g., dictionary size).jelly-stream
module or the more low-level API: Low-level usage.RdfStreamFrame
in the file.eu.ostrzyciel.jelly.convert.rdf4j.rio
package (source code). They are automatically registered on startup using the RDFParserFactory
and RDFWriterFactory
SPIs provided by RDF4J.This guide explains the reactive streaming functionalities of the jelly-stream
module.
Prerequisites
If you are unfamiliar with the concept of reactive streams or Apache Pekko Streams, we highly recommend you start from reading about the basic concepts of Pekko Streams.
We also recommend you first read about the RDF stream types in Jelly. Otherwise, this guide may not make much sense.
You can use jelly-stream
with any RDF library that has a Jelly integration, such as Apache Jena (using jelly-jena
) or RDF4J (using jelly-rdf4j
). The streaming API is generic and identical across all libraries.
A key notion of this API are the encoders and decoders.
Triple
in Apache Jena) into an object representation of Jelly's binary format (RdfStreamFrame
).RdfStreamFrame
s into objects from your RDF library of choice.So, for example, an encoder flow for flat triple streams would have a type of Flow[Triple, RdfStreamFrame, NotUsed]
in Apache Jena. The opposite (a flat triple stream decoder) would have a type of Flow[RdfStreamFrame, Triple, NotUsed]
.
RdfStreamFrame
s can be converted to and from raw bytes using a range of methods, depending on your use case. See the sections below for examples.
EncoderSource
)","text":"The easiest way to start is with flat RDF streams (i.e., flat streams of triples or quads). You can convert an RDF dataset or graph into such using the methods in eu.ostrzyciel.jelly.stream.EncoderSource
.
Source code on GitHub
PekkoStreamsEncoderSource.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.EncoderSource]] utility to convert RDF graphs and datasets\n * into Jelly streams with a single method call.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsEncoderSource extends shared.Example:\n def main(args: Array[String]): Unit =\n // We will need a Pekko actor system to run the streams\n given actorSystem: ActorSystem = ActorSystem()\n // And an execution context for the futures\n given ExecutionContext = actorSystem.getDispatcher\n\n // Load an example RDF graph from an N-Triples file\n val model = RDFDataMgr.loadModel(File(getClass.getResource(\"/weather.nt\").toURI).toURI.toString)\n\n println(s\"Loaded model with ${model.size()} triples\")\n println(s\"Streaming the model to memory...\")\n\n // Create a Pekko Streams Source from the Jena model\n // This automatically sets the physical and logical stream types.\n val encodedModelFuture = EncoderSource\n .fromGraph(\n model,\n // Aim for frames with ~2000 bytes \u2013 may be more!\n ByteSizeLimiter(2000),\n JellyOptions.smallStrict,\n )\n // wireTap: print the size of the frames\n // Notice in the output that the frames are slightly bigger than 2000 bytes.\n .wireTap(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes on wire\"))\n // Convert each stream frame to bytes\n .via(JellyIo.toBytes)\n // Collect the stream into a sequence\n .runWith(Sink.seq)\n\n // Wait for the stream to complete and collect the result\n val encodedModel = Await.result(encodedModelFuture, 10.seconds)\n\n println(s\"Streamed model to memory with ${encodedModel.size} frames and\" +\n s\" ${encodedModel.map(_.length).sum} bytes on wire\")\n\n println(\"\\n\")\n\n // -------------------------------------------------------------------\n // Second example: try encoding an RDF dataset as a GRAPHS stream\n val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n println(s\"Loaded dataset with ${dataset.asDatasetGraph.size} named graphs\")\n println(s\"Streaming the dataset to memory...\")\n\n val encodedDatasetFuture = EncoderSource\n // Here we stream this is as a GRAPHS stream (physical type)\n // You can also use .fromDatasetAsQuads to stream as QUADS\n .fromDatasetAsGraphs(\n dataset,\n // This time we limit the number of rows in each frame to 30\n // Note that for this particular encoder, we can skip the limiter entirely \u2013 but this can lead to huge frames!\n // So, be careful with that, or may get an out-of-memory error.\n Some(StreamRowCountLimiter(30)),\n JellyOptions.smallStrict,\n )\n // wireTap: print the size of the frames\n // Note that some frames smaller than the limit \u2013 this is because this encoder will always split frames\n // on graph boundaries.\n .wireTap(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes on wire\"))\n // Convert each stream frame to bytes\n .via(JellyIo.toBytes)\n // Collect the stream into a sequence\n .runWith(Sink.seq)\n\n // Wait for the stream to complete and collect the result\n val encodedDataset = Await.result(encodedDatasetFuture, 10.seconds)\n\n println(s\"Streamed dataset to memory with ${encodedDataset.size} frames and\" +\n s\" ${encodedDataset.map(_.length).sum} bytes on wire\")\n\n actorSystem.terminate()\n
"},{"location":"user/reactive/#encoding-any-rdf-data-as-a-flat-or-grouped-stream-encoderflow","title":"Encoding any RDF data as a flat or grouped stream (EncoderFlow
)","text":"The eu.ostrzyciel.jelly.stream.EncoderFlow
provides even more options for turning RDF data into Jelly streams, including both grouped and flat streams. Every type of RDF stream in Jelly can be created using this API.
Source code on GitHub
PekkoStreamsEncoderFlow.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.EncoderFlow]] utility to encode RDF data as Jelly streams.\n * \n * Here, the RDF data is turned into a series of byte buffers, with each buffer corresponding to exactly one frame.\n * This is suitable if your streaming protocol (e.g., Kafka, MQTT, AMQP) already frames the messages.\n * If you are writing to a raw socket or file, then you must use the DELIMITED variant of Jelly instead.\n * See [[eu.ostrzyciel.jelly.examples.PekkoStreamsWithIo]] for examples of that.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsEncoderFlow extends shared.Example:\n def main(args: Array[String]): Unit =\n // We will need a Pekko actor system to run the streams\n given actorSystem: ActorSystem = ActorSystem()\n // And an execution context for the futures\n given ExecutionContext = actorSystem.getDispatcher\n\n // Load the example dataset\n val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n // First, let's see what views of the dataset can we obtain using Jelly's Iterable adapters:\n // 1. Iterable of all quads in the dataset\n val quads: immutable.Iterable[Quad] = dataset.asQuads\n // 2. Iterable of all graphs (named and default) in the dataset\n val graphs: immutable.Iterable[(Node, Iterable[Triple])] = dataset.asGraphs\n // 3. Iterable of all triples in the default graph\n val triples: immutable.Iterable[Triple] = dataset.getDefaultModel.asTriples\n\n // Note: here we are not turning the frames into bytes, but just printing their size in bytes.\n // You can find an example of how to turn a frame into a byte array in the `PekkoStreamsEncoderSource` example.\n // This is done with: .via(JellyIo.toBytes)\n\n // Let's try encoding this as flat RDF streams (streams of triples or quads)\n // https://w3id.org/stax/ontology#flatQuadStream\n println(f\"Encoding ${quads.size} quads as a flat RDF quad stream\")\n val flatQuadsFuture = Source(quads)\n .via(EncoderFlow.flatQuadStream(\n // This encoder requires a size limiter \u2013 otherwise a stream frame could have infinite length!\n StreamRowCountLimiter(20),\n JellyOptions.smallStrict,\n ))\n .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n Await.ready(flatQuadsFuture, 10.seconds)\n\n // https://w3id.org/stax/ontology#flatTripleStream\n println(f\"\\n\\nEncoding ${triples.size} triples as a flat RDF triple stream\")\n val flatTriplesFuture = Source(triples)\n .via(EncoderFlow.flatTripleStream(\n // This encoder requires a size limiter \u2013 otherwise a stream frame could have infinite length!\n ByteSizeLimiter(500),\n JellyOptions.smallStrict,\n ))\n .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n Await.ready(flatTriplesFuture, 10.seconds)\n\n // We can also stream already grouped triples or quads \u2013 for example, if your system generates batches of\n // N triples, you can just send those batches straight to be encoded, with one batch = one stream frame.\n // https://w3id.org/stax/ontology#flatQuadStream\n println(f\"\\n\\nEncoding ${quads.size} quads as a flat RDF quad stream, grouped in batches of 10\")\n // First, group the quads into batches of 8\n val groupedQuadsFuture = Source.fromIterator(() => quads.grouped(10))\n .via(EncoderFlow.flatQuadStreamGrouped(\n // Do not use a size limiter here \u2013 we want exactly one batch in each frame\n None,\n JellyOptions.smallStrict,\n ))\n .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n Await.ready(groupedQuadsFuture, 10.seconds)\n\n // Now, let's try grouped streams. Let's say we want to stream all graphs in a dataset, but put exactly one\n // graph in each frame (message). This is very common in (for example) IoT systems.\n // https://w3id.org/stax/ontology#namedGraphStream\n println(f\"\\n\\nEncoding ${graphs.size} graphs as a named graph stream\")\n val namedGraphsFuture = Source(graphs)\n .via(EncoderFlow.namedGraphStream(\n // Do not use a size limiter here \u2013 we want exactly one graph in each frame\n None,\n JellyOptions.smallStrict,\n ))\n // Note that we will see exactly as many frames as there are graphs in the dataset\n .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n Await.ready(namedGraphsFuture, 10.seconds)\n\n // As a last example, we will stream a series of RDF graphs. In our case this will be just the default graph\n // repeated a few times. This type of stream is also pretty common in practical applications.\n // https://w3id.org/stax/ontology#graphStream\n println(f\"\\n\\nEncoding 5 RDF graphs as a graph stream\")\n val graphsFuture = Source.repeat(triples)\n .take(5)\n .via(EncoderFlow.graphStream(\n // Do not use a size limiter here \u2013 we want exactly one graph in each frame\n None,\n JellyOptions.smallStrict,\n ))\n // Note that we will see exactly 5 frames \u2013 the number of graphs we streamed\n .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n Await.ready(graphsFuture, 10.seconds)\n\n actorSystem.terminate()\n
"},{"location":"user/reactive/#decoding-rdf-streams-decoderflow","title":"Decoding RDF streams (DecoderFlow
)","text":"The eu.ostrzyciel.jelly.stream.DecoderFlow
provides methods for decoding flat and grouped streams. There is no opposite equivalent to EncoderSource
for decoding, though. This would require constructing an RDF graph or dataset from statements, which is a process that can vary a lot depending on your application. You will have to do this part yourself.
Source code on GitHub
PekkoStreamsDecoderFlow.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.query.Dataset\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.DecoderFlow]] utility to turn incoming Jelly streams\n * into usable RDF data.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsDecoderFlow extends shared.Example:\n def main(args: Array[String]): Unit =\n // We will need a Pekko actor system to run the streams\n given actorSystem: ActorSystem = ActorSystem()\n // And an execution context for the futures\n given ExecutionContext = actorSystem.getDispatcher\n\n // Load the example dataset\n val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n // To decode something, we first need to encode it...\n // See [[PekkoStreamsEncoderFlow]] and [[PekkoStreamsEncoderSource]] for an explanation of what is happening here.\n // We have four seqences of byte arrays, with each byte array corresponding to one encoded stream frame:\n // - encodedQuads: a flat RDF quad stream, physical type: QUADS\n // - encodedTriples: a flat RDF triple stream, physical type: TRIPLES\n // - encodedGraphs: a flat RDF quad stream, physical type: GRAPHS\n val (encodedQuads, encodedTriples, encodedGraphs) = getEncodedData(dataset)\n\n // Now we can decode the encoded data back into something useful.\n // Let's start by simply decoding the quads as a flat RDF quad stream:\n println(\"Decoding quads as a flat RDF quad stream...\")\n val decodedQuadsFuture = Source(encodedQuads)\n // We need to parse the bytes into a Jelly stream frame\n .via(JellyIo.fromBytes)\n // And then decode the frame into Jena quads.\n // We use \"decodeQuads\" because the physical stream type is QUADS.\n // And then we want to treat it as a flat RDF quad stream, so we call \"asFlatQuadStreamStrict\".\n // We use the \"Strict\" method to tell the decoder to check if the incoming logical stream type is the same\n // as we are expecting: flat RDF quad stream.\n .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n .runWith(Sink.seq)\n\n val decodedQuads: Seq[Quad] = Await.result(decodedQuadsFuture, 10.seconds)\n println(s\"Decoded ${decodedQuads.size} quads.\")\n\n // We can also treat each stream frame as a separate dataset. This way we would get an\n // RDF dataset stream.\n println(f\"\\n\\nDecoding quads as an RDF dataset stream from ${encodedQuads.size} frames...\")\n val decodedDatasetFuture = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n // Note that we cannot use the strict variant (asDatasetStreamOfQuadsStrict) here, because the stream says its\n // logical type is flat RDF quad stream.\n .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuads)\n .runWith(Sink.seq)\n\n val decodedDatasets: Seq[IterableOnce[Quad]] = Await.result(decodedDatasetFuture, 10.seconds)\n println(s\"Decoded ${decodedDatasets.size} datasets with\" +\n s\" ${decodedDatasets.map(_.iterator.size).sum} quads in total.\")\n\n // If we tried that with the strict variant, we would get an exception:\n println(f\"\\n\\nDecoding quads as an RDF dataset stream with strict logical type handling...\")\n val future = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuadsStrict)\n .runWith(Sink.seq)\n Await.result(future.recover {\n // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n // Expected logical stream type LOGICAL_STREAM_TYPE_DATASETS, got LOGICAL_STREAM_TYPE_FLAT_QUADS.\n // LOGICAL_STREAM_TYPE_FLAT_QUADS is not a subtype of LOGICAL_STREAM_TYPE_DATASETS.\n case e: Exception => println(e.getCause)\n }, 10.seconds)\n\n // We can also pass entirely custom supported options to the decoder, instead of the defaults\n // (see [[JellyOptions.defaultSupportedOptions]]). This is useful if we want to decode a stream with\n // for example very large lookup tables or we want to put stricter limits on the streams that we accept.\n println(f\"\\n\\nDecoding quads as an RDF dataset stream with custom supported options...\")\n val customSupportedOptions = JellyOptions.defaultSupportedOptions\n .withMaxNameTableSize(50) // This is too small for the stream we are decoding\n val customSupportedOptionsFuture = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuads(customSupportedOptions))\n .runWith(Sink.seq)\n Await.result(customSupportedOptionsFuture.recover {\n // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n // The stream uses a name table size of 128, which is larger than the maximum supported size of 50.\n // To read this stream, set maxNameTableSize to at least 128 in the supportedOptions for this decoder.\n case e: Exception => println(e.getCause)\n }, 10.seconds)\n\n // Flat RDF triple stream\n println(f\"\\n\\nDecoding triples as a flat RDF triple stream...\")\n val decodedTriplesFuture = Source(encodedTriples)\n .via(JellyIo.fromBytes)\n .via(DecoderFlow.decodeTriples.asFlatTripleStreamStrict)\n .runWith(Sink.seq)\n\n val decodedTriples: Seq[Triple] = Await.result(decodedTriplesFuture, 10.seconds)\n println(s\"Decoded ${decodedTriples.size} triples.\")\n\n // We can interpret the GRAPHS stream in a few ways, see\n // [[eu.ostrzyciel.jelly.stream.DecoderFlow.GraphsIngestFlowOps]] for more details.\n // Here we will treat it as an RDF named graph stream.\n println(f\"\\n\\nDecoding graphs as an RDF named graph stream...\")\n val decodedGraphsFuture = Source(encodedGraphs)\n .via(JellyIo.fromBytes)\n // Non-strict because the original logical stream type is flat RDF quad stream.\n .via(DecoderFlow.decodeGraphs.asNamedGraphStream)\n .runWith(Sink.seq)\n\n val decodedGraphs: Seq[(Node, Iterable[Triple])] = Await.result(decodedGraphsFuture, 10.seconds)\n println(s\"Decoded ${decodedGraphs.size} graphs.\")\n\n // If we tried using a decoder for a physical stream type that does not match the type of the stream,\n // we would get an exception. Here let's try to decode a QUADS stream with a TRIPLES decoder.\n println(f\"\\n\\nDecoding quads as a flat RDF triple stream...\")\n val future2 = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n // Note the \"decodeTriples\" here\n .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n .runWith(Sink.seq)\n Await.result(future2.recover {\n // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n // Incoming stream type is not TRIPLES.\n case e: Exception => println(e.getCause)\n }, 10.seconds)\n\n // We can get around this by using the \"decodeAny\" method, which will pick the appropriate decoder\n // based on the stream options in the stream.\n // In this case we can only ask the decoder to output a flat or grouped RDF stream.\n println(f\"\\n\\nDecoding quads as a flat RDF stream using decodeAny...\")\n val decodedAnyFuture = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n // The is no strict variant at all for decodeAny, as we don't care about the stream type anyway.\n .via(DecoderFlow.decodeAny.asFlatStream)\n .runWith(Sink.seq)\n\n val decodedAny: Seq[Triple | Quad] = Await.result(decodedAnyFuture, 10.seconds)\n println(s\"Decoded ${decodedAny.size} statements.\")\n\n // One last trick up our sleeves is the snoopStreamOptions method, which allows us to inspect the stream options\n // and carry on with the decoding as normal.\n // In this case, we will reuse the first example (flat RDF quad stream) and snoop the stream options.\n println(f\"\\n\\nSnooping the stream options of the first frame while decoding a flat RDF quad stream...\")\n val snoopFuture = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n // We add a .viaMat here to capture the materialized value of this stage.\n .viaMat(DecoderFlow.snoopStreamOptions)(Keep.right)\n .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n .toMat(Sink.seq)(Keep.both)\n .run()\n\n val streamOptions = Await.result(snoopFuture._1, 10.seconds)\n val decodedQuads2 = Await.result(snoopFuture._2, 10.seconds)\n\n val streamOptionsIndented = (\"\\n\" + streamOptions.get.toProtoString.strip).replace(\"\\n\", \"\\n \")\n println(s\"Stream options: $streamOptionsIndented\")\n println(s\"Decoded ${decodedQuads2.size} quads.\")\n\n actorSystem.terminate()\n\n\n /**\n * Helper method to produce encoded data from a dataset.\n */\n private def getEncodedData(dataset: Dataset)(using ActorSystem, ExecutionContext):\n (Seq[Array[Byte]], Seq[Array[Byte]], Seq[Array[Byte]]) =\n val quadStream = EncoderSource.fromDatasetAsQuads(\n dataset,\n ByteSizeLimiter(500),\n JellyOptions.smallStrict\n )\n val tripleStream = EncoderSource.fromGraph(\n dataset.getDefaultModel,\n ByteSizeLimiter(250),\n JellyOptions.smallStrict\n )\n val graphStream = EncoderSource.fromDatasetAsGraphs(\n dataset,\n None,\n JellyOptions.smallStrict\n )\n val results = Seq(quadStream, tripleStream, graphStream).map { stream =>\n val streamFuture = stream\n .via(JellyIo.toBytes)\n .runWith(Sink.seq)\n Await.result(streamFuture, 10.seconds)\n }\n (results.head, results(1), results(2))\n
"},{"location":"user/reactive/#byte-streams-delimited-variant","title":"Byte streams (delimited variant)","text":"In all of the examples above, we used the non-delimited variant of Jelly, which is appropriate for, e.g., sending Jelly data over gRPC or Kafka. If you want to write Jelly data to a file or a socket, you will need to use the delimited variant. jelly-stream
provides a few methods for this in eu.ostrzyciel.jelly.stream.JellyIo
.
Source code on GitHub
PekkoStreamsWithIo.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.query.Dataset\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\nimport org.apache.pekko.util.ByteString\n\nimport java.io.{File, FileInputStream, FileOutputStream}\nimport java.util.zip.GZIPInputStream\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\nimport scala.util.Using\n\n/**\n * Example of using Pekko Streams to read/write Jelly to a file or any other byte stream (e.g., socket).\n *\n * The examples here use the DELIMITED variant of Jelly, which is suitable only for situations where there is\n * no framing in the underlying stream. You should always use the delimited variant with raw files and sockets,\n * as otherwise it would be impossible to tell where one stream frame ends and another one begins.\n *\n * If you are working with something like MQTT, Kafka, JMS, AMQP... then check the examples in\n * [[eu.ostrzyciel.jelly.examples.PekkoStreamsEncoderFlow]].\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsWithIo extends shared.Example:\n def main(args: Array[String]): Unit =\n // We will need a Pekko actor system to run the streams\n given actorSystem: ActorSystem = ActorSystem()\n // And an execution context for the futures\n given ExecutionContext = actorSystem.getDispatcher\n\n // We will read a gzipped Jelly file from disk and decode it on the fly, as we are decompressing it.\n println(\"Decoding a gzipped Jelly file with Pekko Streams...\")\n // The input file is a GZipped Jelly file\n val inputFile = File(getClass.getResource(\"/jelly/weather.jelly.gz\").toURI)\n\n // Use Java's GZIPInputStream to decompress the input file on the fly\n val decodedTriples: Seq[Triple] = Using.resource(new GZIPInputStream(FileInputStream(inputFile))) { inputStream =>\n val decodedTriplesFuture = JellyIo.fromIoStream(inputStream)\n // Decode the Jelly frames to triples.\n // Under the hood it uses the RdfStreamFrame.parseDelimitedFrom method.\n .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n .runWith(Sink.seq)\n\n Await.result(decodedTriplesFuture, 10.seconds)\n }\n\n println(s\"Decoded ${decodedTriples.size} triples\")\n\n // -----------------------------------------------------------\n // Now we will write the decoded triples to a new Jelly file\n println(\"\\n\\nWriting the decoded triples to a new Jelly file with Pekko Streams...\")\n Using.resource(new FileOutputStream(\"weather.jelly\")) { outputStream =>\n val writeFuture = Source(decodedTriples)\n // Encode the triples to Jelly\n .via(EncoderFlow.flatTripleStream(\n ByteSizeLimiter(500),\n JellyOptions.smallStrict\n ))\n // Write the Jelly frames to a Java byte stream.\n // Under the hood it uses the RdfStreamFrame.writeDelimitedTo method.\n .runWith(JellyIo.toIoStream(outputStream))\n\n Await.ready(writeFuture, 10.seconds)\n println(\"Done writing the Jelly file.\")\n }\n\n // -----------------------------------------------------------\n // Pekko Streams offers its own utilities for reading and writing bytes that do not involve using Java's\n // blocking implementation of streams.\n // We will again write the decoded triples to a Jelly file, but this time use Pekko's facilities.\n println(\"\\n\\nWriting the decoded triples to a new Jelly file with Pekko Streams' utilities...\")\n val writeFuture = Source(decodedTriples)\n .via(EncoderFlow.flatTripleStream(\n ByteSizeLimiter(500),\n JellyOptions.smallStrict\n ))\n // Convert the frames into Pekko's byte strings.\n // Note: we are using the DELIMITED variant because we will write this to disk!\n .via(JellyIo.toBytesDelimited)\n .map(bytes => ByteString(bytes))\n .runWith(FileIO.toPath(File(\"weather2.jelly\").toPath))\n\n Await.ready(writeFuture, 10.seconds)\n println(\"Done writing the Jelly file.\")\n\n actorSystem.terminate()\n
"},{"location":"user/reactive/#see-also","title":"See also","text":"This guide presents some useful utilities in the jelly-core
and jelly-stream
modules.
Every Jelly stream begins with a header that specifies the serialization options used to encode the stream \u2013 see the details in the specification. So, whenever you serialize some RDF with Jelly (e.g., using Apache Jena RIOT, RDF4J Rio, or the jelly-stream
module), you need to specify these options.
The eu.ostrzyciel.jelly.core.JellyOptions
object provides a few common presets for Jelly serialization options. They return an instance of eu.ostrzyciel.jelly.core.proto.v1.RdfStreamOptions
that you can further customize. For example:
import eu.ostrzyciel.jelly.core.JellyOptions\n\nval options = JellyOptions.smallStrict\n\nval optionsWithRdfStarSupport = JellyOptions.smallRdfStar\n\nval bigWithCustomDictionarySize = JellyOptions.bigStrict\n .withMaxNameTableSize(2000) \n
Warning
These presets do not specify the physical or logical stream type. In most cases, the Jelly library will take care of this for you and set these types automatically later. However, if you use the low-level API, you need to set the stream types manually. For example:
import eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.*\n\nJellyOptions.smallStrict\n .withPhysicalType(PhysicalStreamType.QUADS)\n .withLogicalType(LogicalStreamType.DATASETS)\n
"},{"location":"user/utilities/#checking-supported-options","title":"Checking supported options","text":"There is also the eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions
method which specifies the maximum set of options supported by default in Jelly-JVM, when parsing a stream. By default, Jelly-JVM will refuse to parse any stream that uses options that are beyond what is specified in this method. This is important for security reasons, as it prevents the library from, for example, allocating a 10 GB dictionary (potential Denial of Service attack).
The supported options check is carried out automatically by the decoder when parsing a stream. You cannot disable the check, but you can customize the supported options by constructing a new RdfStreamOptions
object from eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions
, customizing it, and passing it to the decoder.
If you want to do this kind of check in some other context (e.g., in a gRPC service to check if you can support the options requested by the client), you can use the eu.ostrzyciel.jelly.core.JellyOptions.checkCompatibility
method. It will throw an exception if the options are not supported.
The eu.ostrzyciel.jelly.core.Constants
object defines some useful constants, such as the file extension for Jelly, its content type, and the version of the Jelly protocol.
Jelly uses RDF-STaX to define the logical stream types (more details here). Jelly-JVM defines each of these types as a case object in eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType
.
These objects have a few useful methods for working with the RDF-STaX ontology:
import eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\n\n// Get the RDF-STaX IRI of a stream type\n// returns \"https://w3id.org/stax/ontology#flatTripleStream\"\nLogicalStreamType.TRIPLES.getRdfStaxType\n
You can also obtain a full RDF-STaX annotation for your stream if you also import an RDF library interop module (e.g., jelly-jena
or jelly-rdf4j
):
// Here we import `jena.given` to get the necessary implicit conversions.\n// You can do the same with `rdf4j.given` if you are using RDF4J.\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\nimport org.apache.jena.graph.NodeFactory\n\nval subjectNode: Node = NodeFactory.createURI(\"http://example.org/subject\")\nval triples: Seq[Triple] = LogicalStreamType.QUADS.getRdfStaxAnnotation\n// Returns a Seq of three triples that would look like this in Turtle:\n// <http://example.org/subject> stax:hasStreamTypeUsage [\n// a stax:RdfStreamTypeUsage ;\n// stax:hasStreamType stax:flatQuadStream\n// ] .\n
You can then take this annotation and expose as semantic metadata of your stream.
You can also do the opposite and construct an instance of LogicalStreamType
from an RDF-STaX IRI:
import eu.ostrzyciel.jelly.core.LogicalStreamTypeFactory\n\nval iri = \"https://w3id.org/stax/ontology#flatQuadStream\"\n// returns LogicalStreamType.QUADS\nval streamType = LogicalStreamTypeFactory.fromOntologyIri(iri)\n
Finally, there are also stream type checking and manipulation utilities:
import eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\n\n// Check if this type is equal or a subtype of another type.\n// This is useful for performing compatibility checks.\n// Returns false\nLogicalStreamType.TRIPLES.isEqualOrSubtypeOf(LogicalStreamType.DATASETS)\n// Returns true\nLogicalStreamType.NAMED_GRAPHS.isEqualOrSubtypeOf(LogicalStreamType.DATASETS)\n\n// Get the \"base\" type of a stream type. Base types are concrete stream types \n// that have no parent types. \n// There are only 4 base types: GRAPHS, DATASETS, TRIPLES, QUADS.\n// Returns LogicalStreamType.TRIPLES\nLogicalStreamType.TRIPLES.toBaseType\n// Returns LogicalStreamType.DATASETS\nLogicalStreamType.NAMED_GRAPHS.toBaseType\n// Returns LogicalStreamType.DATASETS\nLogicalStreamType.TIMESTAMPED_NAMED_GRAPHS.toBaseType\n
"},{"location":"user/utilities/#jelly-configuration-from-typesafe-config","title":"Jelly configuration from Typesafe config","text":"The jelly-stream
module also implements a utility for configuring Jelly serialization options using the Typesafe config library, which is commonly used in Apache Pekko applications.
The utility is provided by the eu.ostrzyciel.jelly.stream.JellyOptionsFromTypesafe
object. For example:
import com.typesafe.config.ConfigFactory\nimport eu.ostrzyciel.jelly.stream.JellyOptionsFromTypesafe\n\nval config = ConfigFactory.parseString(\"\"\"\n |jelly.physical-type = QUADS\n |jelly.name-table-size = 1024\n |jelly.prefix-table-size = 64\n |\"\"\".stripMargin)\n\nval options = JellyOptionsFromTypesafe.fromConfig(config.getConfig(\"jelly\"))\noptions.physicalType // returns PhysicalStreamType.QUADS\noptions.maxNameTableSize // returns 1024\noptions.maxPrefixTableSize // returns 64\noptions.maxDatatypeTableSize // returns 16 (the default)\n
See the source code of this class for more details.
"},{"location":"user/utilities/#see-also","title":"See also","text":"Jelly-JVM is an implementation of the Jelly serialization format and gRPC streaming protocol for the Java Virtual Machine (JVM), written in Scala 31. The supported RDF libraries are Apache Jena and Eclipse RDF4J.
Jelly-JVM provides a full stack of utilities for fast and scalable RDF streaming with the Jelly protocol. Oh, and it's blazing-fast, too!
Getting started with plugins \u2013 no code required
See the getting started guide with plugins for a quick way to use Jelly with your Apache Jena or RDF4J application without writing any code.
Getting started for application developers
If you want to use the full feature set of Jelly-JVM in your code, see the getting started guide for application developers.
This documentation is for the latest development version of Jelly-JVM \u2013 it is not considered stable. If you are looking for the documentation of a stable release, use the version selector on the left of the top navigation bar. See: latest stable version.
"},{"location":"#library-modules","title":"Library modules","text":"The implementation is split into a few modules that can be used separately:
jelly-core
\u2013 implementation of the Jelly serialization format (using the scalapb library), along with generic utilities for converting the deserialized RDF data to/from the representations of RDF libraries (like Apache Jena or RDF4J).
jelly-jena
\u2013 conversions and interop code for the Apache Jena library.
jelly-rdf4j
\u2013 conversions and interop code for the RDF4J library.
jelly-stream
\u2013 utilities for building Reactive Streams of RDF data (based on Pekko Streams). Useful for integrating with gRPC or other streaming protocols (e.g., Kafka, MQTT).
jelly-grpc
\u2013 implementation of a gRPC client and server for the Jelly gRPC streaming protocol.
We also publish plugin JARs which allow you to use Jelly-JVM with Apache Jena and RDF4J just by dropping the JARs into the classpath. Find out more about using the plugins.
"},{"location":"#compatibility","title":"Compatibility","text":"The Jelly-JVM implementation is compatible with Java 11 and newer. Java 11, 17, and 21 are tested in CI and are guaranteed to work. Jelly is built with Scala 3 LTS releases.
The following table shows the compatibility of the Jelly-JVM implementation with other libraries:
Jelly-JVM Scala Java RDF4J Apache Jena Apache Pekko 2.0.x \u2013 2.2.x 3.3.x (LTS) 17+ 5.x.x 5.x.x 1.1.x 1.0.x 3.3.x (LTS)2.13.x1 11+ 4.x.x 4.x.x 1.0.xSee the compatibility policy for more details and the release notes on GitHub.
"},{"location":"#documentation","title":"Documentation","text":"Below is a list of all documentation pages about Jelly-JVM. You can also browse the Javadoc using the badges in the module list above. The documentation uses examples written in Scala, but the libraries can be used from Java as well.
Scala 2.13-compatible builds of Jelly-JVM are available for Jelly-JVM 1.0.x. Scala 2 support was removed in subsequent versions. See more details.\u00a0\u21a9\u21a9
Jelly-JVM is an open project \u2013 you are welcome to submit issues, pull requests, or just ask questions!
"},{"location":"contributing/#submitting-issues","title":"Submitting issues","text":"If you have a question, found a bug, or have an idea for a new feature, please open an issue in the GitHub issue tracker.
"},{"location":"contributing/#security-issues","title":"Security issues","text":"If you find a security issue or vulnerability, please do not open a public issue. Instead, use the dedicated vulnerability reporting page.
"},{"location":"contributing/#pull-requests","title":"Pull requests","text":"Pull requests are welcome! Simply fork the GitHub repository and create a new branch for your changes. When you are ready, open a pull request to the main
branch.
If you are working on a larger feature or a significant change, it is recommended to open an issue first to discuss the idea.
"},{"location":"contributing/#documentation","title":"Documentation","text":"Jelly-JVM uses the exact same documentation system as the main Jelly documentation. Further information on editing the documentation can be found in the Contributing to the Jelly documentation guide.
"},{"location":"contributing/#releases","title":"Releases","text":"See the dedicated page on making releases.
"},{"location":"contributing/#see-also","title":"See also","text":"If you don't want to code anything and only use Jelly with your Apache Jena/RDF4J application, see the dedicated guide about using Jelly-JVM as a plugin.
This guide explains a few of the basic functionalities of Jelly-JVM and how to use them in your code. Jelly-JVM is written in Scala, but it can be used from Java as well. However, in this guide, we will focus on Scala 3.
"},{"location":"getting-started-devs/#quick-start-plain-old-files","title":"Quick start \u2013 plain old files","text":"Depending on your RDF library of choice (Apache Jena or RDF4J), you should import one of two dependencies: jelly-jena
or jelly-rdf4j
1. In our examples we will use Jena, so let's add this to your build.sbt
file (this would be the same for other build tools like Maven or Gradle):
lazy val jellyVersion = \"2.1.0\"\n\nlibraryDependencies ++= Seq(\n \"eu.ostrzyciel.jelly\" %% \"jelly-jena\" % jellyVersion,\n)\n
Now you can serialize/deserialize Jelly data with Apache Jena. Jelly is fully integrated with Jena, so it should all just magically work. Here is a simple example of reading a .jelly
file (in this case, a metadata file from RiverBench) with RIOT:
import eu.ostrzyciel.jelly.convert.jena.riot.*\nimport org.apache.jena.riot.RDFDataMgr\n\n// Load an RDF graph from a Jelly file\nval model = RDFDataMgr.loadModel(\n \"https://w3id.org/riverbench/v/2.0.1.jelly\", \n JellyLanguage.JELLY\n)\n// Print the size of the model\nprintln(s\"Loaded an RDF graph with ${model.size} triples\")\n
Serialization is just as easy:
Serialization example (Scala 3)import eu.ostrzyciel.jelly.convert.jena.riot.*\nimport org.apache.jena.riot.RDFDataMgr\n\nimport java.io.FileOutputStream\nimport scala.util.Using\n\n// Omitted here: creating an RDF model.\n// You can use the one from the previous example.\n\nUsing.resource(new FileOutputStream(\"metadata.jelly\")) { out =>\n // Write the model to a Jelly file\n RDFDataMgr.write(out, model, JellyLanguage.JELLY)\n println(\"Saved the model to metadata.jelly\")\n}\n
Read more about using Jelly-JVM with Apache Jena
Read more about using Jelly-JVM with RDF4J
"},{"location":"getting-started-devs/#rdf-streams","title":"RDF streams","text":"Now, the real power of Jelly lies in its streaming capabilities. Not only can it stream individual RDF triples/quads (this is called flat streaming), but it can also very effectively handle streams of RDF graphs or datasets. To work with streams, you need to use the jelly-stream
module, which is based on the Apache Pekko Streams library. So, let's update our dependencies:
lazy val jellyVersion = \"2.1.0\"\n\nlibraryDependencies ++= Seq(\n \"eu.ostrzyciel.jelly\" %% \"jelly-jena\" % jellyVersion,\n \"eu.ostrzyciel.jelly\" %% \"jelly-stream\" % jellyVersion,\n)\n
Now, let's say we have a stream of RDF graphs \u2013 for example each graph corresponds to one set of measurements from an IoT sensor. We want to have a stream that turns these graphs into their serialized representations (byte arrays), which we can then send over the network. Here is how to do it:
Reactive streaming example (Scala 3)// We need to import \"jena.given\" for Jena-to-Jelly conversions\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport scala.concurrent.ExecutionContext\n\n// We will need a Pekko actor system to run the streams\ngiven actorSystem: ActorSystem = ActorSystem()\n// And an execution context for the futures\ngiven ExecutionContext = actorSystem.getDispatcher\n\n// Load an RDF graph for testing\nval model = RDFDataMgr.loadModel(\n \"https://w3id.org/riverbench/v/2.0.1.jelly\", \n JellyLanguage.JELLY\n)\n\nSource.repeat(model) // Create a stream of the same model over and over\n .take(10) // Take only the first 10 elements in the stream\n .map(_.asTriples) // Convert each model to an iterable of triples\n .via(EncoderFlow.graphStream( // Encode each iterable to a Jelly stream frame\n maybeLimiter = None, // 1 RDF graph = 1 message\n JellyOptions.smallStrict, // Jelly compression settings preset\n ))\n .via(JellyIo.toBytes) // Convert the stream frames to a byte arrays\n .runForeach { bytes =>\n // Just print the length of each byte array in the stream.\n // You can also hook this up to MQTT, Kafka, etc.\n println(s\"Streamed ${bytes.length} bytes\")\n }\n .onComplete(_ => actorSystem.terminate())\n
Jelly will compress this stream on-the-fly, so if the data is repetitive, it will be very efficient. If you run this code, you will notice that the byte sizes for the later graphs are smaller, even though we are sending the same graph over and over again. But, even if each graph is completely different, Jelly still should be much faster than other serialization formats.
These streams are very powerful, because they are reactive and asynchronous \u2013 in short, this means you can hook this up to any data source and any data sink \u2013 and you can scale it up as much as you want. If you are unfamiliar with the concept of reactive streams, we recommend you start with this Apache Pekko Streams guide.
Jelly-JVM supports streaming serialization and deserialization of all types of streams in the RDF Stream Taxonomy. You can read more about the theory of this and all available stream types in the Jelly protocol documentation.
Learn more about reactive streaming with Jelly-JVM
Learn more about the types of streams in Jelly
"},{"location":"getting-started-devs/#grpc-streaming","title":"gRPC streaming","text":"Jelly is a bit more than just a serialization format \u2013 it also defines a gRPC-based straming protocol. You can use it for streaming RDF data between microservices, to build a pub/sub system, or to publish RDF data to the web.
Learn more about using Jelly gRPC protocol servers and clients
"},{"location":"getting-started-devs/#further-reading","title":"Further reading","text":"jelly-stream
module and Apache Pekko Streamsexamples
directory in the Jelly-JVM repo contains code snippets that demonstrate how to use the library in various scenarios.If you have any questions about using Jelly-JVM, feel free to open an issue on GitHub.
There is nothing stopping you from using both at the same time. You can also pretty easily add support for any other Java-based RDF library by implementing a few interfaces. More details here.\u00a0\u21a9
This guide explains how to use Jelly-JVM with Apache Jena or RDF4J as a plugin, without writing a single line of code. Jelly-JVM provides plugin JARs that you can simply drop in the appropriate directory to get Jelly format support in your application.
"},{"location":"getting-started-plugins/#installation","title":"Installation","text":""},{"location":"getting-started-plugins/#apache-jena-apache-jena-fuseki","title":"Apache Jena, Apache Jena Fuseki","text":"You can simply add Jelly format support to Apache Jena or Apacha Jena Fuseki with Jelly's plugin JAR.
jelly-jena-plugin.jar
file.$FUSEKI_BASE/extra/
directory. $FUSEKI_BASE
is the directory usually called run
where you have files such as config.ttl
and shiro.ini
. You will most likely need to create the extra
directory yourself.lib/
directory of your Jena installation.Content negotiation in Fuseki
Content negotiation using the application/x-jelly-rdf
media type in the Accept
header works in Fuseki since Apache Jena version 5.2.0. Previous versions of Fuseki did not support media type registration.
You can simply add Jelly format support to an application based on RDF4J with Jelly's plugin JAR.
jelly-rdf4j-plugin.jar
file.The Jelly-JVM plugin JARs provide the following features:
.jelly
file extension.application/x-jelly-rdf
media type.The Jelly format is registered under the name jelly
in the RDF libraries, so you can use it in the same way as other formats like Turtle, RDF/XML, or JSON-LD.
Jelly-JVM is licensed under the Apache License 2.0.
"},{"location":"licensing/#attribution-citation","title":"Attribution / citation","text":"If you use Jelly-JVM in your research, please the most recent paper about Jelly:
Sowi\u0144ski, P., Wasielewska-Michniewska, K., Ganzha, M., & Paprzycki, M. (2022, October). Efficient RDF streaming for the edge-cloud continuum. In 2022 IEEE 8th World Forum on Internet of Things (WF-IoT) (pp. 1-8). IEEE.
Or use this BibTeX entry:
@inproceedings{sowinski2022efficient,\n title={Efficient RDF streaming for the edge-cloud continuum},\n author={Sowi{\\'n}ski, Piotr and Wasielewska-Michniewska, Katarzyna and Ganzha, Maria and Paprzycki, Marcin and others},\n booktitle={2022 IEEE 8th World Forum on Internet of Things (WF-IoT)},\n pages={1--8},\n year={2022},\n organization={IEEE},\n doi={10.1109/WF-IoT54382.2022.10152225}\n}\n
This paper describes an earlier version of Jelly from 2022. A new paper is in preparation.
"},{"location":"licensing/#jelly-maintainer","title":"Jelly maintainer","text":"Jelly-JVM was created and is maintained by Piotr Sowi\u0144ski (Ostrzyciel) \u2013 GitHub.
"},{"location":"licensing/#see-also","title":"See also","text":"Currently converters for the two most popular RDF JVM libraries are implemented \u2013 RDF4J and Jena. But it is possible to implement your own converters and adapt the Jelly serialization code to any RDF library with little effort.
To do this, you will need to implement three traits (interfaces in Java) from the jelly-core
module: ProtoEncoder
, ProtoDecoderConverter
, and ConverterFactory
.
ProtoEncoder (serialization)
get*
methods deconstruct triple statements, quad statements, and quoted triples (RDF-star). You can make them inline
.nodeToProto
and graphToProto
should translate into Jelly's representation all possible variations of RDF terms in the SPO and G positions, respectively.ProtoDecoderConverter (deserialization)
make*
methods should construct new RDF terms and statements. You can make them inline
.ConverterFactory \u2013 wrapper that allows other modules to use your converter.
ProtoEncoder
and ProtoDecoderConverter
implementations.Full (versioned) releases are created manually and follow the Semantic Versioning scheme for binary compatibility.
To create a new tagged release (example for version 1.2.3):
$ git checkout main\n$ git pull\n$ git tag v1.2.3\n$ git push origin v1.2.3\n
The rest (packaging and release creation) will be handled automatically by the CI. The release will be pushed to Maven Central.
"},{"location":"dev/releases/#snapshot-releases","title":"Snapshot releases","text":"Snapshot releases are triggered automatically by commits in the main
branch. Snapshots are pushed to the Sonatype snapshot repository.
Jelly-JVM follows Semantic Versioning 2.0.0, with MAJOR.MINOR.PATCH releases. Please see the compatibility table on the main page for the current compatibility information. The documentation is versioned to match each Jelly-JVM MAJOR.MINOR version.
"},{"location":"user/compatibility/#jvm-and-scala","title":"JVM and Scala","text":"The current version of Jelly-JVM is compatible with Java 17 and newer. Java 17, 21, and 23 are tested in CI and are guaranteed to work. We recommend using a recent release of GraalVM to get the best performance. If you need Java 11 support, you should use Jelly-JVM 1.0.x.
Jelly is built with Scala 3 LTS releases and supports only Scala 3. If you need Scala 2 support, you should use Jelly-JVM 1.0.x.
"},{"location":"user/compatibility/#rdf-libraries","title":"RDF libraries","text":"Major-version upgrades of RDF4J and Apache Jena (e.g., updating from 4.0.x to 5.0.x) are done in Jelly-JVM MINOR releases. Jelly-JVM generally does not use any complex features of these libraries, so it should work with multiple versions without any problems.
If you do encounter any compatibility issues, please report them on the issue tracker.
"},{"location":"user/compatibility/#internal-vs-external-apis","title":"Internal vs external APIs","text":"Generally, all public classes and methods in Jelly-JVM are considered part of the public API. However, there are some exceptions.
Auto-generated classes in the jelly-core
module, eu.ostrzyciel.jelly.core.proto.v1
package are not considered part of the public API, although we will avoid any incompatibilities where possible. These classes may change between MINOR releases.
Jelly-JVM follows the Jelly protocol's backward compatibility policy. This means that Jelly-JVM can read data serialized with older versions of Jelly. Backward compatibility is tested in CI \u2013 the code is in BackCompatSpec.scala.
Forward compatibility is provided only in a very limited manner in Jelly-JVM. The parser is guaranteed to only parse the stream options header and reject the rest of the stream, if the used protocol version is not supported. You may choose to disable this check and try to parse the rest of the data anyway, but this is most certainly NOT recommended and may lead to unexpected results. In general, Jelly-JVM will ignore any unknown fields in the stream, but any other changes in the protocol may lead to really \"funny\" errors. Forward compatibility is tested in CI \u2013 the code is in ForwardCompatSpec.scala.
"},{"location":"user/compatibility/#see-also","title":"See also","text":"This guide explains the functionalities of the jelly-grpc
module, which implements a gRPC client and server for the Jelly gRPC streaming protocol.
Prerequisites
If you are unfamiliar with gRPC, we recommend you first read some introductory material on the gRPC website or in the Apache Pekko gRPC documentation.
The jelly-grpc
module builds on the functionalities of jelly-stream
, so we recommend you first read the reactive streaming guide.
You may also want to first skim the Jelly gRPC streaming protocol specification to understand the protocol's structure.
As with the jelly-stream
module, you can use jelly-grpc
with any RDF library that has a Jelly integration, such as Apache Jena (using jelly-jena
) or RDF4J (using jelly-rdf4j
). The gRPC API is generic and identical across all libraries.
jelly-grpc
builds on the Apache Pekko gRPC library. Jelly-JVM provides boilerplate code for setting up a gRPC server and client that can send and receive Jelly streams, as shown in the example below:
Source code on GitHub
PekkoGrpc.scalapackage eu.ostrzyciel.jelly.examples\n\nimport com.typesafe.config.ConfigFactory\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.*\nimport eu.ostrzyciel.jelly.grpc.RdfStreamServer\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.NotUsed\nimport org.apache.pekko.actor.typed.ActorSystem\nimport org.apache.pekko.actor.typed.javadsl.Behaviors\nimport org.apache.pekko.grpc.{GrpcClientSettings, GrpcServiceException}\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.concurrent.{Await, ExecutionContext, Future}\nimport scala.concurrent.duration.*\nimport scala.util.{Failure, Success}\n\n/**\n * Example of using Jelly's gRPC client and server to send Jelly streams over the network.\n * This uses the Apache Pekko gRPC library. Its documentation can be found at:\n * https://pekko.apache.org/docs/pekko-grpc/current/index.html\n * \n * See also examples named `PekkoStreams*` for instructions on encoding and decoding RDF streams with Jelly.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoGrpc extends shared.Example:\n // Create a config for Pekko gRPC.\n // We can use the same config for the client and the server, as we are communicating on localhost.\n // This would usually be loaded from a configuration file (e.g., application.conf).\n // More details: https://github.com/lightbend/config\n val config = ConfigFactory.parseString(\n \"\"\"\n |pekko.http.server.preview.enable-http2 = on\n |pekko.grpc.client.jelly.host = 127.0.0.1\n |pekko.grpc.client.jelly.port = 8088\n |pekko.grpc.client.jelly.enable-gzip = true\n |pekko.grpc.client.jelly.use-tls = false\n |pekko.grpc.client.jelly.backend = netty\n |\"\"\".stripMargin\n )\n .withFallback(ConfigFactory.defaultApplication())\n\n // We will need two Pekko actor systems to run the streams \u2013 one for the server and one for the client\n val serverActorSystem: ActorSystem[_] = ActorSystem(Behaviors.empty, \"ServerSystem\")\n val clientActorSystem: ActorSystem[_] = ActorSystem(Behaviors.empty, \"ClientSystem\", config)\n\n // Our mock dataset that we will send around in the streams\n val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n\n /**\n * Main method that starts the server and the client.\n */\n def main(args: Array[String]): Unit =\n given system: ActorSystem[_] = serverActorSystem\n given ExecutionContext = system.executionContext\n\n // Start the server\n val exampleService = ExampleJellyService()\n RdfStreamServer(\n RdfStreamServer.Options.fromConfig(config.getConfig(\"pekko.grpc.client.jelly\")),\n exampleService\n ).run() onComplete {\n case Success(binding) =>\n // If the server started successfully, start the client\n println(s\"[SERVER] Bound to ${binding.localAddress}\")\n runClient()\n case Failure(exception) =>\n // Otherwise, print the error and terminate the actor system\n println(s\"[SERVER] Failed to bind: $exception\")\n system.terminate()\n }\n\n\n /**\n * The client part of the example.\n */\n private def runClient(): Unit =\n given system: ActorSystem[_] = clientActorSystem\n given ExecutionContext = system.executionContext\n\n // Create a gRPC client\n val client = RdfStreamServiceClient(GrpcClientSettings.fromConfig(\"jelly\"))\n\n // First, let's try to publish some data to the server\n val frameSource = EncoderSource.fromDatasetAsQuads(\n dataset,\n ByteSizeLimiter(500),\n JellyOptions.smallStrict.withStreamName(\"weather\")\n )\n println(\"[CLIENT] Publishing data to the server...\")\n val publishFuture = client.publishRdf(frameSource) map { response =>\n println(s\"[CLIENT] Received acknowledgment: $response\")\n } recover {\n case e =>\n println(s\"[CLIENT] Failed to publish data: $e\")\n }\n // Wait for the publish to complete\n Await.ready(publishFuture, 10.seconds)\n\n // Now, let's try to subscribe to some data from the server in the QUADS format\n println(\"\\n\\n[CLIENT] Subscribing to QUADS data from the server...\")\n val quadsFuture = client\n .subscribeRdf(RdfStreamSubscribe(\n \"weather\",\n Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.QUADS))\n ))\n .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n .runFold(0L)((acc, _) => acc + 1)\n // Process the result of the stream (Future[Long])\n .map { counter =>\n println(s\"[CLIENT] Received $counter quads.\")\n } recover {\n case e =>\n println(s\"[CLIENT] Failed to receive quads: $e\")\n }\n Await.ready(quadsFuture, 10.seconds)\n\n // Let's try the same, with a GRAPHS stream\n println(\"\\n\\n[CLIENT] Subscribing to GRAPHS data from the server...\")\n val graphsFuture = client\n .subscribeRdf(RdfStreamSubscribe(\n \"weather\",\n Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.GRAPHS))\n ))\n // Decode the response and transform it into a stream of quads\n .via(DecoderFlow.decodeGraphs.asDatasetStreamOfQuads)\n .mapConcat(identity)\n .runFold(0L)((acc, _) => acc + 1)\n // Process the result of the stream (Future[Long])\n .map { counter =>\n println(s\"[CLIENT] Received $counter quads.\")\n } recover {\n case e =>\n println(s\"[CLIENT] Failed to receive data: $e\")\n }\n Await.ready(graphsFuture, 10.seconds)\n\n // Finally, let's try to subscribe to a stream that the server does not support\n // We will request TRIPLES, but the server only supports QUADS and GRAPHS.\n println(\"\\n\\n[CLIENT] Subscribing to TRIPLES data from the server...\")\n val triplesFuture = client\n .subscribeRdf(RdfStreamSubscribe(\n \"weather\",\n Some(JellyOptions.smallStrict.withPhysicalType(PhysicalStreamType.TRIPLES))\n ))\n .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n .runFold(0L)((acc, _) => acc + 1)\n .map { counter =>\n println(s\"[CLIENT] Received $counter triples.\")\n } recover {\n case e =>\n println(s\"[CLIENT] Failed to receive triples: $e\")\n }\n Await.result(triplesFuture, 10.seconds)\n\n println(\"\\n\\n[CLIENT] Terminating...\")\n system.terminate()\n println(\"[SERVER] Terminating...\")\n serverActorSystem.terminate()\n\n\n /**\n * Example implementation of RdfStreamService to act as the server.\n * \n * You will also need to implement this trait in your own service. It defines the logic with which the server\n * will handle incoming streams and subscriptions.\n */\n class ExampleJellyService(using system: ActorSystem[_]) extends RdfStreamService:\n given ExecutionContext = system.executionContext\n\n /**\n * Handler for clients publishing RDF streams to the server.\n * \n * We receive a stream of RdfStreamFrames and must respond with an acknowledgment (or an error).\n */\n override def publishRdf(in: Source[RdfStreamFrame, NotUsed]): Future[RdfStreamReceived] =\n // Decode the incoming stream and count the number of RDF statements in it\n in.via(DecoderFlow.decodeAny.asFlatStream)\n .runFold(0L)((acc, _) => acc + 1)\n .map(counter => {\n println(s\"[SERVER] Received ${counter} RDF statements. Sending acknowledgment.\")\n // Send an acknowledgment back to the client\n RdfStreamReceived()\n })\n\n /**\n * Handler for clients subscribing to RDF streams from the server.\n * \n * We receive a subscription request and must respond with a stream of RdfStreamFrames or an error.\n */\n override def subscribeRdf(in: RdfStreamSubscribe): Source[RdfStreamFrame, NotUsed] =\n println(s\"[SERVER] Received subscription request for topic ${in.topic}.\")\n // First, check the requested physical stream type\n val streamType = in.requestedOptions match\n case Some(options) =>\n println(s\"[SERVER] Requested physical stream type: ${options.physicalType}.\")\n options.physicalType\n case None =>\n println(s\"[SERVER] No requested stream options.\")\n PhysicalStreamType.UNSPECIFIED\n\n // Get the stream options requested by the client or the default options if none were provided\n val options = in.requestedOptions.getOrElse(JellyOptions.smallStrict)\n .withStreamName(in.topic)\n // Check if the requested options are supported\n // !!! THIS IS IMPORTANT !!!\n // If you don't check if the requested options are supported, you may be vulnerable to\n // denial-of-service attacks. For example, a client could request a very large lookup table\n // that would consume a lot of memory on the server.\n try\n JellyOptions.checkCompatibility(options, JellyOptions.defaultSupportedOptions)\n catch\n case e: IllegalArgumentException =>\n // If the requested options are not supported, return an error\n return Source.failed(new GrpcServiceException(\n io.grpc.Status.INVALID_ARGUMENT.withDescription(e.getMessage)\n ))\n\n streamType match\n // This server implementation only supports QUADS and GRAPHS streams... and in both cases\n // it will always the same dataset.\n // You can of course implement more complex logic here, e.g., to stream different data based on the topic.\n case PhysicalStreamType.QUADS => EncoderSource.fromDatasetAsQuads(\n dataset,\n ByteSizeLimiter(16_000),\n options\n )\n case PhysicalStreamType.GRAPHS => EncoderSource.fromDatasetAsGraphs(\n dataset,\n Some(ByteSizeLimiter(16_000)),\n options\n )\n // PhysicalStreamType.TRIPLES is not supported here \u2013 the server will throw a gRPC error\n // if the client requests it.\n // This is an example of how to properly handle unsupported stream options requested by the client.\n // The library is able to automatically convert the error into a gRPC status and send it back to the client.\n case _ => Source.failed(new GrpcServiceException(\n io.grpc.Status.INVALID_ARGUMENT.withDescription(\"Unsupported physical stream type\")\n ))\n
The classes provided in jelly-grpc
should cover most cases, but they only serve as the boilerplate. You must yourself define the logic for handling the incoming and outgoing streams, as shown in the example above.
Of course, you can also implement the server or the client from scratch, if you want to.
"},{"location":"user/grpc/#see-also","title":"See also","text":"This guide explains the functionalities of the jelly-jena
module, which provides Jelly support for Apache Jena.
If you just want to add Jelly format support to Apache Jena / Apache Jena Fuseki, you can use the Jelly-JVM plugin JAR. See the dedicated guide for more information.
"},{"location":"user/jena/#base-facilities","title":"Base facilities","text":"jelly-jena
implements the eu.ostrzyciel.jelly.core.ConverterFactory
trait in eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory
. This factory allows you to build encoders and decoders that convert between Jelly's RdfStreamFrame
s and Apache Jena's Triple
and Quad
objects. The eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame
class is an object representation of Jelly's binary format.
The module also implements the eu.ostrzyciel.jelly.core.IterableAdapter
trait in eu.ostrzyciel.jelly.convert.jena.JenaIterableAdapter
. This adapter provides extension methods for Apache Jena's Model
, Dataset
, Graph
, and DatasetGraph
classes to convert them into an iterable of triples (.asTriples
), quads (.asQuads
), or named graphs (.asGraphs
). This is useful when working with Jelly on a lower level or when using the jelly-stream
module.
jelly-jena
implements an RDF writer and reader for Apache Jena's RIOT library. This means you can use Jelly just like, for example, Turtle or RDF/XML. See the example below:
Source code on GitHub
JenaRiot.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.*\nimport org.apache.jena.rdf.model.ModelFactory\nimport org.apache.jena.riot.{RDFDataMgr, RDFFormat, RDFParser, RDFWriterRegistry, RIOT}\n\nimport java.io.{File, FileOutputStream}\nimport scala.util.Using\n\n/**\n * Example of using Jelly's integration with Apache Jena's RIOT library for\n * writing and reading RDF graphs and datasets to/from disk.\n *\n * See also: https://jena.apache.org/documentation/io/\n */\nobject JenaRiot extends shared.Example:\n def main(args: Array[String]): Unit =\n // Load the RDF graph from an N-Triples file\n val model = RDFDataMgr.loadModel(File(getClass.getResource(\"/weather.nt\").toURI).toURI.toString)\n\n // Print the size of the model\n println(s\"Loaded an RDF graph from N-Triples with size: ${model.size}\")\n\n Using.resource(new FileOutputStream(\"weather.jelly\")) { out =>\n // Write the model to a Jelly file\n // Note: by default this will use the [[JellyFormat.JELLY_SMALL_STRICT]] format variant\n RDFDataMgr.write(out, model, JellyLanguage.JELLY)\n println(\"Saved the model to a Jelly file\")\n }\n\n // Load the RDF graph from a Jelly file\n val model2 = RDFDataMgr.loadModel(\"weather.jelly\", JellyLanguage.JELLY)\n\n // Print the size of the model\n println(s\"Loaded an RDF graph from Jelly with size: ${model2.size}\")\n\n\n\n // ---------------------------------\n println(\"\\n\")\n\n // Try the same with an RDF dataset and some different settings\n val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n println(s\"Loaded an RDF dataset from a Trig file with ${dataset.asDatasetGraph.size} named graphs and \" +\n s\"${dataset.asDatasetGraph.stream.count} quads\")\n\n Using.resource(new FileOutputStream(\"weather-quads.jelly\")) { out =>\n // Write the dataset to a Jelly file, using the \"BIG\" settings\n // (better compression for big files, more memory usage)\n RDFDataMgr.write(out, dataset, JellyFormat.JELLY_BIG_STRICT)\n println(\"Saved the dataset to a Jelly file\")\n }\n\n // Load the RDF dataset from a Jelly file\n val dataset2 = RDFDataMgr.loadDataset(\"weather-quads.jelly\", JellyLanguage.JELLY)\n println(s\"Loaded an RDF dataset from Jelly with ${dataset2.asDatasetGraph.size} named graphs and \" +\n s\"${dataset2.asDatasetGraph.stream.count} quads\")\n\n // ---------------------------------\n println(\"\\n\")\n\n // Custom Jelly format \u2013 change any settings you like\n val customFormat = new RDFFormat(\n JellyLanguage.JELLY,\n JellyFormatVariant(\n opt = JellyOptions.smallStrict\n .withMaxPrefixTableSize(0) // disable the prefix table\n .withStreamName(\"My weather stream\"), // add metadata to the stream\n frameSize = 16 // make RdfStreamFrames with 16 rows each\n )\n )\n\n // Jena requires us to register the custom format \u2013 once for graphs and once for datasets,\n // as Jelly supports both.\n RDFWriterRegistry.register(customFormat, JellyGraphWriterFactory)\n RDFWriterRegistry.register(customFormat, JellyDatasetWriterFactory)\n\n Using.resource(new FileOutputStream(\"weather-quads-custom.jelly\")) { out =>\n // Write the dataset to a Jelly file using the custom format\n RDFDataMgr.write(out, dataset, customFormat)\n println(\"Saved the dataset to a Jelly file with custom settings\")\n }\n\n // Load the RDF dataset from a Jelly file with the custom format\n val dataset3 = RDFDataMgr.loadDataset(\"weather-quads-custom.jelly\", JellyLanguage.JELLY)\n println(s\"Loaded an RDF dataset from Jelly with custom settings with ${dataset3.asDatasetGraph.size} named graphs\" +\n s\" and ${dataset3.asDatasetGraph.stream.count} quads\")\n\n // ---------------------------------\n println(\"\\n\")\n\n // By default, the parser has limits on for example the maximum size of the lookup tables.\n // The default supported options are [[JellyOptions.defaultSupportedOptions]].\n // You can change these limits by creating your own options object.\n val customOptions = JellyOptions.defaultSupportedOptions\n .withMaxNameTableSize(50) // set the maximum size of the name table to 100\n // Create a Context object with the custom options\n val parserContext = RIOT.getContext.copy()\n .set(JellyLanguage.SYMBOL_SUPPORTED_OPTIONS, customOptions)\n\n println(\"Trying to load the model with custom supported options...\")\n val model3 = ModelFactory.createDefaultModel()\n try\n // The loading operation should fail because our allowed max name table size is too low\n RDFParser.create()\n .source(\"weather.jelly\")\n .lang(JellyLanguage.JELLY)\n // Set the context object with the custom options\n .context(parserContext)\n .parse(model3)\n catch\n case e: RdfProtoDeserializationError =>\n // The stream uses a name table size of 128, which is larger than the maximum supported size of 50.\n // To read this stream, set maxNameTableSize to at least 128 in the supportedOptions for this decoder.\n println(s\"Failed to load the model with custom options: ${e.getMessage}\")\n
Usage notes:
eu.ostrzyciel.jelly.core.JellyOptions
provides a few common presets for Jelly serialization options construct a JellyFormatVariant
, as shown in the example above. You can also further customize the serialization options (e.g., dictionary size).jelly-stream
module or the more low-level API: Low-level usage.RdfStreamFrame
in the file.eu.ostrzyciel.jelly.convert.jena.riot.JellyLanguage
object (source code). This registration should happen automatically when you include the jelly-jena
module in your project, using Jena's component initialization mechanism.jelly-jena
also implements a streaming writer (StreamRDF
API in Jena). Using it is similar to the regular RIOT writer, with a slightly different setup:
Source code on GitHub
JenaRiotStreaming.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.riot.*\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.PhysicalStreamType\nimport org.apache.jena.graph.{NodeFactory, Triple}\nimport org.apache.jena.riot.system.{StreamRDFLib, StreamRDFWriter}\nimport org.apache.jena.riot.{RDFDataMgr, RDFParser, RIOT}\n\nimport java.io.{File, FileOutputStream}\nimport scala.util.Using\n\n/**\n * Example of using Apache Jena's streaming IO API with Jelly.\n *\n * See also: https://jena.apache.org/documentation/io/streaming-io.html\n */\nobject JenaRiotStreaming extends shared.Example:\n def main(args: Array[String]): Unit =\n // Initialize a Jena StreamRDF to consume the statements\n val readerStream = StreamRDFLib.count()\n\n println(\"Reading a stream of triples from a Jelly file...\")\n\n // Parse a Jelly file as a stream of triples\n val inputFileTriples = new File(getClass.getResource(\"/jelly/weather.jelly\").toURI)\n RDFParser\n .source(inputFileTriples.toURI.toString)\n .lang(JellyLanguage.JELLY)\n .parse(readerStream)\n\n println(f\"Read ${readerStream.countTriples()} triples\")\n println()\n println(\"Reading a stream of quads from a Jelly file...\")\n\n // Parse a different Jelly file as a stream of quads and send it to the same sink\n val inputFileQuads = new File(getClass.getResource(\"/jelly/weather-quads.jelly\").toURI)\n RDFParser\n .source(inputFileQuads.toURI.toString)\n .lang(JellyLanguage.JELLY)\n .parse(readerStream)\n\n // Print the number of triples and quads\n //\n // The number of triples here is the sum of the triples from the first file and the triples\n // in the default graph of the second file. This is just how Jena handles it.\n println(f\"Read ${readerStream.countTriples()} triples (in total)\" +\n f\" and ${readerStream.countQuads()} quads\")\n\n // -------------------------------------\n println(\"\\n\")\n\n println(\"Writing a stream of 10 triples to a file...\")\n\n // Try writing some triples to a file\n // We need to create an instance of RdfStreamOptions to pass to the writer:\n val options = JellyOptions.smallStrict\n // The stream writer does not know if we will be writing triples or quads \u2013 we\n // have to specify the physical stream type explicitly.\n .withPhysicalType(PhysicalStreamType.TRIPLES)\n .withStreamName(\"A stream of 10 triples\")\n\n // To pass the options, we use Jena's Context mechanism\n val context = RIOT.getContext.copy()\n .set(JellyLanguage.SYMBOL_STREAM_OPTIONS, options)\n .set(JellyLanguage.SYMBOL_FRAME_SIZE, 128) // optional, default is 256\n\n Using.resource(new FileOutputStream(\"stream-riot.jelly\")) { out =>\n // Create the writer \u2013 remember to pass the context!\n val writerStream = StreamRDFWriter.getWriterStream(out, JellyLanguage.JELLY, context)\n writerStream.start()\n\n for i <- 1 to 10 do\n writerStream.triple(Triple.create(\n NodeFactory.createBlankNode(),\n NodeFactory.createURI(\"https://example.org/p\"),\n NodeFactory.createLiteralString(s\"object $i\")\n ))\n\n writerStream.finish()\n }\n\n println(\"Done writing triples\")\n\n // Load the RDF graph that we just saved using normal RIOT API\n val model = RDFDataMgr.loadModel(\"stream-riot.jelly\", JellyLanguage.JELLY)\n\n println(\"Loaded the stream from disk, contents:\\n\")\n model.write(System.out, \"NT\")\n
"},{"location":"user/jena/#see-also","title":"See also","text":"Warning
This page describes a low-level API that is a bit of a hassle to use directly. It's recommended to use the higher-level abstractions provided by the jelly-stream
module, or the integrations with Apache Jena's RIOT or RDF4J's Rio libraries. If you really want to use this, it is highly recommended that you first get a basic understanding of how Jelly works under the hood and take a look at the code in the jelly-stream
module to see how it's done there.
Note
The following guide uses the Apache Jena library as an example. The exact same thing can be done with RDF4J or any other RDF library that has a Jelly integration.
"},{"location":"user/low-level/#deserialization","title":"Deserialization","text":"To parse a serialized stream frame into triples/quads:
eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame.parseFrom
if it's a non-delimited frame (like you would see, e.g., in a Kafka or gRPC stream), or parseDelimitedFrom
if it's a delimited stream (like you would see in a file or a socket).eu.ostrzyciel.jelly.core.IoUtils.autodetectDelimiting
. In most cases you will not need to use it. It is used internally by the Jena and RDF4J integrations for user convenience.RdfStreamFrame
s into triples/quads: eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory
has different methods for different physical stream types:anyStatementDecoder
for any physical stream type, outputs Triple
or Quad
triplesDecoder
for TRIPLES streams, outputs Triple
quadsDecoder
for QUADS streams, outputs Quad
graphsDecoder
for GRAPHS streams, outputs (Node, Iterable[Triple])
graphsAsQuadsDecoder
for GRAPHS streams, outputs Quad
ingestRow
method to get the output iteratively.To serialize triples/quads into a stream frame:
asTriples
/asQuads
/asGraphs
extension methods provided by the eu.ostrzyciel.jelly.convert.jena.JenaIterableAdapter
object.RdfStreamRow
s (the rows of a stream frame): use the eu.ostrzyciel.jelly.convert.jena.JenaConverterFactory.encoder
method to get an instance of eu.ostrzyciel.jelly.convert.jena.JenaProtoEncoder
.RdfStreamFrame
s. What you do here depends highly on the logical stream type you are working with.This guide explains the functionalities of the jelly-rdf4j
module, which provides Jelly support for Eclipse RDF4J.
If you just want to add Jelly format support to your RDF4J application, you can use the Jelly-JVM plugin JAR. See the dedicated guide for more information.
"},{"location":"user/rdf4j/#base-facilities","title":"Base facilities","text":"jelly-rdf4j
implements the eu.ostrzyciel.jelly.core.ConverterFactory
trait in eu.ostrzyciel.jelly.convert.rdf4j.Rdf4jConverterFactory
. This factory allows you to build encoders and decoders that convert between Jelly's RdfStreamFrame
s and RDF4J's Statement
objects. The eu.ostrzyciel.jelly.core.proto.v1.RdfStreamFrame
class is an object representation of Jelly's binary format.
The module also implements the eu.ostrzyciel.jelly.core.IterableAdapter
trait in eu.ostrzyciel.jelly.convert.rdf4j.Rdf4jIterableAdapter
. This adapter provides extension methods for RDF4J's Model
class to convert it into an iterable of triples (.asTriples
), quads (.asQuads
), or named graphs (.asGraphs
). This is useful when working with Jelly on a lower level or when using the jelly-stream
module.
jelly-rdf4j
implements an RDF writer and parser for Eclipse RDF4J's Rio library. This means you can use Jelly just like any other RDF serialization format (e.g., RDF/XML, Turtle). See the example below:
Source code on GitHub
Rdf4jRio.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.rdf4j.rio.*\nimport eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.{PhysicalStreamType, RdfStreamOptions}\nimport org.eclipse.rdf4j.model.Statement\nimport org.eclipse.rdf4j.rio.helpers.StatementCollector\nimport org.eclipse.rdf4j.rio.{RDFFormat, Rio}\n\nimport java.io.{File, FileOutputStream}\nimport scala.jdk.CollectionConverters.*\nimport scala.util.Using\n\n/**\n * Example of using RDF4J's Rio library to read and write RDF data.\n *\n * See also: https://rdf4j.org/documentation/programming/rio/\n */\nobject Rdf4jRio extends shared.Example:\n def main(args: Array[String]): Unit =\n // Load the RDF graph from an N-Triples file\n val inputFile = File(getClass.getResource(\"/weather.nt\").toURI)\n val triples = readRdf4j(inputFile, RDFFormat.TURTLE, None)\n\n // Print the size of the graph\n println(s\"Loaded ${triples.size} triples from an N-Triples file\")\n\n // Write the RDF graph to a Jelly file\n // Fist, create the stream's options:\n val options = JellyOptions.smallStrict\n // Setting the physical stream type is mandatory! It will always be either TRIPLES or QUADS.\n .withPhysicalType(PhysicalStreamType.TRIPLES)\n // Set other optional options\n .withStreamName(\"My weather data\")\n // Create the config object to pass to the writer\n val config = JellyWriterSettings.configFromOptions(options, frameSize = 128)\n\n // Do the actual writing\n Using.resource(new FileOutputStream(\"weather.jelly\")) { out =>\n val writer = Rio.createWriter(JELLY, out)\n writer.setWriterConfig(config)\n writer.startRDF()\n triples.foreach(writer.handleStatement)\n writer.endRDF()\n }\n\n println(\"Saved the model to a Jelly file\")\n\n // Load the RDF graph from the Jelly file\n val jellyFile = File(\"weather.jelly\")\n val jellyTriples = readRdf4j(jellyFile, JELLY, None)\n\n // Print the size of the graph\n println(s\"Loaded ${jellyTriples.size} triples from a Jelly file\")\n\n // ---------------------------------\n println(\"\\n\")\n // By default, the parser has limits on for example the maximum size of the lookup tables.\n // The default supported options are [[JellyOptions.defaultSupportedOptions]].\n // You can change these limits by creating your own options object.\n val customOptions = JellyOptions.defaultSupportedOptions\n .withMaxPrefixTableSize(10) // set the maximum size of the prefix table to 10\n println(\"Trying to read the Jelly file with custom options...\")\n try\n // This operation should fail because the Jelly file uses a prefix table larger than 10\n val customTriples = readRdf4j(jellyFile, JELLY, Some(customOptions))\n catch\n case e: RdfProtoDeserializationError =>\n // The stream uses a prefix table size of 16, which is larger than the maximum supported size of 10.\n // To read this stream, set maxPrefixTableSize to at least 16 in the supportedOptions for this decoder.\n println(s\"Failed to read the Jelly file with custom options: ${e.getMessage}\")\n\n\n /**\n * Helper function to read RDF data using RDF4J's Rio library.\n * @param file file to read from\n * @param format RDF format\n * @param supportedOptions supported options for reading Jelly streams (optional)\n * @return sequence of RDF statements\n */\n private def readRdf4j(file: File, format: RDFFormat, supportedOptions: Option[RdfStreamOptions]): Seq[Statement] =\n val parser = Rio.createParser(format)\n val collector = new StatementCollector()\n parser.setRDFHandler(collector)\n supportedOptions.foreach(opt =>\n // If the user provided supported options, set them on the parser\n parser.setParserConfig(JellyParserSettings.configFromOptions(opt))\n )\n Using.resource(file.toURI.toURL.openStream()) { is =>\n parser.parse(is)\n }\n collector.getStatements.asScala.toSeq\n
Usage notes:
eu.ostrzyciel.jelly.core.JellyOptions
provides a few common presets for Jelly serialization options. These options are passed through eu.ostrzyciel.jelly.convert.rdf4j.rio.JellyWriterSettings.configFromOptions
and used to configure the writer, as shown in the example above. You can also further customize the serialization options (e.g., dictionary size).jelly-stream
module or the more low-level API: Low-level usage.RdfStreamFrame
in the file.eu.ostrzyciel.jelly.convert.rdf4j.rio
package (source code). They are automatically registered on startup using the RDFParserFactory
and RDFWriterFactory
SPIs provided by RDF4J.This guide explains the reactive streaming functionalities of the jelly-stream
module.
Prerequisites
If you are unfamiliar with the concept of reactive streams or Apache Pekko Streams, we highly recommend you start from reading about the basic concepts of Pekko Streams.
We also recommend you first read about the RDF stream types in Jelly. Otherwise, this guide may not make much sense.
You can use jelly-stream
with any RDF library that has a Jelly integration, such as Apache Jena (using jelly-jena
) or RDF4J (using jelly-rdf4j
). The streaming API is generic and identical across all libraries.
A key notion of this API are the encoders and decoders.
Triple
in Apache Jena) into an object representation of Jelly's binary format (RdfStreamFrame
).RdfStreamFrame
s into objects from your RDF library of choice.So, for example, an encoder flow for flat triple streams would have a type of Flow[Triple, RdfStreamFrame, NotUsed]
in Apache Jena. The opposite (a flat triple stream decoder) would have a type of Flow[RdfStreamFrame, Triple, NotUsed]
.
RdfStreamFrame
s can be converted to and from raw bytes using a range of methods, depending on your use case. See the sections below for examples.
EncoderSource
)","text":"The easiest way to start is with flat RDF streams (i.e., flat streams of triples or quads). You can convert an RDF dataset or graph into such using the methods in eu.ostrzyciel.jelly.stream.EncoderSource
.
Source code on GitHub
PekkoStreamsEncoderSource.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.EncoderSource]] utility to convert RDF graphs and datasets\n * into Jelly streams with a single method call.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsEncoderSource extends shared.Example:\n def main(args: Array[String]): Unit =\n // We will need a Pekko actor system to run the streams\n given actorSystem: ActorSystem = ActorSystem()\n // And an execution context for the futures\n given ExecutionContext = actorSystem.getDispatcher\n\n // Load an example RDF graph from an N-Triples file\n val model = RDFDataMgr.loadModel(File(getClass.getResource(\"/weather.nt\").toURI).toURI.toString)\n\n println(s\"Loaded model with ${model.size()} triples\")\n println(s\"Streaming the model to memory...\")\n\n // Create a Pekko Streams Source from the Jena model\n // This automatically sets the physical and logical stream types.\n val encodedModelFuture = EncoderSource\n .fromGraph(\n model,\n // Aim for frames with ~2000 bytes \u2013 may be more!\n ByteSizeLimiter(2000),\n JellyOptions.smallStrict,\n )\n // wireTap: print the size of the frames\n // Notice in the output that the frames are slightly bigger than 2000 bytes.\n .wireTap(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes on wire\"))\n // Convert each stream frame to bytes\n .via(JellyIo.toBytes)\n // Collect the stream into a sequence\n .runWith(Sink.seq)\n\n // Wait for the stream to complete and collect the result\n val encodedModel = Await.result(encodedModelFuture, 10.seconds)\n\n println(s\"Streamed model to memory with ${encodedModel.size} frames and\" +\n s\" ${encodedModel.map(_.length).sum} bytes on wire\")\n\n println(\"\\n\")\n\n // -------------------------------------------------------------------\n // Second example: try encoding an RDF dataset as a GRAPHS stream\n val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n println(s\"Loaded dataset with ${dataset.asDatasetGraph.size} named graphs\")\n println(s\"Streaming the dataset to memory...\")\n\n val encodedDatasetFuture = EncoderSource\n // Here we stream this is as a GRAPHS stream (physical type)\n // You can also use .fromDatasetAsQuads to stream as QUADS\n .fromDatasetAsGraphs(\n dataset,\n // This time we limit the number of rows in each frame to 30\n // Note that for this particular encoder, we can skip the limiter entirely \u2013 but this can lead to huge frames!\n // So, be careful with that, or may get an out-of-memory error.\n Some(StreamRowCountLimiter(30)),\n JellyOptions.smallStrict,\n )\n // wireTap: print the size of the frames\n // Note that some frames smaller than the limit \u2013 this is because this encoder will always split frames\n // on graph boundaries.\n .wireTap(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes on wire\"))\n // Convert each stream frame to bytes\n .via(JellyIo.toBytes)\n // Collect the stream into a sequence\n .runWith(Sink.seq)\n\n // Wait for the stream to complete and collect the result\n val encodedDataset = Await.result(encodedDatasetFuture, 10.seconds)\n\n println(s\"Streamed dataset to memory with ${encodedDataset.size} frames and\" +\n s\" ${encodedDataset.map(_.length).sum} bytes on wire\")\n\n actorSystem.terminate()\n
"},{"location":"user/reactive/#encoding-any-rdf-data-as-a-flat-or-grouped-stream-encoderflow","title":"Encoding any RDF data as a flat or grouped stream (EncoderFlow
)","text":"The eu.ostrzyciel.jelly.stream.EncoderFlow
provides even more options for turning RDF data into Jelly streams, including both grouped and flat streams. Every type of RDF stream in Jelly can be created using this API.
Source code on GitHub
PekkoStreamsEncoderFlow.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.EncoderFlow]] utility to encode RDF data as Jelly streams.\n * \n * Here, the RDF data is turned into a series of byte buffers, with each buffer corresponding to exactly one frame.\n * This is suitable if your streaming protocol (e.g., Kafka, MQTT, AMQP) already frames the messages.\n * If you are writing to a raw socket or file, then you must use the DELIMITED variant of Jelly instead.\n * See [[eu.ostrzyciel.jelly.examples.PekkoStreamsWithIo]] for examples of that.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsEncoderFlow extends shared.Example:\n def main(args: Array[String]): Unit =\n // We will need a Pekko actor system to run the streams\n given actorSystem: ActorSystem = ActorSystem()\n // And an execution context for the futures\n given ExecutionContext = actorSystem.getDispatcher\n\n // Load the example dataset\n val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n // First, let's see what views of the dataset can we obtain using Jelly's Iterable adapters:\n // 1. Iterable of all quads in the dataset\n val quads: immutable.Iterable[Quad] = dataset.asQuads\n // 2. Iterable of all graphs (named and default) in the dataset\n val graphs: immutable.Iterable[(Node, Iterable[Triple])] = dataset.asGraphs\n // 3. Iterable of all triples in the default graph\n val triples: immutable.Iterable[Triple] = dataset.getDefaultModel.asTriples\n\n // Note: here we are not turning the frames into bytes, but just printing their size in bytes.\n // You can find an example of how to turn a frame into a byte array in the `PekkoStreamsEncoderSource` example.\n // This is done with: .via(JellyIo.toBytes)\n\n // Let's try encoding this as flat RDF streams (streams of triples or quads)\n // https://w3id.org/stax/ontology#flatQuadStream\n println(f\"Encoding ${quads.size} quads as a flat RDF quad stream\")\n val flatQuadsFuture = Source(quads)\n .via(EncoderFlow.flatQuadStream(\n // This encoder requires a size limiter \u2013 otherwise a stream frame could have infinite length!\n StreamRowCountLimiter(20),\n JellyOptions.smallStrict,\n ))\n .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n Await.ready(flatQuadsFuture, 10.seconds)\n\n // https://w3id.org/stax/ontology#flatTripleStream\n println(f\"\\n\\nEncoding ${triples.size} triples as a flat RDF triple stream\")\n val flatTriplesFuture = Source(triples)\n .via(EncoderFlow.flatTripleStream(\n // This encoder requires a size limiter \u2013 otherwise a stream frame could have infinite length!\n ByteSizeLimiter(500),\n JellyOptions.smallStrict,\n ))\n .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n Await.ready(flatTriplesFuture, 10.seconds)\n\n // We can also stream already grouped triples or quads \u2013 for example, if your system generates batches of\n // N triples, you can just send those batches straight to be encoded, with one batch = one stream frame.\n // https://w3id.org/stax/ontology#flatQuadStream\n println(f\"\\n\\nEncoding ${quads.size} quads as a flat RDF quad stream, grouped in batches of 10\")\n // First, group the quads into batches of 8\n val groupedQuadsFuture = Source.fromIterator(() => quads.grouped(10))\n .via(EncoderFlow.flatQuadStreamGrouped(\n // Do not use a size limiter here \u2013 we want exactly one batch in each frame\n None,\n JellyOptions.smallStrict,\n ))\n .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n Await.ready(groupedQuadsFuture, 10.seconds)\n\n // Now, let's try grouped streams. Let's say we want to stream all graphs in a dataset, but put exactly one\n // graph in each frame (message). This is very common in (for example) IoT systems.\n // https://w3id.org/stax/ontology#namedGraphStream\n println(f\"\\n\\nEncoding ${graphs.size} graphs as a named graph stream\")\n val namedGraphsFuture = Source(graphs)\n .via(EncoderFlow.namedGraphStream(\n // Do not use a size limiter here \u2013 we want exactly one graph in each frame\n None,\n JellyOptions.smallStrict,\n ))\n // Note that we will see exactly as many frames as there are graphs in the dataset\n .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n Await.ready(namedGraphsFuture, 10.seconds)\n\n // As a last example, we will stream a series of RDF graphs. In our case this will be just the default graph\n // repeated a few times. This type of stream is also pretty common in practical applications.\n // https://w3id.org/stax/ontology#graphStream\n println(f\"\\n\\nEncoding 5 RDF graphs as a graph stream\")\n val graphsFuture = Source.repeat(triples)\n .take(5)\n .via(EncoderFlow.graphStream(\n // Do not use a size limiter here \u2013 we want exactly one graph in each frame\n None,\n JellyOptions.smallStrict,\n ))\n // Note that we will see exactly 5 frames \u2013 the number of graphs we streamed\n .runWith(Sink.foreach(frame => println(s\"Frame with ${frame.rows.size} rows, ${frame.serializedSize} bytes\")))\n\n Await.ready(graphsFuture, 10.seconds)\n\n actorSystem.terminate()\n
"},{"location":"user/reactive/#decoding-rdf-streams-decoderflow","title":"Decoding RDF streams (DecoderFlow
)","text":"The eu.ostrzyciel.jelly.stream.DecoderFlow
provides methods for decoding flat and grouped streams. There is no opposite equivalent to EncoderSource
for decoding, though. This would require constructing an RDF graph or dataset from statements, which is a process that can vary a lot depending on your application. You will have to do this part yourself.
Source code on GitHub
PekkoStreamsDecoderFlow.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.query.Dataset\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\n\nimport java.io.File\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\n\n/**\n * Example of using the [[eu.ostrzyciel.jelly.stream.DecoderFlow]] utility to turn incoming Jelly streams\n * into usable RDF data.\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsDecoderFlow extends shared.Example:\n def main(args: Array[String]): Unit =\n // We will need a Pekko actor system to run the streams\n given actorSystem: ActorSystem = ActorSystem()\n // And an execution context for the futures\n given ExecutionContext = actorSystem.getDispatcher\n\n // Load the example dataset\n val dataset = RDFDataMgr.loadDataset(File(getClass.getResource(\"/weather-graphs.trig\").toURI).toURI.toString)\n\n // To decode something, we first need to encode it...\n // See [[PekkoStreamsEncoderFlow]] and [[PekkoStreamsEncoderSource]] for an explanation of what is happening here.\n // We have four seqences of byte arrays, with each byte array corresponding to one encoded stream frame:\n // - encodedQuads: a flat RDF quad stream, physical type: QUADS\n // - encodedTriples: a flat RDF triple stream, physical type: TRIPLES\n // - encodedGraphs: a flat RDF quad stream, physical type: GRAPHS\n val (encodedQuads, encodedTriples, encodedGraphs) = getEncodedData(dataset)\n\n // Now we can decode the encoded data back into something useful.\n // Let's start by simply decoding the quads as a flat RDF quad stream:\n println(\"Decoding quads as a flat RDF quad stream...\")\n val decodedQuadsFuture = Source(encodedQuads)\n // We need to parse the bytes into a Jelly stream frame\n .via(JellyIo.fromBytes)\n // And then decode the frame into Jena quads.\n // We use \"decodeQuads\" because the physical stream type is QUADS.\n // And then we want to treat it as a flat RDF quad stream, so we call \"asFlatQuadStreamStrict\".\n // We use the \"Strict\" method to tell the decoder to check if the incoming logical stream type is the same\n // as we are expecting: flat RDF quad stream.\n .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n .runWith(Sink.seq)\n\n val decodedQuads: Seq[Quad] = Await.result(decodedQuadsFuture, 10.seconds)\n println(s\"Decoded ${decodedQuads.size} quads.\")\n\n // We can also treat each stream frame as a separate dataset. This way we would get an\n // RDF dataset stream.\n println(f\"\\n\\nDecoding quads as an RDF dataset stream from ${encodedQuads.size} frames...\")\n val decodedDatasetFuture = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n // Note that we cannot use the strict variant (asDatasetStreamOfQuadsStrict) here, because the stream says its\n // logical type is flat RDF quad stream.\n .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuads)\n .runWith(Sink.seq)\n\n val decodedDatasets: Seq[IterableOnce[Quad]] = Await.result(decodedDatasetFuture, 10.seconds)\n println(s\"Decoded ${decodedDatasets.size} datasets with\" +\n s\" ${decodedDatasets.map(_.iterator.size).sum} quads in total.\")\n\n // If we tried that with the strict variant, we would get an exception:\n println(f\"\\n\\nDecoding quads as an RDF dataset stream with strict logical type handling...\")\n val future = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuadsStrict)\n .runWith(Sink.seq)\n Await.result(future.recover {\n // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n // Expected logical stream type LOGICAL_STREAM_TYPE_DATASETS, got LOGICAL_STREAM_TYPE_FLAT_QUADS.\n // LOGICAL_STREAM_TYPE_FLAT_QUADS is not a subtype of LOGICAL_STREAM_TYPE_DATASETS.\n case e: Exception => println(e.getCause)\n }, 10.seconds)\n\n // We can also pass entirely custom supported options to the decoder, instead of the defaults\n // (see [[JellyOptions.defaultSupportedOptions]]). This is useful if we want to decode a stream with\n // for example very large lookup tables or we want to put stricter limits on the streams that we accept.\n println(f\"\\n\\nDecoding quads as an RDF dataset stream with custom supported options...\")\n val customSupportedOptions = JellyOptions.defaultSupportedOptions\n .withMaxNameTableSize(50) // This is too small for the stream we are decoding\n val customSupportedOptionsFuture = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n .via(DecoderFlow.decodeQuads.asDatasetStreamOfQuads(customSupportedOptions))\n .runWith(Sink.seq)\n Await.result(customSupportedOptionsFuture.recover {\n // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n // The stream uses a name table size of 128, which is larger than the maximum supported size of 50.\n // To read this stream, set maxNameTableSize to at least 128 in the supportedOptions for this decoder.\n case e: Exception => println(e.getCause)\n }, 10.seconds)\n\n // Flat RDF triple stream\n println(f\"\\n\\nDecoding triples as a flat RDF triple stream...\")\n val decodedTriplesFuture = Source(encodedTriples)\n .via(JellyIo.fromBytes)\n .via(DecoderFlow.decodeTriples.asFlatTripleStreamStrict)\n .runWith(Sink.seq)\n\n val decodedTriples: Seq[Triple] = Await.result(decodedTriplesFuture, 10.seconds)\n println(s\"Decoded ${decodedTriples.size} triples.\")\n\n // We can interpret the GRAPHS stream in a few ways, see\n // [[eu.ostrzyciel.jelly.stream.DecoderFlow.GraphsIngestFlowOps]] for more details.\n // Here we will treat it as an RDF named graph stream.\n println(f\"\\n\\nDecoding graphs as an RDF named graph stream...\")\n val decodedGraphsFuture = Source(encodedGraphs)\n .via(JellyIo.fromBytes)\n // Non-strict because the original logical stream type is flat RDF quad stream.\n .via(DecoderFlow.decodeGraphs.asNamedGraphStream)\n .runWith(Sink.seq)\n\n val decodedGraphs: Seq[(Node, Iterable[Triple])] = Await.result(decodedGraphsFuture, 10.seconds)\n println(s\"Decoded ${decodedGraphs.size} graphs.\")\n\n // If we tried using a decoder for a physical stream type that does not match the type of the stream,\n // we would get an exception. Here let's try to decode a QUADS stream with a TRIPLES decoder.\n println(f\"\\n\\nDecoding quads as a flat RDF triple stream...\")\n val future2 = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n // Note the \"decodeTriples\" here\n .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n .runWith(Sink.seq)\n Await.result(future2.recover {\n // eu.ostrzyciel.jelly.core.JellyExceptions$RdfProtoDeserializationError:\n // Incoming stream type is not TRIPLES.\n case e: Exception => println(e.getCause)\n }, 10.seconds)\n\n // We can get around this by using the \"decodeAny\" method, which will pick the appropriate decoder\n // based on the stream options in the stream.\n // In this case we can only ask the decoder to output a flat or grouped RDF stream.\n println(f\"\\n\\nDecoding quads as a flat RDF stream using decodeAny...\")\n val decodedAnyFuture = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n // The is no strict variant at all for decodeAny, as we don't care about the stream type anyway.\n .via(DecoderFlow.decodeAny.asFlatStream)\n .runWith(Sink.seq)\n\n val decodedAny: Seq[Triple | Quad] = Await.result(decodedAnyFuture, 10.seconds)\n println(s\"Decoded ${decodedAny.size} statements.\")\n\n // One last trick up our sleeves is the snoopStreamOptions method, which allows us to inspect the stream options\n // and carry on with the decoding as normal.\n // In this case, we will reuse the first example (flat RDF quad stream) and snoop the stream options.\n println(f\"\\n\\nSnooping the stream options of the first frame while decoding a flat RDF quad stream...\")\n val snoopFuture = Source(encodedQuads)\n .via(JellyIo.fromBytes)\n // We add a .viaMat here to capture the materialized value of this stage.\n .viaMat(DecoderFlow.snoopStreamOptions)(Keep.right)\n .via(DecoderFlow.decodeQuads.asFlatQuadStreamStrict)\n .toMat(Sink.seq)(Keep.both)\n .run()\n\n val streamOptions = Await.result(snoopFuture._1, 10.seconds)\n val decodedQuads2 = Await.result(snoopFuture._2, 10.seconds)\n\n val streamOptionsIndented = (\"\\n\" + streamOptions.get.toProtoString.strip).replace(\"\\n\", \"\\n \")\n println(s\"Stream options: $streamOptionsIndented\")\n println(s\"Decoded ${decodedQuads2.size} quads.\")\n\n actorSystem.terminate()\n\n\n /**\n * Helper method to produce encoded data from a dataset.\n */\n private def getEncodedData(dataset: Dataset)(using ActorSystem, ExecutionContext):\n (Seq[Array[Byte]], Seq[Array[Byte]], Seq[Array[Byte]]) =\n val quadStream = EncoderSource.fromDatasetAsQuads(\n dataset,\n ByteSizeLimiter(500),\n JellyOptions.smallStrict\n )\n val tripleStream = EncoderSource.fromGraph(\n dataset.getDefaultModel,\n ByteSizeLimiter(250),\n JellyOptions.smallStrict\n )\n val graphStream = EncoderSource.fromDatasetAsGraphs(\n dataset,\n None,\n JellyOptions.smallStrict\n )\n val results = Seq(quadStream, tripleStream, graphStream).map { stream =>\n val streamFuture = stream\n .via(JellyIo.toBytes)\n .runWith(Sink.seq)\n Await.result(streamFuture, 10.seconds)\n }\n (results.head, results(1), results(2))\n
"},{"location":"user/reactive/#byte-streams-delimited-variant","title":"Byte streams (delimited variant)","text":"In all of the examples above, we used the non-delimited variant of Jelly, which is appropriate for, e.g., sending Jelly data over gRPC or Kafka. If you want to write Jelly data to a file or a socket, you will need to use the delimited variant. jelly-stream
provides a few methods for this in eu.ostrzyciel.jelly.stream.JellyIo
.
Source code on GitHub
PekkoStreamsWithIo.scalapackage eu.ostrzyciel.jelly.examples\n\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.stream.*\nimport org.apache.jena.graph.{Node, Triple}\nimport org.apache.jena.query.Dataset\nimport org.apache.jena.riot.RDFDataMgr\nimport org.apache.jena.sparql.core.Quad\nimport org.apache.pekko.actor.ActorSystem\nimport org.apache.pekko.stream.scaladsl.*\nimport org.apache.pekko.util.ByteString\n\nimport java.io.{File, FileInputStream, FileOutputStream}\nimport java.util.zip.GZIPInputStream\nimport scala.collection.immutable\nimport scala.concurrent.{Await, ExecutionContext}\nimport scala.concurrent.duration.*\nimport scala.util.Using\n\n/**\n * Example of using Pekko Streams to read/write Jelly to a file or any other byte stream (e.g., socket).\n *\n * The examples here use the DELIMITED variant of Jelly, which is suitable only for situations where there is\n * no framing in the underlying stream. You should always use the delimited variant with raw files and sockets,\n * as otherwise it would be impossible to tell where one stream frame ends and another one begins.\n *\n * If you are working with something like MQTT, Kafka, JMS, AMQP... then check the examples in\n * [[eu.ostrzyciel.jelly.examples.PekkoStreamsEncoderFlow]].\n *\n * In this example we are using Apache Jena as the RDF library (note the import:\n * `import eu.ostrzyciel.jelly.convert.jena.given`).\n * The same can be achieved with RDF4J just by importing a different module.\n */\nobject PekkoStreamsWithIo extends shared.Example:\n def main(args: Array[String]): Unit =\n // We will need a Pekko actor system to run the streams\n given actorSystem: ActorSystem = ActorSystem()\n // And an execution context for the futures\n given ExecutionContext = actorSystem.getDispatcher\n\n // We will read a gzipped Jelly file from disk and decode it on the fly, as we are decompressing it.\n println(\"Decoding a gzipped Jelly file with Pekko Streams...\")\n // The input file is a GZipped Jelly file\n val inputFile = File(getClass.getResource(\"/jelly/weather.jelly.gz\").toURI)\n\n // Use Java's GZIPInputStream to decompress the input file on the fly\n val decodedTriples: Seq[Triple] = Using.resource(new GZIPInputStream(FileInputStream(inputFile))) { inputStream =>\n val decodedTriplesFuture = JellyIo.fromIoStream(inputStream)\n // Decode the Jelly frames to triples.\n // Under the hood it uses the RdfStreamFrame.parseDelimitedFrom method.\n .via(DecoderFlow.decodeTriples.asFlatTripleStream)\n .runWith(Sink.seq)\n\n Await.result(decodedTriplesFuture, 10.seconds)\n }\n\n println(s\"Decoded ${decodedTriples.size} triples\")\n\n // -----------------------------------------------------------\n // Now we will write the decoded triples to a new Jelly file\n println(\"\\n\\nWriting the decoded triples to a new Jelly file with Pekko Streams...\")\n Using.resource(new FileOutputStream(\"weather.jelly\")) { outputStream =>\n val writeFuture = Source(decodedTriples)\n // Encode the triples to Jelly\n .via(EncoderFlow.flatTripleStream(\n ByteSizeLimiter(500),\n JellyOptions.smallStrict\n ))\n // Write the Jelly frames to a Java byte stream.\n // Under the hood it uses the RdfStreamFrame.writeDelimitedTo method.\n .runWith(JellyIo.toIoStream(outputStream))\n\n Await.ready(writeFuture, 10.seconds)\n println(\"Done writing the Jelly file.\")\n }\n\n // -----------------------------------------------------------\n // Pekko Streams offers its own utilities for reading and writing bytes that do not involve using Java's\n // blocking implementation of streams.\n // We will again write the decoded triples to a Jelly file, but this time use Pekko's facilities.\n println(\"\\n\\nWriting the decoded triples to a new Jelly file with Pekko Streams' utilities...\")\n val writeFuture = Source(decodedTriples)\n .via(EncoderFlow.flatTripleStream(\n ByteSizeLimiter(500),\n JellyOptions.smallStrict\n ))\n // Convert the frames into Pekko's byte strings.\n // Note: we are using the DELIMITED variant because we will write this to disk!\n .via(JellyIo.toBytesDelimited)\n .map(bytes => ByteString(bytes))\n .runWith(FileIO.toPath(File(\"weather2.jelly\").toPath))\n\n Await.ready(writeFuture, 10.seconds)\n println(\"Done writing the Jelly file.\")\n\n actorSystem.terminate()\n
"},{"location":"user/reactive/#see-also","title":"See also","text":"This guide presents some useful utilities in the jelly-core
and jelly-stream
modules.
Every Jelly stream begins with a header that specifies the serialization options used to encode the stream \u2013 see the details in the specification. So, whenever you serialize some RDF with Jelly (e.g., using Apache Jena RIOT, RDF4J Rio, or the jelly-stream
module), you need to specify these options.
The eu.ostrzyciel.jelly.core.JellyOptions
object provides a few common presets for Jelly serialization options. They return an instance of eu.ostrzyciel.jelly.core.proto.v1.RdfStreamOptions
that you can further customize. For example:
import eu.ostrzyciel.jelly.core.JellyOptions\n\nval options = JellyOptions.smallStrict\n\nval optionsWithRdfStarSupport = JellyOptions.smallRdfStar\n\nval bigWithCustomDictionarySize = JellyOptions.bigStrict\n .withMaxNameTableSize(2000) \n
Warning
These presets do not specify the physical or logical stream type. In most cases, the Jelly library will take care of this for you and set these types automatically later. However, if you use the low-level API, you need to set the stream types manually. For example:
import eu.ostrzyciel.jelly.core.JellyOptions\nimport eu.ostrzyciel.jelly.core.proto.v1.*\n\nJellyOptions.smallStrict\n .withPhysicalType(PhysicalStreamType.QUADS)\n .withLogicalType(LogicalStreamType.DATASETS)\n
"},{"location":"user/utilities/#checking-supported-options","title":"Checking supported options","text":"There is also the eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions
method which specifies the maximum set of options supported by default in Jelly-JVM, when parsing a stream. By default, Jelly-JVM will refuse to parse any stream that uses options that are beyond what is specified in this method. This is important for security reasons, as it prevents the library from, for example, allocating a 10 GB dictionary (potential Denial of Service attack).
The supported options check is carried out automatically by the decoder when parsing a stream. You cannot disable the check, but you can customize the supported options by constructing a new RdfStreamOptions
object from eu.ostrzyciel.jelly.core.JellyOptions.defaultSupportedOptions
, customizing it, and passing it to the decoder.
If you want to do this kind of check in some other context (e.g., in a gRPC service to check if you can support the options requested by the client), you can use the eu.ostrzyciel.jelly.core.JellyOptions.checkCompatibility
method. It will throw an exception if the options are not supported.
The eu.ostrzyciel.jelly.core.Constants
object defines some useful constants, such as the file extension for Jelly, its content type, and the version of the Jelly protocol.
Jelly uses RDF-STaX to define the logical stream types (more details here). Jelly-JVM defines each of these types as a case object in eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType
.
These objects have a few useful methods for working with the RDF-STaX ontology:
import eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\n\n// Get the RDF-STaX IRI of a stream type\n// returns \"https://w3id.org/stax/ontology#flatTripleStream\"\nLogicalStreamType.TRIPLES.getRdfStaxType\n
You can also obtain a full RDF-STaX annotation for your stream if you also import an RDF library interop module (e.g., jelly-jena
or jelly-rdf4j
):
// Here we import `jena.given` to get the necessary implicit conversions.\n// You can do the same with `rdf4j.given` if you are using RDF4J.\nimport eu.ostrzyciel.jelly.convert.jena.given\nimport eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\nimport org.apache.jena.graph.NodeFactory\n\nval subjectNode: Node = NodeFactory.createURI(\"http://example.org/subject\")\nval triples: Seq[Triple] = LogicalStreamType.QUADS.getRdfStaxAnnotation\n// Returns a Seq of three triples that would look like this in Turtle:\n// <http://example.org/subject> stax:hasStreamTypeUsage [\n// a stax:RdfStreamTypeUsage ;\n// stax:hasStreamType stax:flatQuadStream\n// ] .\n
You can then take this annotation and expose as semantic metadata of your stream.
You can also do the opposite and construct an instance of LogicalStreamType
from an RDF-STaX IRI:
import eu.ostrzyciel.jelly.core.LogicalStreamTypeFactory\n\nval iri = \"https://w3id.org/stax/ontology#flatQuadStream\"\n// returns LogicalStreamType.QUADS\nval streamType = LogicalStreamTypeFactory.fromOntologyIri(iri)\n
Finally, there are also stream type checking and manipulation utilities:
import eu.ostrzyciel.jelly.core.*\nimport eu.ostrzyciel.jelly.core.proto.v1.LogicalStreamType\n\n// Check if this type is equal or a subtype of another type.\n// This is useful for performing compatibility checks.\n// Returns false\nLogicalStreamType.TRIPLES.isEqualOrSubtypeOf(LogicalStreamType.DATASETS)\n// Returns true\nLogicalStreamType.NAMED_GRAPHS.isEqualOrSubtypeOf(LogicalStreamType.DATASETS)\n\n// Get the \"base\" type of a stream type. Base types are concrete stream types \n// that have no parent types. \n// There are only 4 base types: GRAPHS, DATASETS, TRIPLES, QUADS.\n// Returns LogicalStreamType.TRIPLES\nLogicalStreamType.TRIPLES.toBaseType\n// Returns LogicalStreamType.DATASETS\nLogicalStreamType.NAMED_GRAPHS.toBaseType\n// Returns LogicalStreamType.DATASETS\nLogicalStreamType.TIMESTAMPED_NAMED_GRAPHS.toBaseType\n
"},{"location":"user/utilities/#jelly-configuration-from-typesafe-config","title":"Jelly configuration from Typesafe config","text":"The jelly-stream
module also implements a utility for configuring Jelly serialization options using the Typesafe config library, which is commonly used in Apache Pekko applications.
The utility is provided by the eu.ostrzyciel.jelly.stream.JellyOptionsFromTypesafe
object. For example:
import com.typesafe.config.ConfigFactory\nimport eu.ostrzyciel.jelly.stream.JellyOptionsFromTypesafe\n\nval config = ConfigFactory.parseString(\"\"\"\n |jelly.physical-type = QUADS\n |jelly.name-table-size = 1024\n |jelly.prefix-table-size = 64\n |\"\"\".stripMargin)\n\nval options = JellyOptionsFromTypesafe.fromConfig(config.getConfig(\"jelly\"))\noptions.physicalType // returns PhysicalStreamType.QUADS\noptions.maxNameTableSize // returns 1024\noptions.maxPrefixTableSize // returns 64\noptions.maxDatatypeTableSize // returns 16 (the default)\n
See the source code of this class for more details.
"},{"location":"user/utilities/#see-also","title":"See also","text":"