Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inlining vertex properties into a CompositeIndex structure #4692

Merged
merged 4 commits into from
Oct 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,23 @@ For more information on features and bug fixes in 1.1.0, see the GitHub mileston
* [JanusGraph zip](https://github.com/JanusGraph/janusgraph/releases/download/v1.1.0/janusgraph-1.1.0.zip)
* [JanusGraph zip with embedded Cassandra and ElasticSearch](https://github.com/JanusGraph/janusgraph/releases/download/v1.1.0/janusgraph-full-1.1.0.zip)

##### Upgrade Instructions

##### Inlining vertex properties into a Composite Index

Inlining vertex properties into a Composite Index structure can offer significant performance and efficiency benefits.
See [documentation](./schema/index-management/index-performance.md#inlining-vertex-properties-into-a-composite-index) on how to inline vertex properties into a composite index.

**Important Notes on Compatibility**

1. **Backward Incompatibility**
Once a JanusGraph instance adopts this new schema feature, it cannot be rolled back to a prior version of JanusGraph.
The changes in the schema structure are not compatible with earlier versions of the system.

2. **Migration Considerations**
It is critical that users carefully plan their migration to this new version, as there is no automated or manual rollback process
to revert to an older version of JanusGraph once this feature is used.

### Version 1.0.1 (Release Date: ???)

/// tab | Maven
Expand Down
38 changes: 38 additions & 0 deletions docs/schema/index-management/index-performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -323,6 +323,44 @@ index with label restriction is defined as unique, the uniqueness
constraint only applies to properties on vertices or edges for the
specified label.

### Inlining vertex properties into a Composite Index

Inlining vertex properties into a Composite Index structure can offer significant performance and efficiency benefits.

1. **Performance Improvements**
Faster Querying: Inlining vertex properties directly within the index allows the search engine to retrieve all relevant data from the index itself.
This means, queries don’t need to make additional calls to data stores to fetch full vertex information, significantly reducing lookup time.

2. **Data Locality**
In distributed storages, having inlined properties ensures that more complete data exists within individual partitions or shards.
This reduces cross-node network calls and improves the overall query performance by ensuring data is more local to the request being processed.

3. **Cost of Indexing vs. Storage Trade-off**
While inlining properties increases the size of the index (potentially leading to more extensive index storage requirements),
it is often a worthwhile trade-off for performance, mainly when query speed is critical.
This is a typical pattern in systems optimized for read-heavy workloads.

#### Usage
In order to take advantage of the inlined properties feature, JanusGraph Transaction should be set to use `.propertyPrefetching(false)`

Example:

```groovy
//Build index
mgmt.buildIndex("composite", Vertex.class)
.addKey(idKey)
.addInlinePropertyKey(nameKey)
.buildCompositeIndex()
mgmt.commit()

//Query
tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start()

tx.traversal().V().has("id", 100).next().value("name")
```

### Composite versus Mixed Indexes

1. Use a composite index for exact match index retrievals. Composite
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@
import org.janusgraph.diskstorage.indexing.IndexInformation;
import org.janusgraph.diskstorage.indexing.IndexProvider;
import org.janusgraph.diskstorage.indexing.IndexTransaction;
import org.janusgraph.diskstorage.keycolumnvalue.scan.ScanJobFuture;
import org.janusgraph.diskstorage.log.kcvs.KCVSLog;
import org.janusgraph.diskstorage.util.time.TimestampProvider;
import org.janusgraph.example.GraphOfTheGodsFactory;
Expand All @@ -83,11 +84,14 @@
import org.janusgraph.graphdb.internal.ElementCategory;
import org.janusgraph.graphdb.internal.ElementLifeCycle;
import org.janusgraph.graphdb.internal.Order;
import org.janusgraph.graphdb.internal.RelationCategory;
import org.janusgraph.graphdb.log.StandardTransactionLogProcessor;
import org.janusgraph.graphdb.query.index.ApproximateIndexSelectionStrategy;
import org.janusgraph.graphdb.query.index.BruteForceIndexSelectionStrategy;
import org.janusgraph.graphdb.query.index.ThresholdBasedIndexSelectionStrategy;
import org.janusgraph.graphdb.query.profile.QueryProfiler;
import org.janusgraph.graphdb.query.vertex.BaseVertexCentricQuery;
import org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphMixedIndexAggStep;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphStep;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.JanusGraphMixedIndexCountStrategy;
Expand All @@ -106,6 +110,8 @@
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.time.Duration;
import java.time.Instant;
import java.time.temporal.ChronoUnit;
Expand Down Expand Up @@ -1445,6 +1451,199 @@ public void testCompositeVsMixedIndexing() {
assertTrue(tx.traversal().V().has("intId2", 234).hasNext());
}

@Test
public void testIndexInlineProperties() throws NoSuchMethodException {

clopen(option(FORCE_INDEX_USAGE), true);

final PropertyKey idKey = makeKey("id", Integer.class);
final PropertyKey nameKey = makeKey("name", String.class);
final PropertyKey cityKey = makeKey("city", String.class);

mgmt.buildIndex("composite", Vertex.class)
.addKey(idKey)
.addInlinePropertyKey(nameKey)
.buildCompositeIndex();

finishSchema();

String name = "Mizar";
String city = "Chicago";
tx.addVertex("id", 100, "name", name, "city", city);
tx.commit();

tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

Method m = VertexCentricQueryBuilder.class.getSuperclass().getDeclaredMethod("constructQuery", RelationCategory.class);
m.setAccessible(true);

CacheVertex v = (CacheVertex) (tx.traversal().V().has("id", 100).next());

verifyPropertyLoaded(v, "name", true, m);
verifyPropertyLoaded(v, "city", false, m);

assertEquals(name, v.value("name"));
assertEquals(city, v.value("city"));
}

@Test
public void testIndexInlinePropertiesReindex() throws NoSuchMethodException, InterruptedException {
clopen(option(FORCE_INDEX_USAGE), true);

PropertyKey idKey = makeKey("id", Integer.class);
PropertyKey nameKey = makeKey("name", String.class);
PropertyKey cityKey = makeKey("city", String.class);

mgmt.buildIndex("composite", Vertex.class)
.addKey(cityKey)
.buildCompositeIndex();

finishSchema();

String city = "Chicago";
for (int i = 0; i < 3; i++) {
tx.addVertex("id", i, "name", "name" + i, "city", city);
}

tx.commit();

tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

Method m = VertexCentricQueryBuilder.class.getSuperclass().getDeclaredMethod("constructQuery", RelationCategory.class);
m.setAccessible(true);

List<Vertex> vertices = tx.traversal().V().has("city", city).toList();
vertices.stream()
.map(v -> (CacheVertex) v)
.forEach(v -> verifyPropertyLoaded(v, "name", false, m));

tx.commit();

//Include inlined property
JanusGraphIndex index = mgmt.getGraphIndex("composite");
nameKey = mgmt.getPropertyKey("name");
mgmt.addInlinePropertyKey(index, nameKey);
finishSchema();

//Reindex
index = mgmt.getGraphIndex("composite");
ScanJobFuture scanJobFuture = mgmt.updateIndex(index, SchemaAction.REINDEX);
finishSchema();

while (!scanJobFuture.isDone()) {
Thread.sleep(1000);
}
porunov marked this conversation as resolved.
Show resolved Hide resolved

//Try query now
tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

List<Vertex> vertices2 = tx.traversal().V().has("city", city).toList();
vertices2.stream()
.map(v -> (CacheVertex) v)
.forEach(v -> verifyPropertyLoaded(v, "name", true, m));

tx.commit();
}

@Test
public void testIndexInlinePropertiesUpdate() {

clopen(option(FORCE_INDEX_USAGE), true);

final PropertyKey idKey = makeKey("id", Integer.class);
final PropertyKey nameKey = makeKey("name", String.class);
final PropertyKey cityKey = makeKey("city", String.class);

mgmt.buildIndex("composite", Vertex.class)
.addKey(idKey)
.addInlinePropertyKey(nameKey)
.buildCompositeIndex();

finishSchema();

String name1 = "Mizar";
String name2 = "Alcor";

String city = "Chicago";
tx.addVertex("id", 100, "name", name1, "city", city);
tx.addVertex("id", 200, "name", name2, "city", city);
tx.commit();

tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

Vertex v = (tx.traversal().V().has("id", 100).next());
assertEquals(name1, v.value("name"));

//Update inlined property
v.property("name", "newName");
tx.commit();

tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

v = (tx.traversal().V().has("id", 100).next());
assertEquals("newName", v.value("name"));
}

@Test
public void testIndexInlinePropertiesLimit() throws NoSuchMethodException {

clopen(option(FORCE_INDEX_USAGE), true);

final PropertyKey nameKey = makeKey("name", String.class);
final PropertyKey cityKey = makeKey("city", String.class);

mgmt.buildIndex("composite", Vertex.class)
.addKey(cityKey)
.addInlinePropertyKey(nameKey)
.buildCompositeIndex();

finishSchema();

String city = "Chicago";
for (int i = 0; i < 10; i++) {
String name = "name_" + i;
tx.addVertex("name", name, "city", city);
}
tx.commit();

tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

Method m = VertexCentricQueryBuilder.class.getSuperclass().getDeclaredMethod("constructQuery", RelationCategory.class);
m.setAccessible(true);

List<Vertex> vertices = tx.traversal().V().has("city", city).limit(3).toList();
assertEquals(3, vertices.size());
vertices.stream().map(v -> (CacheVertex) v).forEach(v -> {
verifyPropertyLoaded(v, "name", true, m);
verifyPropertyLoaded(v, "city", false, m);
});
}

private void verifyPropertyLoaded(CacheVertex v, String propertyName, Boolean isPresent, Method m) {
VertexCentricQueryBuilder queryBuilder = v.query().direction(Direction.OUT);
//Verify the name property is already present in vertex cache
BaseVertexCentricQuery nameQuery = null;
try {
nameQuery = (BaseVertexCentricQuery) m.invoke(queryBuilder.keys(propertyName), RelationCategory.PROPERTY);
} catch (IllegalAccessException | InvocationTargetException e) {
throw new RuntimeException(e);
}
Boolean result = v.hasLoadedRelations(nameQuery.getSubQuery(0).getBackendQuery());
assertEquals(isPresent, result);
}

@Test
public void testCompositeAndMixedIndexing() {
final PropertyKey name = makeKey("name", String.class);
Expand Down
Loading
Loading