Skip to content

Évaluation des performances (novembre 2017)

Erwan Demairy edited this page Nov 14, 2017 · 12 revisions

Requête sur rdfs:label

Requête

@db <.../dbpedial_ttl>
select (count(*) as ?c) where {
  ?x rdfs:label ?l
}

Comportement avec cypher

Il faut un serveur neo4j lancé. Par défaut, l'accès pour les clients se fait sur localhost:7474.

$ ./cypher-shell -u neo4j -p "..."
Connected to Neo4j 3.2.6 at bolt://localhost:7687 as user neo4j.
...
neo4j> match (n:rdf_edge {p_value: "http://www.w3.org/2000/01/rdf-schema#label"}) return count(n); 
+----------+
| count(n) |
+----------+
| 3168487  |
+----------+

1 row available after 92924 ms, consumed after another 2 ms

En forçant l'utilisation de l'index de rdf_edge:p_value, on obtient des résultats nettement meilleurs (cf. ci-dessous). En fait, les résultats sont incohérents : en refaisant la requête précédente (sans index explicite donc), les temps deviennent identiques à la requête ci-dessous. Pas d'explication pour l'instant (un cache probablement).

neo4j> match (n:rdf_edge {p_value: "http://www.w3.org/2000/01/rdf-schema#label"}) using index n:rdf_edge(p_value) return count(n); 
+----------+
| count(n) |
+----------+
| 3168487  |
+----------+

1 row available after 1867 ms, consumed after another 0 ms

Utilisation du driver java (cypher)

public class Main implements AutoCloseable {

	private final Driver driver;

	public Main(String uri, String user, String password) {
		driver = GraphDatabase.driver(uri, AuthTokens.basic(user, password));
	}

	@Override
	public void close() throws Exception {
		driver.close();
	}

	public void search() {
		try (Session session = driver.session()) {
			String result = session.readTransaction(new TransactionWork<String>() {
				@Override
				public String execute(Transaction tx) {
					StatementResult result = tx.run("match (n:rdf_edge {p_value: \"http://www.w3.org/2000/01/rdf-schema#label\"}) using index n:rdf_edge(p_value) return count(n)");
//					long count = 0;
//					while (result.hasNext()) {
//						count++;
//						result.next();
//					}
//					return Long.toString(count);//single().get(0).asString();)
					return result.next().toString();
//					return result.single().get(0).toString();
				}
			});
			System.out.println(result);
		}
	}

	public static void main(String... args) throws Exception {
		try (Main greeter = new Main("bolt://localhost:7687", "neo4j", "4cG-KkD-F3A-y9F")) {
			long s = System.nanoTime();
			greeter.search();
			long e = System.nanoTime();
			System.out.println((e-s)/1_000_000_000.);
		}
	}
} 

requête match (n:rdf_edge {p_value: "http://www.w3.org/2000/01/rdf-schema#label\"}) using index n:rdf_edge(p_value) return count(n)

Temps d'exécution du comptage : 3.11s.

requête match (n:rdf_edge {p_value: "http://www.w3.org/2000/01/rdf-schema#label\"}) using index n:rdf_edge(p_value) return n

Temps d'exécution du comptage par boucle (hasNext/next): 39.6

Jointure

Requête en cypher

neo4j> profile  match (n:rdf_edge {s_value: "http://fr.dbpedia.org/resource/Antibes"}),(y:rdf_edge) where n.p_value=y.p_value and n.o_value=y.o_value return n;
+---------------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time   | DbHits | Rows   |
+---------------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.2" | "COST"  | "INTERPRETED" | 293982 | 0      | 368110 |
+---------------------------------------------------------------------------------------------+

+------------------+----------------+---------+---------+-----------+-------------+------------------------+
| Operator         | Estimated Rows | Rows    | DB Hits | Cache H/M | Identifiers | Other                  |
+------------------+----------------+---------+---------+-----------+-------------+------------------------+
| +ProduceResults  |              0 |  368110 |       0 |       0/0 | n, y        |                        |
| |                +----------------+---------+---------+-----------+-------------+------------------------+
| +Filter          |              0 |  368110 | 2026946 |       0/0 | n, y        | n.p_value == y.p_value |
| |                +----------------+---------+---------+-----------+-------------+------------------------+
| +Apply           |             18 | 1013473 |       0 |       0/0 | n, y        |                        |
| |\               +----------------+---------+---------+-----------+-------------+------------------------+
| | +NodeIndexSeek |        2695944 | 1013473 | 1013801 |       0/0 | n, y        | :rdf_edge(o_value)     |
| |                +----------------+---------+---------+-----------+-------------+------------------------+
| +NodeIndexSeek   |              7 |     164 |     165 |       0/0 | n           | :rdf_edge(s_value)     |
+------------------+----------------+---------+---------+-----------+-------------+------------------------+
profile  match (n:rdf_edge {s_value: "http://fr.dbpedia.org/resource/Antibes"}),(y:rdf_edge) where n.p_value=y.p_value and n.o_value=y.o_value return count(n);
+-------------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time   | DbHits | Rows |
+-------------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.2" | "COST"  | "INTERPRETED" | 445105 | 0      | 1    |
+-------------------------------------------------------------------------------------------+

+-------------------+----------------+---------+---------+-----------+-------------+------------------------+
| Operator          | Estimated Rows | Rows    | DB Hits | Cache H/M | Identifiers | Other                  |
+-------------------+----------------+---------+---------+-----------+-------------+------------------------+
| +ProduceResults   |              0 |       1 |       0 |       0/0 | count(n)    |                        |
| |                 +----------------+---------+---------+-----------+-------------+------------------------+
| +EagerAggregation |              0 |       1 |       0 |       0/0 | count(n)    |                        |
| |                 +----------------+---------+---------+-----------+-------------+------------------------+
| +Filter           |              0 |  368110 | 2026946 |       0/0 | n, y        | n.p_value == y.p_value |
| |                 +----------------+---------+---------+-----------+-------------+------------------------+
| +Apply            |             18 | 1013473 |       0 |       0/0 | n, y        |                        |
| |\                +----------------+---------+---------+-----------+-------------+------------------------+
| | +NodeIndexSeek  |        2695944 | 1013473 | 1013801 |       0/0 | n, y        | :rdf_edge(o_value)     |
| |                 +----------------+---------+---------+-----------+-------------+------------------------+
| +NodeIndexSeek    |              7 |     164 |     165 |       0/0 | n           | :rdf_edge(s_value)     |
+-------------------+----------------+---------+---------+-----------+-------------+------------------------+

+----------+
| count(n) |
+----------+
| 368110   |
+----------+

1 row available after 445102 ms, consumed after another 3 ms```