Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: pull OpenLinkSW fork changes #86

Merged
merged 63 commits into from
Jan 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
6f6b5b1
feat: pull OpenLinkSW fork changes
berezovskyi Dec 26, 2024
589bd39
fix: allow coherence to be added gradually to existing instances
berezovskyi Dec 26, 2024
02e9cd7
fix: Avro quirks
berezovskyi Dec 26, 2024
55466da
fix: Avro quirks 2
berezovskyi Dec 27, 2024
448e5d9
fix: Avro quirks 2
berezovskyi Dec 27, 2024
cfd15cd
fix: undo unintended change
berezovskyi Dec 29, 2024
5d8adef
fix: more fixes from the fork + make the AEvol code run on IV recalc
berezovskyi Dec 29, 2024
f007602
fix: typo
berezovskyi Dec 29, 2024
6575028
chore: switch to GOOGLE code style
berezovskyi Dec 29, 2024
b74c8d1
chore: extra logging
berezovskyi Dec 29, 2024
54d717c
fix: use either amonths or index.availability for the index page avai…
berezovskyi Dec 29, 2024
b6ff40b
use index.availability, ignore amonths
berezovskyi Dec 29, 2024
044240a
fix: pre-process index.availability
berezovskyi Dec 29, 2024
c08efd5
fix: type
berezovskyi Dec 29, 2024
85dcdd1
fix: Index page updates
berezovskyi Dec 29, 2024
a9feac7
chore: extra logging
berezovskyi Dec 31, 2024
e364bb5
feat: user agent and timeouts
berezovskyi Dec 31, 2024
3dbf7f5
feat: user agent and timeouts
berezovskyi Dec 31, 2024
1c90f4f
feat: user agent and timeouts
berezovskyi Dec 31, 2024
96f05e4
feat: improve scheduling
berezovskyi Dec 31, 2024
5a74dd3
feat: improve scheduling
berezovskyi Dec 31, 2024
1a3c06b
fix: scheduling
berezovskyi Dec 31, 2024
f33ba21
feat: new charts on the homepage [WIP]
berezovskyi Dec 31, 2024
f0e524a
fix: show new charts on the homepage
berezovskyi Dec 31, 2024
e337d87
refactor: schedules, user agents, and log strings
berezovskyi Dec 31, 2024
144a5ea
fix: API discovery template
berezovskyi Dec 31, 2024
7aa98e6
docs: update footer
berezovskyi Dec 31, 2024
82161a4
refactor: better fault handling
berezovskyi Dec 31, 2024
22b649b
refactor: better fault handling
berezovskyi Dec 31, 2024
bb4e4be
refactor: better fault handling
berezovskyi Dec 31, 2024
645bdfd
refactor: better fault handling
berezovskyi Dec 31, 2024
ad69d7c
refactor: better fault handling
berezovskyi Dec 31, 2024
cb4e405
refactor: better fault handling
berezovskyi Dec 31, 2024
a25affc
refactor: better fault handling
berezovskyi Dec 31, 2024
c750f18
refactor: better fault handling
berezovskyi Dec 31, 2024
36ff8c4
refactor: better fault handling
berezovskyi Dec 31, 2024
5a62cb2
refactor: better fault handling
berezovskyi Dec 31, 2024
4e98c89
refactor: better fault handling
berezovskyi Dec 31, 2024
d03f273
refactor: better fault handling
berezovskyi Dec 31, 2024
eec7020
refactor: better fault handling
berezovskyi Dec 31, 2024
539b07a
refactor: better fault handling
berezovskyi Dec 31, 2024
b79b763
refactor: better fault handling
berezovskyi Dec 31, 2024
371c968
refactor: better fault handling
berezovskyi Dec 31, 2024
4af8ffe
refactor: better fault handling
berezovskyi Dec 31, 2024
2ce1597
refactor: better fault handling
berezovskyi Dec 31, 2024
b5aefe0
refactor: better fault handling
berezovskyi Dec 31, 2024
81e4f3b
feat: calculate median threshold
berezovskyi Dec 31, 2024
37f55ea
refactor: better fault handling
berezovskyi Dec 31, 2024
4a526ff
refactor: better fault handling
berezovskyi Dec 31, 2024
6f69761
refactor: better fault handling
berezovskyi Dec 31, 2024
b6c4404
fix: safe access
berezovskyi Dec 31, 2024
e19556d
fix: safe access
berezovskyi Dec 31, 2024
63a884e
feat: calculate median threshold
berezovskyi Jan 1, 2025
1bc9d2d
fix: median threshold
berezovskyi Jan 1, 2025
ef03683
fix: index/avail calc
berezovskyi Jan 1, 2025
b2d8d21
wip: use index.avail by default
berezovskyi Jan 1, 2025
fcc133e
refactor: better fault handling
berezovskyi Jan 1, 2025
842ce81
refactor: better fault handling
berezovskyi Jan 1, 2025
eb27774
refactor: better fault handling
berezovskyi Jan 1, 2025
7e65653
feat: a script to add endpoints from VOID/SD metadata
berezovskyi Jan 1, 2025
26d3142
fix: timestamps in the views
berezovskyi Jan 3, 2025
7965e7b
Update backend/src/main/java/sparqles/analytics/PAnalyser.java
berezovskyi Jan 4, 2025
a586f9c
Update backend/src/main/java/sparqles/analytics/IndexViewAnalytics.java
berezovskyi Jan 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ indent_size = 2
max_line_length = 100

[*.java]
indent_size = 4
indent_size = 2
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
/data/

*.tmp
*~

# Node
npm-debug.log
Expand Down
27 changes: 23 additions & 4 deletions backend/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
<artifactId>sparqles</artifactId>
<version>0.0.1-SNAPSHOT</version>
<properties>
<jdkLevel>11</jdkLevel>
<jdkLevel>17</jdkLevel>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Critical: MongoDB and Jongo dependencies require updates for Java 17 compatibility

The verification reveals critical compatibility issues:

  • MongoDB Java Driver 2.11.3 (released in 2014) is not compatible with Java 17. The minimum recommended version for Java 17 support is 4.x
  • Jongo 1.1 depends on this old MongoDB driver and hasn't been updated since 2015

Before proceeding with the Java 17 upgrade, you must:

  • Upgrade mongo-java-driver to version 4.x (preferably latest stable)
  • Consider replacing Jongo with the native MongoDB Java driver as Jongo is no longer maintained
  • Test the application thoroughly after these upgrades as they involve significant API changes
🔗 Analysis chain

Verify compatibility with Java 17 upgrade

The upgrade from Java 11 to 17 is significant. While the change is good for leveraging newer Java features, we should verify compatibility with key dependencies:

  • MongoDB Java Driver (2.11.3) is quite old
  • Jongo (1.1) hasn't been updated in years

Also applies to: 58-58

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if any dependencies explicitly declare Java version compatibility
# Look for maven.compiler.target in dependency POMs

echo "Checking Java compatibility declarations in dependencies..."
mvn dependency:tree -Dverbose | grep -B1 "maven.compiler"

Length of output: 280


Script:

#!/bin/bash
# Let's check the dependencies directly from the POM file
echo "Checking dependencies in pom.xml..."
cat backend/pom.xml | grep -A5 "<dependency>"

# Also check for MongoDB driver version specifically
echo -e "\nChecking MongoDB driver version..."
cat backend/pom.xml | grep -A3 "mongo"

# Check for Jongo version
echo -e "\nChecking Jongo version..."
cat backend/pom.xml | grep -A3 "jongo"

Length of output: 3967

<maven.compiler.release>${jdkLevel}</maven.compiler.release>
<maven.compiler.target>${jdkLevel}</maven.compiler.target>
<maven.compiler.source>${jdkLevel}</maven.compiler.source>
Expand Down Expand Up @@ -55,7 +55,7 @@
<version>3.6.0</version>
</requireMavenVersion>
<requireJavaVersion>
<version>11</version>
<version>17</version>
</requireJavaVersion>
<bannedDependencies>
<excludes>
Expand Down Expand Up @@ -177,9 +177,12 @@
<!-- optional: limit format enforcement to just the files changed by this feature branch -->
<!-- <ratchetFrom>origin/main</ratchetFrom> -->
<java>
<toggleOffOn>
<off>Autogenerated by Avro</off>
</toggleOffOn>
<googleJavaFormat>
<version>1.24.0</version>
<style>AOSP</style>
<style>GOOGLE</style>
<reflowLongStrings>true</reflowLongStrings>
<formatJavadoc>true</formatJavadoc>
</googleJavaFormat>
Expand All @@ -188,6 +191,23 @@
</java>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.avro</groupId>
<artifactId>avro-maven-plugin</artifactId>
<version>${avro.version}</version>
<executions>
<execution>
<phase>generate-sources</phase>
<goals>
<goal>schema</goal>
</goals>
<configuration>
<sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
<outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
</configuration>
</execution>
</executions>
</plugin>
Comment on lines +194 to +210
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider relocating generated sources to target directory

While the Avro plugin configuration is functional, generating sources directly in src/main/java could cause issues:

  1. Generated files might be accidentally committed
  2. Could interfere with source control
  3. May cause conflicts during clean builds

Consider this alternative configuration:

-              <outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
+              <outputDirectory>${project.build.directory}/generated-sources/avro</outputDirectory>

Also, add:

<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>build-helper-maven-plugin</artifactId>
    <executions>
        <execution>
            <phase>generate-sources</phase>
            <goals>
                <goal>add-source</goal>
            </goals>
            <configuration>
                <sources>
                    <source>${project.build.directory}/generated-sources/avro</source>
                </sources>
            </configuration>
        </execution>
    </executions>
</plugin>

</plugins>
</build>
<dependencyManagement>
Expand Down Expand Up @@ -262,7 +282,6 @@
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>

</exclusions>
</dependency>
<dependency>
Expand Down
46 changes: 29 additions & 17 deletions backend/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,29 +15,37 @@ export JAVA_OPTS="-XX:MaxRAMPercentage=80"

# while :
# do
echo "Running SPARQLes full cycle"

#echo "Running SPARQLes full cycle"
# interop
echo "Running SPARQLes full cycle [ftask]"
bin/sparqles $CMDARGS -run ftask
#echo "Running SPARQLes full cycle [ftask]"
#bin/sparqles $CMDARGS -run ftask
# # availability
# echo "Running SPARQLes full cycle [atask]"
# bin/sparqles $CMDARGS -run atask
#echo "Running SPARQLes full cycle [atask]"
#bin/sparqles $CMDARGS -run atask
# coherence
#echo "Running SPARQLes full cycle [ctask]"
#bin/sparqles $CMDARGS -run ctask
# # performance
echo "Running SPARQLes full cycle [ptask]"
bin/sparqles $CMDARGS -run ptask
#echo "Running SPARQLes full cycle [ptask]"
#bin/sparqles $CMDARGS -run ptask
# # discoverability
echo "Running SPARQLes full cycle [dtask]"
bin/sparqles $CMDARGS -run dtask
# index view
echo "Running SPARQLes full cycle [iv]"
bin/sparqles $CMDARGS -iv
#echo "Running SPARQLes full cycle [dtask]"
#bin/sparqles $CMDARGS -run dtask
# stats
echo "Running SPARQLes full cycle [st]"
bin/sparqles $CMDARGS -st
#echo "Running SPARQLes full cycle [st]"
#bin/sparqles $CMDARGS -st

# # recompute
# echo "Running SPARQLes full cycle [r]"
# bin/sparqles $CMDARGS -r
#echo "Running SPARQLes full cycle [r]"
#bin/sparqles $CMDARGS -r
#echo "Running SPARQLes - recompute last [rl]"
#bin/sparqles $CMDARGS -rl
# index view
# FIXME: crashes on SPARQLES.recomputeIndexView
#echo "Running SPARQLes full cycle [iv]"
#bin/sparqles $CMDARGS -iv

# index from old.datahub.io
# echo "Running SPARQLes full cycle [itask]"
# bin/sparqles $CMDARGS -run itask
Expand All @@ -46,8 +54,12 @@ export JAVA_OPTS="-XX:MaxRAMPercentage=80"
# sleep $DELAY
# done

echo "${JAVA_OPTS}"
#echo "Running SPARQLes [reschedule all tasks]"
#bin/sparqles $CMDARGS -run reschedule

#echo "${JAVA_OPTS}"

echo "Running SPARQLes [start service]"
## Fully automatic
JAVA_OPTS="${JAVA_OPTS} " bin/sparqles $CMDARGS --start

1 change: 1 addition & 0 deletions backend/src/main/avro/AResult.avsc
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{"namespace": "sparqles.avro.availability",
"type": "record",
"import" : "EndpointResult.avsc",
"name": "AResult",
"fields": [
{"name": "endpointResult", "type": "sparqles.avro.EndpointResult"},
Expand Down
28 changes: 28 additions & 0 deletions backend/src/main/avro/CResult.avsc
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"namespace": "sparqles.avro.calculation",
"type": "record",
"import" : "EndpointResult.avsc",
"name": "CResult",
"fields": [
{"name": "endpointResult", "type": "sparqles.avro.EndpointResult"},
{"name": "triples", "type": "long"},
{"name": "entities", "type": "long"},
{"name": "classes", "type": "long"},
{"name": "properties", "type": "long"},
{"name": "distinctSubjects", "type": "long"},
{"name": "distinctObjects", "type": "long"},
{"name": "exampleResources", "type":
{"type": "array", "items":
{
"name": "uri", "type": "string"
}
}
},
{"name": "VoID", "type": "string"},
{"name": "VoIDPart", "type": "boolean"},
{"name": "SD", "type": "string"},
{"name": "SDPart", "type": "boolean"},
{"name": "coherence", "type": "double"},
{"name": "RS", "type": "double"}
]
}
15 changes: 15 additions & 0 deletions backend/src/main/avro/CalculationView.avsc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"namespace": "sparqles.avro.analytics",
"type": "record",
"name": "CalculationView",
"fields": [
{"name": "endpoint", "type": "sparqles.avro.Endpoint"},
{"name": "VoID", "type": "boolean"},
{"name": "VoIDPart", "type": "boolean"},
{"name": "SD", "type": "boolean"},
{"name": "SDPart", "type": "boolean"},
{"name": "coherence", "type": "double"},
{"name": "RS", "type": "double"},
{"name": "lastUpdate", "type": "long"}
]
}
1 change: 1 addition & 0 deletions backend/src/main/avro/DResult.avsc
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{
"namespace": "sparqles.avro.discovery",
"type": "record",
"import" : "EndpointResult.avsc",
"name": "DResult",
"fields": [
{"name": "endpointResult", "type": "sparqles.avro.EndpointResult"},
Expand Down
90 changes: 66 additions & 24 deletions backend/src/main/avro/EPView.avsc
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@
"namespace": "sparqles.avro.analytics",
"type": "record",
"name": "EPView",
"fields": [
"fields": [
{"name": "endpoint", "type": "sparqles.avro.Endpoint"},
{"name": "availability", "type": {
"namespace": "sparqles.avro.analytics",
{"name": "availability", "type": {
"namespace": "sparqles.avro.analytics",
"name": "EPViewAvailability",
"type": "record",
"fields" : [
Expand All @@ -16,15 +16,15 @@
{ "name": "uptimeLast31d", "type": "double"},
{ "name": "uptimeOverall", "type": "double"},
{ "name": "data", "type": {
"namespace": "sparqles.avro.analytics",
"namespace": "sparqles.avro.analytics",
"name": "EPViewAvailabilityData",
"type": "record",
"fields" : [
{ "name": "key", "type": "string"},
{ "name": "values", "type":
{"type": "array", "items":
{ "name": "values", "type":
{"type": "array", "items":
{
"namespace": "sparqles.avro.analytics",
"namespace": "sparqles.avro.analytics",
"name": "EPViewAvailabilityDataPoint",
"type": "record",
"fields" : [
Expand All @@ -46,19 +46,19 @@
"name":"EPViewPerformance",
"fields":[
{"name": "threshold", "type": "long"},
{"name": "ask" , "type":
{"type": "array", "items":
{"name": "ask" , "type":
{"type": "array", "items":
{
"namespace": "sparqles.avro.analytics",
"namespace": "sparqles.avro.analytics",
"name": "EPViewPerformanceData",
"type": "record",
"fields" : [
{ "name": "key", "type": "string"},
{ "name": "color", "type": "string"},
{ "name": "data" , "type":
{ "type": "array", "items":
{
"namespace": "sparqles.avro.analytics",
{ "type": "array", "items":
{
"namespace": "sparqles.avro.analytics",
"name": "EPViewPerformanceDataValues",
"type": "record",
"fields" : [
Expand All @@ -70,7 +70,7 @@
}
}
]
}
}
}
},
{"name": "join" , "type": {"type": "array", "items": "array", "items" : "sparqles.avro.analytics.EPViewPerformanceData"}}
Expand All @@ -83,18 +83,18 @@
"type":"record",
"name":"EPViewInteroperability",
"fields":[
{"name": "SPARQL1Features" , "type":
{"type": "array", "items":
{"name": "SPARQL1Features" , "type":
{"type": "array", "items":
{
"namespace": "sparqles.avro.analytics",
"namespace": "sparqles.avro.analytics",
"name": "EPViewInteroperabilityData",
"type": "record",
"fields" : [
{ "name": "label", "type": "string"},
{ "name": "value", "type": "boolean"},
{ "name": "exception", "type": ["string", "null"]}
]
}
}
}
},
{"name": "SPARQL11Features" , "type": {"type": "array", "items": "array", "items" : "sparqles.avro.analytics.EPViewInteroperabilityData"}}
Expand All @@ -107,23 +107,65 @@
"type":"record",
"name":"EPViewDiscoverability",
"fields":[
{"name": "serverName" , "type" : "string"},
{"name": "VoIDDescription" , "type":
{"type": "array", "items":
{"name": "serverName" , "type" : "string"},
{"name": "VoIDDescription" , "type":
{"type": "array", "items":
{
"namespace": "sparqles.avro.analytics",
"namespace": "sparqles.avro.analytics",
"name": "EPViewDiscoverabilityData",
"type": "record",
"fields" : [
{ "name": "label", "type": "string"},
{ "name": "value", "type": "boolean"}
]
}
}
}
},
{"name": "SDDescription" , "type": {"type": "array", "items": "array", "items" : "sparqles.avro.analytics.EPViewDiscoverabilityData"}}
]
}
},
{"name": "calculation", "type": {
"namespace":"sparqles.avro.analytics",
"type":"record",
"name":"EPViewCalculation",
"fields":[
{"name": "triples", "type": "long"},
{"name": "entities", "type": "long"},
{"name": "classes", "type": "long"},
{"name": "properties", "type": "long"},
{"name": "distinctSubjects", "type": "long"},
{"name": "distinctObjects", "type": "long"},
{"name": "exampleResources", "type":
{"type": "array", "items":
{
"name": "uri", "type": "string"
}
}
},
{"name": "VoID", "type": "string"},
{"name": "VoIDPart", "type": "boolean"},
{"name": "SD", "type": "string"},
{"name": "SDPart", "type": "boolean"},
{"name": "coherence", "type": "double"},
{"name": "RS", "type": "double"}
]
},
"default": {
"triples": -1,
"entities": -1,
"classes": -1,
"properties": -1,
"distinctSubjects": -1,
"distinctObjects": -1,
"exampleResources": [],
"VoID": "n/a",
"VoIDPart": false,
"SD": "n/a",
"SDPart": false,
"coherence": -1.0,
"RS": -1.0
}
}
]
]
}
2 changes: 2 additions & 0 deletions backend/src/main/avro/FResult.avsc
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
{
"namespace": "sparqles.avro.features",
"type": "record",
"import" : "EndpointResult.avsc",
"import" : "Run.avsc",
"name": "FResult",
"fields": [
{"name": "endpointResult", "type": "sparqles.avro.EndpointResult"},
Expand Down
Loading
Loading