Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: pull OpenLinkSW fork changes #86

Merged
merged 63 commits into from
Jan 4, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
6f6b5b1
feat: pull OpenLinkSW fork changes
berezovskyi Dec 26, 2024
589bd39
fix: allow coherence to be added gradually to existing instances
berezovskyi Dec 26, 2024
02e9cd7
fix: Avro quirks
berezovskyi Dec 26, 2024
55466da
fix: Avro quirks 2
berezovskyi Dec 27, 2024
448e5d9
fix: Avro quirks 2
berezovskyi Dec 27, 2024
cfd15cd
fix: undo unintended change
berezovskyi Dec 29, 2024
5d8adef
fix: more fixes from the fork + make the AEvol code run on IV recalc
berezovskyi Dec 29, 2024
f007602
fix: typo
berezovskyi Dec 29, 2024
6575028
chore: switch to GOOGLE code style
berezovskyi Dec 29, 2024
b74c8d1
chore: extra logging
berezovskyi Dec 29, 2024
54d717c
fix: use either amonths or index.availability for the index page avai…
berezovskyi Dec 29, 2024
b6ff40b
use index.availability, ignore amonths
berezovskyi Dec 29, 2024
044240a
fix: pre-process index.availability
berezovskyi Dec 29, 2024
c08efd5
fix: type
berezovskyi Dec 29, 2024
85dcdd1
fix: Index page updates
berezovskyi Dec 29, 2024
a9feac7
chore: extra logging
berezovskyi Dec 31, 2024
e364bb5
feat: user agent and timeouts
berezovskyi Dec 31, 2024
3dbf7f5
feat: user agent and timeouts
berezovskyi Dec 31, 2024
1c90f4f
feat: user agent and timeouts
berezovskyi Dec 31, 2024
96f05e4
feat: improve scheduling
berezovskyi Dec 31, 2024
5a74dd3
feat: improve scheduling
berezovskyi Dec 31, 2024
1a3c06b
fix: scheduling
berezovskyi Dec 31, 2024
f33ba21
feat: new charts on the homepage [WIP]
berezovskyi Dec 31, 2024
f0e524a
fix: show new charts on the homepage
berezovskyi Dec 31, 2024
e337d87
refactor: schedules, user agents, and log strings
berezovskyi Dec 31, 2024
144a5ea
fix: API discovery template
berezovskyi Dec 31, 2024
7aa98e6
docs: update footer
berezovskyi Dec 31, 2024
82161a4
refactor: better fault handling
berezovskyi Dec 31, 2024
22b649b
refactor: better fault handling
berezovskyi Dec 31, 2024
bb4e4be
refactor: better fault handling
berezovskyi Dec 31, 2024
645bdfd
refactor: better fault handling
berezovskyi Dec 31, 2024
ad69d7c
refactor: better fault handling
berezovskyi Dec 31, 2024
cb4e405
refactor: better fault handling
berezovskyi Dec 31, 2024
a25affc
refactor: better fault handling
berezovskyi Dec 31, 2024
c750f18
refactor: better fault handling
berezovskyi Dec 31, 2024
36ff8c4
refactor: better fault handling
berezovskyi Dec 31, 2024
5a62cb2
refactor: better fault handling
berezovskyi Dec 31, 2024
4e98c89
refactor: better fault handling
berezovskyi Dec 31, 2024
d03f273
refactor: better fault handling
berezovskyi Dec 31, 2024
eec7020
refactor: better fault handling
berezovskyi Dec 31, 2024
539b07a
refactor: better fault handling
berezovskyi Dec 31, 2024
b79b763
refactor: better fault handling
berezovskyi Dec 31, 2024
371c968
refactor: better fault handling
berezovskyi Dec 31, 2024
4af8ffe
refactor: better fault handling
berezovskyi Dec 31, 2024
2ce1597
refactor: better fault handling
berezovskyi Dec 31, 2024
b5aefe0
refactor: better fault handling
berezovskyi Dec 31, 2024
81e4f3b
feat: calculate median threshold
berezovskyi Dec 31, 2024
37f55ea
refactor: better fault handling
berezovskyi Dec 31, 2024
4a526ff
refactor: better fault handling
berezovskyi Dec 31, 2024
6f69761
refactor: better fault handling
berezovskyi Dec 31, 2024
b6c4404
fix: safe access
berezovskyi Dec 31, 2024
e19556d
fix: safe access
berezovskyi Dec 31, 2024
63a884e
feat: calculate median threshold
berezovskyi Jan 1, 2025
1bc9d2d
fix: median threshold
berezovskyi Jan 1, 2025
ef03683
fix: index/avail calc
berezovskyi Jan 1, 2025
b2d8d21
wip: use index.avail by default
berezovskyi Jan 1, 2025
fcc133e
refactor: better fault handling
berezovskyi Jan 1, 2025
842ce81
refactor: better fault handling
berezovskyi Jan 1, 2025
eb27774
refactor: better fault handling
berezovskyi Jan 1, 2025
7e65653
feat: a script to add endpoints from VOID/SD metadata
berezovskyi Jan 1, 2025
26d3142
fix: timestamps in the views
berezovskyi Jan 3, 2025
7965e7b
Update backend/src/main/java/sparqles/analytics/PAnalyser.java
berezovskyi Jan 4, 2025
a586f9c
Update backend/src/main/java/sparqles/analytics/IndexViewAnalytics.java
berezovskyi Jan 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions backend/src/main/avro/AResult.avsc
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{"namespace": "sparqles.avro.availability",
"type": "record",
"import" : "EndpointResult.avsc",
"name": "AResult",
"fields": [
{"name": "endpointResult", "type": "sparqles.avro.EndpointResult"},
Expand Down
28 changes: 28 additions & 0 deletions backend/src/main/avro/CResult.avsc
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"namespace": "sparqles.avro.calculation",
"type": "record",
"import" : "EndpointResult.avsc",
"name": "CResult",
"fields": [
{"name": "endpointResult", "type": "sparqles.avro.EndpointResult"},
{"name": "triples", "type": "long"},
{"name": "entities", "type": "long"},
{"name": "classes", "type": "long"},
{"name": "properties", "type": "long"},
{"name": "distinctSubjects", "type": "long"},
{"name": "distinctObjects", "type": "long"},
{"name": "exampleResources", "type":
{"type": "array", "items":
{
"name": "uri", "type": "string"
}
}
},
{"name": "VoID", "type": "string"},
{"name": "VoIDPart", "type": "boolean"},
{"name": "SD", "type": "string"},
{"name": "SDPart", "type": "boolean"},
{"name": "coherence", "type": "double"},
{"name": "RS", "type": "double"}
]
}
15 changes: 15 additions & 0 deletions backend/src/main/avro/CalculationView.avsc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"namespace": "sparqles.avro.analytics",
"type": "record",
"name": "CalculationView",
"fields": [
{"name": "endpoint", "type": "sparqles.avro.Endpoint"},
{"name": "VoID", "type": "boolean"},
{"name": "VoIDPart", "type": "boolean"},
{"name": "SD", "type": "boolean"},
{"name": "SDPart", "type": "boolean"},
{"name": "coherence", "type": "double"},
{"name": "RS", "type": "double"},
{"name": "lastUpdate", "type": "long"}
]
}
1 change: 1 addition & 0 deletions backend/src/main/avro/DResult.avsc
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{
"namespace": "sparqles.avro.discovery",
"type": "record",
"import" : "EndpointResult.avsc",
"name": "DResult",
"fields": [
{"name": "endpointResult", "type": "sparqles.avro.EndpointResult"},
Expand Down
28 changes: 28 additions & 0 deletions backend/src/main/avro/EPView.avsc
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,34 @@
{"name": "SDDescription" , "type": {"type": "array", "items": "array", "items" : "sparqles.avro.analytics.EPViewDiscoverabilityData"}}
]
}
},
{"name": "calculation", "type":
{
"namespace":"sparqles.avro.analytics",
"type":"record",
"name":"EPViewCalculation",
"fields":[
{"name": "triples", "type": "long"},
{"name": "entities", "type": "long"},
{"name": "classes", "type": "long"},
{"name": "properties", "type": "long"},
{"name": "distinctSubjects", "type": "long"},
{"name": "distinctObjects", "type": "long"},
{"name": "exampleResources", "type":
{"type": "array", "items":
{
"name": "uri", "type": "string"
}
}
},
{"name": "VoID", "type": "string"},
{"name": "VoIDPart", "type": "boolean"},
{"name": "SD", "type": "string"},
{"name": "SDPart", "type": "boolean"},
{"name": "coherence", "type": "double"},
{"name": "RS", "type": "double"}
]
}
}
]
}
2 changes: 2 additions & 0 deletions backend/src/main/avro/FResult.avsc
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
{
"namespace": "sparqles.avro.features",
"type": "record",
"import" : "EndpointResult.avsc",
"import" : "Run.avsc",
"name": "FResult",
"fields": [
{"name": "endpointResult", "type": "sparqles.avro.EndpointResult"},
Expand Down
48 changes: 45 additions & 3 deletions backend/src/main/avro/Index.avsc
Original file line number Diff line number Diff line change
Expand Up @@ -91,15 +91,14 @@
]
}
}
}
}
]
}
}
}
]
}
},

{"name": "discoverability", "type":
{
"namespace":"sparqles.avro.analytics",
Expand Down Expand Up @@ -137,7 +136,50 @@
]
}

},
{"name": "calculation", "type":
{
"namespace":"sparqles.avro.analytics",
"type":"record",
"name":"IndexViewCalculation",
"fields":[
{"name": "coherences" , "type":
{"type": "array", "items":
{
"namespace": "sparqles.avro.analytics",
"name": "IndexViewCalculationData",
"type": "record",
"fields" : [
{ "name": "key", "type": "string"},
{ "name": "values" , "type":
{ "type": "array", "items":
{
"namespace": "sparqles.avro.analytics",
"name": "IndexViewCalculationDataValues",
"type": "record",
"fields" : [
{ "name": "label", "type": "string"},
{ "name": "value", "type": "double"}
]
}
}
}
]
}
}
},
{"name": "rss" , "type":
{"type": "array", "items": "sparqles.avro.analytics.IndexViewCalculationData"}
},
{ "name": "VoID", "type": "double"},
{ "name": "VoIDPart", "type": "double"},
{ "name": "SD", "type": "double"},
{ "name": "SDPart", "type": "double"},
{ "name": "Coherence", "type": "double"},
{ "name": "RS", "type": "double"}
]
}

}
]
}

14 changes: 3 additions & 11 deletions backend/src/main/avro/PResult.avsc
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
{
"namespace": "sparqles.avro.performance",
"type": "record",
"import" : "EndpointResult.avsc",
"import" : "Run.avsc",
"name": "PResult",
"fields": [
{"name": "endpointResult", "type": "sparqles.avro.EndpointResult"},
Expand All @@ -10,17 +12,7 @@
"type": "record",
"fields" : [
{ "name": "query", "type": "string"},
{ "name": "cold", "type":
{"type":"record","name":"Run","namespace":"sparqles.avro.performance",
"fields":[
{"name": "frestout", "type": "long"},
{"name": "solutions", "type": "int"},
{"name": "inittime", "type": "long"},
{"name": "exectime", "type": "long"},
{"name": "closetime", "type": "long"},
{"name": "Exception", "type": ["string", "null"]},
{"name": "exectout", "type": "long"}
]}},
{ "name": "cold", "type": "sparqles.avro.performance.Run"},
{ "name": "warm", "type": "sparqles.avro.performance.Run"}
]}
}
Expand Down
14 changes: 14 additions & 0 deletions backend/src/main/avro/Run.avsc
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"namespace":"sparqles.avro.performance",
"type":"record",
"name":"Run",
"fields":[
{"name": "frestout", "type": "long"},
{"name": "solutions", "type": "int"},
{"name": "inittime", "type": "long"},
{"name": "exectime", "type": "long"},
{"name": "closetime", "type": "long"},
{"name": "Exception", "type": ["string", "null"]},
{"name": "exectout", "type": "long"}
]
}
1 change: 1 addition & 0 deletions backend/src/main/avro/Schedule.avsc
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
{"name": "FTask", "type": ["string", "null"]},
{"name": "PTask", "type": ["string", "null"]},
{"name": "DTask", "type": ["string", "null"]},
{"name": "CTask", "type": ["string", "null"]},
{"name": "ITask", "type": ["string", "null"]},
{"name": "ETask", "type": ["string", "null"]}
]
Expand Down
138 changes: 71 additions & 67 deletions backend/src/main/config/log4j.properties
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,73 @@ log4j.rootLogger=INFO, stdout, stderr
#log4j.logger.sparqles.core.features=DEBUG, flog
#log4j.logger.sparqles.core.performance=DEBUG, plog
#log4j.logger.sparqles.utils.ExceptionHandler=INFO, exlog
#DISABLE certain packages
log4j.logger.org.apache.http=WARN
log4j.logger.org.apache.commons.httpclient.params.DefaultHttpParams=INFO
log4j.logger.com.hp.hpl.jena.sparql=WARN
log4j.logger.org.apache.jena=WARN

# Direct log messages to a log file
log4j.appender.file=org.apache.log4j.DailyRollingFileAppender
log4j.appender.file.DatePattern = '.'yyyy-MM-dd
log4j.appender.file.Append = true
log4j.appender.file.Threshold=INFO
log4j.appender.file.File=logs/main.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n

# Log all information for ATask log messages to a log file
log4j.appender.alog=org.apache.log4j.DailyRollingFileAppender
log4j.appender.alog.DatePattern = '.'yyyy-MM-dd
log4j.appender.alog.Append = true
log4j.appender.alog.Threshold=INFO
log4j.appender.alog.File=logs/availability.log
log4j.appender.alog.layout=org.apache.log4j.PatternLayout
log4j.appender.alog.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n


# Log all information for ATask log messages to exception log file
log4j.appender.exlog=org.apache.log4j.DailyRollingFileAppender
log4j.appender.exlog.DatePattern = '.'yyyy-MM-dd
log4j.appender.exlog.Append = true
log4j.appender.exlog.Threshold=INFO
log4j.appender.exlog.File=logs/exception.log
log4j.appender.exlog.layout=org.apache.log4j.PatternLayout
log4j.appender.exlog.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n



# Log all information for ATask log messages to a log file
log4j.appender.flog=org.apache.log4j.DailyRollingFileAppender
log4j.appender.flog.DatePattern = '.'yyyy-MM-dd
log4j.appender.flog.Append = true
log4j.appender.flog.Threshold=INFO
log4j.appender.flog.File=logs/interoperability.log
log4j.appender.flog.layout=org.apache.log4j.PatternLayout
log4j.appender.flog.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n

# Log all information for ATask log messages to a log file
log4j.appender.dlog=org.apache.log4j.DailyRollingFileAppender
log4j.appender.dlog.DatePattern = '.'yyyy-MM-dd
log4j.appender.dlog.Append = true
log4j.appender.dlog.Threshold=INFO
log4j.appender.dlog.File=logs/discoverability.log
log4j.appender.dlog.layout=org.apache.log4j.PatternLayout
log4j.appender.dlog.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n

# Log all information for ATask log messages to a log file
log4j.appender.plog=org.apache.log4j.DailyRollingFileAppender
log4j.appender.plog.DatePattern = '.'yyyy-MM-dd
log4j.appender.plog.Append = true
log4j.appender.plog.Threshold=INFO
log4j.appender.plog.File=logs/performance.log
log4j.appender.plog.layout=org.apache.log4j.PatternLayout
log4j.appender.plog.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n
Comment on lines +19 to +65
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Reduce configuration duplication and standardize naming

  1. There's significant duplication in appender configurations. All appenders share identical settings except for the file path.
  2. The naming is inconsistent: the logger uses features but the log file is named interoperability.log.

Consider:

  1. Using a property placeholder system to reduce duplication:
# Common settings
log4j.appender.common.DatePattern='.'yyyy-MM-dd
log4j.appender.common.Append=true
log4j.appender.common.Threshold=INFO
log4j.appender.common.layout=org.apache.log4j.PatternLayout
log4j.appender.common.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n

# Specific appenders inherit common settings
log4j.appender.alog=org.apache.log4j.DailyRollingFileAppender
log4j.appender.alog.File=logs/availability.log
# ... inherit common settings ...
  1. Standardize the naming convention between loggers and file names:
-log4j.logger.sparqles.core.features=DEBUG, flog 
+log4j.logger.sparqles.core.interoperability=DEBUG, ilog 




# Direct log messages to stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
#log4j.appender.stdout.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %5p [%c{1}:%M:%L] - %m%n
#log4j.appender.stdout.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n

log4j.appender.stderr=org.apache.log4j.ConsoleAppender
log4j.appender.stderr.Target=System.err
log4j.appender.stderr.Threshold=ERROR
Expand All @@ -32,65 +88,13 @@ log4j.appender.stdout.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} [%5p] %c:%L
log4j.appender.stderr.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} [%5p] %c:%L - %m%n


# Don't want to log to file when in Docker
## Direct log messages to a log file
#log4j.appender.file=org.apache.log4j.DailyRollingFileAppender
#log4j.appender.file.DatePattern = '.'yyyy-MM-dd
#log4j.appender.file.Append = true
#log4j.appender.file.Threshold=INFO
#log4j.appender.file.File=logs/main.log
#log4j.appender.file.layout=org.apache.log4j.PatternLayout
#log4j.appender.file.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n
#
## Log all information for ATask log messages to a log file
#log4j.appender.alog=org.apache.log4j.DailyRollingFileAppender
#log4j.appender.alog.DatePattern = '.'yyyy-MM-dd
#log4j.appender.alog.Append = true
#log4j.appender.alog.Threshold=INFO
#log4j.appender.alog.File=logs/availability.log
#log4j.appender.alog.layout=org.apache.log4j.PatternLayout
#log4j.appender.alog.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n
#
#
## Log all information for ATask log messages to exception log file
#log4j.appender.exlog=org.apache.log4j.DailyRollingFileAppender
#log4j.appender.exlog.DatePattern = '.'yyyy-MM-dd
#log4j.appender.exlog.Append = true
#log4j.appender.exlog.Threshold=INFO
#log4j.appender.exlog.File=logs/exception.log
#log4j.appender.exlog.layout=org.apache.log4j.PatternLayout
#log4j.appender.exlog.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n
#
#
#
## Log all information for ATask log messages to a log file
#log4j.appender.flog=org.apache.log4j.DailyRollingFileAppender
#log4j.appender.flog.DatePattern = '.'yyyy-MM-dd
#log4j.appender.flog.Append = true
#log4j.appender.flog.Threshold=INFO
#log4j.appender.flog.File=logs/interoperability.log
#log4j.appender.flog.layout=org.apache.log4j.PatternLayout
#log4j.appender.flog.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n
#
## Log all information for ATask log messages to a log file
#log4j.appender.dlog=org.apache.log4j.DailyRollingFileAppender
#log4j.appender.dlog.DatePattern = '.'yyyy-MM-dd
#log4j.appender.dlog.Append = true
#log4j.appender.dlog.Threshold=INFO
#log4j.appender.dlog.File=logs/discoverability.log
#log4j.appender.dlog.layout=org.apache.log4j.PatternLayout
#log4j.appender.dlog.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n
#
## Log all information for ATask log messages to a log file
#log4j.appender.plog=org.apache.log4j.DailyRollingFileAppender
#log4j.appender.plog.DatePattern = '.'yyyy-MM-dd
#log4j.appender.plog.Append = true
#log4j.appender.plog.Threshold=INFO
#log4j.appender.plog.File=logs/performance.log
#log4j.appender.plog.layout=org.apache.log4j.PatternLayout
#log4j.appender.plog.layout.ConversionPattern=%d{dd-MM-yy HH:mm:ss} %15.15c{1}:%-3.3L %5p - %m%n
#
#log4j.appender.HTML=org.apache.log4j.FileAppender
#log4j.appender.HTML.File=logs/main.html
#log4j.appender.HTML.layout=org.apache.log4j.HTMLLayout
#log4j.appender.HTML.Threshold=DEBUG
log4j.appender.HTML=org.apache.log4j.FileAppender
log4j.appender.HTML.File=logs/main.html
log4j.appender.HTML.layout=org.apache.log4j.HTMLLayout
log4j.appender.HTML.Threshold=DEBUG

#DISABLE certain packages
log4j.logger.org.apache.http=WARN
log4j.logger.org.apache.commons.httpclient.params.DefaultHttpParams=INFO
log4j.logger.com.hp.hpl.jena.sparql=WARN
log4j.logger.org.apache.jena=WARN
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ public NoRobotClient(String userAgent) {
// throw new NoRobotException("Problem while parsing "+txtUrl, nre);
// }
// }

public void parse(String txt, URL baseUrl) throws NoRobotException {
this.baseUrl = baseUrl;
parseText(txt);
Expand Down
Loading
Loading