Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RML produces error when processing special character... #37

Open
kabulkurniawan opened this issue Apr 3, 2022 · 1 comment
Open

RML produces error when processing special character... #37

kabulkurniawan opened this issue Apr 3, 2022 · 1 comment

Comments

@kabulkurniawan
Copy link

input json

{ "id":"2baad784-c695-459a-9f2f-471aa5258938", "winlog":{ "computer_name":"logcollector", "event_id":"1", "api":"wineventlog", "opcode":"Info", "task":"Process Create (rule: ProcessCreate)", "process":{ "pid":2848, "thread":{ "id":4104 } }, "channel":"Microsoft-Windows-Sysmon/Operational", "provider_guid":"{5770385F-C22A-43E0-BF4C-06F5698FFBD9}", "user":{ "domain":"NT AUTHORITY", "identifier":"S-1-5-18", "type":"User", "name":"SYSTEM" }, "event_data":{ "Image":"C:\\Windows\\System32\\backgroundTaskHost.exe", "Company":"Microsoft Corporation", "Product":"Microsoft? Windows? Operating System", "LogonId":"0x5ae2c", "ProcessGuid":"{849CE8FF-62C8-6249-7747-000000004300}", "Hashes":"SHA1=EA114FDE884B16817A7D348660EA6FA1C217D315", "IntegrityLevel":"AppContainer", "LogonGuid":"{849CE8FF-E959-6248-2CAE-050000000000}", "TerminalSessionId":"2", "FileVersion":"10.0.17134.1 (WinBuild.160101.0800)", "Description":"Background Task Host", "User":"LOGCOLLECTOR\\sepses", "UtcTime":"2022-04-03 09:03:04.054", "RuleName":"-", "ParentProcessGuid":"{849CE8FF-E935-6248-0F00-000000004300}", "ProcessId":"9172", "CurrentDirectory":"C:\\Windows\\SystemApps\\Microsoft.Windows.ContentDeliveryManager_cw5n1h2txyewy\\", "ParentCommandLine":"C:\\WINDOWS\\system32\\svchost.exe -k DcomLaunch -p", "OriginalFileName":"backgroundTaskHost.exe", "ParentImage":"C:\\Windows\\System32\\svchost.exe", "ParentProcessId":"884", "CommandLine":"\"C:\\WINDOWS\\system32\\backgroundTaskHost.exe\" -ServerName:App.AppXmtcan0h2tfbfy7k9kn8hbxb6dmzz1zh0.mca" }, "version":5, "record_id":1824291, "provider_name":"Microsoft-Windows-Sysmon" }, "event":{ "kind":"event", "created":"2022-04-03T09:03:05.566Z", "action":"Process Create (rule: ProcessCreate)", "provider":"Microsoft-Windows-Sysmon", "code":"1", "original":"Process Create:\nRuleName: -\nUtcTime: 2022-04-03 09:03:04.054\nProcessGuid: {849CE8FF-62C8-6249-7747-000000004300}\nProcessId: 9172\nImage: C:\\Windows\\System32\\backgroundTaskHost.exe\nFileVersion: 10.0.17134.1 (WinBuild.160101.0800)\nDescription: Background Task Host\nProduct: Microsoft? Windows? Operating System\nCompany: Microsoft Corporation\nOriginalFileName: backgroundTaskHost.exe\nCommandLine: \"C:\\WINDOWS\\system32\\backgroundTaskHost.exe\" -ServerName:App.AppXmtcan0h2tfbfy7k9kn8hbxb6dmzz1zh0.mca\nCurrentDirectory: C:\\Windows\\SystemApps\\Microsoft.Windows.ContentDeliveryManager_cw5n1h2txyewy\\\nUser: LOGCOLLECTOR\\sepses\nLogonGuid: {849CE8FF-E959-6248-2CAE-050000000000}\nLogonId: 0x5AE2C\nTerminalSessionId: 2\nIntegrityLevel: AppContainer\nHashes: SHA1=EA114FDE884B16817A7D348660EA6FA1C217D315\nParentProcessGuid: {849CE8FF-E935-6248-0F00-000000004300}\nParentProcessId: 884\nParentImage: C:\\Windows\\System32\\svchost.exe\nParentCommandLine: C:\\WINDOWS\\system32\\svchost.exe -k DcomLaunch -p" } }

RML Mapping
`@prefix rr: http://www.w3.org/ns/r2rml# .
@Prefix xsd: http://www.w3.org/2001/XMLSchema# .
@Prefix rml: http://semweb.mmlab.be/ns/rml# .
@Prefix rmls: http://semweb.mmlab.be/ns/rmls# .
@Prefix ql: http://semweb.mmlab.be/ns/ql# .
@Prefix ue: http://w3id.org/sepses/vocab/unix-event# .
@Prefix audit: https://w3id.org/sepses/vocab/log/audit#.
@Prefix cl: https://w3id.org/sepses/vocab/log/core# .
@Prefix winlog: https://w3id.org/sepses/vocab/log/winlog# .
@Prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
@Prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
@base http://example.com/base/ .

a rr:TriplesMap;

rml:logicalSource [
rml:source [
a rmls:TCPSocketStream ;
rmls:hostName "localhost" ;
rmls:port "6662"
];
rml:referenceFormulation ql:JSONPath;
];

rr:subjectMap [ rr:template "http://w3id.org/sepses/resource/auditlog#LogEntry-{id}"];
rr:predicateObjectMap [ rr:predicate rdf:type; rr:object https://w3id.org/sepses/vocab/log/audit#WinlogEvent];
rr:predicateObjectMap [ rr:predicate cl:timestamp; rr:objectMap [ rr:template "{@timestamp}"; rr:datatype xsd:string;]];
rr:predicateObjectMap [ rr:predicate winlog:CommandLine; rr:objectMap [ rr:template "{winlog.event_data.CommandLine}"; rr:datatype rdfs:Literal;]];
.`

Error:
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147) at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:259) ... 20 more Caused by: java.lang.IllegalArgumentException: character to be escaped is missing at java.base/java.util.regex.Matcher.appendExpandedReplacement(Matcher.java:1020) at java.base/java.util.regex.Matcher.appendReplacement(Matcher.java:998) at java.base/java.util.regex.Matcher.replaceAll(Matcher.java:1182) at java.base/java.lang.String.replaceAll(String.java:2142) at io.rml.framework.engine.Engine$$anonfun$processTemplate$2$$anonfun$apply$2$$anonfun$apply$3.apply(Engine.scala:81) at io.rml.framework.engine.Engine$$anonfun$processTemplate$2$$anonfun$apply$2$$anonfun$apply$3.apply(Engine.scala:81) at scala.collection.immutable.List.map(List.scala:284) at io.rml.framework.engine.Engine$$anonfun$processTemplate$2$$anonfun$apply$2.apply(Engine.scala:81) at io.rml.framework.engine.Engine$$anonfun$processTemplate$2$$anonfun$apply$2.apply(Engine.scala:76) at scala.Option.foreach(Option.scala:257) at io.rml.framework.engine.Engine$$anonfun$processTemplate$2.apply(Engine.scala:76) at io.rml.framework.engine.Engine$$anonfun$processTemplate$2.apply(Engine.scala:74) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)

@ghsnd
Copy link
Contributor

ghsnd commented May 19, 2022

Hi, there are some issue with the mapping rules:

  • Turtle is case sensitive (by default), so @Prefix should be @prefix;
  • RDF resources need to be surrounded by < and >, so e.g. https://w3id.org/sepses/vocab/log/audit#WinlogEvent becomes <https://w3id.org/sepses/vocab/log/audit#WinlogEvent>;
  • a Triples Map is also a resource, so it needs to have a URL.
  • It's best to define an iterator for the source. For example in this case I think $ works.
  • There is no timestamp field (in this excample), so RMLStreamer won't produce a triple for that.

So if you fix these, you can get a working mapping, like this one: (but please check and modify to your needs):

@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rml: <http://semweb.mmlab.be/ns/rml#> .
@prefix rmls: <http://semweb.mmlab.be/ns/rmls#> .
@prefix ql: <http://semweb.mmlab.be/ns/ql#> .
@prefix ue: <http://w3id.org/sepses/vocab/unix-event#> .
@prefix audit: <https://w3id.org/sepses/vocab/log/audit#>.
@prefix cl: <https://w3id.org/sepses/vocab/log/core#> .
@prefix winlog: <https://w3id.org/sepses/vocab/log/winlog#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@base <http://example.com/base/> .

<#someTM> a rr:TriplesMap;

rml:logicalSource [
rml:source "/home/gerald/projects/rml/bugs/rmlstreamer_github_37/input.json";
rml:referenceFormulation ql:JSONPath;
rml:iterator "$"
];

rr:subjectMap [ rr:template "http://w3id.org/sepses/resource/auditlog#LogEntry-{id}"];
rr:predicateObjectMap [ rr:predicate rdf:type; rr:object <https://w3id.org/sepses/vocab/log/audit#WinlogEvent>];
rr:predicateObjectMap [ rr:predicate cl:timestamp; rr:objectMap [ rr:template "{@timestamp}"; rr:datatype xsd:string;]];
rr:predicateObjectMap [ rr:predicate winlog:CommandLine; rr:objectMap [ rr:template "{winlog.event_data.CommandLine}"; rr:datatype rdfs:Literal;]];

It produces these triples:

<http://w3id.org/sepses/resource/auditlog#LogEntry-2baad784-c695-459a-9f2f-471aa5258938> <https://w3id.org/sepses/vocab/log/winlog#CommandLine> "\"C:WINDOWSsystem32backgroundTaskHost.exe\" -ServerName:App.AppXmtcan0h2tfbfy7k9kn8hbxb6dmzz1zh0.mca"^^<http://www.w3.org/2000/01/rdf-schema#Literal> .
<http://w3id.org/sepses/resource/auditlog#LogEntry-2baad784-c695-459a-9f2f-471aa5258938> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/sepses/vocab/log/audit#WinlogEvent> .

Could you try again and let me know if that works?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants