-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Update docs * Downgrade docusaurus dependencies * Update docs * Fix docs style * Update README with VTL 2.1 reference * Add SDMX reference * Update VTL SDMX use case
- Loading branch information
Showing
51 changed files
with
5,129 additions
and
572 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,239 @@ | ||
--- | ||
slug: /trevas-provenance | ||
title: Trevas - Provenance | ||
authors: [nicolas] | ||
tags: [Trevas, provenance, SDTH] | ||
--- | ||
|
||
import useBaseUrl from '@docusaurus/useBaseUrl'; | ||
import Link from '@theme/Link'; | ||
|
||
### News | ||
|
||
Trevas 1.6.0 introduces the VTL Prov module. | ||
|
||
This module enables to produce lineage metadata from Trevas, based on RDF ontologies: `PROV-O` and `SDTH`. | ||
|
||
#### SDTH model overview | ||
|
||
```mermaid | ||
classDiagram | ||
class Program["sdth:Program"] { | ||
rdfs:label | ||
} | ||
class ProgramStep["sdth:ProgramStep"] { | ||
rdfs:label | ||
sdth:hasSourceCode | ||
sdth:hasSDTL | ||
} | ||
class VariableInstance["sdth:VariableInstance"] { | ||
rdfs:label | ||
sdth:hasName | ||
} | ||
class DataframeInstance["sdth:DataframeInstance"] { | ||
rdfs:label | ||
sdth:hasName | ||
} | ||
class FileInstance["sdth:FileInstance"] { | ||
rdfs:label | ||
sdth:hasName | ||
} | ||
ProgramStep <-- Program : sdthhasProgramStep | ||
ProgramStep <-- ProgramStep : sdth_hasProgramStep | ||
ProgramStep --> VariableInstance : sdth_usesVariable | ||
ProgramStep --> VariableInstance : sdth_assignsVariable | ||
ProgramStep --> DataframeInstance : sdth_consumesDataframe | ||
ProgramStep --> DataframeInstance : sdth_producesDataframe | ||
ProgramStep --> FileInstance : sdth_loadsFile | ||
ProgramStep --> FileInstance : sdth_savesFile | ||
DataframeInstance --> VariableInstance : sdth_hasVariableInstance | ||
FileInstance --> VariableInstance : sdth_hasVariableInstance | ||
DataframeInstance --> DataframeInstance : sdth_derivedFrom | ||
DataframeInstance --> DataframeInstance : sdth_elaborationOf | ||
FileInstance --> FileInstance : sdth_derivedFrom | ||
FileInstance --> FileInstance : sdth_elaborationOf | ||
VariableInstance --> VariableInstance : sdth_derivedFrom | ||
VariableInstance --> VariableInstance : sdth_elaborationOf | ||
``` | ||
|
||
#### Adopted model | ||
|
||
The `vtl-prov` module, version 1.6.0, uses the following partial model: | ||
|
||
```mermaid | ||
classDiagram | ||
class Agent { | ||
} | ||
class Program { | ||
rdfs:label | ||
} | ||
class ProgramStep { | ||
rdfs:label | ||
} | ||
class VariableInstance { | ||
rdfs:label | ||
sdth:hasName | ||
} | ||
class DataframeInstance { | ||
rdfs:label | ||
sdth:hasName | ||
} | ||
Agent <|-- Program | ||
ProgramStep <-- Program : sdth_hasProgramStep | ||
ProgramStep --> VariableInstance : sdth_usesVariable | ||
ProgramStep --> VariableInstance : sdth_assignsVariable | ||
ProgramStep --> DataframeInstance : sdth_consumesDataframe | ||
ProgramStep --> DataframeInstance : sdth_producesDataframe | ||
DataframeInstance --> VariableInstance : sdth_hasVariableInstance | ||
DataframeInstance --> DataframeInstance : sdth_wasDerivedFrom | ||
VariableInstance --> VariableInstance : sdth_wasDerivedFrom | ||
``` | ||
|
||
Improvements will come in next weeks. | ||
|
||
#### Tools available | ||
|
||
Provenance Trevas tools are documented <Link label={"here"} href={useBaseUrl('/developer-guide/spark-mode/data-sources/sdmx')} />. | ||
|
||
#### Example | ||
|
||
##### Business use case | ||
|
||
Two sources datasets are transformed to produce transient datasets and a final permanent one. | ||
|
||
```mermaid | ||
flowchart TD | ||
OP1{add +} | ||
OP2{multiply *} | ||
OP3{filter} | ||
OP4{create variable} | ||
SC3([3]) | ||
ds_1 --> OP1 | ||
ds_2 --> OP1 | ||
OP1 --> ds_sum | ||
SC3 --> OP2 | ||
ds_sum --> OP2 | ||
OP2 --> ds_mul | ||
ds_mul --> OP3 | ||
OP3 --> OP4 | ||
OP4 --> ds_res | ||
``` | ||
|
||
### Inputs | ||
|
||
`ds1` & `ds2` metadata: | ||
|
||
| id | var1 | var2 | | ||
| :--------: | :-----: | :-----: | | ||
| STRING | INTEGER | NUMBER | | ||
| IDENTIFIER | MEASURE | MEASURE | | ||
|
||
### VTL script | ||
|
||
```vtl | ||
ds_sum := ds1 + ds2; | ||
ds_mul := ds_sum * 3; | ||
ds_res <- ds_mul[filter mod(var1, 2) = 0][calc var_sum := var1 + var2]; | ||
``` | ||
|
||
### RDF model target | ||
|
||
```ttl | ||
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> | ||
PREFIX prov: <http://www.w3.org/ns/prov#> | ||
PREFIX sdth: <http://rdf-vocabulary.ddialliance.org/sdth#> | ||
# --- Program and steps | ||
<http://example.com/program1> a sdth:Program ; | ||
a prov:Agent ; # Agent? Or an activity | ||
rdfs:label "My program 1"@en, "Mon programme 1"@fr ; | ||
sdth:hasProgramStep <http://example.com/program1/program-step1>, | ||
<http://example.com/program1/program-step2>, | ||
<http://example.com/program1/program-step3> . | ||
<http://example.com/program1/program-step1> a sdth:ProgramStep ; | ||
rdfs:label "Program step 1"@en, "Étape 1"@fr ; | ||
sdth:hasSourceCode "ds_sum := ds1 + ds2;" ; | ||
sdth:consumesDataframe <http://example.com/dataset/ds1>, | ||
<http://example.com/dataset/ds2> ; | ||
sdth:producesDataframe <http://example.com/dataset/ds_sum> . | ||
<http://example.com/program1/program-step2> a sdth:ProgramStep ; | ||
rdfs:label "Program step 2"@en, "Étape 2"@fr ; | ||
sdth:hasSourceCode "ds_mul := ds_sum * 3;" ; | ||
sdth:consumesDataframe <http://example.com/dataset/ds_sum> ; | ||
sdth:producesDataframe <http://example.com/dataset/ds_mul> . | ||
<http://example.com/program1/program-step3> a sdth:ProgramStep ; | ||
rdfs:label "Program step 3"@en, "Étape 3"@fr ; | ||
sdth:hasSourceCode "ds_res <- ds_mul[filter mod(var1, 2) = 0][calc var_sum := var1 + var2];" ; | ||
sdth:consumesDataframe <http://example.com/dataset/ds_mul> ; | ||
sdth:producesDataframe <http://example.com/dataset/ds_res> ; | ||
sdth:usesVariable <http://example.com/variable/var1>, | ||
<http://example.com/variable/var2> ; | ||
sdth:assignsVariable <http://example.com/variable/var_sum> . | ||
# --- Variables | ||
# i think here it's not instances but names we refer to... | ||
<http://example.com/variable/id1> a sdth:VariableInstance ; | ||
rdfs:label "id1" . | ||
<http://example.com/variable/var1> a sdth:VariableInstance ; | ||
rdfs:label "var1" . | ||
<http://example.com/variable/var2> a sdth:VariableInstance ; | ||
rdfs:label "var2" . | ||
<http://example.com/variable/var_sum> a sdth:VariableInstance ; | ||
rdfs:label "var_sum" . | ||
# --- Data frames | ||
<http://example.com/dataset/ds1> a sdth:DataframeInstance ; | ||
rdfs:label "ds1" ; | ||
sdth:hasName "ds1" ; | ||
sdth:hasVariableInstance <http://example.com/variable/id1>, | ||
<http://example.com/variable/var1>, | ||
<http://example.com/variable/var2> . | ||
<http://example.com/dataset/ds2> a sdth:DataframeInstance ; | ||
rdfs:label "ds2" ; | ||
sdth:hasName "ds2" ; | ||
sdth:hasVariableInstance <http://example.com/variable/id1>, | ||
<http://example.com/variable/var1>, | ||
<http://example.com/variable/var2> . | ||
<http://example.com/dataset/ds_sum> a sdth:DataframeInstance ; | ||
rdfs:label "ds_sum" ; | ||
sdth:hasName "ds_sum" ; | ||
sdth:wasDerivedFrom <http://example.com/dataset/ds1>, | ||
<http://example.com/dataset/ds2> ; | ||
sdth:hasVariableInstance <http://example.com/variable/id1>, | ||
<http://example.com/variable/var1>, | ||
<http://example.com/variable/var2> . | ||
<http://example.com/dataset/ds_mul> a sdth:DataframeInstance ; | ||
rdfs:label "ds_mul" ; | ||
sdth:hasName "ds_mul" ; | ||
sdth:wasDerivedFrom <http://example.com/dataset/ds_sum> ; | ||
sdth:hasVariableInstance <http://example.com/variable/id1>, | ||
<http://example.com/variable/var1>, | ||
<http://example.com/variable/var2> . | ||
<http://example.com/dataset/ds_res> a sdth:DataframeInstance ; | ||
rdfs:label "ds_res" ; | ||
sdth:wasDerivedFrom <http://example.com/dataset/ds_mul> ; | ||
sdth:hasVariableInstance <http://example.com/variable/id1>, | ||
<http://example.com/variable/var1>, | ||
<http://example.com/variable/var2>, | ||
<http://example.com/variable/var_sum> . | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
--- | ||
slug: /trevas-vtl-21 | ||
title: Trevas - VTL 2.1 | ||
authors: [nicolas] | ||
tags: [Trevas, 'VTL 2.1'] | ||
--- | ||
|
||
import useBaseUrl from '@docusaurus/useBaseUrl'; | ||
import Link from '@theme/Link'; | ||
|
||
Trevas 1.7.0 upgrade to version 2.1 of VTL. | ||
|
||
This version introduces two new operators: | ||
|
||
- `random` | ||
- `case` | ||
|
||
`random` produces a decimal number between 0 and 1. | ||
|
||
`case` allows for clearer multi conditional branching, for example: | ||
|
||
`ds2 := ds1[ calc c := case when r < 0.2 then "Low" when r > 0.8 then "High" else "Medium" ]` | ||
|
||
Both operators are already available in Trevas! | ||
|
||
The new grammar also provides time operators and includes corrections, without any breaking changes compared to the 2.0 version. | ||
|
||
See the <Link label={"coverage"} href={useBaseUrl('/user-guide/coverage')} /> section for more details. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.