Epic: Data Validator v0.1 #1

petewalker · 2018-07-11T11:32:09Z

Key Stakeholders

Leigh Dodds (ODI), Melanie Abraham (ODI), Pete Walker (imin), Nish Desai (imin)

Scope

Online validator tool to allow a developer to validate a JSON document against the Modelling Opportunity Specification
Extensible set of validation rules, covering a range of options
Focus on implementing robust set of validation rules

Out of Scope

RDPE Feed validator
Validation as a service
Automated fixing of bad data
Validation of custom properties

User Stories

Roles

Developer - An end user who is using the library to validate their JSON
Maintainer - An end user who will be further developing and maintaining the library in the future

Artifacts

Specification - The current modelling opportunity data specification

Stories

Library

As a Developer, I would like to submit a JSON fragment as a string to the library, and have it validate it against the current Specification
As a Developer, I would like the library to return different severities of error, such as failure, warning and notice
As a Developer, I would like the library to return a failure if my JSON fragment is not valid JSON
As a Developer, I would like the library to return a failure if any required fields are missing from my JSON fragment
As a Developer, I would like the library to return a warning if any recommended fields are missing from my JSON fragment
As a Developer, I would like the library to return a failure if any fields in my JSON fragment are of an incorrect type
As a Developer, I would like the library to return a failure if any fields in my JSON fragment are of an incorrect format (e.g. URI, DateTime)
As a Developer, I would like the library to return a warning if any fields are included in the JSON fragment that are not in the Specification
As a Developer, I would like the library to return an error if there are logical inconsistencies in my JSON fragment:
- Event duration should not exceed the interval between startDate and endDate
- SubEvent duration should not exceed parent Event duration
- Event startDate should not be before endDate
- SubEvent startDate and endDate should not be outside parent Event startDate and endDate
- Latitude and longitude should be to at least 4 decimal places (Lat/lng insufficiently accurate to be useful openactive-archive/developer-microsite#39) (SOURCE?)
- startDate and endDate should be Date not DateTime for all-day events (Do not include time component in startDate and endDate for all-day events openactive-archive/developer-microsite#40) (SOURCE? - sort of Holiday camps and bootcamps modelling-opportunity-data#94)
- Include isAccessibleForFree for universally free events (Include isAccessibleForFree for universally free events openactive-archive/developer-microsite#28, Describing free events, advanced booking requirements, etc modelling-opportunity-data#98)
- For ageRange, at least one of minValue or maxValue must be supplied (For ageRange, at least one of minValue or maxValue must be supplied openactive-archive/developer-microsite#27) (SOURCE?)
- Do not include endDate or duration for zero duration (Do not include endDate or duration for zero duration openactive-archive/developer-microsite#26, Handling zero duration modelling-opportunity-data#96)
- duration should not contain day component, unless MultidayEvent (duration should not contain day component, unless MultidayEvent openactive-archive/developer-microsite#25, Holiday camps and bootcamps modelling-opportunity-data#94)
- level should be included wherever possible (Include level wherever possible openactive-archive/developer-microsite#20) (SOURCE? - The standard marks this as OPTIONAL rather than RECOMMENDED)
- postalAddress should contain streetAddress, addressLocality, addressRegion, postalCode and addressCountry (Missing details in the PostalAddress may result in the event not being listed in Google Reserve openactive-archive/developer-microsite#19) (SOURCE? - This seems to simply be to satisfy Google)
- For multi-day events, such as holiday camps, use MultidayEvent (For multi-day events, such as holiday camps, use MultidayEvent openactive-archive/developer-microsite#17, Holiday camps and bootcamps modelling-opportunity-data#94)
- offer property should be name offers (common typo) ("offer" property should be named "offers" openactive-archive/developer-microsite#16)
- price should always be a string with 2 decimal places (Price should always be a string with 2 decimal places openactive-archive/developer-microsite#15, Price should always be a string with 2 decimal places modelling-opportunity-data#92)
- If providing a programme, always try to include a logo, a url and a video (If providing a programme, always try to include a logo, a url and a video openactive-archive/developer-microsite#14, Video property for programmes and events modelling-opportunity-data#88, Use Brand type for programme modelling-opportunity-data#89) (SOURCE? - This is not recommended in the spec yet)
- programme should be used instead of superEvent when the information relates to a programme ("programme" should be used instead of "superEvent" when the information relates to a programme type openactive-archive/developer-microsite#13)
- programme should be used in combination with activity for context (Use "programme", in combination with "activity" openactive-archive/developer-microsite#12) (SOURCE? - This is not recommended in the spec yet)
- givenName should not be two words, if no familyName is provided consistently, and vice versa. name should be used instead to represent full name (givenName should not be two words, if no familyName is provided consistently, and vice versa. "name" should be used instead to represent full name. openactive-archive/developer-microsite#11) (SOURCE? - This is not recommended in the spec yet)
- name, streetAddress, addressLocality and addressRegion must not have trailing commas ("name", "streetAddress", "addressLocality" and "addressRegion" must not have trailing commas openactive-archive/developer-microsite#7) (SOURCE? - This is not recommended in the spec yet (although no schema.org example has trailing commas))
As a Developer, I would like to see all the errors generated as a collection
As a Developer, I would like any error to be accompanied by text informing me how to fix it
As a Developer, I would like any error to be accompanied by a line and column number, so I can relate the message to the correct part of my JSON fragment
As a Developer, I would like the library/UI to warn about beta or other extension properties that the validator isn't checking
As a Developer, I would like the library to assign errors to categories, so I can distinguish between e,g. conformance errors and data quality issues (I think this will give us more options for filtering, prioritising in the UX)
As a Maintainer, I would like as much validation as possible to draw rules from configuration so that adding new rules is as simple as possible
As a Maintainer, I would like more complex validation to be performed by an extensible validation interface so that adding new rules is as simple as possible

UI

As a Developer, I would like to be able to paste a JSON fragment into a textbox, and have it validate it against the current Specification
As a Developer, I would like the UI to prettify my submitted JSON fragment
As a Developer, I would like the UI to show my JSON fragment syntax highlighted
As a Developer, I would like to see a list of errors, warnings and notices generated by the validator alongside my JSON fragment submission
As a Developer, I would like an error, warning or notice to be linked to a location in the original JSON fragment, where appropriate
As a Developer, I should be able to make changes to my JSON fragment and revalidate without leaving the page

Non Functional Requirements

Project SHOULD be written in nodejs
Project MUST have unit tests
Project MUST have MIT licence
Project SHOULD use Bootstrap
Project MUST follow the OpenActive style guide
Project MUST be written on Github using issues and projects
Project SHOULD be split into 2 distinct parts: library and UI
Project MUST be deployable to Heroku
Project SHOULD use Travis for CI

The text was updated successfully, but these errors were encountered:

petewalker · 2018-07-11T11:40:31Z

This is a first stab at writing an Epic to define the scope of the project.

I’d like to draw your particular attention to the long (but not exhaustive) list of the more complex rules that will need to be validated against. Mel and I discussed these this morning, and felt that each rule that we’re aiming to validate against should be defined as early as possible, to prevent scope creep and for transparency. Most of these are derived from Nick’s issues, which I’ve linked to. In some cases, some discussion needs to be had about where the issues actually come from, as I’ve been unable to find a further source to justify the rule. For example, “Latitude and longitude should be to at least 4 decimal places” seems good practice, but it’s not recommended anywhere outside Nick’s issue. We don’t want to surprise developers with extra rules that only exist within the validator (even if they’re simple guidance).

Once we’ve agreed the stories, we can split out into issues, and prioritise and size using the project I’ve created.

Any comments / additions / deletions would be greatly appreciated

ldodds · 2018-07-12T06:51:35Z

Couple of comments, you have:

As a Developer, I would like the library to return a failure if any fields in my JSON fragment are of an incorrect type

Should we distinguish between type checking (which would be JSON type checking array, string, number, etc) and format checking (which might be validating a string is a URI, a date, etc)

Suggested addition:

As a Developer, I would like the library/UI to warn about beta or other extension properties that the validator isn't checking
As a Developer, I would like the library to assign errors to categories, so I can distinguish between e,g. conformance errors and data quality issues (I think this will give us more options for filtering, prioritising in the UX)

For the more complex rules I agree that we need to refine these to ensure they're agreed before proceeding too far. Nick has proposed these but they've not had much review yet. Not all will necessarily make it into the specification as that's not the place to record common errors.

Some of the rules are straight-forward as they're basic data hygiene issues, e.g. removing trailing commas. Some need to be re-framed, e.g. lat/longs should have a reasonable precision, within the scope of what data is available. And some (like MultidayEvent) we may not do, or defer to later

I'm planning to work through some of these as part of improving the validation rules for the v2.0 specification. To help prioritise your time, my suggestion is to

focus initial rapid prototype on demonstrating the core rules and showing how more complex checking might work (so we can discuss approach in more detail).
ensure we have core specification validations done first
we review and prioritise (e.g. in terms of impact) the additional checks. this is something that I can coordinate as part of the standards work
you then work through implementing the list in time available

We may not get to all of them and we may identify other checks that we want to do instead. For example I think there's some around making events "bookable", checking nesting of subEvents, etc.

petewalker · 2018-07-12T08:29:09Z

From Mel:

I think the list looks really good. Just looking through the logical inconsistency bullet, and I think some should be warnings rather than errors:

Fields not completed to recommendation (return warning)
Latitude and longitude should be to at least 4 decimal places (openactive-archive/developer-microsite#39)
Include isAccessibleForFree for universally free events (openactive-archive/developer-microsite#28, openactive/modelling-opportunity-data#98)
For ageRange, at least one of minValue or maxValue must be supplied (openactive-archive/developer-microsite#27) - I think this is a warning to say: no value given so the default will be 18, rather than error?
level should be included wherever possible (openactive-archive/developer-microsite#20)
postalAddress should contain streetAddress, addressLocality, addressRegion, postalCode and addressCountry (openactive-archive/developer-microsite#19)
If providing a programme, always try to include a logo, a url and a video (openactive-archive/developer-microsite#14, openactive/modelling-opportunity-data#88, openactive/modelling-opportunity-data#89)
name, streetAddress, addressLocality and addressRegion must not have trailing commas (openactive-archive/developer-microsite#7)

I’m not sure, as I don’t understand what the validator would look at to decide what to show
programme should be used instead of superEvent when the information relates to a programme (openactive-archive/developer-microsite#13)
Do not include endDate or duration for zero duration (openactive-archive/developer-microsite#26, openactive/modelling-opportunity-data#96)
programme should be used in combination with activity for context (openactive-archive/developer-microsite#12) (SOURCE? - This is not recommended in the spec yet)

petewalker · 2018-07-12T08:39:19Z

Thanks both for comments.

@ldodds - RE: categories of error/warning. We could have:

Severity

Failure
Warning
Notice

Category

Conformance (deviation from the specification)
Data Quality (e.g. trailing commas, precision)
Recommendation (properties that are not mandatory, but that we recommend. or switching to a preferred type?)

When we finalise the rules we'd like to implement, we should be clear about the severity and category of error that they should generate.

RE: Strategy - I agree with that approach. If the rules engine is extensible as per spec, we don't need a complete list of rules up-front

petewalker added discussion An issue requiring discussion before acceptance epic An Epic user story, acting as a parent for a large number of user stories labels Jul 11, 2018

petewalker closed this as completed Sep 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epic: Data Validator v0.1 #1

Epic: Data Validator v0.1 #1

petewalker commented Jul 11, 2018 •

edited

Loading

petewalker commented Jul 11, 2018

ldodds commented Jul 12, 2018

petewalker commented Jul 12, 2018

petewalker commented Jul 12, 2018

Epic: Data Validator v0.1 #1

Epic: Data Validator v0.1 #1

Comments

petewalker commented Jul 11, 2018 • edited Loading

Key Stakeholders

Scope

Out of Scope

User Stories

Roles

Artifacts

Stories

Library

UI

Non Functional Requirements

petewalker commented Jul 11, 2018

ldodds commented Jul 12, 2018

petewalker commented Jul 12, 2018

petewalker commented Jul 12, 2018

petewalker commented Jul 11, 2018 •

edited

Loading