Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epic: Data Validator v0.1 #1

Closed
petewalker opened this issue Jul 11, 2018 · 4 comments
Closed

Epic: Data Validator v0.1 #1

petewalker opened this issue Jul 11, 2018 · 4 comments
Labels
discussion An issue requiring discussion before acceptance epic An Epic user story, acting as a parent for a large number of user stories

Comments

@petewalker
Copy link
Contributor

petewalker commented Jul 11, 2018

Key Stakeholders

Leigh Dodds (ODI), Melanie Abraham (ODI), Pete Walker (imin), Nish Desai (imin)

Scope

  • Online validator tool to allow a developer to validate a JSON document against the Modelling Opportunity Specification
  • Extensible set of validation rules, covering a range of options
  • Focus on implementing robust set of validation rules

Out of Scope

  • RDPE Feed validator
  • Validation as a service
  • Automated fixing of bad data
  • Validation of custom properties

User Stories

Roles

  • Developer - An end user who is using the library to validate their JSON
  • Maintainer - An end user who will be further developing and maintaining the library in the future

Artifacts

Stories

Library

UI

  • As a Developer, I would like to be able to paste a JSON fragment into a textbox, and have it validate it against the current Specification
  • As a Developer, I would like the UI to prettify my submitted JSON fragment
  • As a Developer, I would like the UI to show my JSON fragment syntax highlighted
  • As a Developer, I would like to see a list of errors, warnings and notices generated by the validator alongside my JSON fragment submission
  • As a Developer, I would like an error, warning or notice to be linked to a location in the original JSON fragment, where appropriate
  • As a Developer, I should be able to make changes to my JSON fragment and revalidate without leaving the page

Non Functional Requirements

  • Project SHOULD be written in nodejs
  • Project MUST have unit tests
  • Project MUST have MIT licence
  • Project SHOULD use Bootstrap
  • Project MUST follow the OpenActive style guide
  • Project MUST be written on Github using issues and projects
  • Project SHOULD be split into 2 distinct parts: library and UI
  • Project MUST be deployable to Heroku
  • Project SHOULD use Travis for CI
@petewalker petewalker added discussion An issue requiring discussion before acceptance epic An Epic user story, acting as a parent for a large number of user stories labels Jul 11, 2018
@petewalker
Copy link
Contributor Author

This is a first stab at writing an Epic to define the scope of the project.

I’d like to draw your particular attention to the long (but not exhaustive) list of the more complex rules that will need to be validated against. Mel and I discussed these this morning, and felt that each rule that we’re aiming to validate against should be defined as early as possible, to prevent scope creep and for transparency. Most of these are derived from Nick’s issues, which I’ve linked to. In some cases, some discussion needs to be had about where the issues actually come from, as I’ve been unable to find a further source to justify the rule. For example, “Latitude and longitude should be to at least 4 decimal places” seems good practice, but it’s not recommended anywhere outside Nick’s issue. We don’t want to surprise developers with extra rules that only exist within the validator (even if they’re simple guidance).

Once we’ve agreed the stories, we can split out into issues, and prioritise and size using the project I’ve created.

Any comments / additions / deletions would be greatly appreciated

@ldodds
Copy link
Contributor

ldodds commented Jul 12, 2018

Couple of comments, you have:

  • As a Developer, I would like the library to return a failure if any fields in my JSON fragment are of an incorrect type

Should we distinguish between type checking (which would be JSON type checking array, string, number, etc) and format checking (which might be validating a string is a URI, a date, etc)

Suggested addition:

  • As a Developer, I would like the library/UI to warn about beta or other extension properties that the validator isn't checking
  • As a Developer, I would like the library to assign errors to categories, so I can distinguish between e,g. conformance errors and data quality issues (I think this will give us more options for filtering, prioritising in the UX)

For the more complex rules I agree that we need to refine these to ensure they're agreed before proceeding too far. Nick has proposed these but they've not had much review yet. Not all will necessarily make it into the specification as that's not the place to record common errors.

Some of the rules are straight-forward as they're basic data hygiene issues, e.g. removing trailing commas. Some need to be re-framed, e.g. lat/longs should have a reasonable precision, within the scope of what data is available. And some (like MultidayEvent) we may not do, or defer to later

I'm planning to work through some of these as part of improving the validation rules for the v2.0 specification. To help prioritise your time, my suggestion is to

  • focus initial rapid prototype on demonstrating the core rules and showing how more complex checking might work (so we can discuss approach in more detail).
  • ensure we have core specification validations done first
  • we review and prioritise (e.g. in terms of impact) the additional checks. this is something that I can coordinate as part of the standards work
  • you then work through implementing the list in time available

We may not get to all of them and we may identify other checks that we want to do instead. For example I think there's some around making events "bookable", checking nesting of subEvents, etc.

@petewalker
Copy link
Contributor Author

From Mel:

I think the list looks really good. Just looking through the logical inconsistency bullet, and I think some should be warnings rather than errors:

Fields not completed to recommendation (return warning)
Latitude and longitude should be to at least 4 decimal places (openactive-archive/developer-microsite#39)
Include isAccessibleForFree for universally free events (openactive-archive/developer-microsite#28, openactive/modelling-opportunity-data#98)
For ageRange, at least one of minValue or maxValue must be supplied (openactive-archive/developer-microsite#27) - I think this is a warning to say: no value given so the default will be 18, rather than error?
level should be included wherever possible (openactive-archive/developer-microsite#20)
postalAddress should contain streetAddress, addressLocality, addressRegion, postalCode and addressCountry (openactive-archive/developer-microsite#19)
If providing a programme, always try to include a logo, a url and a video (openactive-archive/developer-microsite#14, openactive/modelling-opportunity-data#88, openactive/modelling-opportunity-data#89)
name, streetAddress, addressLocality and addressRegion must not have trailing commas (openactive-archive/developer-microsite#7)

I’m not sure, as I don’t understand what the validator would look at to decide what to show
programme should be used instead of superEvent when the information relates to a programme (openactive-archive/developer-microsite#13)
Do not include endDate or duration for zero duration (openactive-archive/developer-microsite#26, openactive/modelling-opportunity-data#96)
programme should be used in combination with activity for context (openactive-archive/developer-microsite#12) (SOURCE? - This is not recommended in the spec yet)

@petewalker
Copy link
Contributor Author

Thanks both for comments.

@ldodds - RE: categories of error/warning. We could have:

Severity

  • Failure
  • Warning
  • Notice

Category

  • Conformance (deviation from the specification)
  • Data Quality (e.g. trailing commas, precision)
  • Recommendation (properties that are not mandatory, but that we recommend. or switching to a preferred type?)

When we finalise the rules we'd like to implement, we should be clear about the severity and category of error that they should generate.

RE: Strategy - I agree with that approach. If the rules engine is extensible as per spec, we don't need a complete list of rules up-front

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion An issue requiring discussion before acceptance epic An Epic user story, acting as a parent for a large number of user stories
Projects
None yet
Development

No branches or pull requests

2 participants