Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec file and validation edits and additions #31

Merged
merged 2 commits into from
Sep 11, 2020

Conversation

ialarmedalien
Copy link
Collaborator

@ialarmedalien ialarmedalien commented Sep 8, 2020

Miscellaneous edits to the spec files and code related to spec files:

  • Add tests to test_json_validation.py for array validation

  • Add spec/test/collections/test_djornl.py to test DJORNL collection schemas

  • Refactor DJORNL node schema to use references throughout

  • Add uniqueItems check to DJORNL stored queries

  • Edit all DJORNL stored queries to work correctly when populating defaults

  • Edit scripts/run_tests.sh to auto-discover all relation_engine_server/, spec/, and importers/ tests from the base directory

  • I updated the README.md docs to reflect this change.

  • This is not a breaking API change

@@ -315,6 +358,74 @@ def date_format_validation(self, schema_arg, schema_file_arg):
data_file=file_path,
validate_at=valid_json_loc)

def test_array_validation(self, schema_arg=None, schema_file_arg=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we testing this? It makes sense to me to test something like setting defaults, since that involves our own custom code. But array validation is basic functionality covered by a third party lib.

Copy link
Collaborator Author

@ialarmedalien ialarmedalien Sep 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a weird problem with arrays not validating properly, and I couldn't find anything in the python-jsonschema docs or the generic jsonschema docs to explain what was going on. I wrote some tests to work out why things weren't working, and then edited the test input until the tests passed. Between python-jsonschema and pyyaml, I don't have much trust in the third party libs not to do something surprising or to not do something expected, so I write tests with examples of stuff I want to do to ensure it's going to work.

TL;DR: TDD

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I would argue that all this additional code has maintenance and runtime costs, and that if we decide that we want to write test coverage for library dependencies, then we have to cover dozens of third party libraries in our dependency tree, which nobody would ever take on. I am sure that the json-schema library has coverage on array types and that we gain little to no benefit in the cost of keeping and maintaining that test code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't suggest that we write tests for every single third-party library that we use, but I do think it is useful to have some jsonschema tests that provide specific use case examples where the existing JSONSchema documentation is lacking or where there's some unexpected behaviour to be elucidated. I frequently look through tests to find usage examples if the docs aren't up to snuff. I've added a number of JSON validation tests specifically because the existing documentation of some feature or other was poor, I couldn't find any examples, and the best way to investigate what was going on was to create test cases and check that they worked.

In the specific case here, I've been trying to find out why array defaults weren't being filled in, whether it was a bug in python-jsonschema, some weird pyyaml thing, a problem with the schema itself, etc. It turns out that it's a shortcoming of the way that defaults are implemented -- not directly a jsonschema problem, but one due to our copying the simple default setting method in the jsonschema docs. I've written a test that directly illustrates the problem, as well as similar cases that do work.

go_terms:
type: string
format: regex
pattern: "^(GO:\\d{7}, ?)*(GO:\\d{7})?$"
pattern: ^(GO:\d{7}, ?)*(GO:\d{7})?$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to still quote things (eg due to colons), you can use single quotes which dont require escaping backslashes. Here's a reference: https://docs.octoprint.org/en/master/configuration/yaml.html#scalars

@ialarmedalien
Copy link
Collaborator Author

Will add in tests for #29 to this branch.

Improve readme for test/spec_release
Add test description
Improve formatting and info given by validate.py
Add djornl collection schema test
Add sample_spec_release dir with decompressed versions of the files in relation_engine_server/test/spec_release/spec.tar.gz
Edit spec_loader.py to prevent access outside the spec dir
Add tests to test_spec_loader.py to ensure correct functioning spec loader
remove $schema line from stored query
@ialarmedalien ialarmedalien force-pushed the use_defs_for_djornl_nodes branch 4 times, most recently from 319d16c to 6218b76 Compare September 10, 2020 18:28
@@ -230,26 +230,26 @@
}
},
{
"params": { "keys": ["Mary Poppins"], "distance": 0, "edge_types": [] },
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defaults problem fixed / worked around

Comment on lines +8 to +18
title: Edge Types
description: Edge types to filter on
type: array
items:
$ref: ../../datasets/djornl/edge_type.yaml
default: []
uniqueItems: true
examples:
- ['AraNetv2-HT_high-throughput-ppi', 'AraNetv2-LC_lit-curated-ppi']
- ['AraGWAS-Phenotype_Associations']
- []
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workaround for defaults problem

Comment on lines +398 to +401
# N.b. the default does not get populated in this case!
# This is a change from the expected functionality
'fruits': fruits_array_ref,
'name': 'using fruits_array.yaml -- the array is a ref',
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #29

Refactor json tests
Remove some unneeded test files
Add tests for the specific example of the default population failure

Update DJORNL stored queries to a ref structure that will work when populating defaults

Remove comments from file
@ialarmedalien ialarmedalien force-pushed the use_defs_for_djornl_nodes branch from 6218b76 to 1c5bbea Compare September 10, 2020 23:55
@jayrbolton jayrbolton merged commit d4953a3 into develop Sep 11, 2020
@jayrbolton jayrbolton deleted the use_defs_for_djornl_nodes branch September 11, 2020 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants