-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parser options [WIP] #33
Conversation
Codecov Report
@@ Coverage Diff @@
## master #33 +/- ##
==========================================
- Coverage 100% 90.62% -9.38%
==========================================
Files 4 5 +1
Lines 102 128 +26
Branches 6 13 +7
==========================================
+ Hits 102 116 +14
- Misses 0 12 +12
Continue to review full report at Codecov.
|
@travisbrown I like your approach. Given the various impedance mismatches that are possible, I think the best option is to have defaults that support all of YAML, but allow particular callers to disable things they don't want to support. This also supports cases where there's not a good one-size-fits-all option available. What would an alternative to this be, if any? I don't see any other options that are as good as something like this. |
@non The alternative would be picking a standard set of language-independent types that are supported (merge, maybe bool, probably not timestamp) and making those the behavior instead of complicating things by adding options to Since |
This approach moves the node handling to an object algebra. This allows for relatively easily adding configurable stuff that's able to change how various YAML types are handled as the need arises. One can even plug in their own algebra if they want to get nitty-gritty. It also allows adding optional error accumulation, which is something that's always felt missing. A nice thing here is by default we don't have to even check any configuration - only when a parser instance is configured do we change to `ConfiguredAlg`. Also should be binary compatible (and `yaml.parser.configure(...)` is a nice way to get a configured parser!). TODO: Add tests around error accumulation and such (coverage currently lacking)
Closing in favor of @jeremyrsmith's #33. |
This is another attempt to fix #31 (see also #32, @jeremyrsmith's PR here, and this discussion).
The key issue (which I've refined a bit in my head since opening #31) is that YAML supports a number of "Language-Independent Types" that are supported by SnakeYAML but not (currently) circe-yaml.
One missing feature is merge keys (see example in #31). Another is YAML's more lenient representation of booleans, integers, etc.:
Another (more problematic) one is timestamps. In a current Jackson-powered pipeline for working with YAML in circe, we're relying on the fact that Jackson's YAML processing transforms YAML's (kind of ridiculous) timestamp representation to epoch seconds. This is really nice in that it means that we don't have to reproduce YAML's silly rules for what counts as a timestamp for ourselves on the circe decoding side. In the current version of circe-yaml, YAML timestamps are just passed through to the JSON unchanged as strings, which means we do.
The advantage of circe-yaml's current approach is that no information (timezone, etc.) gets dropped. These YAML things are called timestamps, though, so I'm not sure I care that much about losing timezone information, and even if we only supported decoding YAML's timestamp as longs, users would still be able to put their dates in YAML strings if they really wanted the exact representation to be available in their JSON.
(We could also transform YAML timestamps into JSON strings with a single standard date format instead of epoch seconds, but the long approach has the advantage of matching what Jackson does and not making us choose the format).
What I've done in this PR is to add a new
Parser
case class with options for a few language-independent types so that users don't have to choose (or contort their YAML). By default all of the language-independent types mentioned above are enabled, but they can all be turned off if people want the old behavior.I've also taken a quick stab at fixing stack safety for deep documents with
EitherT
. I haven't tested it, and that really should have happened in a different PR, but oh well.This is just a sketch, and needs several things before it could be merged (if we decide to go this route):
Note that while this PR is source compatible with 0.6.1, it's not binary compatible (unlike the simpler version in #32).