Handling invalid data values #93

sumps · 2015-10-11T19:16:38Z

How should we handle invalid data values in the SK spec? Two examples are;

Depth value when the depth has been lost i.e. turbulence hàs caused the depth unit to lose the bottom reading, a depth sounder normally flashes the last reading.
GPS unit temporarily loses position and as a result all readings; position, COG, SOG etc. are not valid.

tkurki · 2015-10-11T19:57:44Z

So the information we need to convey is that a sensor says "I'm operational & connected & ok, but I am unable to function for some reason" so it is not just the absence of data / updates.

Several options:

send a delta without the value field or have a missing value field in the full tree. Both make normally required fields optional, making validation problematic
replace value property with a special marker property like dataMissing: true, making either mandatory. Clearly marks missing values instead of just omitting them.

sumps · 2015-10-12T07:30:36Z

I think having a special marker property like dataValid: false, would be a good solution and suitable for most sensor data. Depth and GPS are the obvious ones and we should start with these ones but we might need to add this for other sensors if we find that they have a valid flag or some other way of indicating data invalid.

rob42 · 2015-11-21T21:13:33Z

Hmm, this brings up other issues - how does a sensor communicate its condition via signalk. Looking at the simple cases:

ok
ok, but invalid data
failed, but alive
dead

Well 4) is easy - its just gone, so nothing, no comms etc.

Since this is most common, we can take the position its all ok unless notified otherwise.
For each data key we currently have value, timestamp, source. We could add an optional state key, that is sent to provide for case 2) .
The source tree should have some way to register device failed for case 3) above. Maybe a state there too?

faceless2 · 2016-02-16T14:17:44Z

For what it's worth I'm not sure I see any practical difference between a sensor that is dead/missing, and a sensor that is alive but not reporting any usable values.

If this were a requirement then you could do this with a "last seen" timestamp on the source - this is the last communication with the source, which is distinct from the timestamp associated with any data fields it updates. But given I'm not sure I understand the problem this is trying to solve, this might not help.

joabakk · 2016-02-22T22:22:45Z

Just came to think about this. I'm not sure how a depth sensor would react when the depth is greater than it could measure. But there is value in a measurement saying "it's really deep here", rather than "the sounder is not measuring since 2 mins ago"

sumps · 2016-02-22T22:44:21Z

Traditionally depth sounders flash the last good value and they can lose depth for a number of reasons, not just going "out of their depth" in deep water. GPS position is another value that can become invalid.

@faceless2 it is not about last seen, as the values could still be streaming in from NMEA sources, it is about this is the last value but it is no longer valid. You could just stop sending SK data but then the display values on a consumer would time out and disappear and you do not know what the last value was and whether you have a system failure, data connection drop out, etc.

sumps · 2016-04-04T12:10:36Z

I am going to have to make a decision on this one, as it is holding up iKommunicate testing.

If there is an NMEA sentence or PGN that is showing the data as not available/invalid then we will send the relevant Signal K value as a Null. The same is true if the data was coming in and it then times out, we will also change the Signal K value to a Null.

keesverruijt · 2016-04-04T12:16:19Z

I think that is the best solution. All alternatives are worse!

tkurki · 2016-04-07T05:44:01Z

There are three practical ways for sending 'data invalid' message:

Send delta with just path, no value.

Send value as undefined.

Send value as null.

I think null, indicating the absence of value, is the proper choice.

https://www.wwco.com/~wls/blog/2011/05/30/json-and-undefined-properties/

tkurki · 2016-04-09T06:11:20Z

There are several explicit cases where a sensor can indicate that it is otherwise operational but can not produce vali ddata. Example cases are a depth sounder that can not discern bottom from the echo data and a GNSS that has no satellite fix.

In these cases a gateway/server will typically receive a message indicating invalid data. It must send out a delta message where the value of the data item is null and serve the value as null in the REST api.

tkurki · 2016-04-09T06:15:58Z

SignalK/n2k-signalk#21
SignalK/nmea0183-signalk#56

Should be included in the spec document.

sumps · 2016-12-21T10:54:35Z

It seems to me that everyone is in agreement on the above solution of using "Null" in Signal K to indicate that data is not available or currently invalid.

I am not sure what changes are needed to the schema files to implement this but if someone wants to do that part I am happy to make changes to the documentation.

timmathews · 2016-12-22T00:24:49Z

No change should be necessary to the schema. It just needs to be explicit in the documentation somewhere.

In these cases a gateway/server will typically receive a message indicating invalid data. It must send out a delta message where the value of the data item is null and serve the value as null in the REST api.

However, I think that this is a very JavaScript-centric approach which makes implementing consumers more difficult in other languages. JSON parsers typically treat null values and missing values as the same thing (i.e. null), but we're explicitly stating here that they're not the same thing (because in JavaScript, they're not).

I've verified this to be the case in C#, Go and Ruby so far. It would be illustrative to know what common parsers for C, C++, Objective-C, Java, Python and Swift do. I suspect the results will be similar.

A way to explicitly indicate that the value is invalid would be better.

rob42 · 2016-12-22T05:57:17Z

Since we use json as the data format we need to send data that conforms to json spec, and assume that different parsers dont have bugs. If they do its the parsers problem.

JSON only allows true, false, null and empty objects or valid data. We could send INVALID for an invalid object but many parsers would then barf out "Invalid numeric format" for numbers, so thats not a good option IMHO. Also we would have to test values everywhere, very messy.

In the end we can either make a value null, or make an object null, or we break the parsers.

Only other option is in the delta we send a block of invalid keys, eg

{
  "context":"vessels.urn....",
   "updates":[
        .......
    ],
    "invalid":[
         "navigation.position",
         "navigation.courseOverGroundTrue"
     ]
}

tkurki · 2016-12-22T06:41:46Z

We do have the option of sending just path, no value, as mentioned already above.

Other options exist, like

{
  "path": "some.path",
  "valid": false
}

which is a more verbose version of just omitting value.

PS. Python is ok with JSON nulls (None) and Java has Gson.isJsonNull() and JSONObject.isNull(key).

sumps · 2016-12-22T07:52:44Z

I think the best option is to send the exact same JSON as we normally do but with value = null and then it is one change and can be applied to all objects. If we have the valid field it will add complexity and unless it is mandatory people will ignore it where as value = null will just happen and the client software will need no extra modification.

faceless2 · 2016-12-22T10:32:03Z

For what it's worth I agree with Paul - setting a value to null (whether it's {"navigation":{"position": null}}, {"key":"navigation.position", "value":null} or some other syntax) is unambiguous; it's standard JSON so should cause no technical problems for any client; it cannot be ignored by a client, unlike an additional key; and it meets the principle of least surprise.

…

On 22 December 2016 at 07:52, sumps ***@***.***> wrote: I think the best option is to send the exact same JSON as we normally do but with value = null and then it is one change and can be applied to all objects. If we have the valid field it will add complexity and unless it is mandatory people will ignore it where as value = null will just happen and the client software will need no extra modification. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#93 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA8YO2HF6RJzTFzjlGeiAhT4hjgWIjRIks5rKixMgaJpZM4GMzSm> .

jboynes · 2016-12-22T17:30:26Z

There's a interaction with the REST API as well. For a key that is known but whose value is currently unknown, the appropriate response would be 200 with a null value; for a key that is not known the server should return a 404 response.

When retrieving a non-leaf path (e.g. environment or navigation) the server must include keys whose value is unknown i.e. explicitly contain the property with a null value. Not all JSON frameworks do that automatically so we need to call this out as a requirement.

There's a still problem in distinguishing between a full and sparse response. If the some.arbitrary.key property is missing, does that mean the key is not known to the server, or that the key is known but this is a sparse response where the server just did not include that key?

This stops being a problem if we switch from a hierarchical to the flat model.

rob42 · 2016-12-22T23:28:49Z

See #311 we need some way to identify if its a sparse or full response. I still need to catch up on recent discussions but is it possible to differentiate between sparse and full by the request or subscription?
It seems to me that REST will always be a full response, at least for the branch requested, that a subscription will know because it was part of the sub request?
eg in what use cases would the server decide between sparse/full without direction from the client?

tkurki · 2017-10-30T18:22:50Z

Documented at https://github.com/SignalK/specification/blob/c446914b331084dc5c8bcddf6aa65003b69b1c45/gitbook-docs/data_model.md#missing-or-invalid-data

tkurki added this to the v1 milestone Jan 9, 2016

sumps mentioned this issue Feb 15, 2016

Proposal to reduce the verbosity of the datamodel #172

Closed

rob42 mentioned this issue Apr 14, 2017

Add note on data accuracy #351

Merged

tkurki mentioned this issue Jul 12, 2017

fix: provide a null delta when timeRemaining changes to unknown SignalK/n2k-signalk#74

Closed

tkurki closed this as completed Oct 30, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling invalid data values #93

Handling invalid data values #93

sumps commented Oct 11, 2015

tkurki commented Oct 11, 2015

sumps commented Oct 12, 2015

rob42 commented Nov 21, 2015

faceless2 commented Feb 16, 2016

joabakk commented Feb 22, 2016

sumps commented Feb 22, 2016

sumps commented Apr 4, 2016

keesverruijt commented Apr 4, 2016

tkurki commented Apr 7, 2016

tkurki commented Apr 9, 2016

tkurki commented Apr 9, 2016

sumps commented Dec 21, 2016

timmathews commented Dec 22, 2016

rob42 commented Dec 22, 2016

tkurki commented Dec 22, 2016

sumps commented Dec 22, 2016

faceless2 commented Dec 22, 2016 via email

jboynes commented Dec 22, 2016

rob42 commented Dec 22, 2016

tkurki commented Oct 30, 2017

Handling invalid data values #93

Handling invalid data values #93

Comments

sumps commented Oct 11, 2015

tkurki commented Oct 11, 2015

sumps commented Oct 12, 2015

rob42 commented Nov 21, 2015

faceless2 commented Feb 16, 2016

joabakk commented Feb 22, 2016

sumps commented Feb 22, 2016

sumps commented Apr 4, 2016

keesverruijt commented Apr 4, 2016

tkurki commented Apr 7, 2016

tkurki commented Apr 9, 2016

tkurki commented Apr 9, 2016

sumps commented Dec 21, 2016

timmathews commented Dec 22, 2016

rob42 commented Dec 22, 2016

tkurki commented Dec 22, 2016

sumps commented Dec 22, 2016

faceless2 commented Dec 22, 2016 via email

jboynes commented Dec 22, 2016

rob42 commented Dec 22, 2016

tkurki commented Oct 30, 2017