Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify range of number #422

Open
mfussenegger opened this issue Aug 1, 2023 · 5 comments
Open

Clarify range of number #422

mfussenegger opened this issue Aug 1, 2023 · 5 comments
Assignees
Labels
clarification Protocol clarification

Comments

@mfussenegger
Copy link
Contributor

This relates a bit to #90

The specification uses number pretty much everywhere. For example:

interface ProtocolMessage {
  /**
   * Sequence number of the message (also known as message ID). The `seq` for
   * the first message sent by a client or debug adapter is 1, and for each
   * subsequent message is 1 greater than the previous message sent by that
   * actor. `seq` can be used to order requests, responses, and events, and to
   * associate requests with their corresponding responses. For protocol
   * messages of type `request` the sequence number can be used to cancel the
   * request.
   */
  seq: number;

  /**
   * Message type.
   * Values: 'request', 'response', 'event', etc.
   */
  type: 'request' | 'response' | 'event' | string;
}

But it doesn't define number.
In some languages number refers to either a float or integer.

The JSON schema uses integer in the same place. But also doesn't define a minimum or maximum. (As far as I could find, JSON schema doesn't define a limit for integer either, it would have to be set via maximum)

There are a few places where the length is explicitly pointed out:

interface RunInTerminalResponse extends Response {
  body: {
    /**
     * The process ID. The value should be less than or equal to 2147483647
     * (2^31-1).
     */
    processId?: number;

    /**
     * The process ID of the terminal shell. The value should be less than or
     * equal to 2147483647 (2^31-1).
     */
    shellProcessId?: number;
  };
}

number without any upper bound is problematic because languages/libraries can choose to serialize them differently:

Numbers in JSON are agnostic with regard to their representation within programming languages. While this allows for numbers of arbitrary precision to be serialized, it may lead to portability issues. For example, since no differentiation is made between integer and floating-point values, some implementations may treat 42, 42.0, and 4.2E+1 as the same number, while others may not. The JSON standard makes no requirements regarding implementation details such as overflow, underflow, loss of precision, rounding, or signed zeros, but it does recommend expecting no more than IEEE 754 binary64 precision for "good interoperability"

Concrete example:

mfussenegger/nvim-dap#1004

The cjson library in Neovim converts largish numbers to scientific notation, leading to precision loss:

print(vim.json.encode({big_int = 3053700806959403}))
-- {"big_int":3.0537008069594e+15}
print $ (decode "{\"threadId\": 3053700806959403}" :: Maybe Foo)
-- Just (Foo {threadId = 3053700806959403})

print $ (decode "{\"threadId\": 3.0537008069594e+15}" :: Maybe Foo)
-- Just (Foo {threadId = 3053700806959400})

This behavior is within the JSON specification but currently breaks the dart debug adapter.

Could the specification clarify the allowed range of numbers and also clarify to which precision it must serialize them without falling back to scientific notation?

Maybe similar to the LSP spec:

/**
 * Defines an integer number in the range of -2^31 to 2^31 - 1.
 */
export type integer = number;
@DanTup
Copy link
Contributor

DanTup commented Aug 1, 2023

Slightly related - is it acceptable for scientific notation to be used for integers even when there is not precision loss?

For example can a DAP client/server send 2e+5 instead of 200000 in a field that is marked as an integer and expect the other party to handle it?

Right now, the Dart adapter will throw given scientific notation because Dart deserialises scientific notation to doubles and then we fail to cast to an int. I'm trying to understand whether this behaviour is acceptable or not, or if we should accept scientific notation for integers and just assume there was no precision loss.

Unfortunately I haven't been able to find anything concrete online and since servers and clients need to agree, perhaps it's worth stating something in the DAP spec.

@DanTup
Copy link
Contributor

DanTup commented Aug 1, 2023

I also just realised that the json spec says integer for threadId but the online spec says number.. this was noted above but I apparently didn't read it properly. It would be good to have this clarified too.

@connor4312
Copy link
Member

"Parsing JSON is a Minefield"

The JSON schema is the normative definition of DAP. The TypeScript types are shown in the markdown for digestibility, and JAvaScript/TypeScript does not have an integer type, so they're shown as a number.

The note about bounds is manually added in the comments of many DAP types. We could declare, in the overview, that integer types should fit in an int32, and then adjust the generator such that the note about range is automatically added on integer-typed fields in the markdown. We can also advise that clients represent the number in standard notation.

We should go through and see what integer types should not be int32's, for example the offset in memory-related requests may often need to exceed these bounds. At least for VS Code, the bounds of this are integers that can be represented in a float64 due to us living in JS, but not all DAP clients are written in JS.

@connor4312 connor4312 added the clarification Protocol clarification label Aug 1, 2023
@mfussenegger
Copy link
Contributor Author

The JSON schema is the normative definition of DAP

If that's the case it may be worth rephrasing "A machine-readable JSON schema can be found here." at the top of the specification page. It sounds more like an extra, and doesn't highlight that it's in fact the source of truth.

The TypeScript types are shown in the markdown for digestibility, and JAvaScript/TypeScript does not have an integer type, so they're shown as a number.

I think that's why the LSP specification declares a type alias for integer, uinteger and decimal. Couldn't the DAP specification do the same?

@DanTup
Copy link
Contributor

DanTup commented Aug 2, 2023

The note about bounds is manually added in the comments of many DAP types. We could declare, in the overview, that integer types should fit in an int32

This seems to be noted here, although it uses the word may:

https://microsoft.github.io/debug-adapter-protocol/overview#:~:text=type%20integer)-,may,-be%20represented%20as

integers defined in the protocol (JSON schema type integer) ⚠️may⚠️ be represented as 32 bit signed integers, although some properties disallow negative values.

I think this should be concrete. "may" should be removed and if there are locations where they are allowed to be more than 32 bit, they should use a different type (long?). Underspecifying these leads to issues like this one, and it's not clear what the best fix is (and while either Dart or nvim-dap can fix this specific issue, there are many other servers and clients and it's much better to have them follow a single spec than try to match each others behaviour/quirks).

We can also advise that clients represent the number in standard notation.

I think this should be specified rather than advised (whether scientific notation is allowed or not). For good interoperability I think it's import the rules are absolutely clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification Protocol clarification
Projects
None yet
Development

No branches or pull requests

4 participants