Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json-valid PPL function #3230

Merged
merged 24 commits into from
Jan 16, 2025

Conversation

kenrickyap
Copy link
Contributor

Description

Based on this feature request: #3207

Added json_valid request.

### `JSON_VALID`

**Description**

`json_valid(jsonStr)` Evaluates whether a json-encoded string contains valid JSON syntax and returns TRUE or FALSE.

**Argument type:** STRING

**Return type:** BOOLEAN

Example:

    os> source=people | eval `valid_json` = json_valid('[1,2,3,4]'), `invalid_json` = json_valid('{"invalid": "json"') | fields `valid_json`, `invalid_json`
    fetched rows / total rows = 1/1
    +--------------+----------------+
    | valid_json   | invalid_json   |
    +--------------+----------------+
    | True         | False          |
    +--------------+----------------+

    os> source=accounts | where json_valid('[1,2,3,4]') and isnull(email) | fields account_number, email
    fetched rows / total rows = 1/1
    +------------------+---------+
    | account_number   | email   |
    |------------------+---------|
    | 13               | null    |
    +------------------+---------+

Related Issues

Resolves #3207

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Kenrick Yap <[email protected]>
@YANG-DB
Copy link
Member

YANG-DB commented Jan 6, 2025

@kenrickyap can you please add the relevant documentation for this new function?

Signed-off-by: Kenrick Yap <[email protected]>
@kenrickyap
Copy link
Contributor Author

@kenrickyap can you please add the relevant documentation for this new function?

Added doctest, integ-test, and unit tests

@kenrickyap kenrickyap marked this pull request as ready for review January 6, 2025 23:28
acarbonetto
acarbonetto previously approved these changes Jan 9, 2025
Signed-off-by: Kenrick Yap <[email protected]>
Signed-off-by: Kenrick Yap <[email protected]>
YANG-DB
YANG-DB previously approved these changes Jan 11, 2025
result =
executeQuery(
String.format(
"source=%s | where json_valid(json_string) | fields test_name",
Copy link
Member

@LantaoJin LantaoJin Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the json_string is null? Could you add a test case for null

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming null should return false?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming null should return false?

@acarbonetto yes I think it makes sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If indeed null return false, then we should specify this behaviour on documentation to avoid any confusion,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null returns null.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: you are correct that null should return valse.
I've updated the tests and logic to match.

acarbonetto
acarbonetto previously approved these changes Jan 13, 2025
Signed-off-by: Kenrick Yap <[email protected]>
@kenrickyap kenrickyap dismissed stale reviews from acarbonetto and YANG-DB via 2b2a8f3 January 14, 2025 17:10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing license header

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually why do we need 2 separate JsonUtils & JsonFunctions ?
would it make sense to unify into a single class ? (inside maybe the different namespaces)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent question. I dislike "utility classes" (what is the responsibility of a utility class?). But it gives us a central class to put all the json business logic. This seems to be how we do things (date time and IP address business logic also lives in util classes).

As for the JsonFunction class, it provides an integration layer between the language parser's function expressions and the json business logic. The casting expressions will be another class that will access the json logic.

This class could very well be named Json or JsonMapper. But if we do this, we should rename all the util classes in the utils package.

docs/user/ppl/functions/json.rst Show resolved Hide resolved
result =
executeQuery(
String.format(
"source=%s | where json_valid(json_string) | fields test_name",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If indeed null return false, then we should specify this behaviour on documentation to avoid any confusion,

YANG-DB
YANG-DB previously approved these changes Jan 15, 2025
acarbonetto
acarbonetto previously approved these changes Jan 15, 2025
Copy link
Contributor

@currantw currantw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments.

}

try {
objectMapper.readTree(jsonExprValue.stringValue());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extract the ObjectMapper as a static final class member?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not waste the memory unless a user is making json calls.
I think it's better to construct it each time its used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. What about the initialization-on-demand holder idiom? Agreed that it's probably better to avoid using the memory if it's not needed, but probably also good to avoid creating the same object hundreds of times if there are hundreds of calls to any JSON function? No need to address now either way, but maybe something to consider as JSON work progresses?

docs/user/ppl/functions/json.rst Outdated Show resolved Hide resolved
doctest/test_docs.py Show resolved Hide resolved
Signed-off-by: Andrew Carbonetto <[email protected]>
@acarbonetto acarbonetto merged commit d5806cc into opensearch-project:main Jan 16, 2025
14 of 15 checks passed
@acarbonetto acarbonetto deleted the feature/json-valid branch January 16, 2025 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Add json_valid as a PPL function
7 participants