Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Parse JSON Notices Section #2669

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
85 changes: 84 additions & 1 deletion app/routes/docs.client.samples.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ and some samples from the FAQs section of the [gcn-kafka-python](https://github.

To contribute your own ideas, make a GitHub pull request to add it to [the Markdown source for this document](https://github.com/nasa-gcn/gcn.nasa.gov/blob/CodeSamples/app/routes/docs.client.samples.md), or [contact us](/contact).

## Parsing
## Parsing XML

Within your consumer loop, use the following functions to convert the
content of `message.value()` into other data types.
Expand Down Expand Up @@ -164,3 +164,86 @@ for message in consumer.consume(end[0].offset - start[0].offset, timeout=1):
continue
print(message.value())
```

## Parsing JSON

GCN Notices for new missions are typically distributed in JSON format. This guide explains how to programmatically read the JSON schema.

Start with subscribing to a Kafka topic and parsing the JSON data

```python
from gcn_kafka import Consumer
import json

# Connect as a Kafka consumer
consumer = Consumer(client_id='fill me in', # Replace with your client ID
client_secret='fill me in', # Replace with your client secret
config={"message.max.bytes": 204194304},
)

# Subscribe to Kafka topic
consumer.subscribe(['gcn.circulars'])
Vidushi-GitHub marked this conversation as resolved.
Show resolved Hide resolved

# Continuously consume and parse JSON data
for message in consumer.consume(timeout=1):
if message.error():
print(message.error())
continue

# Print the topic and message ID
print(f"topic={message.topic()}, offset={message.offset()}")

# Kafka message value as a Base64-encoded string
value = message.value()
```

Vidushi-GitHub marked this conversation as resolved.
Show resolved Hide resolved
## Decoding Embedded Data

The following code demonstrates how to decode bytes to `base64` for transfer over an ASCII medium. Python's built-in [`base64`](https://docs.python.org/3/library/base64.html#base64.b64encode) module provides the `b64decode` and `b64encode` methods to make this task simple. Additionally, JSON is serialized with Unicode, not ASCII, requires the proper handling of non-ASCII characters when encoding and decoding data.
Vidushi-GitHub marked this conversation as resolved.
Show resolved Hide resolved
Vidushi-GitHub marked this conversation as resolved.
Show resolved Hide resolved

In continuation of consumer loop, use the following functions to decode `base64` text to bytes, write into a `.fits` file
Vidushi-GitHub marked this conversation as resolved.
Show resolved Hide resolved

```python
import base64

# Convert the Kafka message value to a string
value_str = value.decode("utf-8").strip()
Vidushi-GitHub marked this conversation as resolved.
Show resolved Hide resolved

# Parse the JSON data
value_json = json.loads(value_str)

# Extract the Base64-encoded skymap
skymap_string = value_json["event"]["skymap"]

# Function to validate Base64 strings
def is_base64(s):
try:
base64.b64decode(s, validate=True)
return True
except Exception:
return False

# Validate the skymap string
if not is_base64(skymap_string):
print("Invalid Base64 string.")
continue

# Decode the Base64 string
decoded_bytes = base64.b64decode(skymap_string)

# Save the decoded data as a FITS file
with open("skymap.fits", "wb") as fitsFile:
fitsFile.write(decoded_bytes)
```

If you want to include a FITS file in a Notice, you add a property to your schema definition in the following format:

```python
{
type: 'string',
contentEncoding: 'base64',
contentMediaType: 'image/fits',
}
```

In your data production pipeline, you can use the encoding steps to convert your file to a bytestring and set the value of the property to this bytestring. See [non-JSON data](https://json-schema.org/understanding-json-schema/reference/non_json_data.html) for more information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great sample code, but not enough context here to understand what these instructions are for.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to expand the instructions.
Could you please elaborate more what's missing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pick an example of an actual notice type that contains embedded HEALPix data. Show how to:

  • subscribe to and receive that notice type
  • decode the Kafka message as JSON
  • decode base64 text to bytes
  • read the bytes as a FITS file

And then the user is ready to go on to the sky maps section! (I think that this is turning into a step-by-step tutorial, which will be awesome.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LVK JSON file is tested for decode base64 and write fits file.

Vidushi-GitHub marked this conversation as resolved.
Show resolved Hide resolved
Loading