How to handle json dump for inputs with CDATA? #553

cehbrecht · 2020-12-08T15:34:27Z

Description

When dumping (as json) a complex-input with format json then the data gets tagged with CDATA:

pywps/pywps/inout/inputs.py

Line 243 in 9fa56cc

data["data"] = u'<![CDATA[{}]]>'.format(out)

Is this necessary?

This needs to be handled when loading the json dump again ... currently this is not the case.

Environment

Steps to Reproduce

We have a workflow process with a workflow document in json:
https://github.com/roocs/rook/blob/858130631bf0a37c19a78e8e94961b7159846833/rook/processes/wps_orchestrate.py#L13

The json document is send with the wps request ... not as reference.

When we use the pywps scheduler extension the WPSRequest is dumped as a json file:

pywps/pywps/processing/job.py

Line 71 in 9fa56cc

def dump(self):

In this case the workflow document (json format) is tagged by CDATA:

pywps/pywps/inout/inputs.py

Line 243 in 9fa56cc

data["data"] = u'<![CDATA[{}]]>'.format(out)

When the json dump is loaded the CDATA tag is still part of the workflow document:

pywps/pywps/processing/job.py

Line 81 in 9fa56cc

def load(cls, filename):

... and the json loader for the workflow document will fail.
https://github.com/roocs/rook/blob/858130631bf0a37c19a78e8e94961b7159846833/rook/processes/wps_orchestrate.py#L52

Additional Information

PR #444

The text was updated successfully, but these errors were encountered:

cehbrecht · 2020-12-08T15:41:38Z

@huard how would you handle this?

Should we add the json data as it is without CDATA?

pywps/pywps/inout/inputs.py

Line 227 in 9fa56cc

    
           if self.data_format.mime_type in ["application/xml", "application/gml+xml", "text/xml"]:

Or when loading the json dump should the CDATA tag be removed?

pywps/pywps/inout/inputs.py

Line 217 in 9fa56cc

instance.data = json_input['data']

huard · 2020-12-08T16:06:17Z

Hi @cehbrecht ,

I remember banging my head about this, trying to find a solution that worked across the board. I think the issue is that the json content may contain characters that will confuse the xml parser, hence the need to put everything inside CDATA tags.

I thought that the XML parser removed the CDATA tags automatically. Sorry I don't have a better answer...

cehbrecht · 2020-12-11T18:04:58Z

fixed by #555

cehbrecht mentioned this issue Dec 8, 2020

added workaround for cdata issue roocs/rook#72

Merged

cehbrecht added the bug label Dec 8, 2020

cehbrecht mentioned this issue Dec 10, 2020

fix cdata deserialzation #555

Merged

1 task

cehbrecht closed this as completed Dec 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to handle json dump for inputs with CDATA? #553

How to handle json dump for inputs with CDATA? #553

cehbrecht commented Dec 8, 2020

cehbrecht commented Dec 8, 2020

huard commented Dec 8, 2020

cehbrecht commented Dec 11, 2020

How to handle json dump for inputs with CDATA? #553

How to handle json dump for inputs with CDATA? #553

Comments

cehbrecht commented Dec 8, 2020

Description

Environment

Steps to Reproduce

Additional Information

cehbrecht commented Dec 8, 2020

huard commented Dec 8, 2020

cehbrecht commented Dec 11, 2020