Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow fl= parameter to request partially absent fields #9

Open
sebastian-nagel opened this issue Sep 26, 2019 · 1 comment
Open

Allow fl= parameter to request partially absent fields #9

sebastian-nagel opened this issue Sep 26, 2019 · 1 comment

Comments

@sebastian-nagel
Copy link

If a field requested by the fl parameter is missing in one of the records, the query processing exits with an exception and the result list is truncated:

Traceback (most recent call last):
  File "/var/venv/lib/python3.5/site-packages/pywb/cdx/cdxobject.py", line 186, in to_text
    result = ' '.join(str(self[x]) for x in fields) + '\n'
  File "/var/venv/lib/python3.5/site-packages/pywb/cdx/cdxobject.py", line 186, in <genexpr>
    result = ' '.join(str(self[x]) for x in fields) + '\n'
KeyError: 'languages'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/var/venv/lib/python3.5/site-packages/pywb/framework/wbrequestresponse.py", line 221, in encode
    for obj in stream:
  File "/var/venv/lib/python3.5/site-packages/pywb/cdx/cdxops.py", line 53, in cdx_to_text
    yield cdx.to_text(fields)
  File "/var/venv/lib/python3.5/site-packages/pywb/cdx/cdxobject.py", line 190, in to_text
    raise CDXException(msg)
pywb.cdx.cdxobject.CDXException: Invalid field "'languages'" found in fields= argument

The absence of a field should be handled. Ideally fl=url,languages and
fl=url should return the same number of results with no/empty values for the missing fields.

Currently, the URL index is still based on PyWB 0.33.2.
PyWB 2.3.0 just crashes with non-existing fields (param name is fields, see #8) and output=text:

  File ".../pywb/warcserver/index/cdxobject.py", line 186, in to_text
    result = ' '.join(str(self[x]) for x in fields) + '\n'
  File ".../pywb/warcserver/index/cdxobject.py", line 186, in <genexpr>
    result = ' '.join(str(self[x]) for x in fields) + '\n'
KeyError: 'languages'
@sebastian-nagel
Copy link
Author

sebastian-nagel commented Mar 5, 2021

Ok, this will work with PyWB 2.5.0 (see webrecorder/pywb@92e459b in cdxobject.py).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant