Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tidy up #144

Merged
merged 2 commits into from
Aug 21, 2023
Merged

Tidy up #144

merged 2 commits into from
Aug 21, 2023

Conversation

mruwnik
Copy link
Collaborator

@mruwnik mruwnik commented Aug 21, 2023

A couple more small bugs:

  • handle missing date published in arxiv vanity parser
  • ignore values that are None when merging dicts - this was overwiting proper date_published values with Nones

@@ -9,6 +9,13 @@
logger = logging.getLogger(__name__)


def merge_dicts(*dicts):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works but what benefit does it have over the previous implementation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dict1 = {'a': 1, 'b': None}
dict2 = {'a': 2, 'b': 3, 'c': 4}

dict(dict1, **dict2) == {'a': 2, 'b': 3, 'c': 4)
dict(dict2, **dict1) == {'a': 1, 'b': None, 'c': 4}

merge_dicts(dict1, dict2) == {'a': 2, 'b': 3, 'c': 4}
merge_dicts(dict2, dict1) == {'a': 1, 'b': 3, 'c': 4} # this is crucial one

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, thanks

@mruwnik mruwnik merged commit 9ecf594 into main Aug 21, 2023
@mruwnik mruwnik deleted the tidy-up branch August 21, 2023 18:35
henri123lemoine pushed a commit that referenced this pull request Aug 23, 2023
* filter out empty values when merging dicts

* handle invalid dates in arxiv vanity
@@ -155,7 +156,10 @@ def get_first_child(item):
]

if date_published := contents.select_one("div.ltx_dates"):
date_published = date_published.text.strip("()")
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just noticed while merging #97 into main that if there's a ParserError, the behaviour
"If the date couldn't be parsed, hope that later phases will be more successful"
is probably a typo. maybe you mean to log it or to return it as an error, but date_published probably ought to be None?

Copy link
Collaborator

@Thomas-Lemoine Thomas-Lemoine Aug 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just pass and have that as a comment, too

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    selected_date = contents.select_one("div.ltx_dates")
    try:
        date_published = parse(selected_date.text.strip("()")) if selected_date else None
    except ParserError:
        # "If the date couldn't be parsed, hope that later phases will be more successful"
        date_published = None

?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I modified it there, #97

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants