Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance #62

Merged
merged 1 commit into from
May 12, 2022
Merged

Improve performance #62

merged 1 commit into from
May 12, 2022

Conversation

elacuesta
Copy link
Member

@elacuesta elacuesta commented Mar 19, 2022

Try to keep improving performance after #60.

It's my understanding that any is short-circuit, however there seems to be some penalty. Using a modified version of this benchmarking script:

from timeit import timeit
from itemadapter import is_item
from scrapy.item import Item

def test_dict():
    return is_item({})

def test_item():
    return is_item(Item())

print("test_dict:", timeit(test_dict, number=10**6))
print("test_item:", timeit(test_item, number=10**6))

1203b5e (master branch)

test_dict: 2.421997083
test_item: 3.901377961

8733014 (this branch)

test_dict: 1.6679764910000001
test_item: 3.0513542609999997

The above numbers are from a 2020 Macbook Pro, Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00GHz

@codecov
Copy link

codecov bot commented Mar 19, 2022

Codecov Report

Merging #62 (8733014) into master (1203b5e) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master       #62   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            4         4           
  Lines          285       291    +6     
=========================================
+ Hits           285       291    +6     
Impacted Files Coverage Δ
itemadapter/adapter.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1203b5e...8733014. Read the comment docs.

@kmike
Copy link
Member

kmike commented Mar 21, 2022

@elacuesta what motivates this change? Was it a real-world performance issue which is solved in this PR? I'm not sure I understand the implications of this PR - would it make itemadapter faster in a real world? The benchmark you posted is for Item vs dict creation, not for itemadapter before & after the change, probably I'm missing something :)

@elacuesta
Copy link
Member Author

elacuesta commented Mar 21, 2022

@Gallaecio reported a performance decrease in #59 (based on some research he did with itemloaders), I tried to improve that with #60, but there are still a few things to try in order to keep improving. The benchmark tests both dict and Item creation, I executed it before and after the changes of this PR, showing a ~30% time decrease for dict and a ~20% for Item.

@elacuesta
Copy link
Member Author

Ping @kmike 😄

@kmike kmike merged commit 8f239bc into master May 12, 2022
@elacuesta elacuesta deleted the performance branch May 12, 2022 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants