Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete Device2Host caused by comm with device and host #2840

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

zhaozheng09
Copy link

@zhaozheng09 zhaozheng09 commented Nov 22, 2024

Delete some D2H ops

if not will be bring a sync within host and device, so replace it from if not to torch.where

Before submitting
  • Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

before:
image
after:
image
Host will be block until device done, this D2H maybe block pipeline of host and device, so I use where op to replace if not

Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we pls add tests for this fix?
and maybe pls update the title to better express what the PR does

Copy link

codecov bot commented Nov 25, 2024

Codecov Report

Attention: Patch coverage is 66.66667% with 4 lines in your changes missing coverage. Please review.

Project coverage is 41%. Comparing base (d528131) to head (5146a6d).

❗ There is a different number of reports uploaded between BASE (d528131) and HEAD (5146a6d). Click for more details.

HEAD has 76 uploads less than BASE
Flag BASE (d528131) HEAD (5146a6d)
gpu 2 0
unittest 2 0
torch2.0.1+cpu 6 3
python3.10 19 9
Windows 4 2
cpu 32 14
macOS 6 3
python3.12 7 3
torch2.5.0 2 1
torch2.0.1 4 2
torch2.5.0+cpu 2 1
Linux 22 9
python3.11 4 1
torch2.4.1+cu121 7 2
torch2.1.2+cpu 2 1
python3.9 2 1
torch2.2.2+cpu 2 1
torch2.3.1+cpu 2 1
torch2.5.0+cu124 5 2
Additional details and impacted files
@@           Coverage Diff            @@
##           master   #2840     +/-   ##
========================================
- Coverage      69%     41%    -28%     
========================================
  Files         346     332     -14     
  Lines       19129   18961    -168     
========================================
- Hits        13227    7739   -5488     
- Misses       5902   11222   +5320     

@zhaozheng09 zhaozheng09 changed the title async host/device . delete some H2D op Nov 26, 2024
@zhaozheng09 zhaozheng09 changed the title delete some H2D op delete some D2H op Nov 26, 2024
@zhaozheng09
Copy link
Author

can we pls add tests for this fix? and maybe pls update the title to better express what the PR does

image

must be 100% cover ? this code will raise in old torch version, I don't know how to cover, please give me an example ? thx very much.

@zhaozheng09 zhaozheng09 changed the title delete some D2H op Delete Device2Host caused by comm with device and host Nov 26, 2024
@Borda
Copy link
Member

Borda commented Nov 26, 2024

this code will raise in old torch version, I don't know how to cover, please give me an example ? thx very much

ok, so you say that this PR does not have any impact on functional metrics results but improves performances, correct?

@mergify mergify bot added the ready label Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants