Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda time profiles for DY+4j have very high 'HEL' component for helicity filtering? #999

Open
valassi opened this issue Sep 16, 2024 · 0 comments

Comments

@valassi
Copy link
Member

valassi commented Sep 16, 2024

Documenting/Analysing further results of DY+4jet tests in #948

Cuda time profiles for DY+4j have very high 'HEL' component for helicity filtering?

pp_dy4j.mad/fortran/output.txt (#events: 81)
[GridPackCmd.launch] OVERALL TOTAL    21707.6095 seconds
[madevent COUNTERS]  PROGRAM TOTAL    21546.1
[madevent COUNTERS]  Fortran Overhead 1579.09
[madevent COUNTERS]  Fortran MEs      19967
--------------------------------------------------------------------------------
pp_dy4j.mad/cppnone/output.txt (#events: 195)
[GridPackCmd.launch] OVERALL TOTAL    26745.1639 seconds
[madevent COUNTERS]  PROGRAM TOTAL    26584.9
[madevent COUNTERS]  Fortran Overhead 1608.51
[madevent COUNTERS]  CudaCpp MEs      24910.4
[madevent COUNTERS]  CudaCpp HEL      66.0341
--------------------------------------------------------------------------------
pp_dy4j.mad/cppsse4/output.txt (#events: 195)
[GridPackCmd.launch] OVERALL TOTAL    14398.4664 seconds
[madevent COUNTERS]  PROGRAM TOTAL    14231.3
[madevent COUNTERS]  Fortran Overhead 1647.03
[madevent COUNTERS]  CudaCpp MEs      12550.6
[madevent COUNTERS]  CudaCpp HEL      33.7035
--------------------------------------------------------------------------------
pp_dy4j.mad/cppavx2/output.txt (#events: 195)
[GridPackCmd.launch] OVERALL TOTAL    7335.2356 seconds
[madevent COUNTERS]  PROGRAM TOTAL    7114.43
[madevent COUNTERS]  Fortran Overhead 1683.7
[madevent COUNTERS]  CudaCpp MEs      5415.48
[madevent COUNTERS]  CudaCpp HEL      15.2596
--------------------------------------------------------------------------------
pp_dy4j.mad/cpp512y/output.txt (#events: 195)
[GridPackCmd.launch] OVERALL TOTAL    6831.8971 seconds
[madevent COUNTERS]  PROGRAM TOTAL    6649.98
[madevent COUNTERS]  Fortran Overhead 1669.94
[madevent COUNTERS]  CudaCpp MEs      4966.24
[madevent COUNTERS]  CudaCpp HEL      13.8066
--------------------------------------------------------------------------------
pp_dy4j.mad/cpp512z/output.txt (#events: 195)
[GridPackCmd.launch] OVERALL TOTAL    7136.2962 seconds
[madevent COUNTERS]  PROGRAM TOTAL    6958.96
[madevent COUNTERS]  Fortran Overhead 1636.28
[madevent COUNTERS]  CudaCpp MEs      5305.14
[madevent COUNTERS]  CudaCpp HEL      17.5447
--------------------------------------------------------------------------------
pp_dy4j.mad/cuda/output.txt (#events: 195)
[GridPackCmd.launch] OVERALL TOTAL    2523.7488 seconds
[madevent COUNTERS]  PROGRAM TOTAL    2234.93
[madevent COUNTERS]  Fortran Overhead 1820.36
[madevent COUNTERS]  CudaCpp MEs      97.9622
[madevent COUNTERS]  CudaCpp HEL      316.613
--------------------------------------------------------------------------------

Specifically, the 316sec which is 3 times the 97s does not make any sense...

There was an issue in #958 but this should have been fixed by now?

valassi added a commit to valassi/madgraph4gpu that referenced this issue Sep 16, 2024
Note also:
- CudaCpp HEL seems still very high for cuda? madgraph5#999
- 'Python/Bash' component (difference between gridpack and total madevent) seems high in cudacpp? madgraph5#1000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant