Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reverting changes to unsigned variables causing Future SW CI crash #311

Merged
merged 10 commits into from
Dec 21, 2023

Conversation

mlarson02
Copy link
Contributor

@mlarson02 mlarson02 commented Dec 18, 2023

Reverts changes made in PR #304 that changes various looped over variables in TrackBuilder.h to signed, causing error: comparison is always false due to limited range of data type [-Werror=type-limits] in future SW CI. Because this warning has been flagged as an error centrally in CMSSW, the only current fix for this is to revert these variables to not being unsigned.

@aehart
Copy link
Contributor

aehart commented Dec 19, 2023

@mlarson02 This is strange, since -Werror=type-limits is one of the compiler flags added in #304:

set ::env(CCFLAG) "-Wall -Werror=address -Werror=array-bounds -Werror=conversion-null -Werror=delete-non-virtual-dtor -Werror=format -Werror=format-contains-nul -Werror=main -Werror=missing-braces -Werror=narrowing -Werror=overflow -Werror=overlength-strings -Werror=pointer-arith -Werror=reorder -Werror=return-local-addr -Werror=return-type -Werror=sign-compare -Werror=strict-aliasing -Werror=switch -Werror=type-limits -Werror=unused-but-set-variable -Werror=unused-value -Werror=unused-variable -Werror=write-strings -Wno-attributes -Wno-deprecated -Wno-ignored-qualifiers -Wno-long-long -Wno-non-template-friend -Wnon-virtual-dtor -Wno-psabi -Wno-unused-but-set-parameter -Wno-unused-label -Wno-unused-local-typedefs -Wno-unused-parameter -Wno-vla -Wparentheses -Wreturn-type -Wunused"

The only thing I can figure is that the older version of GCC that HLS uses isn't smart enough to trace the template parameters that are set to zero to the for-loops where they are compared with unsigned integers.

Anyway, you saw the compiler error in the CI that I was trying to work around when I originally made the loop counters unsigned. I guess you don't see this in the future SW emulation because you include the -Wno-sign-compare flag here (is that still needed now?):
https://gitlab.cern.ch/ejclemen/cmssw_trackfinding_hlsframework/-/blob/df59dbcae854a3b7fff13a4c3a27e0e06ea1fe31/TrackFindingTrackletHLS/plugins/BuildFile.xml#L15

I think a better solution to both compiler errors is to just convert the template parameters in TrackBuilder.h to signed integers. I took the liberty of pushing this to your branch, while preserving your commit history so far. We'll see if the CI likes it, but let me know if this is acceptable to CMSSW.

@mlarson02
Copy link
Contributor Author

@mlarson02 This is strange, since -Werror=type-limits is one of the compiler flags added in #304:

set ::env(CCFLAG) "-Wall -Werror=address -Werror=array-bounds -Werror=conversion-null -Werror=delete-non-virtual-dtor -Werror=format -Werror=format-contains-nul -Werror=main -Werror=missing-braces -Werror=narrowing -Werror=overflow -Werror=overlength-strings -Werror=pointer-arith -Werror=reorder -Werror=return-local-addr -Werror=return-type -Werror=sign-compare -Werror=strict-aliasing -Werror=switch -Werror=type-limits -Werror=unused-but-set-variable -Werror=unused-value -Werror=unused-variable -Werror=write-strings -Wno-attributes -Wno-deprecated -Wno-ignored-qualifiers -Wno-long-long -Wno-non-template-friend -Wnon-virtual-dtor -Wno-psabi -Wno-unused-but-set-parameter -Wno-unused-label -Wno-unused-local-typedefs -Wno-unused-parameter -Wno-vla -Wparentheses -Wreturn-type -Wunused"

The only thing I can figure is that the older version of GCC that HLS uses isn't smart enough to trace the template parameters that are set to zero to the for-loops where they are compared with unsigned integers.
Anyway, you saw the compiler error in the CI that I was trying to work around when I originally made the loop counters unsigned. I guess you don't see this in the future SW emulation because you include the -Wno-sign-compare flag here (is that still needed now?): https://gitlab.cern.ch/ejclemen/cmssw_trackfinding_hlsframework/-/blob/df59dbcae854a3b7fff13a4c3a27e0e06ea1fe31/TrackFindingTrackletHLS/plugins/BuildFile.xml#L15

I think a better solution to both compiler errors is to just convert the template parameters in TrackBuilder.h to signed integers. I took the liberty of pushing this to your branch, while preserving your commit history so far. We'll see if the CI likes it, but let me know if this is acceptable to CMSSW.

@aehart Thanks for pushing those changes. Yeah, I don't think suppressing the HLS warnings as we are currently doing should still be necessary, I'm not sure why that was included in the BuildFile but I think ideally we wouldn't require it.

It seems using signed template parameters has resulted in the same error (ERROR: [SYNCHK 200-61] /home/gitlab-runner/builds/biTwR5NQ/3/cms-l1tk/firmware-hls/TrackletAlgorithm/TrackBuilder.h:314: unsupported memory access on variable 'diskStubWords.V' which is (or contains) an array with unknown size at compile time.) that I was trying to work around yesterday when casting the variables to signed in each required loop. Do you have any idea why this is caused and how we could possibly fix it without inducing the same compilation errors?

@tomalin tomalin self-requested a review December 19, 2023 18:26
@tomalin
Copy link
Contributor

tomalin commented Dec 19, 2023

The fix f5c736c , which I expected to work, seems to fail when running Vivado SIM on the chain https://gitlab.cern.ch/cms-l1tk/firmware-hls/-/commit/2777aa061e6c550b71c3c82c7f9573bababee1b3 , with error https://gitlab.cern.ch/cms-l1tk/firmware-hls/-/jobs/34825473 in TrackBuilder.h at this line https://gitlab.cern.ch/cms-l1tk/firmware-hls/-/blob/2777aa061e6c550b71c3c82c7f9573bababee1b3/TrackletAlgorithm/TrackBuilder.h#L314 . Apparently since array diskStubWords was declared with indefinate array size, it doesn't like the range of index j being unknown at compile time.
I suspect the solution is to replace in the "for" loop "j < int(NDiskStubs)" by "j < static_cast< int >(NDiskStubs)". (Though there's a chance you may need to have a separate line like "constexpr int myNum = static_cast< int >(NDiskStubs)" and then use myNum in the loop).

@aehart
Copy link
Contributor

aehart commented Dec 20, 2023

@mlarson02 @tomalin I think the reason the casting done in f5c736c and 51a8ec1 doesn't work is because HLS is somehow interpreting the bodies of loops before realizing the loops have zero iterations. This is very dumb, but it's my best guess as to why C-synthesis fails.

I just pushed a commit that seems to allow C-synthesis in these cases. Basically, I just added explicit checks to the loop conditions that the relevant template parameters be greater than zero. This avoids casting, and bafflingly, HLS accepts this in my testing. Let's see if the CI likes this…

@mlarson02
Copy link
Contributor Author

@mlarson02 @tomalin I think the reason the casting done in f5c736c and 51a8ec1 doesn't work is because HLS is somehow interpreting the bodies of loops before realizing the loops have zero iterations. This is very dumb, but it's my best guess as to why C-synthesis fails.

I just pushed a commit that seems to allow C-synthesis in these cases. Basically, I just added explicit checks to the loop conditions that the relevant template parameters be greater than zero. This avoids casting, and bafflingly, HLS accepts this in my testing. Let's see if the CI likes this…

@aehart Thanks for making these changes, the HLS CI and (when using this branch) Future SW CI are both passing. @tomalin Does this PR then look okay to be merged?

@tomalin
Copy link
Contributor

tomalin commented Dec 20, 2023

Yes, though please add a comment to the code explaining the weird logic, otherwise someone is bound to "simplify" it back again in future. And please raise the silly compiler option issue with the SW group.

@mlarson02
Copy link
Contributor Author

I just added comments where necessary and created a thread with the SW group here https://cms-talk.web.cern.ch/t/problems-caused-by-werror-type-limits-compiler-flag/32918.

Copy link
Contributor

@tomalin tomalin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved when CI passes

@tomalin tomalin merged commit 6c28b29 into master Dec 21, 2023
1 check passed
@aehart aehart deleted the FutureSWSignedFix branch April 16, 2024 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants