Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

num_contexts patch #1122

Merged
merged 6 commits into from
May 1, 2024
Merged

Conversation

davidozog
Copy link
Member

@davidozog davidozog commented Apr 30, 2024

This resolves the issue discovered by @ronawho regarding unit test shmem_team_get_config here:
openshmem-org/tests-sos#37

This is just a temporary patch to better align with the API spec. SOS still effectively ignores the num_contexts hint, so #1019 is not quite addressed.

This also addresses #1019

@davidozog davidozog changed the title Pr/num contexts patch numcontexts patch Apr 30, 2024
@davidozog davidozog changed the title numcontexts patch num_contexts patch Apr 30, 2024
@davidozog davidozog added this to the v1.5.3 milestone Apr 30, 2024
@davidozog davidozog self-assigned this Apr 30, 2024
Copy link
Collaborator

@bcmIntc bcmIntc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming Wasi's question is addressed, looks good!

Copy link
Collaborator

@parkerha1 parkerha1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidozog davidozog merged commit e3ba48e into Sandia-OpenSHMEM:main May 1, 2024
35 checks passed
Comment on lines +339 to +342
if (config != NULL) {
RAISE_WARN_MSG("%s %s\n", "team_split_strided operation encountered an unexpected",
"non-NULL config structure passed with a config_mask of 0.");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidozog - @dalcinl and I disagree. 🙂 We started getting warnings in shmem4py. We checked the specification and we do not see it saying config should be NULL if config_mask == 0. In fact, we are not even sure if a NULL pointer is allowed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, we are not even sure if a NULL pointer is allowed.

In here, Marcin is talking about the standard: there is no explicit wording saying that if the config_mask is 0, then config may be NULL. Or did we miss something? What if some implementation decides to add a check assert(config != NULL) irrespective of the value of config_mask? Would such behavior be in contradiction of the 1.5 standard?

Copy link
Member Author

@davidozog davidozog Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrogowski @dalcinl I agree there is a "blind spot" in this section of the standard right now (v1.5), and I have a note to try to fix it for the upcoming new version (v1.6). I would prefer to add the following statement to the v1.6 standard (do you think it's sufficient?):

If \VAR{config} is a null pointer, then \VAR{config_mask} must be 0, otherwise
the behavior is undefined.

Another option is to simply prohibit a null config pointer, but my hunch is that's a bit more restrictive than what was intended for v1.5, but I'm open to it!

We also might want something like this for improved clarity:

If \VAR{config_mask} is 0, then `shmem_team_get_config` performs no operation.

So if config_mask is 0 and config is non-null, then that's perfectly fine. Given the state of OpenSHMEM v1.5, we opted to include an SOS warning in this special case, but I think we could also move it to "DEBUG" output or simply remove it altogether - I'd prefer to remove it myself, especially if the statements above are added to OpenSHMEM v1.6.

Any preferences from @wrrobin and @stewartl318?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your previous suggestions with a small addition for maximum clarity:

 If \VAR{config_mask} is 0, then `shmem_team_get_config` performs no action and \VAR{config} may be `NULL`.

but I think we could also move it to "DEBUG" output or simply remove it altogether

I guess that would be good enough. Extra points if you guys ever allow for these warnings as an opt-in via some environment variable.

Another option is to simply prohibit a null config pointer, but my hunch is that's a bit more restrictive than what was intended for v1.5, but I'm open to it!

Indeed, there is little point in such restriction. Moreover, it is kind of a backward incompatible change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the internet robustness principle states it well: be conservative in what you do, be liberal in what you accept from others

I think there is no real value in any sort of debug or error message when mask is 0 but config is non-null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dalcinl @stewartl318 - Thanks for the input!

Extra points if you guys ever allow for these warnings as an opt-in via some environment variable.

This sounds like what I meant by "DEBUG" message above. The DEBUG_MSG macro in SOS will only print to stderr if the SHMEM_DEBUG environment variable is set.

I think there is no real value in any sort of debug or error message when mask is 0 but config is non-null.

I tend to agree... but maybe the debug message is a good compromise since the spec is pretty under-defined for this special case. Let's move the discussion to PR #1138, which proposes changing this to a debug message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants