Skip to content

ASC Q1 2023 Meeting

Thomas Naughton edited this page Jan 26, 2023 · 11 revisions

PMIx Standard Administrative Steering Committee (ASC) 1Q 2023 Meeting

Quick Links

  • Governance Document [latest]

Agenda (Finalized Jan. 23, 2022)

This meeting has a floating agenda with specific synchronization points to keep us on track. Rough time estimates are provided per agenda item, and the co-chairs plan to cover the topics in the order seen below. However, since some agenda items will take longer/shorter than anticipated, an exact start/end timing is not guaranteed, and some items may float to the second day. If you cannot attend the full meeting and are presenting, please let the co-chairs know, and we can plan accordingly.

Day 1: (10 am - 1 pm US Central Daylight Time)

Start End Topic
10:00 am 10:05 am Gathering (--)
10:05 am 10:10 am Roll Call (We will start roll call promptly at this time)
10:10 am 11:30 am Discussion of agenda items
11:30 am 11:45 am Break
11:45 am 1:00 pm Discussion of agenda items

Day 2: (10 am - 1 pm US Central Daylight Time)

Start End Topic
10:00 am 10:05 am Gathering (--)
10:05 am 11:30 am Discussion of agenda items
11:30 am 11:50 am Voting and Break Voting Link
11:50 am 12:30 am Administrative and Working Group agenda items
12:30 am 12:45 pm Technical and Use Case Presentation(s)
12:45 am 1:00 pm Closing discussion and wrap up

Agenda Items

  • Governance PRs up for a Second Vote:
    • None
  • Governance PRs up for a Reading and First Vote:
    • None
  • PMIx Standard PRs up for a Reading (Provisional):
    • None
  • PMIx Standard PRs up for a Reading (Errata):
  • PMIx Standard PRs up for a Second Vote:
    • None
  • PMIx Standard PRs up for a Reading and First Vote:
    • None
  • Voting Link
  • DAY2 Voting Link
  • Plenary discussion items
  • Revision Exception Votes
    • None
  • Presentation of the v5.0 Standard Release Candidate for discussion (Ken/Dave)

Administrative and Working Group Agenda Items

  • Review quarterly meetings dates and plans
1Q 2023 - Virtual
  - Jan. 24 & 26 (10 am - 1 pm US Central)
2Q 2023 - Virtual
  - May 9 & 11 (10 am - 1 pm US Central)
3Q 2023 - Virtual
  - July 18 & 20 (10 am - 1 pm US Central)
4Q 2023 - Virtual
  - Oct. 17 & 19 (10 am - 1 pm US Central)
  • ASC Membership
    • Vote on new ASC Members
    • Call for new ASC Members
  • Release Planning
  • Working Group Updates (~ 10-15 minutes each)
    • Client Separation / Implementation Agnostic Document
    • Tools & Dynamic Workflows
    • Open Call for New Working Groups
  • Technical and Use Case presentations
    • Josh Hursey (IBM) "A separated model for running rootless, unprivileged PMIx-enabled HPC applications in Kubernetes" (Presented at CANOPIE-HPC)
  • Additional discussion items

Meeting Notes:

Attendance

Person Institution Day 1 Day 2
Josh Hursey IBM X X
Michael Karo Altair X
Ken Raffenetti ANL X X
Isaias Compres TUM X X
Ralph Castain Nanook ` X X
Brice Goglin INRIA X X
Dave Solt IBM X
Kathryn Mohror LLNL X X
Dominik Huber X
Norbert Eicker JSC X X
Thomas Naughton ORNL X X
Aurelien Bouteiller UTK X

Day 1: Jan. 24, 2023

  • Introductions
  • Reading: Add const to string parameters (Ken ~10 min)
  • https://github.com/pmix/pmix-standard/pull/430
  • No comments/concerns mentioned
  • Note that already added into OpenPMIx so just a matter of adding to Standard text
  • Plenary discussion
    • Publish/Lookup Chapter (Dave ~30 min)
      • https://github.com/pmix/pmix-standard/pull/398
      • TODO: add dave’s slides
      • Brief summary: Motivation was to separate things to publish from attributes influencing the publishing.  Also, resolve some non-deterministic behavior when looking up on ranges.  Led to introducing new APIs for Publish/Lookup (PMIx_Publish_datastore/PMIx_Lookup_datastore)
      • The new publish datastore returns a unique publish_id, used to unpublish the specific item.
      • Q: Who can unpublish?
        • Generally any process with same userID can unpublish
        • So as in theory could transfer the publish_id to another process under same userID could unpublish it.  Maybe in future could add attributes to further refine these semantics/limits, but not being introduced now.
      • Note: Maybe this is getting overly complicated, possibly starting to look more like a database?
        • Intent was to keep as simple as possible
        • The reason for publish_id was to ensure proper specificity on what should be unpublished.
        • Q: how did get to multiple publishes for same key?
        • Once have pub on ranges+key, it gets more complicated and just allow for publishing multiple times
        • Maybe just remove the “realm” stuff and every key must be unique.  And make the key be the unique part.
        • Trying to get at exact need for this functionality and what’s minimum need to accomplish
        • Example: Currently mainly used in Open MPI for rendezvous for connect/accept.  And goal is to remove that method in future.
        • From past notes: Multiple processes need to be able to publish the same key.
        • Having ability to publish multiple keys also removes the requirement that the publisher check on “uniqueness” of the key.
        • Discussion continued to review other points raised during design/requirements review…
        • If adding complexity, do so intentionally to address clear use/need
        • Review current APIs and datastructures
        • Note: pmix_pdsdata_t contains the key now so have symmetry in blocking and non-blocking paths, also the publish_id contains the epoch
        • In progress: Regarding epoch (pmix_epoch_t) increasing number for comparison and possibly ordering, with weaker ordering constraints across different processes (i.e., possibly same time so just know happened at same time) but w/in same process have stronger ordering.
        • Question about implementation was discussed, generally it seems that an implementation would be needed before voting.  Unclear who has time/resources for implementation – point for discussion within the working group.  Note: Ralph will not have time to do this implementation.
        • See also notes on PR https://github.com/pmix/pmix-standard/pull/398
  • PMIx v5.0 presentation
    • Note: for voting, will have two votes: v5 release and errata
    • TODO: add Ken slides
    • First major release under new governance procedures
    • Using a time based release target (in contrast to feature based)
    • V5 Additions
      • Use-case WG additions (business card, debugging, hybrid prog models, cross-version)
      • Implementation Agnostic WG (return codes, rework ch1-2, ch5-8, ABI (ABI partially in pmix-4.2))
      • Storage WG (added in pmix-4.2 also)
      • See Ken’s slide for detailed changelog
      • Note: missed the macros converted to functions in changelog
      • Need to double check items in Exception file that may be missing from standard.  Namely the macros-to-function items.
      • Procedurally - need to check before Thursday and revision exception if only a bullet to changelog.  If items missing from standard (PRs) then may need to delay ratification.
      • TODO: check on exception files and checker script and see if items missed before Thursday (Day2)
  • Next meetings
    • Longer gap until Q2 meeting in May
    • Q3 meeting in …
  • PMIx v4.2 release
    • Plan is to have a Q2 release
    • Quite a number of PR have piled up since the last release
  • PMIx v5.0 release
    • Identified some missing items (e.g., macros-to-functions) during this meeting and would like to discuss next steps until that list of items is complete.
    • Ignore current v5 voting item (links already posted) and will do the vote on Thursday (Day#2).
  • Working Group updates
    • Client Separation / Implementation Agnostic Doc WG (Dave)
      • See items discussed earlier this meeting
      • Other work has been on ABI
    • Tools & Dynamic Workflow WG (Isaias)
      • Resumed meetings last week
      • Will describe a use-case to discuss potential race conditions
      • Brainstorming on more but not yet ready for presentation
    • Call for New Working Groups
      • Drift of the library away from the standard (Ralph)
        • more than 200 functions right now
      • Shall there be WG keeping an eye on this?
        • Might be part of the Implementation Agnostic WG? (Ken)
  • Voting link
  • Technical Presentation

Day 2: Jan. 26, 2023

  • Agenda

    • V5.0 release candidate revision exception discussion
    • V5.0 errata discussion
    • V5.0 release candidate promotion vote https://www.surveymonkey.com/r/9WJDT87
    • WG on Open PMIx-Standard parity discussion
    • Open Discussion
  • Review v5.0 rc revision and vote

    • Updated changelog for revision history (since Day1 mtg)
    • Macro to function changes – not mentioned in 5.0 history b/c mostly backported to v4.2 (shows in that history/changelog), which is part of the v5.0 standard
    • There are more macro-to-func changes since creating v4.2 & v5.0 branches
    • Note: Seems too large a diff to try and bring these changes in as revision exception change.  Suggestion is to release as-is and then have subsequent release, not rushed, to bring in the other changes.  Will bring this up for vote.
    • Q: Regarding what would be voted up today, are the macro to function items reflected in changelog?
      • The macros that were converted are listed in the v4.2 changelog, but not called out explicitly in the v5.0 changelog.  Not sure if worth restating for v5.0.
      • The v4.2 is not released yet.
      • The v5.0 would include the v4.2 changelog w/ release date as TBD.  Little odd b/c of overlapping changlog, but the changelog (text) would be present.
      • We could later add an errata to v5.0 to update the v4.2 release date when it clears/releases.
    • Q: Raises question, if release v5.0 should we create a new maintenance branch in anticipation for subsequent changes (e.g., errata w/ changelog update for v4.2)?
      • Not sure we have enough changes for full major release so could continue maintaining on main branch, and only branch when think we are nearing enough to warrant a v6.0 (next major).
    • Note - little odd having v4.2 not released yet.
    • Note - could delay v5.0 a quarter to resolve that and avoid confusion.
    • One way to resolve could be to back out v4.2 change w/ macro2func, and release v4.2 w/o the macro2func patch and put that patch into v5.0.  Also, could then bring all macro2func into a single patch and bring it in with v5.0 cleanly.
    • Procedurally that just slips things 1 quarter.
    • Note - could possibly vote on v5.0 with provisional change / changelog change to remove v4.2 changelog into v5.0.
    • Options
      • Back changes out of v4.2 (no longer in v4.2)
      • Exclude from v5.0 changelog
      • Back out changes and bring all function to macros as v5.1
    • Proposed: Remove v4.2, Keep v5.0 and update changelog accordingly (vote for v5.0 w/ functions and changelog fixup)
    • Note: Not putting stake in ground yet w/ v5.0 on ABI, just have elements to move toward that and include reference to PMIx ABI repo (versioned w/ PMIx standard)
    • Note - the removal of macros was guided by need for ABI and wanting to be able to dlopen() and have functions (instead of macros).  So having macros present, deprecated, still does not really help with the ABI.
    • Discussion about accessing items with headers in the PMIx ABI repo.
    • Note - there is text in standard about the linker/build ABI distinctions for things related to macros. A linking ABI, the macros not usable, but for building macros would be usable. But moving away from macros toward functions to move more toward a “linking ABI”. But text in standard helps to clarify what sort of expectations you can make about the ABI, i.e., a “build ABI”.
    • Suggestion - include in the changelog the From/To, e.g., “Changes from v4.1 to v4.2” and “Changes from v4.1 to v5.0”.  Clarifies the lineage in addition to the date stamp
    • Suggestion to move text (C.10.2) to (C.11.x) and add the “From/To” versions.
    • We will make new PR for this movement/edit and then vote.  If any vote NO we will fail vote and move to next quarter
    • Brief summary for the revision exception vote
      • Remove v4.2 section (C.10)
      • Move items from C.10 to C.11
      • Update release date of v5.0 in revision history
      • Add v5.0 section text to start, “...v5.0 release includes changes from v4.1..”
  • WG on Open PMIx-Standard parity discussion

    • Discussion about maintaining sync between OpenPMIx and PMIx Standard
    • WG might help to resolve current backlog
    • Point about succession plans also raise, maybe WG could help with that planning too.  Point raised that that is more of a code/implementation issue so maybe separate
    • Discussion about namespace for extensions in implementation that are not yet in official standard, e.g., OMPI extensions only supported by Open MPI implementation, but no guarantees from official MPI standard.  “Brand lockin” 🙂
    • Note - need the tools to enumerate and triage the differences
    • There are some checker scripts that can be helpful to triage things and file issue to bring things over. Mostly there for the technical parts/machinery. But need someone to be in charge for the actual triage.
    • Maybe we can roll this into the monthly meeting & not need a separate WG meeting.
    • Folks volunteering to help out, mainly will need time to get the procedures in place.  Note, Ralph mentioned that je can help provide input/questions on some write-up as needed, but does not have time for pushing things through the full process/reading/voting/etc.
    • For now plan to do the checks on these during the monthly PMIx meetings
  • Voting

    • Yes with edits for v5.0 release
  • Open Discussion

    • Question about IAWG / time commitment, maybe check back next month
  • Next monthly meeting: Feb 9

  • Next quarterly meeting: May 9

Clone this wiki locally