Skip to content

WeeklyTelcon_20200602

Geoffrey Paulsen edited this page Jun 2, 2020 · 1 revision

Open MPI Weekly Telecon ---

  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Jeff Squyres (Cisco)
  • Austen Lauria (IBM)
  • Aurelien Bouteiller (UTK)
  • Barrett, Brian (AWS)
  • Brendan Cunningham (Intel)
  • Edgar Gabriel (UH)
  • George Bosilca (UTK)
  • Harumi Kuno (HPE)
  • Howard Pritchard (LANL)
  • Joseph Schuchart
  • Joshua Ladd (nVidia/Mellanox)
  • Matthew Dosanjh (Sandia)
  • Michael Heinz (Intel)
  • Naughton III, Thomas (ORNL)
  • Ralph Castain (Intel)
  • Todd Kordenbrock (Sandia)
  • Geoffrey Paulsen (IBM)
  • William Zhang

not there today (I keep this for easy cut-n-paste for future notes)

  • David Bernhold (ORNL)
  • Josh Hursey (IBM)
  • William Zhang (AWS)
  • Akshay Venkatesh (NVIDIA)
  • Artem Polyakov (nVidia/Mellanox)
  • Brandon Yates (Intel)
  • Charles Shereda (LLNL)
  • Erik Zeiske
  • Geoffroy Vallee (ARM)
  • Mark Allen (IBM)
  • Matias Cabral (Intel)
  • Nathan Hjelm (Google)
  • Noah Evans (Sandia)
  • Scott Breyer (Sandia?)
  • Shintaro iwasaki
  • Xin Zhao (nVidia/Mellanox)
  • mohan (AWS)

New

  • nothing new.

Release Branches

Review v4.0.x Milestones v4.0.4

  • Will we want a v4.0.4rc3?
    • Took a low-risk bugfix after rc2.
    • Do we do an rc3 just for this low-risk fix?
    • Maybe do an rc3 with expectation that if no issues in 24 hours, we'll ship release.
  • v4.0.4rc2 - Announced this week. Asking for people to test for feedback
  • Given the v4.1 plan (see below), we'd like to release v4.0.4 before next Tuesday.

Review v5.0.0 Milestones v5.0.0

  • PMIX

    • New Blocker Found:
    • generate PPN scaling issue - simple algorithmic issue in this function
  • Schedule:

    • PRRTE - Intel is changing his priority to move PRRTE to low priority. - Intel needs it by Q4, but end of summer at the latest. - It works with tiny option set. - Many feature requests. Many CI options.
    • PRRTE changes should not block OMPI from branching, as it's a simple submodule update.
  • We went through a number of PRRTE defects, listed on v5.0 spreadsheet.

    • Can't ship OMPI v5.0 by the end of July.
    • Amazon needs a release to deliver multi-nic OFI features
      • Also needs collectives components updates.
      • Maybe tuning update PR, maybe more?
      • Maybe George's new collectives component on v4.1
    • This would open it up to other stakeholders who need a feature (waiting on v5.x for feature)
      • v5.0 probably won't make end of summer. If you're waiting on v5.0 would a v4.1 be acceptable?
      • a v4.1 release, couldn't
      • If the branching is delayed significantly, can revisit some of the v5.0 feature list.
  • What if we try to do a more minimal v5.0 that addresses AMAZON's needs?

    • Worried that a bad prrte will damage Open-MPI brand badly.

Discuss v4.1

  • What would be release target for v4.1?
    • AWS would push for end of June release.
    • Lets just say we're going to do this.
  • NOT touching runtime!!!
  • Anyone have objections to v4.1?
    • Send out email on devel-core.
    • Branch from v4.0.4
  • AWS can be release manager for v4.1
    • Need a 2nd release manager, discuss next week.
    • Get list of things people are trying to push in.
    • Plan on commiting
  • Time to branch: by next Tuesday.
  • Brian will email devel-core with summary of discussion.

master

Face to face

  • No discussion since COVID19

Infrastrastructure

  • scale-testing, PRs have to opt-into it.

Review Master Master Pull Requests

CI status


Depdendancies

PMIx Update

ORTE/PRRTE

MTT


Back to 2020 WeeklyTelcon-2020

Clone this wiki locally