Skip to content

WeeklyTelcon_20210831

Geoffrey Paulsen edited this page Aug 31, 2021 · 1 revision

Open MPI Weekly Telecon ---

Attendees (on Web-ex)

  • Austen Lauria (IBM)
  • Brendan Cunningham (Cornelis Networks)
  • Geoffrey Paulsen (IBM)
  • George Bosilca (UTK)
  • Harumi Kuno (HPE)
  • Hessam Mirsadeghi (NVIDIA))
  • Jeff Squyres (Cisco)
  • Joseph Schuchart (HLRS)
  • Josh Hursey (IBM)
  • Marisa Roman (Cornelius)
  • Matthew Dosanjh (Sandia)
  • Michael Heinz (Cornelis Networks)
  • Sam Gutierrez (LANL)
  • Siripaul (Intel)
  • Thomas Naughton (ORNL)
  • Todd Kordenbrock (Sandia)
  • Tomislav Janjusic (NVIDIA)
  • William Zhang (AWS)

not there today (I keep this for easy cut-n-paste for future notes)

  • Akshay Venkatesh (NVIDIA)
  • Artem Polyakov (NVIDIA)
  • Aurelien Bouteiller (UTK)
  • Brandon Yates (Intel)
  • Brian Barrett (AWS)
  • Charles Shereda (LLNL)
  • Christoph Niethammer (HLRS)
  • David Bernholdt (ORNL)
  • Edgar Gabriel (UH)
  • Erik Zeiske (HPE)
  • Geoffroy Vallee (ARM)
  • Howard Pritchard (LANL)
  • Joshua Ladd (NVIDIA)
  • Mark Allen (IBM)
  • Matias Cabral (Intel)
  • Nathan Hjelm (Google)
  • Noah Evans (Sandia)
  • Raghu Raja
  • Ralph Castain (Intel)
  • Scott Breyer (Sandia?)
  • Shintaro iwasaki
  • Xin Zhao (NVIDIA)

New Topics For Today

  • New Person Trirage introduced himself.
  • MPI Thread Level Discussion
    • Two ways to initialize MPI: MPI_Init, and MPI_Init_thread
    • Years ago we added an env var to MPI_Init to allow us to set env to request threading level, without having to modify tests.
    • This env var isn't documented anywhere. Perhaps intended as an internal-only flag.
    • Intentionally did not put this into MPI_Init_thread because you explicitly ask thread-level, so no env var check there.
    • Recently in master, merged a commit to check this env var in MPI_Init_thread.
      • This commit got merged to release branches (v4.0.x, and v4.1.x)
    • What to do for v4.0.x and v4.1.x
      • PR to revert in v4.0.x
    • In v5.0.x and master - Issue 9332
      • Do we want to make this uniform how
    • If we want to expose to users, and not just magic/backdoor, should probably be an mca parameter.
    • Discussion
      • Would be nice for US for testing community.
      • Nice to make it universal.
      • Do we want to make it an mca parameter?
      • mca paremter, move down to ompi_mpi_init level.
        • Who do we talk to about Sessions [PR9097]
    • If we do this uniformly, should just check with Session and make sure it's uniform.
      • After mca system is standardized, but before mpi is initialized.
      • is this a standardized arg to mpiexec? - George will check.
    • On v4.0 and v4.1
      • Jeff thinks it's a change in behavior (this var is set, and might get)
      • Seen it multiple times where people.
      • What if we use a different env var.

v4.0.x

  • Schedule: milestone is set for September for 4.0.7
  • Howard is gone

v4.1.x

  • Jeff will look at v4.1.2 RC shortly.
  • Schedule: milestone is currently (August) for 4.1.2 acculated bugfixes.
    • A whole pile of PRs. Austen is testing a few to verify
    • William has a PR he needs to backport today

v5.0.x

  • Went over the Github Project of [critical v5.0.x issues|https://github.com/open-mpi/ompi/projects/3]
    • some progress on issues
    • Described approach of rc1 on Sept 24, disabling any functionality that are blockers to allow for the rc.
      • Worried that blockers might not be fixed in time, so will put in code to issue an error at runtime to prevent getting into those paths, and document it heavily.
      • No discussion, so we're going forward with that.
  • MPIAlltoallw needs to go in. Is a PR from Giles George
  • Janjust - has a long outstanding one.
  • Portals bugfixes incomming.
  • https://github.com/open-mpi/ompi/pull/9326 should get into 5.0 too
  • RMs will put in a disablement PR to prevent going down Issue 7830.
    • If others can verify that this is an issue for them, perhaps it's site-local.

Master

  • No discussion
  • MTT results look pretty good
  • libevent - could only replicate this in his environment.

Documentation

  • No update
  • Don't do the old system, use this new system for v5.0.0

MPI 4.0 API

  • No discussion [Open MPI 4.0 API Compliance Github Project|https://github.com/open-mpi/ompi/projects/2]

MTT

  • Looking okay.
  • Ciscos results are still hidden by default.

Longer Term discussions

  • No discussion.
Clone this wiki locally