Skip to content

WeeklyTelcon_20171030

Geoffrey Paulsen edited this page Jan 9, 2018 · 1 revision

Open MPI Weekly Telcon


  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees

  • Jeff Squyres
  • Geoff Paulsen (IBM)
  • Brian
  • Edgar Gabriel
  • Todd Kordenbrock
  • Ralph
  • Howard Pritchard
  • Josh Hursey
  • Mohan
  • Nathan Hjelm

Agenda

Review v2.0.x Milestones v2.0.4

Review v2.x Milestones v2.1.2

  • v2.1.3 (unscheduled, but probably jan 19, 2018)
    • PR4172 - a mix between feature / bugfix.
  • Are we going to do anything for v2.x for hwloc 2?
    • At least put in a configure error if detects hwloc v2.x
  • Still some PRs going in.
  • Not paying attention here, but will have a callback

Review v3.0.x Milestones v3.0

  • v3.0.1 ( Shoot for end of week for an RC. Want out before Super Computing ).
  • Delayed to end of Next week
  • Ralph did some more investigation, and found that dstore backwards compatibility would work with OMPI v3.0, just not OMPI v2.0 stream
    • This revelation meets Open MPI's backwards compatibility commitments.
    • Need to start testing v3.1 in MTT.
  • v3.1.x -
    • DONE - Roll hwloc back to 1.11.7 on v3.1.x branch (Ralph put together, Brian reviews)
  • Schedule - No point in rolling RC until more people turn on v3.1 testing.
    • Outlook - Probably will not get out by supercomputing. :(
  • Brian will send out requests to start testing v3.1
  • Add v3.1 to MTT tests
    • Database is active now to accept v3.1 tests.
    • MTT disk full issue has been resolved.

Review Master Master Pull Requests

  • Master version is currently v4.0, but that's an artifact of the datatype stuff that ended up getting pulled into v3.0, so it can be made to be v3.2
  • Cisco and possibly others will begin testing v3.1 depsite v2.x not quite being out.
  • Many failures at Cisco due to lack of --oversubscribe flag with MPI_Spawn

MTT / Jenkins Testing Dev

  • Mellanox Jenkins can't write RPMs. email sent to Artem
  • Josh will inform George that IBM's fix doesn't fix

This week Discussion Points.

  • Website - openmpi.org
    • Brian trying to make things more automated, so can checkout repo, etc. Repo is TOO large.
    • Majority of the problem is the Tarballs. and already storing those in S3.

Oldest PR

Oldest Issue

Next face-to-face meeting

  • Jan / Feb
  • Possible locations: San Jose, Portland, Albuquerque, Dallas
  • Discuss What to do for partner's broken CI pieces?
  • Big section of going through old issues and old PRs.

Status Updates:

Status Update Rotation

  1. Mellanox, Sandia, Intel
  2. LANL, Houston, IBM, Fujitsu
  3. Amazon,
  4. Cisco, ORNL, UTK, NVIDIA

Back to 2017 WeeklyTelcon-2017

Clone this wiki locally