Skip to content

Latest commit

 

History

History
166 lines (111 loc) · 12.8 KB

03-understand-oss-governance.md

File metadata and controls

166 lines (111 loc) · 12.8 KB

Chapter 03: Understanding Open Source Governance

Learner personas - code contributor
- code-adjacent contributor
- manager/stakeholder
Pre-requisites - Chapter 01: Introduction To Open Source
- Chapter 02: Types of Open Source Software

At its core, governance is a way for the open source community to align on who makes decisions and how decisions are made, and the permutations like how decisions are made about who makes the decisions. Governance models answer these questions by defining roles in the projects and the powers they have.

Two questions shown side-by-side: who makes decisions, and how decisions are made; both point to themselves and each other showing the various permutations.

Table of Contents 🗂️

Learning Objectives 🧠

Building on the governance approaches discussed in the previous chapter, in this chapter, you'll learn:

  • Some actual and concrete governance models used by the OSS community
  • How to navigate the governance structures and hierarchies in OSS projects
  • How napari fits into the PyData, Python, and broader open source software ecosystem

Governance models ⚖️

Let's start by looking at four common ways to define who makes decisions in an OSS project. It is important to remember that projects usually adopt a combination of two or more models described here.

Benevolent Dictator for Life (BDFL)

The term BDFL was popularized by Guido van Rossum, creator of the Python programming language1. In this model, the creator of an open source project takes the title of "BDFL" and has the final say in all decisions for the project. Examples of projects with BDFLs are the Linux Kernel, pandas, and SciPy.

Note: Even if projects have BDFLs, they rarely make decisions alone and have additional teams or advisors to share the responsibility.

Core team

In this model, a group of sustainers2 are the project leaders and final decision-makers. The group may have internal processes like a majority vote or a consensus-based approach to make decisions. Projects with this governance model have pathways for:

  • community members to start contributing,
  • become regular contributors with more privileges like merge rights, and
  • eventually join the core team involving a community or self-nomination, followed by discussion and approval by the core team.

In the PyData ecosystem, the pandas and Bokeh projects have a core team governance structure.

Elected council

Like the core team, the elected council model also involves a team of sustainers taking on project leadership. The main difference is that they are elected by the community, and council members serve a fixed term. The Python and Kubernetes projects have good implementations of the elected council governance model.

Council & subcommittees

Beyond a particular scale or complexity, projects start having different specialized areas with experts in each area. These areas and experts must be recognized and involved in decisions affecting the whole community.

The council & subcommittee model is a two-tiered approach to governance. The council (usually elected, but sometimes core-team-like) is responsible for the overall direction and community-wide decisions, and subcommittees are created to lead and make decisions for specific project areas. The two tiers work collaboratively, with the council advising the subcommittees and the committees representing their particular area in community discussions.

Examples of this governance model are the Kubernetes project with "Special Interest Groups" and the Jupyter Project with the "Software Steering Council".

A note on terminology

The terms above, like "core team", "elected council", etc., are not formal but a best approximation for describing and understanding governance models. The spirit behind each model and their differences are more important. Moreover, every project implements governance differently and will have project-specific nuances.

For instance, the NumPy project has a "steering council". They don't have an explicit community vote for joining the council; they follow a process similar to the one described in the Core team model.

Enhancement Proposals 📑

The second half of governance involves how decisions are made. In the Python and PyData ecosystem, community decision-making is called "Enhancement Proposals". Enhancement proposals are a structured way for contributors to share ideas for a new feature or major change, propose details and impact, and gather feedback before implementing. Enhancement Proposals are for large-scale or community-wide topics; many day-to-day decisions can be made after quick discussions on the project's communication platform, like GitHub issues. Examples of Enhancement proposals are Python's PEPs, NumPy's NEPs, Kubernetes' KEPs, etc.

Note: Other terms for Enhancement Proposals are "Advancement Proposals", "Request for Discussion" (RFDs), or "Request for Comments" (RFCs).

🙋 Learner question: Do you know how napari is governed?

Read napari's governance and Napari Advancement Proposal (NAP) process in the project documentation!

Navigating Governance Structures 🧑‍⚖️

You do not need to read the governance for every project you contribute to. Understanding and following the governance structure becomes essential as you start contributing to a project regularly, especially large-scale projects. It's how the community has decided to collaborate, and as a community member, you're expected to respect that. The governance models are in the project's community documentation, contributor's guide, or repositories dedicated to project management.

Note: The term "governance" is often overloaded, and you may sometimes find a project's contribution guidelines, Code of Conduct, license files, project management notes, and more documented under the governance umbrella.

OSS project management 📁

Unlike corporate and academic projects, open source software projects don't have formal project management systems beyond a project roadmap, release milestones, and some issues (features, tasks, or bugs) marked as important.

Volunteers or enthusiasts contribute to areas of the project that interest them and adopt project management systems (e.g., Kanban board or Gantt chart) that work best for them. As a contributor or team of contributors, you can do that too, but make sure to communicate and share your system with the broader community and the appropriate governing group.

Interlude: Broader open source software community 🌱

Open source software is critical to our digital infrastructure, powering everything from internet security to household televisions. The Python programming language and the surrounding community are just one piece of this massive ecosystem.

There is a saying in the Python community: "Python is the second-best language for anything", and it can be backed by data. The Python Developer Survey 2022 shows how people use Python for web development, game development, network programming and more, in addition to data analytics and machine learning. Several open source projects are associated with these use cases within the Python community. For example, a large and lovely community is around the Django project.

Horizontal bar graph of Python usage in 2022 and 2021. Most to least: Data analysis at 51%/51%, Web development 43%/45%, Machine learning 36%/36%, DevOps, Programming of web parsers, Educational purposes, Software testing, Software prototyping, Desktop development, Network programming, Computer graphics, Game development, Embedded development, Mobile development, Multimedia applications development, Other 6%/7%.

Likewise, a vibrant community of people use Python for scientific research and industry data science. They use Python for data analysis, data visualization, machine learning, high-performance computing, and more. We have foundational projects like NumPy at the core of numerical computing, domain-specific projects like Astropy, and technique-specific projects like scikit-image, as described in this figure from Array programming with Python:

Foundation: NumPy (alongside Python and IPython/Jupyter), SciPy, and matplotlib. This branches into technique-specific: scikit-learn (machine learning), pandas & statsmodels (Statistics), scikit-image (image processing), NetworkX (network analysis). This, in turn, branches into domain-specific: Astropy (Astronomy), Biopython (Biology), QuantEcon (Economics), and more. This finally branches into application-specific: SunPy, PyWavelets, MDAnalysis, yt, and more.

These projects are built to interoperate with each other and are most powerful when working together, hence creating an ecosystem. We call this the PyData3, Scientific Python4, or Python data science ecosystem.

🙋 Learner questions: Where does napari fit in the OSS ecosystem?

napari is a part of the PyData (and hence, broader Python) ecosystem. It's a technique-specific project (high-dimensional imaging) that leans towards bioscience.

Resources 📚

Continue learning 🚥

⬅️ Previous Chapter: 02 Types of Open Source Software | Next Chapter: 04 How Does OSS Relate To The Open Research Movement? ➡️

Footnotes

  1. Guido stepped down as Python's BDFL in 2018, which led to the formation of the Python Steering Council.

  2. The term "maintainers" tends to be associated with code maintainers. So, we use the term "sustainer" instead of "maintainer" in this training to include all kinds of contributions to the OSS project.

  3. PyData is also a meetup and conference series by NumFOCUS.

  4. Scientific Python is also a loose federation of some projects in the ecosystem of Python tools for science and data work.