Skip to content

A Gentle Introduction to mdspan

Mark Hoemmen edited this page May 30, 2023 · 23 revisions

ISO-C++ standard proposals are not the best tutorials for how to use their proposed features. This is intentional—there's enough content to be written in exploring the technical design issues that a tutorial-style introduction is out of place in such documents. However, it also means that proposals such as P0009 for mdspan reach a level of maturity where they are ready for production implementation and use, but without a single tutorial-style introduction. Here's an attempt to do so for mdspan.

What is mdspan?

In its simplest form, std::mdspan1 is a straightforward extension of std::span to enable multidimensional arrays. (The std::span class was added in C++20, and introductions to it can be found all over the internet. Thus, this tutorial assumes basic familiarity with std::span.) You can create them from a pointer to the underlying data, and its extents (the dimensions of the multidimensional array).

int* data = /* ... */

// View data as contiguous memory representing 4 ints
auto s = std::span(data,4);

// View data as contiguous memory representing 2 rows
// of 2 ints each
auto ms = std::mdspan(data,2,2);

In the example above, we use CTAD (constructor template argument deduction, a language feature introduced in C++17) so that we don't have to name the template arguments of the mdspan class. The compiler figures them out for us.

Accessing the data then happens via its [] operator, just as with span or vector.

for(size_t i=0; i<ms.extent(0); i++)
  for(size_t j=0; j<ms.extent(1); j++)
    ms[i, j] = i*1000 + j;

(Note that the mdspan implementation in this repository uses () if the compiler doesn't support multi-argument [] - a feature which was voted into C++23.) mdspans extent function takes an index to indicate which extent the caller wants.

Extents

Just like span, you can use both dynamic extents and static extents with mdspan. Unlike span, mdspan is not directly templated on individual extents, but packs all of them into an explicit extents object. An extents object represents the "shape" of a multidimensional array. It can encode any number and combination of extents as compile-time values (that are part of the type of the extents object), and encodes the remaining extents as run-time values (that must be given as arguments to the extents constructor).

// Create an extents object with one run-time dimension 3,
// and one static (compile-time) extent 4.
// It represents the shape of a 3 x 4 array.
std::extents<int, std::dynamic_extent, 4> exts(3);

The extents class does not only take the actual sizes (or extents) of each dimension, it also takes as an argument the index_type, the extents class - and through it mdspan - will use to store the extents, and do index computation. The total number of arguments after the index_type is also the rank of the extents object, and later the rank of the mdspan.

You can construct an mdspan by providing the individual extents to the constructor (either all of them, or just the dynamic ones), or handing in an extents object.

int* data = /* ... */
int size = /* ... */

int rows = /* ... */
constexpr int cols = 4;

// An extents type using uint32_t to represent each extent,
// with a run-time number of rows, and 4 columns.
using ext_t = std::extents<uint32_t, std::dynamic_extent, cols>;

// When creating the mdspan, we only need to give run-time extents.
// The compile-time extents are already baked into the extents type.
auto ms = std::mdspan<int, ext_t>(data, rows);
assert(ms.extents(0) == rows);
assert(ms.extents(1) == cols);
static_assert(ms.static_extent(1) == cols);
static_assert(decltype(ms)::rank() == 2);

// If we don't specify the mdspan's template parameters,
// then we need to pass in all the extents.
// It will then store all extents as run-time values.
auto msd = std::mdspan(data, rows, cols);
assert(ms.extents(0) == rows);
assert(ms.extents(1) == cols);
static_assert(decltype(msd)::rank() == 2);

Note that dextents is a convenient alias for an extents type with all dynamic_extent.

static_assert<std::is_same_v<std::dextents<uint32_t, 2>,
   std::extents<uint32_t, std::dynamic_extent, std::dynamic_extent>>;

Layout and access customization

The design of mdspan addresses a much broader variety of needs than span. In many scenarios, it even makes more sense to use a one-dimensional mdspan instead of a span. This broader scope of use cases is addressed through two customization points: LayoutPolicy and Accessor.

template <
  class T,
  class Extents,
  class LayoutPolicy = std::layout_right,
  class Accessor = std::default_accessor
>
class mdspan;

The LayoutPolicy defines the data layout, that is, the function that turns a multi-dimensional index i,j,k,... into a one-dimensional offset. The Accessor defines how the array stores its elements, and how to use the offset from the LayoutPolicy to get a reference to an element of the array. (For example, the default Accessor takes a pointer p and an offset i, and returns p[i].)

You can think of these customizations like the same kind of thing as the Hash template parameter on std::unordered_map or the Allocator template parameter on std::vector. These are things that most of the time you don't have to touch, and that most algorithms shouldn't have to care about. (How many times have you written a function that takes a std::vector and had to write a different version of the function for different allocators? Probably not that often, if ever.)

Basics of LayoutPolicy

As stated above the LayoutPolicy controls the mapping from a multi-dimensional index into an offset. One of the reasons why controlling a layout mapping can be useful, is to control an algorithm's data access pattern (how it strides through memory) without changing the algorithm's loop structure.

For example, it's natural to think of a matrix-vector multiply algorithm as a loop over rows, with an inner loop which performs the inner reduction:

for(int i = 0; i < A.extent(0); i++) {
  double sum = 0.;
  for(int j = 0; j < A.extent(1); j++)
    sum += A[i, j] * x[j];
  y[i] = sum;
}

In most cases an optimal data access pattern is achieved if the stride-one access is on j. This also allows for vectorization of the inner loop. However, consider the case where A.extent(1) is small and known at compile time. In that case it might actually be better to have the inner loop fully unrolled, and have a stride-one access on i instead, so that the outer loop can be vectorized.

for(int i = 0; i < A.extent(0); i++) {
  y[i] = A[i, 0] * x[0] +
         A[i, 1] * x[1] +
         A[i, 2] * x[2] +
         A[i, 3] * x[3];
}

With mdspan the first use case can be achieved with layout_right (which is the default) and the second use case might perform better with layout_left:

mdspan<double, extents<uint32_t, dynamic_extent, 4>, layout_left> A(A_ptr, N);
for(int i = 0; i < A.extent(0); i++) {
  y[i] = A[i, 0] * x[0] +
         A[i, 1] * x[1] +
         A[i, 2] * x[2] +
         A[i, 3] * x[3];
}

Another important use case where layouts are needed is for reusing functionality for a subset of data.

For example one might want to perform an operation on a single column of some multi-dimensional array. That column, however, may not be a contiguous array. layout_stride can deal with that in many common cases:

mdspan A(A_ptr, N, M);
for(int c = 0; c < A.extent(1); c++) {
  extents sub_ext(A.extents(0));
  array strides{A.stride(0)};
  layout_right::mapping sub_map(sub_ext, strides);
  mdspan col_c(&A(0,c), sub_map);
  foo_1d_mdspan(col_c);
}

A more advanced tutorial will cover the topics of layout and access customization, but for now, it's enough to know that two simple ones are provided: std::layout_right and std::layout_left, where the right-most and left-most indices are fast-running, respectively. When discussing matrices, these are often described as C-style layout (or row-major) and Fortran-style (or column-major), respectively.

Views and ownership

We say that mdspan is a "view." What does this mean, and how does it relate to ownership?

"Ownership" refers to who frees the memory. "Nonowning" means "it doesn't free the memory."

By convention, const T* and T* are nonowning. span<T> and span<const T> are nonowning. However, vector<T> is owning. When the vector goes away, it frees memory. Standard containers are owning.

"View" means that if you have a view X, and you make a copy of X called Y, then X and Y refer to the same elements. Views don't copy elements. Copying views should also be cheap; it shouldn't depend on the number of elements being viewed. (Ranges has some subtle definition of "view" that permits some exceptions, but this gets at the essence.)

Pointers are views. span is a view. mdspan is a view. vector is not a view; it's a container. If you make a copy of the vector, it will copy all the elements. The elements in the new vector are different elements ("different" in a similar sense as meant by "regular type") than the old vector's elements.

std::shared_ptr is a view, EVEN THOUGH it expresses shared ownership. This matters because mdspan is a view, BUT whether it is owning depends on the accessor. With default_accessor, the mdspan is non-owning. However, you could define an accessor whose data_handle_type is the following.

struct shared_ownership_data_handle {
  shared_ptr<element_type[]> base;
  // This mdspan's data start at base[offset].
  // This exists because shared_ptr<T[]> has no "offset constructor"
  // that expresses shared ownership of an array through a subset.
  size_t offset;
};

That would make different instances of mdspan share ownership of the allocation. When the last "shared owner" falls out of scope, the memory would be deallocated.

Footnotes

1 mdspan has been adopted into C++23. Thus, in this document, we will use the namespace std, since that is where mdspan will end up. However, current implementations put mdspan in a different, user-configurable namespace. This is because it is forbidden for code outside the Standard Library to add features to the std namespace (with a few exceptions). A common default is std::experimental, so it's a common convention among users of these features to put namespace stdex = std::experimental somewhere in their project to reduce verbosity.