Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEWS: Added 1.15.0 section. #9391

Merged
merged 4 commits into from
Sep 28, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 18 additions & 48 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -11,53 +11,7 @@
### Features:
### Bugfixes:

## 1.15.0-rc6 (September 20, 2023)
### Bugfixes:
#### UCP
* Fixed assertion when sending from noncontig GPU buffer to managed buffer.

## 1.15.0-rc5 (September 12, 2023)
### Bugfixes:
#### UCP
* Fixed the data race on endpoint configurations.

## 1.15.0-rc4 (August 30, 2023)
### Bugfixes:
#### RDMA CORE (IB, ROCE, etc.)
* Fixed dma-buf based memory region registration
* Fixed memory handle data corruption when PCIe relaxed ordering is enabled
#### UCS
* Fixed lane selection, adding bandwidth estimation for Sapphire Rapids family

## 1.15.0-rc3 (August 8, 2023)
### Bugfixes:
#### UCP
* Fixed endpoint reconfiguration issues because of assymetrical selection
#### UCT
* Check dmabuf kernel support in ROCm memory domain
#### UCM
* Fixed conditional jump patching
#### Tools
* Fixed memory access flags in perftest

## 1.15.0-rc2 (July 27, 2023)
### Features:
#### RDMA CORE (IB, ROCE, etc.)
* Implemented is_reachable_v2 for IB interfaces
#### Build
* Enabled build with binutils 2.40
* Added versioned dependency to switch between packages with the same names

### Bugfixes:
#### UCP
* Fixed endpoint reconfiguration error due to wrong locality detection
#### RDMA CORE (IB, ROCE, etc.)
* Fixed performance degradation when indirect atomic key is not supported by the hardware
* Fixed remote access error to strict-order key because of wrong offset
#### GPU (CUDA, ROCM)
* Fixed CUDA IPC performance degradation after libnuma removal

## 1.15.0-rc1 (May 10, 2023)
## 1.15.0 (September 27, 2023)
### Features:
#### UCP
* Added 2-stage pipeline protocol in the new protocol infrastructure
Expand All @@ -75,6 +29,7 @@
* Added base implementation of is_reachable_v2 API using intra/inter flag
* Introduced MD capability for non-blocking registration memory types
#### RDMA CORE (IB, ROCE, etc.)
* Implemented is_reachable_v2 for IB interfaces
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented - > Added format

* Added option to control CQE zipping per CQ RX/TX direction
* Added option to specify how DCI selects port under RoCE LAG
* Added hw_dcs to the list of policies to select DCI by an endpoint
Expand Down Expand Up @@ -104,12 +59,17 @@
* Added user-side memcpy option for AM benchmarks in ucx_perftest
* Added wireshark LUA dissectors for some UCX protocols
#### Build
* Enabled build with binutils 2.40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added support for binutils 2.40

* Added versioned dependency to switch between packages with the same names
* Added a separate xpmem deb subpackage
* Added aarch64 support to the binary distribution pipeline
* Removed dependency on libnuma

### Bugfixes:
#### UCP
* Fixed assertion when sending from noncontig GPU buffer to managed buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-contiguous

* Fixed the data race on endpoint configurations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the race condition ...

* Fixed endpoint reconfiguration issues because of assymetrical selection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issues due to asymmetrical selection

* Fixed endpoint reconfiguration error due to wrong locality detection
* Fixed crash during connection manager cleanup
* Fixed rkey index calculation for rendezvous protocol
* Fixed rcache dump function
Expand All @@ -123,20 +83,29 @@
* Fixed CPU/device atomics selection in the new protocol infrastructure
* Multiple fixes in the new protocol infrastructure information output
#### UCT
* Check dmabuf kernel support in ROCm memory domain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added check ?

* Fixed exported memh packing
* Fixed an error in checking return status of multi-threaded memory registration function
#### RDMA CORE (IB, ROCE, etc.)
* Fixed dma-buf based memory region registration
* Fixed memory handle data corruption when PCIe relaxed ordering is enabled
* Fixed performance degradation when indirect atomic key is not supported by the hardware
* Fixed remote access error to strict-order key because of wrong offset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed remote access error to strict-order keys

* Added check for UAR support to memory domain opening
* Fixed updating port counters for devx qp
* Fixed ibv_create_cq error message on node without Infiniband
* Fixed performance degradation due to using 2 paths on NDR400 by default
* Removed unnecessary async lock which otherwise would block UD progress
#### GPU (CUDA, ROCM)
* Fixed CUDA IPC performance degradation after libnuma removal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed CUDA IPC performance degradation due to libnuma removal

#### UCS
* Fixed lane selection, adding bandwidth estimation for Sapphire Rapids family
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed lane selection and added bandwidth estimation for Sapphire Rapids family

* Fixed displaying wrong environment variable suggestions
* Fixed VFS warning output
* Fixed SEGV in ucs_debug_backtrace_next(), upon previous SEGV handling, due to ENOMEM situation
* Fixed memory corruption when using UCX_MPOOL_FIFO=y
#### UCM
* Fixed conditional jump patching
* Fixed mremap() override
#### GPU (CUDA, ROCM)
* Fixed usage of dmabuf when the buffer is not page-aligned
Expand All @@ -148,6 +117,7 @@
#### Tests
* Fixed wrong usage of ep_close in examples
#### Tools
* Fixed memory access flags in perftest
* Removed support for librte from perf
* Fixed worker flush deadlock when using multiple workers in ucx_perftest
#### Build
Expand Down