Skip to content

Commit

Permalink
Merge pull request #6233 from shamisp/topic/v1.10.x/news
Browse files Browse the repository at this point in the history
NEWS: News update before release
  • Loading branch information
shamisp authored Feb 2, 2021
2 parents 6d2449f + 798d265 commit f609817
Showing 1 changed file with 152 additions and 8 deletions.
160 changes: 152 additions & 8 deletions NEWS
Original file line number Diff line number Diff line change
@@ -1,18 +1,162 @@
#
## Copyright (C) Mellanox Technologies Ltd. 2001-2020. ALL RIGHTS RESERVED.
## Copyright (C) Mellanox Technologies Ltd. 2001-2021. ALL RIGHTS RESERVED.
## Copyright (C) UT-Battelle, LLC. 2014-2019. ALL RIGHTS RESERVED.
## Copyright (C) ARM Ltd. 2017-2020. ALL RIGHTS RESERVED.
## Copyright (C) ARM Ltd. 2017-2021. ALL RIGHTS RESERVED.
##
## See file LICENSE for terms.
##
#

## Current
### Features: TBD
#### UCX Core
- Added ucp_tag_msg_recv_nbx routine.
#### UCX Java (API Preview) TBD
### Bugfixes: TBD
## 1.10.0-rc2 (February 2, 2021)
### Features:
#### Core
* Added support for Nvidia HPC SDK
* Added support for latest PGI and Clang
* Added support for ROCM-3.7+ (warning generated if older version detected)
#### Architecture
* Added Arm SVE memcpy()
* Redesigned Arm WFE support
* Improved clear_cache performance for Arm
* Added architecture detection for Zhaoxin CPU
#### CI
* Added release builds on CUDA 11
* Enabled performance validation in gtest
#### UCP
* Added locality awareness to the transport selection logic for GPU devices
* Added put/offload/short and put/offload/zcopy protocols
* Added receive message nbx routine
* Reworked AM implementation and API, which adds support for RNDV semantics
* Added support for multi-lane connection manager over TCP
* Added support for printing AM tls with info log level
* Implement flush and destroy for UCT EPs on UCP worker
* Reduced UCP request size
* Added support for keepalive protocol
* Added support for multi-fragment protocol
* Added implementation for protocol progress for eager, bcopy, and multicopy
* Improved selection logic for protocol selection
* Added new protocols for UCP get operation
* Added bcopy protocols with support for GPU memory
* Added RNDV protocol implementation for GPU devices (CUDA, ROCm)
* Set SOCKADDR_CM_ENABLE=y by default
* Added support for fast-path short with new tag protocols
* Added a new parameter to control the CM listener's backlog
* Added support sending AM RTS over short message protocol
* Added support for shared memory multi-lane when CM is used
#### UCT
* Added API for keepalive_timeout value
* Added add uct_completion.status
* Allowed transports to access multiple mem_types
* Removed status arg from uct_completion_callback_t
* Restructured uct_mem_alloc/uct_md_mem_alloc to use mem_type
* Updated documentation for uct_listener_params
* Lowered the log level for certain network errors
* Added cuda_copy wakeup feature
* Added wakeup support for shared memory
#### UCS
* Added "inf" and "auto" values to time units
* Added on-stack constructors for array and string buffer
* Added ucs_ptr_map_t data structure
* Added bool CSWAP
* Improved logging
* Added optimization for namespace processing
* Fixes for connection matching functionality
#### RDMA CORE (IB, ROCE, etc.)
* Added support for auto detection of adapative routing settings
* Added an option to poll TX CQ every progress iteration
* Added local and remote addresses to the reject error message
* Added support for UAR allocation with non-cacheable memory type
* Added support for multiple flush cancel without completion
* Added async events callback support
* Added detection for ConnectX-6, ConnectX-7 and BlueField-1/2 devices
* Added support for connection matching for UD
* Added a check for AM ordering
#### Java (preview)
* Added support for a different javadoc executable path for different java versions
* Added UCS memory type constants
* Added support build on Java10+
* Added support for io-vector datatype.
#### Tests
* Added CI for CUDA 11
* Added test_ucp_sockaddr_protocols.stream_short
* Reimplemented tests using NBX API
* Added flush(cancel) test
* Added memory_wait mode to perftest
* Added support for clang 10
* Refactored RMA and atomic tests, add memtype support
* Added test for uct_md_mem_query()
* Added request interrupt support
* Added support for connection manager fallbacks
* Added new ucp request test checking for leaks from the ptr_map
#### Documentation
* Added glossaries

### Bugfixes:
#### Portability
* Fixes in print functions to use format string like PRIx64, etc.
#### Continues Integration:
* Fixes in Github release flow
* Fixes in docker image
#### Packaging
* Removed deb package dependencies
* Fixes in SPEC to make the RPM relocatable
#### Documentation
* Fixes in documentation for ucp_am_recv_data_nbx
* Fixes in quick start example
* Fixes in installation instruction
#### Tests
* Fixes for failures under valgrind runtime
* Fixes in mmap tests for 0-length RMA
* Fixes in definition of LAST_WQE wait timeout
* Fixes in ROCm for mem_buffer test
* Fixes in test name printing format
* Fixes in tcp_sockcm test
#### UCP
* Fixes in worker cleanup flow
#### CUDA
* Fixes in managed memory support
#### RDMA CORE (IB, ROCE, etc.)
* Fixes in assert definitions
* Fixes in printing an error about invalid AM Bcopy length for UD
* Fixes for thread safety support
* Fixes to get ROCE device name according to GID
* Fixes for SL selection
* Fixes in create STRICT_ORDER key
* Fixes addressing performance degradation in UD transport due to excess async events
#### UGNI
* Fixing disable logic in config
* Fixing clang 11 warnings
#### Java
* Fixes in build dependencies
* Fixes in constructing UcpRequest object on error
* Fixes in exception handling on endpoint closure request
* Fixes for segfault in UcpErrorHandler
#### UCP
* Fixes in datatype support for get_zcopy RNDV
* Fixes in connection manager disconnect
* Fixes in assert definitions
* Fixes in completion flow for failed EP
* Fixes in flush error handling flow
* Fixes in latency calculations for wireup protocol
* Fixes in offload completion with inlined data
* Fixes in unpacking flow
* Fixes in error handling for various protocols
#### UCT
* Fixes in flush TX
* Fixes in checks for enabling GPU Direct RDMA
#### UCS
* Fixes for crashes on incorrect value set in config
* Fixes in ptr_array
* Fixes in maximal size for ucs_snprintf_safe()
* Fixes in compilation warning
* Fixes in ucs_aarch64_dsb(_op) definition
#### TCP
* Fixes in default route interface confirmation flow
* Fixes in PUT protocol
* Fixes in max connection limit and improved error reporting
#### UCM
* Fixing crash on prevent unload
* Fixes in libucm_rocm
* Fixes for few racing conditions

## 1.9.0 (September 19, 2020)
### Features:
Expand Down

0 comments on commit f609817

Please sign in to comment.