Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEWS: News update for v1.12.0 rc1 #7741

Merged
merged 1 commit into from
Dec 10, 2021
Merged

Conversation

brminich
Copy link
Contributor

@brminich brminich commented Dec 2, 2021

What

  • News update for v.1.12.0 release
  • Porting v.1.11.x news from v.1.11.x branch

Why ?

Release preparation

How ?

Comparing v1.11.x and v1.12.x branches

@brminich
Copy link
Contributor Author

brminich commented Dec 2, 2021

@Akshay-Venkatesh, can you please add CUDA related features/bug fixes added in ucx 1.12? (if something is missed)

NEWS Outdated
#### Core
* Added initial support for Go language bindings
* Added memory invalidation on error detection
* Added threshold for ep connection matching in UCP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ep-> endpoint

NEWS Outdated
* Added memory invalidation on error detection
* Added threshold for ep connection matching in UCP
* Added new objects to VFS (md, component, log_level, etc)
* Added config variable to specify what loadable modules are needed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config -> configuration

NEWS Outdated
#### UCP
* Added API for querying UCP library attributes
* Added new sockaddr private data format
* Enabled rendezvous and tag sync for all cases with error handling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all cases -> all protocols ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, previously RNDV and sync were disabled if user used connect to worker address with error handling. Now this restriction is removed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabled rendezvous and tag sync protocols when error handling is enabled on the endpoint

NEWS Outdated
* Added usage of mpool set for unexpected eager message to reduce memory consumption
* Added client_id to ucp_worker_create() and ucp_conn_request_query() APIs
* Added support for modifying UCT and UCS configs by ucp_config_modify() API
* Added address versioning to correctly preserve wire compatibility
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the big one. Probably should be moved to the top. Also, we should explicitly specify from what version we are wire protocol backward compatible.

NEWS Outdated
* Added address versioning to correctly preserve wire compatibility
* Optimized unpacked rkeys memory consumption
* Added request flag to influence latency vs. bandwidth protocol
* Added ucp_worker_address_query() API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please group all API changes together ?

NEWS Outdated
#### CUDA
* Added option to set cuda_copy bandwidth
* Added profiling of CUDA runtime function calls
* Added stub for memory invalidation support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we need to list "stub" functions - seems like does not contribute to anything (you have few assurances here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is important change, because it allows these transports to be selected for RNDV protocol.
Remove?

Copy link
Contributor

@shamisp shamisp Dec 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rephrase this: Added stub for memory invalidation support inorder to enable CUDA transport selection for rendezvous protocols

Copy link
Member

@dmitrygx dmitrygx Dec 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdyt Added stub for memory invalidation support to enable CUDA-IPC transport selection for rendezvous protocols in case of error handling enabled?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I would make it a bit shorter Added stub for memory invalidation support to enable CUDA-IPC transport selection for rendezvous protocols

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would remove this altogether, it's part of memory invalidation err flows fix

NEWS Outdated
* Added process placement option for ucx_info
* Extended parameters correctness check in ucx_perftest
#### CI
* Replaced gtest 1.7 with gtest 1.10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated gtest 1.7 to 1.10

NEWS Outdated

### Bugfixes
#### Core
* Fixed simultaneous ep close with ucp_hello_world
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ep -> endpoint

NEWS Outdated
* Suppressed EHOSTUNREACH error in TCP sockcm
* Restricted connecting loop-back to other devices in TCP
#### RDMA CORE (IB, ROCE, etc.)
* Added pkey_index initialization when creating RC QP with DEVX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Added/Fixed

NEWS Outdated
* Fixes in UCP, UCT, UCS, FAQ and README documentation
#### Tests
* Fixed memory leak in io_demo
* More fixes in io_demo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove...sounds duplicated. Maybe replace the both lines with Multiple fixes

@brminich
Copy link
Contributor Author

brminich commented Dec 3, 2021

@shamisp, fixed

@Akshay-Venkatesh
Copy link
Contributor

@Akshay-Venkatesh, can you please add CUDA related features/bug fixes added in ucx 1.12? (if something is missed)

@brminich I'm copying this from Yossi's talk this week. These were the main additions:

  • added global memtype cache​ to allow UCT transports to query memory attributes
  • added capability to select cuda stream based on source and destination memory type​ required for device-memory-based pipelining
  • Auto-register cuda whole allocation​s to avoid repeated registration costs
  • Select Cuda-IPC capabilities based on NVLINK topology​ to prefer writes vs reads for specific platforms using NVML
  • Generalized rendezvous fragment pool per mem-type device

There were some recent bug-fixes but they're part of master and not v1.12.x

NEWS Outdated Show resolved Hide resolved
NEWS Outdated Show resolved Hide resolved
NEWS Outdated Show resolved Hide resolved
NEWS Outdated Show resolved Hide resolved
NEWS Outdated Show resolved Hide resolved
NEWS Outdated Show resolved Hide resolved
NEWS Outdated Show resolved Hide resolved
NEWS Outdated Show resolved Hide resolved
NEWS Outdated
#### CUDA
* Added option to set cuda_copy bandwidth
* Added profiling of CUDA runtime function calls
* Added stub for memory invalidation support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would remove this altogether, it's part of memory invalidation err flows fix

NEWS Show resolved Hide resolved
@yosefe
Copy link
Contributor

yosefe commented Dec 4, 2021

@petro-rudenko can you pls check didn't miss any news for Java

@petro-rudenko
Copy link
Member

JUCX:

  • Set worker id and query it from the connection request.
  • ucp_listener_reject functionality.
  • Support setting port 0 for UcpListener to bind on a free port.

Copy link
Contributor

@yosefe yosefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems few comments missed from prev review

@yosefe
Copy link
Contributor

yosefe commented Dec 7, 2021

@shamisp can you pls take a look?

NEWS Outdated
#### UCP
* Added API for querying UCP library attributes
* Added address versioning to correctly preserve wire compatibility since v1.11.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since -> starting from the version v1.11.0

NEWS Outdated
### Features:
#### Core
* Added beta-level support for Go language bindings
* Added new objects to VFS (md, component, log_level, etc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

etc -> etc.

NEWS Outdated
* Added support for user-defined alignment in Active Messages
* Added support for offload tag sync in new protocols
* Updated ucp_atomic_post() to use NBX flow
##### API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move APIs section before UCP. These are the most visible updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but this is UCP API changes, it has nested level (extra #)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or you mean move it right after ####UCP?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to the top, but removed #####API line, because:

  1. We do not have such section for other parts (UCT, UCS), so it is consistent with other parts - that API changes are just on the top
  2. Otherwise would need to introduce some other caption which would separate API and plain features

* Improved accuracy of the topology distance estimation
* Added thread-safe put to ptr_map
* Added prints of leaked callbacks from the callback queue
* Added new ptr_array API for bulk allocation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move API changes to the top of UCS section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

NEWS Outdated
* Added ucs_ffs32()
* Removed a diagnostic message when fuse thread is stopped
* Added ucs_vsnprintf_safe() which always adds '\0'
* Added API for a per-process aggregate-sum statistics report
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move API changes to the top of UCS section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

NEWS Outdated
* Added support for setting worker id and querying it from the connection request
* Added support to bind on a free port in UcpListener
#### Packaging
* Added cmake support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please elaborate on this one, we still use auto tools.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afair #7096

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added cmake config files for better integration with external cmake based projects

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@brminich
Copy link
Contributor Author

brminich commented Dec 8, 2021

bot:pipe:retest

* Added selection of CUDA-IPC capabilities based on NVLINK topology
(to prefer writes vs reads for specific platforms using NVML)
* Added option to set cuda_copy bandwidth
* Added profiling of CUDA runtime function calls
Copy link
Contributor

@bureddy bureddy Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the following in CUDA section to reflect #7772
"Added option to limit GPUDirectRDMA size in rendezvous protocol"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@brminich
Copy link
Contributor Author

brminich commented Dec 9, 2021

@shamisp, @yosefe let's keep it open to be able to update if something else gets into v1.12

@brminich
Copy link
Contributor Author

brminich commented Dec 9, 2021

@tonycurtis, can you please take a look?

NEWS Outdated
* Added API for querying UCP library attributes
* Added client_id to ucp_worker_create() and ucp_conn_request_query() APIs
* Added ucp_worker_address_query() API
* Updated ucp_ep_query() API with getting local and remote addresses
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for getting?

NEWS Outdated
* Added client_id to ucp_worker_create() and ucp_conn_request_query() APIs
* Added ucp_worker_address_query() API
* Updated ucp_ep_query() API with getting local and remote addresses
* Added address versioning to correctly preserve wire compatibility starting from the version v1.11.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove "the", and the "v" in front of the number seems redundant.

NEWS Outdated
* Added memory limit support to memtrack
#### CUDA
* Added global memtype cache to allow UCT transports to query memory attributes
* Auto-register cuda whole allocations to avoid repeated registration costs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuda -> CUDA

NEWS Outdated
* Added capability to select CUDA stream based on source and destination memory type
(required for device memory based pipelining)
* Added selection of CUDA-IPC capabilities based on NVLINK topology
(to prefer writes vs reads for specific platforms using NVML)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vs. (period on end)

NEWS Outdated
### Bugfixes:
* Fixes in Cuda memory hooks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cuda -> CUDA

NEWS Outdated
@@ -68,7 +238,7 @@
#### RDMA CORE (IB, ROCE, etc.)
* Added report of QP info in case of completion with error
* Refactored of FC send operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, not added by this issue, but looks like a missing word here

@shamisp
Copy link
Contributor

shamisp commented Dec 9, 2021

@shamisp, @yosefe let's keep it open to be able to update if something else gets into v1.12

makes sense. You also want to update Authors files (in separate PR)

@brminich
Copy link
Contributor Author

brminich commented Dec 9, 2021

@shamisp, @yosefe let's keep it open to be able to update if something else gets into v1.12

makes sense. You also want to update Authors files (in separate PR)

Probably I was wrong. Would be good to merge it to not block rc1.
@tonycurtis, your comments applied. If you do not have any new comments I'd squash the changes

@shamisp
Copy link
Contributor

shamisp commented Dec 9, 2021

@shamisp, @yosefe let's keep it open to be able to update if something else gets into v1.12

makes sense. You also want to update Authors files (in separate PR)

Probably I was wrong. Would be good to merge it to not block rc1. @tonycurtis, your comments applied. If you do not have any new comments I'd squash the changes

You and @tonycurtis have to decide. In the past we have tried to push NEWs on the fist RC but sometimes those were delayed.

@tonycurtis
Copy link
Contributor

tonycurtis commented Dec 9, 2021 via email

@brminich
Copy link
Contributor Author

Imo, better have actual NEWs for rc release, because users may/will want to try it and check its content

@shamisp
Copy link
Contributor

shamisp commented Dec 10, 2021

@brminich agree. If you happen to have ready to go before RC I don't see a good reason not to include. I have seen few some linux distress using our RC.

@brminich
Copy link
Contributor Author

@yosefe, @tonycurtis, so I'm going to merge it unless you have some objections

@tonycurtis
Copy link
Contributor

tonycurtis commented Dec 10, 2021 via email

@yosefe yosefe merged commit 8ab494b into openucx:v1.12.x Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants