Skip to content

UCT/PLUGIN: ib plugin for token query#11277

Open
jeynmann wants to merge 14 commits into
openucx:masterfrom
jeynmann:plugin_query_token
Open

UCT/PLUGIN: ib plugin for token query#11277
jeynmann wants to merge 14 commits into
openucx:masterfrom
jeynmann:plugin_query_token

Conversation

@jeynmann
Copy link
Copy Markdown
Contributor

@jeynmann jeynmann commented Mar 20, 2026

What?

Add plugin support for token query(capability, rx token, tx token, tx token lengh, rx token length).
This is the plugin part of PR#11234

How?

Add weak symbols in uct.
The implementation will be provided by plugin lib.

Comment thread src/uct/ib/plugin/uct_ib_plugin_stubs.c Outdated
}

ucs_status_t __attribute__((weak))
uct_ib_plugin_ep_query(uct_ep_h ep, uct_ep_attr_t *ep_attr)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be possible to change this to:

uct_ib_plugin_query_qp()
?

Comment thread src/uct/ib/mlx5/Makefile.am Outdated
dv/ib_mlx5_dv.c \
dv/ib_mlx5dv_md.c
dv/ib_mlx5dv_md.c \
../plugin/uct_ib_plugin_stubs.c
Copy link
Copy Markdown
Contributor

@roiedanino roiedanino Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a Makefile.am in /plugin and the stub there
(we should also add the .h file to the noinst_HEADERS list)

Comment thread src/uct/ib/Makefile.am Outdated
Comment on lines +17 to +18
pluginincludedir = $(includedir)/uct/ib/plugin
plugininclude_HEADERS = plugin/uct_ib_plugin.h
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be added in plugin/Makefile.am

Comment thread src/uct/ib/plugin/uct_ib_plugin_stubs.c Outdated
Comment on lines +18 to +20
uct_ib_plugin_qp_query(const uct_ib_plugin_qp_query_params_t *params,
uct_ib_plugin_qp_query_attr_t *attr)
{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is it going to be called from?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be invoked by uct_iface_query_v2 and uct_ep_query.
The UCT implementation will be submitted in a separate PR.

Comment thread src/uct/ib/plugin/uct_ib_plugin.h Outdated
* @brief QP query input parameters.
*/
typedef struct uct_ib_plugin_qp_query_params {
uint64_t field_mask;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should also define an enum for the field_mask

Comment thread src/uct/api/uct.h Outdated
Comment on lines +432 to +433
/* IB plugin capability */
#define UCT_IFACE_FLAG_IB_PLUGIN_QUERY_TOKEN UCS_BIT(47) /**< IB plugin for token query is available */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to getrid of "IB" and "plugin" in UCT API, just "UCT_IFACE_FLAG_QUERY_TOKEN", not implemented/supported => no capability

Comment thread src/uct/ib/plugin/uct_ib_plugin.h Outdated
Comment on lines +46 to +55
/**
* @brief Field mask bits for @ref uct_ib_plugin_qp_query_attr_t.
* Selects which output fields to populate.
*/
enum uct_ib_plugin_qp_query_attr_field {
UCT_IB_PLUGIN_QP_QUERY_ATTR_FIELD_TX_TOKEN_LEN = UCS_BIT(0),
UCT_IB_PLUGIN_QP_QUERY_ATTR_FIELD_RX_TOKEN_LEN = UCS_BIT(1),
UCT_IB_PLUGIN_QP_QUERY_ATTR_FIELD_TX_TOKEN = UCS_BIT(2),
UCT_IB_PLUGIN_QP_QUERY_ATTR_FIELD_RX_TOKEN = UCS_BIT(3)
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll have similar bits for uct_ep_query, so why dont pass the same semantic to the "uct_ib_plugin_ep_query", cast to correct ep_type and access QP.
Then uct_ib_plugin_qp_query_params_t becomes redundant.

Comment thread src/uct/ib/plugin/uct_ib_plugin.h Outdated
Comment on lines +69 to +74
/**
* @brief Return plugin-contributed iface capability flags.
*
* @return Bitmask of UCT_IFACE_FLAG_* contributed by the plugin.
*/
uint64_t uct_ib_plugin_iface_flags(void);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for plugin contribution to defined UCT API

Comment thread src/uct/ib/plugin/uct_ib_plugin.h Outdated
* @return UCS_OK on success, error code otherwise.
*/
ucs_status_t
uct_ib_plugin_qp_query(const uct_ib_plugin_qp_query_params_t *params,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uct_ib_plugin_ep_query?

Copy link
Copy Markdown
Contributor Author

@jeynmann jeynmann Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roiedanino suggests renaming it to uct_ib_plugin_query_qp. We now rely on mlx5dv_devx_obj and ibv_qp rather than the internal UCT structure, which might provide a cleaner and simpler approach.

Will it be possible to change this to: uct_ib_plugin_query_qp() ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it adds inconsistency and overcomplication. Why does iface_flags return native UCT API level flags, but ep_query requires conversion of ep -> qp and qp_attr -> ep_attr converting in base (non-plugin) UCT/IB code?

Comment thread src/uct/ib/plugin/uct_ib_plugin.h Outdated
Comment on lines +51 to +52
UCT_IB_PLUGIN_QP_QUERY_ATTR_FIELD_TX_TOKEN_LEN = UCS_BIT(0),
UCT_IB_PLUGIN_QP_QUERY_ATTR_FIELD_RX_TOKEN_LEN = UCS_BIT(1),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in #11234 these are defined as iface attributes

@roiedanino
Copy link
Copy Markdown
Contributor

Example code for plugin side:

static ucs_status_t uct_ib_plugin_iface_flags(void) {
    ucs_info("executed uct_ib_plugin_iface_flags successfully");
    return UCS_OK;

}
static ucs_status_t
uct_ib_plugin_qp_query(const uct_ib_mlx5_ext_qp_query_attr_t *params,
                       uct_ib_mlx5_ext_qp_query_attr_t *attr) {
    ucs_info("executed uct_ib_plugin_qp_query successfully %p %p", params, attr);
    return UCS_OK;
}

static uct_ib_mlx5_ext_ops_t ucx_ext_plugin_ops = {
    .iface_flags = uct_ib_plugin_iface_flags,
    .qp_query = uct_ib_plugin_qp_query,
};

UCS_MODULE_INIT()
{
   // ...
    uct_ib_mlx5_ext_register(&ucx_ext_plugin_ops);

    return UCS_OK;
}

roiedanino
roiedanino previously approved these changes May 3, 2026
Copy link
Copy Markdown
Member

@yosefe yosefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls fix in next

Comment thread src/uct/api/uct.h Outdated
are no outstanding send operations */

/* Token query capability */
#define UCT_IFACE_FLAG_QUERY_TOKEN UCS_BIT(47) /**< Interface supports token query */
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. let's add this to uct_iface_query_v2
  2. do we need to separate to TX_TOKEN and RX_TOKEN? we need to use TX_TOKEN on 2iface transports like DC

Comment on lines +27 to +28
ucs_assertv(ops->iface_flags != NULL, "ext_resgiter: iface_flags function is NULL");
ucs_assertv(ops->qp_query != NULL, "ext_resgiter: qp_query function is NULL");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment thread src/uct/ib/mlx5/ib_mlx5_ext.c Outdated
ucs_assertv(ops->iface_flags != NULL, "ext_resgiter: iface_flags function is NULL");
ucs_assertv(ops->qp_query != NULL, "ext_resgiter: qp_query function is NULL");
uct_ib_mlx5_ext_ops = *ops;
ucs_info("ib mlx5: registered external ops successfully");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it ucs_debug + details

Comment thread src/uct/ib/mlx5/ib_mlx5_ext.h Outdated
#include <ucs/type/status.h>
#include <ucs/sys/stubs.h>


Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove spaceline

Comment thread src/uct/ib/mlx5/ib_mlx5_ext.h Outdated
Comment on lines +16 to +17
struct ibv_qp;
struct mlx5dv_devx_obj;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can include ib / mlx5dv headers directly

Comment thread src/uct/ib/mlx5/ib_mlx5_ext.h Outdated
Comment on lines +35 to +40
uint64_t field_mask;
size_t tx_token_len;
size_t rx_token_len;
void *tx_token;
void *rx_token;
uint32_t qp_num;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add documentation

Comment on lines +44 to +47
typedef ucs_status_t (*uct_ib_mlx5_ext_iface_flags_func_t)(uint64_t *flags);
typedef ucs_status_t (*uct_ib_mlx5_ext_qp_query_func_t)(
struct ibv_qp *qp, struct mlx5dv_devx_obj *devx_obj,
uct_ib_mlx5_ext_qp_query_attr_t *attr);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add documentation

{
ucs_assertv(ops->iface_flags != NULL, "ext_resgiter: iface_flags function is NULL");
ucs_assertv(ops->qp_query != NULL, "ext_resgiter: qp_query function is NULL");
uct_ib_mlx5_ext_ops = *ops;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably want a list of providers, move uct_ib_mlx5_ext_ops only to C file, and add function that search for the first non-NULL operation in the list and calls the operation or returns NOT_SUPPORTED if not found. maybe adde default provider that returns 0 from iface_flags .

@roiedanino roiedanino force-pushed the plugin_query_token branch from f48724e to a2e4020 Compare May 3, 2026 10:08
@roiedanino roiedanino requested a review from yosefe May 3, 2026 10:08
@openucx openucx deleted a comment from svc-nixl May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants