Skip to content

Commit 083d6ab

Browse files
committed
libibverbs: Introduce Completion Counters verbs
Extend verbs interface to support Completion Counters that can be seen as a light-weight alternative to polling CQ. A completion counter object separately counts successful and error completions, can be attached to multiple QPs and be configured to count completions of a subset of operation types. This is especially useful for batch or credit based workloads running on accelerators but can serve many other types of applications as well. Expose supported number of completion counters through query device extended verb. Reviewed-by: Yonatan Nachum <ynachum@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com>
1 parent 8b9cdb7 commit 083d6ab

8 files changed

Lines changed: 489 additions & 0 deletions

File tree

debian/libibverbs1.symbols

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,3 +129,10 @@ libibverbs.so.1 libibverbs1 #MINVER#
129129
ibv_wc_status_str@IBVERBS_1.1 1.1.6
130130
mbps_to_ibv_rate@IBVERBS_1.1 1.1.8
131131
mult_to_ibv_rate@IBVERBS_1.0 1.1.6
132+
ibv_create_comp_cntr@IBVERBS_1.16 62
133+
ibv_destroy_comp_cntr@IBVERBS_1.16 62
134+
ibv_set_comp_cntr@IBVERBS_1.16 62
135+
ibv_set_err_comp_cntr@IBVERBS_1.16 62
136+
ibv_inc_comp_cntr@IBVERBS_1.16 62
137+
ibv_inc_err_comp_cntr@IBVERBS_1.16 62
138+
ibv_qp_attach_comp_cntr@IBVERBS_1.16 62

libibverbs/examples/devinfo.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -585,6 +585,7 @@ static int print_hca_cap(struct ibv_device *ib_dev, uint8_t ib_port)
585585
printf("\tmax_srq_sge:\t\t\t%d\n", device_attr.orig_attr.max_srq_sge);
586586
}
587587
printf("\tmax_pkeys:\t\t\t%d\n", device_attr.orig_attr.max_pkeys);
588+
printf("\tmax_comp_cntr:\t\t\t\t%d\n", device_attr.max_comp_cntr);
588589
printf("\tlocal_ca_ack_delay:\t\t%d\n", device_attr.orig_attr.local_ca_ack_delay);
589590

590591
print_odp_caps(&device_attr);

libibverbs/libibverbs.map.in

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,13 @@ IBVERBS_1.16 {
175175
global:
176176
ibv_dm_export_dmabuf_fd;
177177
ibv_query_port_speed;
178+
ibv_create_comp_cntr;
179+
ibv_destroy_comp_cntr;
180+
ibv_set_comp_cntr;
181+
ibv_set_err_comp_cntr;
182+
ibv_inc_comp_cntr;
183+
ibv_inc_err_comp_cntr;
184+
ibv_qp_attach_comp_cntr;
178185
} IBVERBS_1.15;
179186

180187
/* If any symbols in this stanza change ABI then the entire staza gets a new symbol

libibverbs/man/CMakeLists.txt

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,9 @@ rdma_man_pages(
1414
ibv_create_ah.3
1515
ibv_create_ah_from_wc.3
1616
ibv_create_comp_channel.3
17+
ibv_create_comp_cntr.3.md
1718
ibv_create_counters.3.md
19+
ibv_qp_attach_comp_cntr.3.md
1820
ibv_create_cq.3
1921
ibv_create_cq_ex.3
2022
ibv_modify_cq.3
@@ -98,6 +100,11 @@ rdma_alias_man_pages(
98100
ibv_create_ah.3 ibv_destroy_ah.3
99101
ibv_create_ah_from_wc.3 ibv_init_ah_from_wc.3
100102
ibv_create_comp_channel.3 ibv_destroy_comp_channel.3
103+
ibv_create_comp_cntr.3 ibv_destroy_comp_cntr.3
104+
ibv_create_comp_cntr.3 ibv_set_comp_cntr.3
105+
ibv_create_comp_cntr.3 ibv_set_err_comp_cntr.3
106+
ibv_create_comp_cntr.3 ibv_inc_comp_cntr.3
107+
ibv_create_comp_cntr.3 ibv_inc_err_comp_cntr.3
101108
ibv_create_counters.3 ibv_destroy_counters.3
102109
ibv_create_cq.3 ibv_destroy_cq.3
103110
ibv_create_flow.3 ibv_destroy_flow.3
Lines changed: 245 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,245 @@
1+
---
2+
date: 2026-02-09
3+
footer: libibverbs
4+
header: "Libibverbs Programmer's Manual"
5+
layout: page
6+
license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md'
7+
section: 3
8+
title: ibv_create_comp_cntr
9+
tagline: Verbs
10+
---
11+
12+
# NAME
13+
14+
**ibv_create_comp_cntr**, **ibv_destroy_comp_cntr** - Create or destroy a
15+
completion counter
16+
17+
**ibv_set_comp_cntr**, **ibv_set_err_comp_cntr** - Set the value of a
18+
completion or error counter
19+
20+
**ibv_inc_comp_cntr**, **ibv_inc_err_comp_cntr** - Increment a completion or
21+
error counter
22+
23+
# SYNOPSIS
24+
25+
```c
26+
#include <infiniband/verbs.h>
27+
28+
struct ibv_comp_cntr *ibv_create_comp_cntr(struct ibv_context *context,
29+
struct ibv_comp_cntr_init_attr *cc_attr);
30+
31+
int ibv_destroy_comp_cntr(struct ibv_comp_cntr *comp_cntr);
32+
33+
int ibv_set_comp_cntr(struct ibv_comp_cntr *comp_cntr, uint64_t value);
34+
int ibv_set_err_comp_cntr(struct ibv_comp_cntr *comp_cntr, uint64_t value);
35+
int ibv_inc_comp_cntr(struct ibv_comp_cntr *comp_cntr, uint64_t amount);
36+
int ibv_inc_err_comp_cntr(struct ibv_comp_cntr *comp_cntr, uint64_t amount);
37+
```
38+
39+
# DESCRIPTION
40+
41+
Completion counters provide a lightweight completion mechanism as an
42+
alternative or extension to completion queues (CQs). Rather than generating
43+
individual completion queue entries, a completion counter tracks the aggregate
44+
number of completed operations. This makes them well suited for applications
45+
that need to know how many requests have completed without requiring
46+
per-request details, such as credit based flow control or tracking responses
47+
from remote peers.
48+
49+
Each completion counter maintains two distinct 64-bit values: a completion
50+
count that is incremented on successful completions, and an error count that
51+
is incremented when operations complete in error.
52+
53+
**ibv_create_comp_cntr**() allocates a new completion counter for the RDMA
54+
device context *context*. The properties of the counter are defined by
55+
*cc_attr*. On success, the returned **ibv_comp_cntr** structure contains
56+
pointers to the completion and error count values. The maximum number of
57+
completion counters a device supports is reported by the *max_comp_cntr*
58+
field of **ibv_device_attr_ex**.
59+
60+
**ibv_destroy_comp_cntr**() releases all resources associated with the
61+
completion counter *comp_cntr*. The counter must not be attached to any QP
62+
when destroyed.
63+
64+
**ibv_set_comp_cntr**() sets the completion count of *comp_cntr* to *value*.
65+
66+
**ibv_set_err_comp_cntr**() sets the error count of *comp_cntr* to *value*.
67+
68+
**ibv_inc_comp_cntr**() increments the completion count of *comp_cntr* by
69+
*amount*.
70+
71+
**ibv_inc_err_comp_cntr**() increments the error count of *comp_cntr* by
72+
*amount*.
73+
74+
## External memory
75+
76+
By default, the memory backing the counter values is allocated internally.
77+
When the **IBV_COMP_CNTR_INIT_WITH_EXTERNAL_MEM** flag is set in
78+
*ibv_comp_cntr_init_attr.flags*, the application provides its own memory for
79+
the completion and error counts via the *comp_cntr_ext_mem* and
80+
*err_cntr_ext_mem* fields. The external memory is described by an
81+
**ibv_memory_location** structure which supports two modes: a virtual address
82+
(**IBV_MEMORY_LOCATION_VA**), where the application supplies a direct pointer, or
83+
a DMA-BUF reference (**IBV_MEMORY_LOCATION_DMABUF**), where the application
84+
supplies a file descriptor and offset into an exported DMA-BUF. When using
85+
DMA-BUF, the *ptr* field may also be set to provide a process-accessible
86+
mapping of the memory; if provided, the *comp_count* and *err_count* pointers
87+
in the returned **ibv_comp_cntr** will point to it. Using external memory
88+
allows the counter values to reside in application-managed buffers or in
89+
memory exported through DMA-BUF, enabling zero-copy observation of completion
90+
progress by co-located processes or devices.
91+
92+
# ARGUMENTS
93+
94+
## ibv_comp_cntr
95+
96+
```c
97+
struct ibv_comp_cntr {
98+
struct ibv_context *context;
99+
uint32_t handle;
100+
uint64_t *comp_count;
101+
uint64_t *err_count;
102+
uint64_t comp_count_max_value;
103+
uint64_t err_count_max_value;
104+
};
105+
```
106+
107+
*context*
108+
: Device context associated with the completion counter.
109+
110+
*handle*
111+
: Kernel object handle for the completion counter.
112+
113+
*comp_count*
114+
: Pointer to the current successful completion count. When the counter
115+
is backed by CPU-accessible memory, this pointer may be read directly
116+
by the application.
117+
118+
*err_count*
119+
: Pointer to the current error completion count. When the counter is
120+
backed by CPU-accessible memory, this pointer may be read directly
121+
by the application.
122+
123+
*comp_count_max_value*
124+
: The maximum value the completion count can hold. A subsequent
125+
increment that would exceed this value wraps the counter to zero.
126+
127+
*err_count_max_value*
128+
: The maximum value the error count can hold. A subsequent increment
129+
that would exceed this value wraps the counter to zero.
130+
131+
## ibv_comp_cntr_init_attr
132+
133+
```c
134+
struct ibv_comp_cntr_init_attr {
135+
uint32_t comp_mask;
136+
uint32_t flags;
137+
struct ibv_memory_location comp_cntr_ext_mem;
138+
struct ibv_memory_location err_cntr_ext_mem;
139+
};
140+
```
141+
142+
*comp_mask*
143+
: Bitmask specifying what fields in the structure are valid.
144+
145+
*flags*
146+
: Creation flags. The following flags are supported:
147+
148+
**IBV_COMP_CNTR_INIT_WITH_EXTERNAL_MEM** - Use application-provided
149+
memory for the counter values, as specified by *comp_cntr_ext_mem*
150+
and *err_cntr_ext_mem*.
151+
152+
*comp_cntr_ext_mem*
153+
: Memory location for the completion count when using external memory.
154+
155+
*err_cntr_ext_mem*
156+
: Memory location for the error count when using external memory.
157+
158+
## ibv_memory_location
159+
160+
```c
161+
enum ibv_memory_location_type {
162+
IBV_MEMORY_LOCATION_VA,
163+
IBV_MEMORY_LOCATION_DMABUF,
164+
};
165+
166+
struct ibv_memory_location {
167+
uint8_t *ptr;
168+
struct {
169+
uint64_t offset;
170+
int32_t fd;
171+
uint32_t reserved;
172+
} dmabuf;
173+
uint8_t type;
174+
uint8_t reserved[7];
175+
};
176+
```
177+
178+
*type*
179+
: The type of memory location. **IBV_MEMORY_LOCATION_VA** for a virtual
180+
address, or **IBV_MEMORY_LOCATION_DMABUF** for a DMA-BUF reference.
181+
182+
*ptr*
183+
: Virtual address pointer. Required when type is
184+
**IBV_MEMORY_LOCATION_VA**. When type is
185+
**IBV_MEMORY_LOCATION_DMABUF**, may optionally be set to provide a
186+
process-accessible mapping of the DMA-BUF memory.
187+
188+
*dmabuf.fd*
189+
: DMA-BUF file descriptor (used when type is
190+
**IBV_MEMORY_LOCATION_DMABUF**).
191+
192+
*dmabuf.offset*
193+
: Offset within the DMA-BUF.
194+
195+
# RETURN VALUE
196+
197+
**ibv_create_comp_cntr**() returns a pointer to the allocated ibv_comp_cntr
198+
object, or NULL if the request fails (and sets errno to indicate the failure
199+
reason).
200+
201+
**ibv_destroy_comp_cntr**(), **ibv_set_comp_cntr**(),
202+
**ibv_set_err_comp_cntr**(), **ibv_inc_comp_cntr**(), and
203+
**ibv_inc_err_comp_cntr**() return 0 on success, or the value of errno on
204+
failure (which indicates the failure reason).
205+
206+
# ERRORS
207+
208+
ENOTSUP
209+
: Completion counters are not supported on this device.
210+
211+
ENOMEM
212+
: Not enough resources to create the completion counter.
213+
214+
EINVAL
215+
: Invalid argument(s) passed.
216+
217+
EBUSY
218+
: The completion counter is still attached to a QP
219+
(**ibv_destroy_comp_cntr**() only).
220+
221+
# NOTES
222+
223+
Counter values should not be modified directly by writing to the memory
224+
pointed to by *comp_count* or *err_count*. Applications must use the provided
225+
API functions (**ibv_set_comp_cntr**(), **ibv_set_err_comp_cntr**(),
226+
**ibv_inc_comp_cntr**(), **ibv_inc_err_comp_cntr**()) to update counter
227+
values.
228+
229+
Updates made to counter values (e.g. via **ibv_set_comp_cntr**() or
230+
**ibv_inc_comp_cntr**()) may not be immediately visible when reading the
231+
counter. A small delay may occur between the update and the observed value.
232+
However, the final updated value will eventually be reflected.
233+
234+
Applications should ensure that the counter value is stable before calling
235+
**ibv_set_comp_cntr**() or **ibv_set_err_comp_cntr**(). Otherwise, concurrent
236+
updates may be lost.
237+
238+
# SEE ALSO
239+
240+
**ibv_qp_attach_comp_cntr**(3), **ibv_create_cq**(3),
241+
**ibv_create_cq_ex**(3), **ibv_create_qp**(3)
242+
243+
# AUTHORS
244+
245+
Michael Margolin <mrgolin@amazon.com>

0 commit comments

Comments
 (0)