Skip to content

Commit dc64a51

Browse files
Add packet trimming API (#2077)
When a packet is lost, it can be recovered through fast retransmission (e.g., Go-Back-N in RoCE) or by using timeouts. Retransmission triggered by timeouts typically incurs significant latency. Packet trimming aims to facilitate rapid packet loss notification and, consequently, eliminate slow timeout-based retransmissions. Signed-off-by: Marian Pritsak <marianp@mellanox.com>
1 parent b2be076 commit dc64a51

6 files changed

Lines changed: 288 additions & 1 deletion

File tree

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# Switch Abstraction Interface Change Proposal for Packet Trimming
2+
3+
Title | Packet Trimming
4+
------------|----------------
5+
Authors | Nvidia
6+
Status | In review
7+
Type | Standards track
8+
Created | 8/28/2024
9+
SAI-Version | 1.14
10+
----------
11+
12+
## Overview
13+
When the lossy queue exceeds a buffer threshold, it drops packets without any notification to the destination host.
14+
15+
When a packet is lost, it can be recovered through fast retransmission (e.g., Go-Back-N in RoCE) or by using timeouts. Retransmission triggered by timeouts typically incurs significant latency. Packet trimming aims to facilitate rapid packet loss notification and, consequently, eliminate slow timeout-based retransmissions.
16+
17+
To help the host recover data more quickly and accurately, we introduce a packet trimming feature, that upon a failed packet admission to a shared buffer,
18+
will trim a packet to a configured size, and try sending it on a different queue to deliver a packet drop notification to an end host.
19+
20+
```
21+
22+
┌───────────────┐
23+
│ │
24+
│Trimmed packet │
25+
│ │
26+
└───────────────┘
27+
28+
┌─┬─┬─┬─┬────────┐
29+
│ │ │ │ │ │
30+
│ │ │ │ │ │
31+
┌────────────────► │ │ │ │ │
32+
│ │ │ │ │ │ │ Queue
33+
│ │ │ │ │ │ │
34+
│ │ │ │ │ │ │
35+
│ └─┴─┴─┴─┴────────┘
36+
┌──────────────┐ │
37+
│ │ ┌──────────────────────────────────────────────────────┐ │ ┌─┬─┬─┬─┬─┬─┬─┬─┬┐
38+
│ │ │ │ │ │ │ │ │ │ │ │ │ ││
39+
│ │ │ │ │ \ / │ │ │ │ │ │ │ │ ││
40+
│ │ │ │ │ \ / │ │ │ │ │ │ │ │ ││
41+
│ Packet │ │ Pipeline ┼────┼───────\────────► │ │ │ │ │ │ │ ││ Queue
42+
│ │ │ │ / \ │ │ │ │ │ │ │ │ ││
43+
│ │ │ │ / \ │ │ │ │ │ │ │ │ ││
44+
│ │ └──────────────────────────────────────────────────────┘ └─┴─┴─┴─┴─┴─┴─┴─┴┘
45+
│ │
46+
│ │
47+
│ │
48+
└──────────────┘
49+
```
50+
51+
This feature assumes that forwarding tables are configured properly, and the original packet would be delivered to the destination successfully if not for the congestion.
52+
53+
## Spec
54+
There is a tradeoff between trying to configure a higher threshold in a queue buffer profile and trimming the packet.
55+
56+
If the user chooses to configure higher thresholds for queues, the probability of a drop on a particular queue is lower only if other ports are less congested at the moment.
57+
58+
However, if all the ports are equally utilized, it makes sense to create a different buffer profile for these queues, with a stricter threshold to have more fairness in shared buffer.
59+
60+
A static trimming threshold may not be effective with shared buffer switches, where the buffer resources allocated to a queue or port can vary over time. Therefore, we propose adding a new attribute to a buffer profile to allow configuring packet trimming on such stricter profiles:
61+
```
62+
/**
63+
* @brief Enum defining queue actions in case the packet fails to pass the admission control.
64+
*/
65+
typedef enum _sai_buffer_profile_packet_admission_fail_action_t
66+
{
67+
/**
68+
* @brief Drop the packet.
69+
*
70+
* Default action. Packet has nowhere to go
71+
* and will be dropped.
72+
*/
73+
SAI_BUFFER_PROFILE_PACKET_ADMISSION_FAIL_ACTION_DROP,
74+
75+
/**
76+
* @brief Trim the packet.
77+
*
78+
* Try sending a shortened packet over a different
79+
* queue. Original packet will be dropped and trimmed copy of the packet will be send.
80+
* The IP length and checksum fields will be updated in a trimmed copy.
81+
* SAI_QUEUE_STAT_DROPPED_PACKETS as well as SAI_QUEUE_STAT_DROPPED_BYTES
82+
* will count the original discarded frames even if they will be trimmed afterwards.
83+
* Interface statistics must show dropped packets.
84+
* Interface statistics may show sent trimmed packets.
85+
*/
86+
SAI_BUFFER_PROFILE_PACKET_ADMISSION_FAIL_ACTION_DROP_AND_TRIM,
87+
} sai_buffer_profile_packet_admission_fail_action_t;
88+
```
89+
```
90+
/**
91+
* @brief Buffer profile discard action
92+
*
93+
* Action to be taken upon packet discard due to
94+
* buffer profile configuration. Applicable only
95+
* when attached to a queue.
96+
*
97+
* @type sai_buffer_profile_packet_admission_fail_action_t
98+
* @flags CREATE_AND_SET
99+
* @default SAI_BUFFER_PROFILE_PACKET_ADMISSION_FAIL_ACTION_DROP
100+
*/
101+
SAI_BUFFER_PROFILE_ATTR_PACKET_ADMISSION_FAIL_ACTION,
102+
```
103+
104+
Trimming engine attributes are configured globally.
105+
```
106+
/**
107+
* @brief Trim packets to this size to reduce bandwidth
108+
*
109+
* @type sai_uint32_t
110+
* @flags CREATE_AND_SET
111+
* @default 128
112+
*/
113+
SAI_SWITCH_ATTR_PACKET_TRIM_SIZE,
114+
115+
/**
116+
* @brief New packet trim DSCP value
117+
*
118+
* @type sai_uint8_t
119+
* @flags CREATE_AND_SET
120+
* @default 0
121+
*/
122+
SAI_SWITCH_ATTR_PACKET_TRIM_DSCP_VALUE,
123+
124+
/**
125+
* @brief Queue mapping mode for a trimmed packet
126+
*
127+
* @type sai_packet_trim_queue_resolution_mode_t
128+
* @flags CREATE_AND_SET
129+
* @default SAI_PACKET_TRIM_QUEUE_RESOLUTION_MODE_STATIC
130+
*/
131+
SAI_SWITCH_ATTR_PACKET_TRIM_QUEUE_RESOLUTION_MODE,
132+
133+
/**
134+
* @brief New packet trim queue index
135+
*
136+
* @type sai_uint8_t
137+
* @flags CREATE_AND_SET
138+
* @default 0
139+
* @validonly SAI_SWITCH_ATTR_PACKET_TRIM_QUEUE_RESOLUTION_MODE == SAI_PACKET_TRIM_QUEUE_RESOLUTION_MODE_STATIC
140+
*/
141+
SAI_SWITCH_ATTR_PACKET_TRIM_QUEUE_INDEX,
142+
```
143+
144+
If more granularity is needed (e.g. trim a specific protocol, or packets within protocol), ACL action is added to disable trimming even if the packet is eligible due to a queue with a buffer profile attached that has trimming enabled.
145+
```
146+
/**
147+
* @brief Disable packet trim for a given match condition.
148+
*
149+
* This rule takes effect only when packet trim is configured on a buffer profile of a queue to which a packet belongs.
150+
*
151+
* @type sai_acl_action_data_t bool
152+
* @flags CREATE_AND_SET
153+
* @default disabled
154+
*/
155+
SAI_ACL_ENTRY_ATTR_ACTION_PACKET_TRIM_DISABLE = SAI_ACL_ENTRY_ATTR_ACTION_START + 0x39,
156+
```
157+
158+
Both the queue and the port have the packet counter to reflect the number of trimmed packet.
159+
```
160+
/** Packets trimmed due to failed shared buffer admission [uint64_t] */
161+
SAI_PORT_STAT_TRIM_PACKETS,
162+
```
163+
```
164+
/** Packets trimmed due to failed admission [uint64_t] */
165+
SAI_QUEUE_STAT_TRIM_PACKETS = 0x00000028,
166+
```

inc/saiacl.h

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -288,6 +288,9 @@ typedef enum _sai_acl_action_type_t
288288

289289
/** Next Chain Group */
290290
SAI_ACL_ACTION_TYPE_CHAIN_REDIRECT = 0x00000038,
291+
292+
/** Disable packet trim */
293+
SAI_ACL_ACTION_TYPE_PACKET_TRIM_DISABLE = 0x00000039,
291294
} sai_acl_action_type_t;
292295

293296
/**
@@ -3289,10 +3292,21 @@ typedef enum _sai_acl_entry_attr_t
32893292
*/
32903293
SAI_ACL_ENTRY_ATTR_ACTION_CHAIN_REDIRECT = SAI_ACL_ENTRY_ATTR_ACTION_START + 0x38,
32913294

3295+
/**
3296+
* @brief Disable packet trim for a given match condition.
3297+
*
3298+
* This rule takes effect only when packet trim is configured on a buffer profile of a queue to which a packet belongs.
3299+
*
3300+
* @type sai_acl_action_data_t bool
3301+
* @flags CREATE_AND_SET
3302+
* @default disabled
3303+
*/
3304+
SAI_ACL_ENTRY_ATTR_ACTION_PACKET_TRIM_DISABLE = SAI_ACL_ENTRY_ATTR_ACTION_START + 0x39,
3305+
32923306
/**
32933307
* @brief End of Rule Actions
32943308
*/
3295-
SAI_ACL_ENTRY_ATTR_ACTION_END = SAI_ACL_ENTRY_ATTR_ACTION_CHAIN_REDIRECT,
3309+
SAI_ACL_ENTRY_ATTR_ACTION_END = SAI_ACL_ENTRY_ATTR_ACTION_PACKET_TRIM_DISABLE,
32963310

32973311
/**
32983312
* @brief End of ACL Entry attributes

inc/saibuffer.h

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -590,6 +590,33 @@ typedef enum _sai_buffer_profile_threshold_mode_t
590590

591591
} sai_buffer_profile_threshold_mode_t;
592592

593+
/**
594+
* @brief Enum defining queue actions in case the packet fails to pass the admission control.
595+
*/
596+
typedef enum _sai_buffer_profile_packet_admission_fail_action_t
597+
{
598+
/**
599+
* @brief Drop the packet.
600+
*
601+
* Default action. Packet has nowhere to go
602+
* and will be dropped.
603+
*/
604+
SAI_BUFFER_PROFILE_PACKET_ADMISSION_FAIL_ACTION_DROP,
605+
606+
/**
607+
* @brief Trim the packet.
608+
*
609+
* Try sending a shortened packet over a different
610+
* queue. Original packet will be dropped and trimmed copy of the packet will be send.
611+
* The IP length and checksum fields will be updated in a trimmed copy.
612+
* SAI_QUEUE_STAT_DROPPED_PACKETS as well as SAI_QUEUE_STAT_DROPPED_BYTES
613+
* will count the original discarded frames even if they will be trimmed afterwards.
614+
* Interface statistics must show dropped packets.
615+
* Interface statistics may show sent trimmed packets.
616+
*/
617+
SAI_BUFFER_PROFILE_PACKET_ADMISSION_FAIL_ACTION_DROP_AND_TRIM,
618+
} sai_buffer_profile_packet_admission_fail_action_t;
619+
593620
/**
594621
* @brief Enum defining buffer profile attributes.
595622
*/
@@ -711,6 +738,19 @@ typedef enum _sai_buffer_profile_attr_t
711738
*/
712739
SAI_BUFFER_PROFILE_ATTR_XON_OFFSET_TH,
713740

741+
/**
742+
* @brief Buffer profile discard action
743+
*
744+
* Action to be taken upon packet discard due to
745+
* buffer profile configuration. Applicable only
746+
* when attached to a queue.
747+
*
748+
* @type sai_buffer_profile_packet_admission_fail_action_t
749+
* @flags CREATE_AND_SET
750+
* @default SAI_BUFFER_PROFILE_PACKET_ADMISSION_FAIL_ACTION_DROP
751+
*/
752+
SAI_BUFFER_PROFILE_ATTR_PACKET_ADMISSION_FAIL_ACTION,
753+
714754
/**
715755
* @brief End of attributes
716756
*/

inc/saiport.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3352,6 +3352,9 @@ typedef enum _sai_port_stat_t
33523352
/** Count of total bits corrected by FEC. Counter will increment monotonically. */
33533353
SAI_PORT_STAT_IF_IN_FEC_CORRECTED_BITS,
33543354

3355+
/** Packets trimmed due to failed shared buffer admission [uint64_t] */
3356+
SAI_PORT_STAT_TRIM_PACKETS,
3357+
33553358
/** Port stat in drop reasons range start */
33563359
SAI_PORT_STAT_IN_DROP_REASON_RANGE_BASE = 0x00001000,
33573360

inc/saiqueue.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -418,6 +418,9 @@ typedef enum _sai_queue_stat_t
418418
/** Queue delay watermark in nanoseconds [uint64_t] */
419419
SAI_QUEUE_STAT_DELAY_WATERMARK_NS = 0x00000027,
420420

421+
/** Packets trimmed due to failed admission [uint64_t] */
422+
SAI_QUEUE_STAT_TRIM_PACKETS = 0x00000028,
423+
421424
/** Custom range base value */
422425
SAI_QUEUE_STAT_CUSTOM_RANGE_BASE = 0x10000000
423426

inc/saiswitch.h

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -610,6 +610,30 @@ typedef enum _sai_switch_hostif_oper_status_update_mode_t
610610

611611
} sai_switch_hostif_oper_status_update_mode_t;
612612

613+
/**
614+
* @brief Attribute data for SAI_SWITCH_ATTR_HOSTIF_OPER_STATUS_UPDATE_MODE.
615+
*/
616+
typedef enum _sai_packet_trim_queue_resolution_mode_t
617+
{
618+
/**
619+
* @brief Static queue resolution.
620+
*
621+
* In this mode, a new queue for the trimmed packet is set directly
622+
* by the application.
623+
*/
624+
SAI_PACKET_TRIM_QUEUE_RESOLUTION_MODE_STATIC,
625+
626+
/**
627+
* @brief Dynamic queue resolution.
628+
*
629+
* In this mode, a new queue for the trimmed packet is resolved
630+
* using QOS maps (DSCP value to TC to Queue), applied to a new
631+
* DSCP value that was provided for a trimmed packet.
632+
*/
633+
SAI_PACKET_TRIM_QUEUE_RESOLUTION_MODE_DYNAMIC,
634+
635+
} sai_packet_trim_queue_resolution_mode_t;
636+
613637
/**
614638
* @brief Attribute Id in sai_set_switch_attribute() and
615639
* sai_get_switch_attribute() calls.
@@ -3124,6 +3148,43 @@ typedef enum _sai_switch_attr_t
31243148
*/
31253149
SAI_SWITCH_ATTR_TAM_TEL_TYPE_CONFIG_CHANGE_NOTIFY,
31263150

3151+
/**
3152+
* @brief Trim packets to this size to reduce bandwidth
3153+
*
3154+
* @type sai_uint32_t
3155+
* @flags CREATE_AND_SET
3156+
* @default 128
3157+
*/
3158+
SAI_SWITCH_ATTR_PACKET_TRIM_SIZE,
3159+
3160+
/**
3161+
* @brief New packet trim DSCP value
3162+
*
3163+
* @type sai_uint8_t
3164+
* @flags CREATE_AND_SET
3165+
* @default 0
3166+
*/
3167+
SAI_SWITCH_ATTR_PACKET_TRIM_DSCP_VALUE,
3168+
3169+
/**
3170+
* @brief Queue mapping mode for a trimmed packet
3171+
*
3172+
* @type sai_packet_trim_queue_resolution_mode_t
3173+
* @flags CREATE_AND_SET
3174+
* @default SAI_PACKET_TRIM_QUEUE_RESOLUTION_MODE_STATIC
3175+
*/
3176+
SAI_SWITCH_ATTR_PACKET_TRIM_QUEUE_RESOLUTION_MODE,
3177+
3178+
/**
3179+
* @brief New packet trim queue index
3180+
*
3181+
* @type sai_uint8_t
3182+
* @flags CREATE_AND_SET
3183+
* @default 0
3184+
* @validonly SAI_SWITCH_ATTR_PACKET_TRIM_QUEUE_RESOLUTION_MODE == SAI_PACKET_TRIM_QUEUE_RESOLUTION_MODE_STATIC
3185+
*/
3186+
SAI_SWITCH_ATTR_PACKET_TRIM_QUEUE_INDEX,
3187+
31273188
/**
31283189
* @brief End of attributes
31293190
*/

0 commit comments

Comments
 (0)