There is a bug in the Recalc strategy when the agent retrieves statistics from the HAProxy stick table: the values are reset every time HAProxy restarts, which turns to an underestimation of the request counters. This behavior is intentional in HAProxy but affects the agent's weight calculation logic.
A partial fix is introduced in commit 15f8a65: the agent now reads the rate directly from the HAProxy stick table for the last 1-second window. This is not a complete fix since it only accounts for the most recent second, but it may be sufficient as the agent bases its decisions on 1-second data during each recalculation period (default: 1 minute).
See:
|
for _, stEntry := range stContent { |
|
// This stick table contains a single key, "80", which tracks both |
|
// the number of requests within the given time window and the rate |
|
// per second during the recalculation period. We only use the rate. |
|
// |
|
// Note: Do not use http_req_cnt, as HAProxy restarts every |
|
// recalculation period, causing all counters to reset. |
|
// |
|
// FIXME: The http_req_rate value is taken from the previous |
|
// 1-second period, not averaged over the entire recalculation |
|
// period. This is a known limitation of the current strategy. |
|
strategy.userRates[funcName] = float64(stEntry.HTTPReqRate) |
|
# Stick table for invocations of function {{$funcName}} from users, not from |
|
# other DFaaS nodes. We just use one row for all clients. Denied requests are |
|
# not counted here. |
|
# |
|
# Among all stick tables, this is the only one used by the DFaaS agent to |
|
# calculate forwarding weights. |
|
backend st_users_func_{{$funcName}} |
|
stick-table type integer size 10 expire {{$.SecsRecalc}}s store http_req_cnt,http_req_rate(1s) |
There is a bug in the Recalc strategy when the agent retrieves statistics from the HAProxy stick table: the values are reset every time HAProxy restarts, which turns to an underestimation of the request counters. This behavior is intentional in HAProxy but affects the agent's weight calculation logic.
A partial fix is introduced in commit 15f8a65: the agent now reads the rate directly from the HAProxy stick table for the last 1-second window. This is not a complete fix since it only accounts for the most recent second, but it may be sufficient as the agent bases its decisions on 1-second data during each recalculation period (default: 1 minute).
See:
dfaas/dfaasagent/agent/loadbalancer/recalcstrategy.go
Lines 181 to 192 in 15f8a65
dfaas/dfaasagent/agent/loadbalancer/haproxycfgrecalc.tmpl
Lines 25 to 32 in 15f8a65