Problem
The NewStorage function in connect/connect.go uses a single ctx context.Context parameter for both etcd and TCS connection attempts. When the caller passes a context with a deadline (e.g., context.WithTimeout(ctx, 5*time.Second)), the deadline applies to the entire NewStorage call, not to each individual connection attempt.
This causes a problem when connecting to TCS after etcd fails:
NewEtcdStorage(ctx, cfg) is called first
- Inside
createEtcdClient, it creates statusCtx with context.WithTimeout(ctx, cfg.dialTimeout())
- If etcd is unavailable, the full
dialTimeout is consumed
- By the time
NewTCSStorage(ctx, cfg) is called, the parent ctx deadline has already expired
- TCS connection fails immediately with
context deadline exceeded, even though TCS is available
Example
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
cfg := connect.Config{
Endpoints: []string{"localhost:4401"},
Username: "client",
Password: "secret",
DialTimeout: 5 * time.Second,
}
// This fails with TCS even though TCS is available,
// because etcd attempt consumed the entire 5s deadline.
stg, cleanup, err := connect.NewStorage(ctx, cfg)
Current Behavior
- With
ctx having a 5s deadline and DialTimeout: 5s:
- etcd attempt takes ~5s (timeout)
- TCS attempt fails immediately (parent ctx expired)
- Result: Cannot connect to TCS when etcd is unavailable
Expected Behavior
Each connection attempt (etcd and TCS) should have its own independent timeout, so that if etcd fails, TCS still has a full DialTimeout window to connect.
Workaround
Callers can currently work around this by passing context.Background() (without a deadline) and relying on DialTimeout alone. However, this loses the ability to cancel the operation externally (e.g., on application shutdown).
Problem
The
NewStoragefunction inconnect/connect.gouses a singlectx context.Contextparameter for both etcd and TCS connection attempts. When the caller passes a context with a deadline (e.g.,context.WithTimeout(ctx, 5*time.Second)), the deadline applies to the entireNewStoragecall, not to each individual connection attempt.This causes a problem when connecting to TCS after etcd fails:
NewEtcdStorage(ctx, cfg)is called firstcreateEtcdClient, it createsstatusCtxwithcontext.WithTimeout(ctx, cfg.dialTimeout())dialTimeoutis consumedNewTCSStorage(ctx, cfg)is called, the parentctxdeadline has already expiredcontext deadline exceeded, even though TCS is availableExample
Current Behavior
ctxhaving a 5s deadline andDialTimeout: 5s:Expected Behavior
Each connection attempt (etcd and TCS) should have its own independent timeout, so that if etcd fails, TCS still has a full
DialTimeoutwindow to connect.Workaround
Callers can currently work around this by passing
context.Background()(without a deadline) and relying onDialTimeoutalone. However, this loses the ability to cancel the operation externally (e.g., on application shutdown).