Skip to content

Extreme Performance Variability (CV > 10%) in Function Workloads Despite CPU Pinning #861

@LUOQIHuang

Description

@LUOQIHuang

Extreme Performance Variability (CV > 10%) in Function Workloads Despite CPU Pinning

Environment

  • Firecracker Version: v1.1.0
  • Host Kernel: 6.8.0-101-generic
  • Guest OS: Ubuntu 22.04
  • CPU: Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz
  • Setup: FaaS with gRPC, Firecracker + fc_vcpu threads pinned to dedicated core

Problem Description

When running serverless functions via gRPC in Firecracker VMs, we observe significant performance instability despite strict CPU core pinning. The BFS graph traversal algorithm shows a Coefficient of Variation (CV) of 212%, making it impossible to provide performance SLO guarantees.

Performance Data (50 iterations per function)

Function Cycles CV Instructions CV IPC Kernel Mode %
fibonacci (compute-bound) 8.90% 15.35% 1.123 98.9%
matmul (compute-bound) 9.26% 13.88% 1.032 98.8%
bfs (memory-intensive) 212.96% ⚠️ 325.50% ⚠️ 0.661 99.9% ⚠️
image_processing 5.57% 11.63% 0.639 99.7%
video_processing 13.06% 32.79% 0.495 99.3%

Key Observations

  1. Unacceptable variance: BFS shows 3x variation in execution time for identical inputs
  2. Kernel mode dominance: All workloads spend 98-99% of time in kernel mode (expected: <20%)
    • User mode cycles: ~20K
    • Kernel mode cycles: ~1.5M - 27M

Suspected Root Causes

The extreme kernel mode percentage suggests:

  • Excessive VM exits - triggered by gRPC network I/O (likely EPT_VIOLATION, EXTERNAL_INTERRUPT)
  • Interrupt injection overhead - network packets triggering frequent guest interrupts
  • Virtio-net inefficiency - possible lack of interrupt coalescing or ring buffer batching

For BFS specifically:

  • Random memory access patterns amplify timing jitter from VM exits
  • Hypothesis: Unpredictable VM exit timing disrupts cache locality

Reproduction Steps

#Start containerd
firecracker-containerd --config /etc/firecracker-containerd/config.toml

#start a container
numactl --physcpubind=150 --membind=1 firecracker-ctr --address /run/firecracker-containerd/containerd.sock \
  run \
  --snapshotter devmapper \
  --runtime aws.firecracker \
  --rm --tty --net-host \
  docker.io/library/fibonacci:latest fibonacci-test
 
# Pin Firecracker to core
taskset -cp 222 $FC_PID
for tid in $(ps -T -p $FC_PID | grep fc_vcpu | awk '{print $2}'); do
    taskset -cp 222 $tid
done

# Measure variance (repeat 50 times)
perf stat -e cycles:u,cycles:k,instructions:u,instructions:k \
  -p $FC_PID -- numactl --physcpubind=50 --membind=0 /home/h00918771/vSwarm/tools/bin/grpcurl --proto proto/fibonacci.proto -plaintext  -d '{"name":"256"}' ${VM_IP}:50051 fibonacci.Greeter.SayHello

Questions

  1. Is this level of kernel mode overhead expected for gRPC workloads?
  2. What approach would you recommend for profiling VM exit reasons? (perf kvm stat?)
  3. Are there virtio-net tuning parameters for latency predictability?
    • Ring sizes, interrupt coalescing, rate limiters?
  4. Should we consider using vhost-user for this use case?

Expected Behavior

For deterministic workloads (identical inputs):

  • CV should be < 20% even for memory-intensive functions
  • Application code should run primarily in user mode
  • Performance should be reproducible across invocations

Next Steps

We are prepared to:

  • Run detailed VM exit profiling with perf kvm stat
  • Test suggested configuration changes
  • Share guest-side profiling data
  • Contribute patches if we identify fixes

Any guidance would be greatly appreciated! 🙏

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions