feat(stats): Add 1% low FPS tracking#2661
Conversation
|
| Filename | Overview |
|---|---|
| Generals/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp | Core FPS metric implementation: migrates history tracking from function-local statics to instance members, adds time-bounded rolling average and 1% low calculation. One static throttle timer (lastLowUpdate) was not migrated with the rest. |
| GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp | Identical implementation to Generals build with the same static lastLowUpdate issue; also contains bundled FPS cap format change (▲ symbols) not present in Generals. |
| Generals/Code/GameEngine/Source/GameClient/InGameUI.cpp | Adds m_renderFpsLowString display string lifecycle (alloc, free, font, draw) and formats the 1% low as (%u). Layout adjustments add half-gap spacing between FPS elements. |
| GeneralsMD/Code/GameEngine/Source/GameClient/InGameUI.cpp | Same as Generals InGameUI but uses ▼%u format for 1% low and also reformats the FPS cap label to ▲%u/▲X — diverges from the Generals implementation. |
| Core/GameEngine/Include/GameClient/Display.h | Adds pure virtual getLow1PercentFPS() to the Display interface; straightforward and correct. |
| Generals/Code/GameEngineDevice/Include/W3DDevice/GameClient/W3DDisplay.h | Declares new methods and the 5000-element history/sort buffers as instance members; replaces the old 30-frame static approach. |
| GeneralsMD/Code/GameEngineDevice/Include/W3DDevice/GameClient/W3DDisplay.h | Mirror of Generals W3DDisplay.h changes; identical and correct. |
| Generals/Code/Tools/GUIEdit/Include/GUIEditDisplay.h | Adds stub getLow1PercentFPS() { return 0; } override to satisfy the new pure virtual; correct. |
| GeneralsMD/Code/Tools/GUIEdit/Include/GUIEditDisplay.h | Same stub override as the Generals GUIEditDisplay; correct. |
Sequence Diagram
sequenceDiagram
participant Draw as W3DDisplay::draw()
participant UPM as updatePerformanceMetrics()
participant AFS as addFpsSample()
participant CAFPS as calculateAverageFPS(0.5s)
participant CL1 as calculateLow1PercentFPS(3.0s)
participant IGUI as InGameUI::updateRenderFpsString()
participant HUD as HUD (m_renderFpsString / m_renderFpsLowString)
Draw->>UPM: called each frame
UPM->>AFS: elapsedSeconds
AFS-->>UPM: updates m_fpsHistory / m_durationHistory
UPM->>CAFPS: "windowSeconds=0.5"
CAFPS-->>UPM: m_averageFPS
UPM->>CL1: every 1000ms via timeGetTime()
CL1-->>UPM: m_low1PercentFPS (nth_element on m_sortBuffer)
IGUI->>Draw: getAverageFPS() / getLow1PercentFPS()
Draw-->>IGUI: m_averageFPS, m_low1PercentFPS
IGUI->>HUD: setText on m_renderFpsString / m_renderFpsLowString
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
Generals/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp:1079-1085
The `lastLowUpdate` throttle timer is declared as a function-local `static`, while the rest of the PR specifically migrated `lastUpdateTime64`, `historyOffset`, and `fpsHistory` from statics to instance members for exactly this reason. Leaving this one as a static means it persists across display resets (e.g., map reloads) and would be shared across multiple `W3DDisplay` instances. It should be an instance member like `m_lastLowUpdateMs` initialized in the constructor and reset in `reset()`.
```suggestion
UnsignedInt now = timeGetTime();
if (now - m_lastLowUpdateMs >= 1000)
{
m_lastLowUpdateMs = now;
m_low1PercentFPS = calculateLow1PercentFPS(3.0f);
}
```
### Issue 2 of 3
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp:1130-1136
Same `static UnsignedInt lastLowUpdate` issue as in the Generals build: this should be an instance member (`m_lastLowUpdateMs`) to stay consistent with the rest of the refactoring and to reset cleanly on display recreation.
```suggestion
UnsignedInt now = timeGetTime();
if (now - m_lastLowUpdateMs >= 1000)
{
m_lastLowUpdateMs = now;
m_low1PercentFPS = calculateLow1PercentFPS(3.0f);
}
```
### Issue 3 of 3
GeneralsMD/Code/GameEngine/Source/GameClient/InGameUI.cpp:6176-6191
**Divergent format strings between Generals and GeneralsMD**
The PR description states both implementations are "identical code," but they differ in two ways unique to GeneralsMD: (1) the 1% low FPS label uses `▼%u` (`\x25BC`) here versus `(%u)` in Generals, and (2) GeneralsMD also reformats the FPS cap display from `[%u]` to `▲%u`/`▲X`. The Unicode symbol change for the cap (`▲`) is an undocumented behavioral change bundled into this PR. If the differing formats are intentional design choices for each build, the PR description should reflect that.
Reviews (7): Last reviewed commit: "add icons and slightly brighten lowfps d..." | Re-trigger Greptile
Move FPS history state into W3DDisplay members. Implement accurate time-based windowing for frame metrics. Use ceiling logic for improved 1% low accuracy. Optimize percentile calculation using efficient selection algorithm. Rename and centralize performance update call sites. Increase history buffer for stable high-FPS monitoring.
Update average FPS math to use time-weighted mean. Move sortBuffer to class members for consistency.
| Real m_low1PercentFPS; ///<1% low fps. | ||
| Real m_currentFPS; ///<current fps value. | ||
|
|
||
| enum { FPS_HISTORY_SIZE = 5000 }; // covers 5s at 1000 FPS, degrades gracefully beyond |
There was a problem hiding this comment.
I think this number needs evaluation but I would like input from others.
As it stands, 5000 samples in three Real arrays require 60 kB in memory - just for the FPS trackers.
I think the question should be: given 3 seconds timeframe for the 1% low FPS, At what 3-second average FPS is the low fps no longer relevant. Say you average 300 fps, what is the chance that the 1% is so low that it is still relevant as a performance metric. If 300 fps is the upper bound, only 900 samples need to be stored. That's only 18% of the memory needed compared to the current setting.
There was a problem hiding this comment.
Say you average 300 fps, what is the chance that the 1% is so low that it is still relevant as a performance metric.
It's pretty common to see 300 average fps and 30 fps lows in this game in large skirmish matches even on vs2022 non-retail.
But definitely would like to hear multiple inputs for this number.
There was a problem hiding this comment.
The implementation approach is fine, but the data size requirements for the fps counters are outrageous.
I inquired a bit with Chat Gippy and it would be possible to reduce size requirements by quantizing samples and moving intermediate samples into timed buckets.
Here is a sample FrameBucket with avg and min frame time, stored in 2 bytes each, with a resolution of 16 micro seconds.
// Compact frametime statistics bucket.
//
// Stores average and maximum frametime values using 16-bit integers
// with fixed-point quantization.
//
// ------------------------------------------------------------------
// Encoding
// ------------------------------------------------------------------
//
// Values are stored in units of 16 microseconds:
//
// stored_value = frametime_us / 16
//
// This allows a uint16_t to represent:
//
// 65535 * 16 us = 1,048,560 us
// = ~1048 ms
//
// which comfortably covers extremely slow frames (~1 FPS).
//
// ------------------------------------------------------------------
// Precision
// ------------------------------------------------------------------
//
// Quantization step:
//
// 16 us = 0.016 ms
//
// Maximum quantization error:
//
// +/- 8 us
//
// This is effectively negligible for FPS telemetry.
//
// Example:
//
// 16.667 ms frame (60 FPS)
//
// Encoded as:
//
// 16667 us / 16 = 1042
//
// Decoded:
//
// 1042 * 16 = 16672 us
//
// Error:
//
// +5 us = 0.005 ms
//
// ------------------------------------------------------------------
// Why fixed-point quantization?
// ------------------------------------------------------------------
//
// Advantages:
//
// - extremely compact (4 bytes total per bucket)
// - deterministic runtime cost
// - cache friendly
// - no floating-point storage
// - sufficient precision for telemetry
// - supports >1 second frametimes
//
// ------------------------------------------------------------------
// Intended usage
// ------------------------------------------------------------------
//
// Typical bucket generation:
//
// avgFrameUs = accumulatedFrameUs / frameCount
// maxFrameUs = worstFrameUsSeen
//
// Then:
//
// bucket.SetAvgUs(avgFrameUs);
// bucket.SetMaxUs(maxFrameUs);
//
// ------------------------------------------------------------------
// Recommended usage pattern
// ------------------------------------------------------------------
//
// Build buckets over fixed time windows:
//
// e.g. 50 ms or 100 ms
//
// rather than fixed frame counts.
//
// This keeps statistical resolution stable across varying FPS.
//
// ------------------------------------------------------------------
struct FrameBucket
{
uint16_t avg16us;
uint16_t max16us;
static constexpr uint32_t QUANTUM_US = 16;
static constexpr uint32_t QUANTUM_SHIFT = 4;
static constexpr uint32_t MAX_US =
0xFFFFu * QUANTUM_US;
// Encode microseconds into 16 us fixed-point units.
//
// Rounds to nearest unit.
//
// Input is clamped to representable range.
static uint16_t EncodeUs(uint32_t us)
{
if (us > MAX_US)
us = MAX_US;
return static_cast<uint16_t>(
(us + (QUANTUM_US / 2)) >> QUANTUM_SHIFT);
}
// Decode fixed-point units back into microseconds.
static uint32_t DecodeUs(uint16_t v)
{
return static_cast<uint32_t>(v)
<< QUANTUM_SHIFT;
}
void SetAvgUs(uint32_t us)
{
avg16us = EncodeUs(us);
}
void SetMaxUs(uint32_t us)
{
max16us = EncodeUs(us);
}
uint32_t GetAvgUs() const
{
return DecodeUs(avg16us);
}
uint32_t GetMaxUs() const
{
return DecodeUs(max16us);
}
float GetAvgMs() const
{
return static_cast<float>(GetAvgUs()) * 0.001f;
}
float GetMaxMs() const
{
return static_cast<float>(GetMaxUs()) * 0.001f;
}
float GetAvgFPS() const
{
const uint32_t us = GetAvgUs();
return (us > 0)
? (1000000.0f / static_cast<float>(us))
: 0.0f;
}
float GetMinFPS() const
{
const uint32_t us = GetMaxUs();
return (us > 0)
? (1000000.0f / static_cast<float>(us))
: 0.0f;
}
};And then we move these FrameBuckets into a time based array.
Sample implementation:
// ============================================================================
// Rolling 3-second FPS statistics
// ============================================================================
//
// Stores:
//
// 3 seconds @ 10 ms resolution
//
// = 300 buckets
//
// Each bucket summarizes:
//
// - average frametime
// - worst frametime
//
// over a 10 ms interval.
//
// ============================================================================
class FpsHistory
{
public:
static constexpr uint32_t BUCKET_INTERVAL_US = 10000; // 10 ms
static constexpr uint32_t HISTORY_SECONDS = 3;
static constexpr uint32_t BUCKET_COUNT =
(HISTORY_SECONDS * 1000000) / BUCKET_INTERVAL_US;
// ------------------------------------------------------------------------
void Reset()
{
m_writeIndex = 0;
m_bucketCount = 0;
m_accumulatedUs = 0;
m_accumulatedFrames = 0;
m_maxFrameUs = 0;
m_bucketElapsedUs = 0;
std::fill(
std::begin(m_buckets),
std::end(m_buckets),
FrameBucket{});
}
// ------------------------------------------------------------------------
// Add a frame
//
// frameUs:
// frametime in microseconds
//
// Example:
//
// 16.667 ms = 16667 us
//
// ------------------------------------------------------------------------
void AddFrame(uint32_t frameUs)
{
// Accumulate stats for current bucket.
m_accumulatedUs += frameUs;
m_accumulatedFrames++;
if (frameUs > m_maxFrameUs)
m_maxFrameUs = frameUs;
m_bucketElapsedUs += frameUs;
// Emit one or more buckets if enough time elapsed.
while (m_bucketElapsedUs >= BUCKET_INTERVAL_US)
{
EmitBucket();
m_bucketElapsedUs -= BUCKET_INTERVAL_US;
}
}
// ------------------------------------------------------------------------
float GetAverageFPS() const
{
if (m_bucketCount == 0)
return 0.0f;
uint64_t totalUs = 0;
for (uint32_t i = 0; i < m_bucketCount; ++i)
{
totalUs += m_buckets[i].GetAvgUs();
}
const float avgUs =
static_cast<float>(totalUs)
/ static_cast<float>(m_bucketCount);
return avgUs > 0.0f
? (1000000.0f / avgUs)
: 0.0f;
}
// ------------------------------------------------------------------------
// Approximate 1% low FPS
//
// Uses the worst bucket frametimes.
//
// This is intentionally approximate and designed for:
//
// - low memory use
// - deterministic runtime
// - good stutter detection
//
// ------------------------------------------------------------------------
float GetOnePercentLowFPS() const
{
if (m_bucketCount == 0)
return 0.0f;
uint32_t worstUs = 0;
// Find worst bucket maximum frametime.
for (uint32_t i = 0; i < m_bucketCount; ++i)
{
worstUs = std::max(
worstUs,
m_buckets[i].GetMaxUs());
}
return worstUs > 0
? (1000000.0f / static_cast<float>(worstUs))
: 0.0f;
}
private:
// ------------------------------------------------------------------------
void EmitBucket()
{
FrameBucket& bucket =
m_buckets[m_writeIndex];
// Compute average frametime for bucket.
const uint32_t avgUs =
(m_accumulatedFrames > 0)
? static_cast<uint32_t>(
m_accumulatedUs / m_accumulatedFrames)
: 0;
bucket.SetAvgUs(avgUs);
bucket.SetMaxUs(m_maxFrameUs);
// Advance ring buffer.
m_writeIndex =
(m_writeIndex + 1) % BUCKET_COUNT;
if (m_bucketCount < BUCKET_COUNT)
++m_bucketCount;
// Reset accumulators.
m_accumulatedUs = 0;
m_accumulatedFrames = 0;
m_maxFrameUs = 0;
}
private:
FrameBucket m_buckets[BUCKET_COUNT];
uint32_t m_writeIndex = 0;
uint32_t m_bucketCount = 0;
uint64_t m_accumulatedUs = 0;
uint32_t m_accumulatedFrames = 0;
uint32_t m_maxFrameUs = 0;
uint32_t m_bucketElapsedUs = 0;
};
// ============================================================================
// Example usage
// ============================================================================
int main()
{
FpsHistory history;
history.Reset();
// Simulate ~60 FPS.
for (int i = 0; i < 500; ++i)
{
uint32_t frameUs = 16667;
// Simulate occasional stutter.
if ((i % 120) == 0)
{
frameUs = 50000; // 50 ms hitch
}
history.AddFrame(frameUs);
}
printf(
"Average FPS: %.2f\n",
history.GetAverageFPS());
printf(
"Approx 1%% Low FPS: %.2f\n",
history.GetOnePercentLowFPS());
return 0;
}This data organization approach uses just 1200 bytes at most. Bucket intervals could also be longer for even less size.
It loses value accuracy. It optimizes for speed over accuracy.
However, your current implemention would perhaps be more efficient at low frame rates, because less values need reading at small frame counts. I do like that aspect, because low fps runtimes need to do less.
I suggest to look into this more how to find a good balance between size,speed and accuracy. There are many options here.
|
I do not like the visuals of the new value. Can this look better? |
|
Maybe make the 1% low value a bit brighter. It can be difficult to read in game. |
| updateRenderFpsString(); | ||
| } | ||
|
|
||
| UnsignedInt renderFpsLimit = 0u; |
There was a problem hiding this comment.
Better change the init value to RenderFpsPreset::UncappedFpsValue. Then the code can be simplified.
| void W3DDisplay::addFpsSample(Real elapsedSeconds) | ||
| { | ||
| constexpr const Int FPS_HISTORY_SIZE = 30; | ||
| if (elapsedSeconds <= 0.0f) |
There was a problem hiding this comment.
How is this possible? If this is called with 0 seconds, then it indicates error.
|
|
||
| Real W3DDisplay::calculateLow1PercentFPS(Real windowSeconds) | ||
| { | ||
| if (m_historyCount == 0) |
| m_currentFPS = 1.0f/elapsedSeconds; | ||
| fpsHistory[historyOffset++] = m_currentFPS; | ||
| addFpsSample(elapsedSeconds); | ||
| m_averageFPS = calculateAverageFPS(0.5f); |
There was a problem hiding this comment.
This is just half of the fps text update rate of 1 second.
| m_averageFPS = sum / FPS_HISTORY_SIZE; | ||
| static UnsignedInt lastLowUpdate = 0; | ||
| UnsignedInt now = timeGetTime(); | ||
| if (now - lastLowUpdate >= 1000) |
There was a problem hiding this comment.
This interval is a bit unfortunate here. Is this for performance reasons? Can we make calculateLow1PercentFPS cheaper instead?
|
|
||
| // convert elapsed time to seconds | ||
| Real elapsedSeconds = (Real)timeDiff/(Real)freq64; | ||
| if (m_lastUpdateTime64 == 0) |
There was a problem hiding this comment.
Can we get rid of this condition? Looks like it will almost never be true.
|
|
||
| Real W3DDisplay::calculateAverageFPS(Real windowSeconds) | ||
| { | ||
| if (m_historyCount == 0) |
There was a problem hiding this comment.
Can we get rid of this condition? It looks like it will almost never be true. Perhaps init m_historyCount with 1.
| Real m_low1PercentFPS; ///<1% low fps. | ||
| Real m_currentFPS; ///<current fps value. | ||
|
|
||
| enum { FPS_HISTORY_SIZE = 5000 }; // covers 5s at 1000 FPS, degrades gracefully beyond |
There was a problem hiding this comment.
The implementation approach is fine, but the data size requirements for the fps counters are outrageous.
I inquired a bit with Chat Gippy and it would be possible to reduce size requirements by quantizing samples and moving intermediate samples into timed buckets.
Here is a sample FrameBucket with avg and min frame time, stored in 2 bytes each, with a resolution of 16 micro seconds.
// Compact frametime statistics bucket.
//
// Stores average and maximum frametime values using 16-bit integers
// with fixed-point quantization.
//
// ------------------------------------------------------------------
// Encoding
// ------------------------------------------------------------------
//
// Values are stored in units of 16 microseconds:
//
// stored_value = frametime_us / 16
//
// This allows a uint16_t to represent:
//
// 65535 * 16 us = 1,048,560 us
// = ~1048 ms
//
// which comfortably covers extremely slow frames (~1 FPS).
//
// ------------------------------------------------------------------
// Precision
// ------------------------------------------------------------------
//
// Quantization step:
//
// 16 us = 0.016 ms
//
// Maximum quantization error:
//
// +/- 8 us
//
// This is effectively negligible for FPS telemetry.
//
// Example:
//
// 16.667 ms frame (60 FPS)
//
// Encoded as:
//
// 16667 us / 16 = 1042
//
// Decoded:
//
// 1042 * 16 = 16672 us
//
// Error:
//
// +5 us = 0.005 ms
//
// ------------------------------------------------------------------
// Why fixed-point quantization?
// ------------------------------------------------------------------
//
// Advantages:
//
// - extremely compact (4 bytes total per bucket)
// - deterministic runtime cost
// - cache friendly
// - no floating-point storage
// - sufficient precision for telemetry
// - supports >1 second frametimes
//
// ------------------------------------------------------------------
// Intended usage
// ------------------------------------------------------------------
//
// Typical bucket generation:
//
// avgFrameUs = accumulatedFrameUs / frameCount
// maxFrameUs = worstFrameUsSeen
//
// Then:
//
// bucket.SetAvgUs(avgFrameUs);
// bucket.SetMaxUs(maxFrameUs);
//
// ------------------------------------------------------------------
// Recommended usage pattern
// ------------------------------------------------------------------
//
// Build buckets over fixed time windows:
//
// e.g. 50 ms or 100 ms
//
// rather than fixed frame counts.
//
// This keeps statistical resolution stable across varying FPS.
//
// ------------------------------------------------------------------
struct FrameBucket
{
uint16_t avg16us;
uint16_t max16us;
static constexpr uint32_t QUANTUM_US = 16;
static constexpr uint32_t QUANTUM_SHIFT = 4;
static constexpr uint32_t MAX_US =
0xFFFFu * QUANTUM_US;
// Encode microseconds into 16 us fixed-point units.
//
// Rounds to nearest unit.
//
// Input is clamped to representable range.
static uint16_t EncodeUs(uint32_t us)
{
if (us > MAX_US)
us = MAX_US;
return static_cast<uint16_t>(
(us + (QUANTUM_US / 2)) >> QUANTUM_SHIFT);
}
// Decode fixed-point units back into microseconds.
static uint32_t DecodeUs(uint16_t v)
{
return static_cast<uint32_t>(v)
<< QUANTUM_SHIFT;
}
void SetAvgUs(uint32_t us)
{
avg16us = EncodeUs(us);
}
void SetMaxUs(uint32_t us)
{
max16us = EncodeUs(us);
}
uint32_t GetAvgUs() const
{
return DecodeUs(avg16us);
}
uint32_t GetMaxUs() const
{
return DecodeUs(max16us);
}
float GetAvgMs() const
{
return static_cast<float>(GetAvgUs()) * 0.001f;
}
float GetMaxMs() const
{
return static_cast<float>(GetMaxUs()) * 0.001f;
}
float GetAvgFPS() const
{
const uint32_t us = GetAvgUs();
return (us > 0)
? (1000000.0f / static_cast<float>(us))
: 0.0f;
}
float GetMinFPS() const
{
const uint32_t us = GetMaxUs();
return (us > 0)
? (1000000.0f / static_cast<float>(us))
: 0.0f;
}
};And then we move these FrameBuckets into a time based array.
Sample implementation:
// ============================================================================
// Rolling 3-second FPS statistics
// ============================================================================
//
// Stores:
//
// 3 seconds @ 10 ms resolution
//
// = 300 buckets
//
// Each bucket summarizes:
//
// - average frametime
// - worst frametime
//
// over a 10 ms interval.
//
// ============================================================================
class FpsHistory
{
public:
static constexpr uint32_t BUCKET_INTERVAL_US = 10000; // 10 ms
static constexpr uint32_t HISTORY_SECONDS = 3;
static constexpr uint32_t BUCKET_COUNT =
(HISTORY_SECONDS * 1000000) / BUCKET_INTERVAL_US;
// ------------------------------------------------------------------------
void Reset()
{
m_writeIndex = 0;
m_bucketCount = 0;
m_accumulatedUs = 0;
m_accumulatedFrames = 0;
m_maxFrameUs = 0;
m_bucketElapsedUs = 0;
std::fill(
std::begin(m_buckets),
std::end(m_buckets),
FrameBucket{});
}
// ------------------------------------------------------------------------
// Add a frame
//
// frameUs:
// frametime in microseconds
//
// Example:
//
// 16.667 ms = 16667 us
//
// ------------------------------------------------------------------------
void AddFrame(uint32_t frameUs)
{
// Accumulate stats for current bucket.
m_accumulatedUs += frameUs;
m_accumulatedFrames++;
if (frameUs > m_maxFrameUs)
m_maxFrameUs = frameUs;
m_bucketElapsedUs += frameUs;
// Emit one or more buckets if enough time elapsed.
while (m_bucketElapsedUs >= BUCKET_INTERVAL_US)
{
EmitBucket();
m_bucketElapsedUs -= BUCKET_INTERVAL_US;
}
}
// ------------------------------------------------------------------------
float GetAverageFPS() const
{
if (m_bucketCount == 0)
return 0.0f;
uint64_t totalUs = 0;
for (uint32_t i = 0; i < m_bucketCount; ++i)
{
totalUs += m_buckets[i].GetAvgUs();
}
const float avgUs =
static_cast<float>(totalUs)
/ static_cast<float>(m_bucketCount);
return avgUs > 0.0f
? (1000000.0f / avgUs)
: 0.0f;
}
// ------------------------------------------------------------------------
// Approximate 1% low FPS
//
// Uses the worst bucket frametimes.
//
// This is intentionally approximate and designed for:
//
// - low memory use
// - deterministic runtime
// - good stutter detection
//
// ------------------------------------------------------------------------
float GetOnePercentLowFPS() const
{
if (m_bucketCount == 0)
return 0.0f;
uint32_t worstUs = 0;
// Find worst bucket maximum frametime.
for (uint32_t i = 0; i < m_bucketCount; ++i)
{
worstUs = std::max(
worstUs,
m_buckets[i].GetMaxUs());
}
return worstUs > 0
? (1000000.0f / static_cast<float>(worstUs))
: 0.0f;
}
private:
// ------------------------------------------------------------------------
void EmitBucket()
{
FrameBucket& bucket =
m_buckets[m_writeIndex];
// Compute average frametime for bucket.
const uint32_t avgUs =
(m_accumulatedFrames > 0)
? static_cast<uint32_t>(
m_accumulatedUs / m_accumulatedFrames)
: 0;
bucket.SetAvgUs(avgUs);
bucket.SetMaxUs(m_maxFrameUs);
// Advance ring buffer.
m_writeIndex =
(m_writeIndex + 1) % BUCKET_COUNT;
if (m_bucketCount < BUCKET_COUNT)
++m_bucketCount;
// Reset accumulators.
m_accumulatedUs = 0;
m_accumulatedFrames = 0;
m_maxFrameUs = 0;
}
private:
FrameBucket m_buckets[BUCKET_COUNT];
uint32_t m_writeIndex = 0;
uint32_t m_bucketCount = 0;
uint64_t m_accumulatedUs = 0;
uint32_t m_accumulatedFrames = 0;
uint32_t m_maxFrameUs = 0;
uint32_t m_bucketElapsedUs = 0;
};
// ============================================================================
// Example usage
// ============================================================================
int main()
{
FpsHistory history;
history.Reset();
// Simulate ~60 FPS.
for (int i = 0; i < 500; ++i)
{
uint32_t frameUs = 16667;
// Simulate occasional stutter.
if ((i % 120) == 0)
{
frameUs = 50000; // 50 ms hitch
}
history.AddFrame(frameUs);
}
printf(
"Average FPS: %.2f\n",
history.GetAverageFPS());
printf(
"Approx 1%% Low FPS: %.2f\n",
history.GetOnePercentLowFPS());
return 0;
}This data organization approach uses just 1200 bytes at most. Bucket intervals could also be longer for even less size.
It loses value accuracy. It optimizes for speed over accuracy.
However, your current implemention would perhaps be more efficient at low frame rates, because less values need reading at small frame counts. I do like that aspect, because low fps runtimes need to do less.
I suggest to look into this more how to find a good balance between size,speed and accuracy. There are many options here.
This PR adds a 1% low FPS metric to the existing FPS counter HUD, displayed in parentheses next to the average FPS. The 1% low is a standard performance metric used to surface frame time spikes that the average FPS hides. Inspired by #1942.
Added 1% low FPS display to HUD counter
Added m_renderFpsLowString and supporting UI members
Added RenderFpsLowColor configuration to InGameUI INI
Increased history to 5,000 time-bounded frames
Implemented rolling 0.5s window for average FPS
Implemented rolling 3.0s window for 1% lows
The following screenshot from AOD Cobalt Rush shows the 1% low FPS overlay compared to CapFrameX (an external benchmarking tool, centered right), demonstrating the value of surfacing this metric separately from the average.
This change was generated with AI assistance. All generated code has been reviewed, tested, and verified for correctness. The implementation went through multiple iterations, including fundamental changes to the underlying approach, as well as passes to apply simplifications, fix inconsistencies, and optimize performance. Both Generals and GeneralsMD implementations are included in this PR with identical code.