feat(stats): keep the hashrate meter alive across jobs (--internal_accumulate_hash)#174
Conversation
2b65ff3 to
4313b96
Compare
|
For slow kernels we already have several options available. Collecting stats on slow kernelsThe The To summarize, the user needs to correctly configure TestPlease redo your tests by adjusting these values, and if you still get bad results could you share your configuration. |
|
I re-ran both algos on the unmodified meter (this PR's changes reverted) on an RX 9070 XT vs live HeroMiners pools, sweeping Setup: Xelis → Results (stock meter):
You're right that Why it behaves this way. The stock meter is a count-gated accumulator tied to job arrival, not a wall clock: Lowering the threshold lets it publish, but each sample is then computed from very few launches over a short, ragged window — so the "good-looking" rows aren't actually good:
So this PR isn't a strict prerequisite — but a hashrate readout is a basic function that shouldn't need per-algorithm flag tuning to show a number at all, and even when tuned the number is jittery, stale, or flickering. A wall-clock window (nonces over elapsed time, decoupled from job resets) removes the dependency: correct out of the box for fast and slow kernels alike, reading 0 only when the GPU is genuinely idle. If the zero-config smoothing isn't something you want here, I'm happy to:
Which direction do you prefer? Happy to share the raw logs/configs. |
|
This PR raises an important point regarding the stats display. We can add a new flag So everything is resolved with a simple flag |
|
A couple of thoughts on the Default value. I'd argue it should default to EWMA vs raw window. I still think the smoothed meter is the better mechanism, and not just cosmetically. Even with accumulation on, the raw Your call on the project, but default-on + EWMA is where I think the better UX lands. |
|
If we add We can therefore set The purpose of And yes, setting |
…cumulate_hash) Slow / memory-hard kernels (Xelis, Octopus, Ergo) complete far fewer than --internal_kernel_count launches per job window, so the count-gated meter never recomputes and the dashboard reads 0 H/s even while valid shares land: updateBatchNonce() zeroes the stored hashrate on every job update. Add a Kernel-section flag --internal_accumulate_hash (default true). When set, job and constant updates no longer reset the hash count or the elapsed window; only a memory rebuild (e.g. DAG) resets. The meter then accumulates a true nonces / elapsed average that spans jobs. Setting it false restores the previous per-job reset behaviour. - config/CLI: new --internal_accumulate_hash bool wired into DeviceOccupancy. - device: gate the epoch/period reset and the updateBatchNonce reset behind (!accumulate || memory update); memory rebuilds always reset. - docs: document the flag in PARAMETERS.md. - test: statistical unit tests for the accumulate-vs-reset invariant.
4313b96 to
6b26a40
Compare
|
Reworked this PR to your With it on, job and constant updates no longer fetch/reset the meter — the hash count and elapsed window accumulate across jobs, and only an One addition beyond the plain flag: the work loop opens the meter window once at start-up, because a non-epoch algorithm's first job is a constant update (not a memory update), so under accumulation its clock would otherwise never start. Verified live on an RX 9070 XT against Octopus (
|
| if (nextjobInfo.epoch != currentJobInfo.epoch || nextjobInfo.period != currentJobInfo.period) | ||
| { | ||
| miningStats.reset(); | ||
| // When accumulating, the meter window spans jobs; only a memory rebuild |
|
|
||
| //////////////////////////////////////////////////////////////////////////// | ||
| updateBatchNonce(); | ||
| // Reset the meter on memory rebuilds always; on plain job/constant updates |
| updateBatchNonce(); | ||
| // Reset the meter on memory rebuilds always; on plain job/constant updates | ||
| // only when not accumulating, so the window can span jobs. | ||
| updateBatchNonce(false == config.occupancy.accumulateHash || true == needUpdateMemory); |
There was a problem hiding this comment.
move false == config.occupancy.accumulateHash || true == needUpdateMemory in variable resetStats
| miningStats.resetHashrate(); | ||
| miningStats.reset(); | ||
|
|
||
| //////////////////////////////////////////////////////////////////////////// |
| computing.store(true, boost::memory_order::seq_cst); | ||
|
|
||
| //////////////////////////////////////////////////////////////////////////// | ||
| // Open the hashrate window once at start-up. When accumulating, a job that |
| #include <gtest/gtest.h> | ||
|
|
||
|
|
||
| using statistical::Statistical; |
There was a problem hiding this comment.
never declare using for namespace out of scope.
| @@ -0,0 +1,82 @@ | |||
| #include "statistical/statistical.hpp" | |||
There was a problem hiding this comment.
use <> instead of ""
Respect ordre include
…test style - device.cpp: remove the explanatory comments flagged in review; hoist the reset predicate into a local 'bool const resetStats' before updateBatchNonce - statistical test: includes use <> in STL -> third-party -> internal order, drop the file-scope 'using statistical::Statistical', and rename the TEST cases to lowerCamelCase per the naming convention
|
Pushed
Line-scoped clang-format-15 over the changed lines is clean. |
Reworks this PR to the
--internal_accumulate_hashdesign we converged on in the thread, replacing the time-windowed EWMA meter. The EWMA implementation is preserved onfeat/hashrate-ewma-meterin case we ever want to revisit smoothing separately.Problem
Slow / memory-hard kernels (Xelis, Octopus, Ergo…) complete far fewer than
--internal_kernel_countlaunches per job window. The meter is count-gated and reset on every job update —updateBatchNonce()zeroes the stored hashrate — so the threshold is never reached,getHashrate()returns 0, and theHASHRATEboard is hidden entirely even while valid shares land.Fix
New Kernel-section flag
--internal_accumulate_hash(defaulttrue):true— job and constant updates no longer reset the hash count or the elapsed window; only a memory rebuild (e.g. DAG) resets. The meter accumulates a truenonces / elapsedaverage that spans jobs, so slow kernels publish a real number out of the box and read0only when the GPU is genuinely idle.false— restores the previous per-job reset behaviour exactly.The work loop also opens the meter window once at start-up: a non-epoch algorithm's first job is a constant update (not a memory update), so under accumulation its clock would otherwise never start.
Changes
config/ CLI: new--internal_accumulate_hashbool wired intoDeviceOccupancy(defaulttrue).device: gate the epoch/period reset and theupdateBatchNoncereset behind(!accumulate || memory update); memory rebuilds always reset; open the meter window once before the loop.PARAMETERS.md.statisticalunit tests for the accumulate-vs-reset invariant.Verification
Unit tests (
unit_test, AMD cross-compile): 119/123 pass. The 4 failures are the pre-existing*AmdTest.allDeviceFindNonceprogpow-family cases, unrelated to this change. The 3 newStatisticalHashrate.*tests pass.Live, RX 9070 XT vs Octopus (
de.conflux.herominers.com:1170), same binary, ~95 s each:--internal_accumulate_hashtrue(default)false