Skip to content

feat: add getMemoryStats endpoint to registry#486

Merged
rvanasa merged 17 commits into
mainfrom
registry-memory-stats
Apr 14, 2026
Merged

feat: add getMemoryStats endpoint to registry#486
rvanasa merged 17 commits into
mainfrom
registry-memory-stats

Conversation

@rvanasa

@rvanasa rvanasa commented Apr 10, 2026

Copy link
Copy Markdown
Contributor

This PR adds an admin-only query endpoint that reports entry counts and Candid-serialized byte counts for analyzing what data structures account for the most storage overhead in the registry canister.

getMemoryStats is a query method gated behind Utils.isAdmin(caller). It returns:

  • Runtime metrics: rts_heap_size and rts_memory_size from mo:prim
  • Per-structure stats: { count : Nat; bytes : Nat } for all data structures across the main actor, DownloadLog, StorageManager, and Users

@rvanasa rvanasa requested a review from Copilot April 10, 2026 16:13
@rvanasa rvanasa marked this pull request as ready for review April 10, 2026 16:14
@rvanasa rvanasa requested a review from a team as a code owner April 10, 2026 16:14

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an admin-only getMemoryStats query endpoint to the registry canister to report runtime RTS memory metrics and per-data-structure {count, bytes} estimates (via Candid serialization sizing) for storage overhead analysis.

Changes:

  • Extended the public Candid interface with StructureStats, MemoryStats, and getMemoryStats.
  • Implemented getMemoryStats in the main canister, including RTS heap/memory metrics and per-structure aggregation.
  • Added internal getMemoryStats helpers to DownloadLog, StorageManager, and Users to report their respective structure stats.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
cli/declarations/main/main.did Adds StructureStats/MemoryStats types and the getMemoryStats service method.
backend/main/main-canister.mo Implements the admin-only query and aggregates stats across in-canister structures and submodules.
backend/main/DownloadLog.mo Adds stats computation for download maps, snapshot buffers, and temp records.
backend/storage/storage-manager.mo Adds stats computation for storage maps.
backend/main/Users.mo Adds stats computation for the users map.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backend/main/main-canister.mo Outdated
Comment thread backend/main/DownloadLog.mo Outdated
Comment thread backend/main/Users.mo Outdated
Comment thread backend/main/DownloadLog.mo
Comment thread backend/main/main-canister.mo Outdated
Comment thread cli/declarations/main/main.did
rvanasa added 3 commits April 10, 2026 13:13
- Change getMemoryStats from query to update to avoid hitting query
  instruction limits when serializing all entries via to_candid
- Bump API_VERSION to 1.4 in backend and CLI (paired change)
- Update cli/declarations/main/main.did.js and main.did.d.ts with
  StructureStats and MemoryStats types, and getMemoryStats method
- Add unit tests for getMemoryStats in test/users.test.mo and
  test/download-log.test.mo

Made-with: Cursor
Replace full-iteration to_candid calls with a systematic sample of up
to 10,000 entries per data structure. The sampled byte total is
extrapolated linearly (sampleBytes * total / sampled), so the result
remains a useful approximation while keeping allocations bounded
regardless of registry size.

Made-with: Cursor
Sampling keeps to_candid calls bounded to ~10,000 per structure, so
query instruction limits are no longer a concern.

Made-with: Cursor

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 11 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backend/main/main-canister.mo Outdated
Comment thread backend/main/DownloadLog.mo Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backend/main/main-canister.mo Outdated
Comment thread backend/main/main-canister.mo Outdated
Comment thread backend/main/DownloadLog.mo Outdated
Comment thread backend/main/DownloadLog.mo Outdated
Comment thread backend/main/DownloadLog.mo Outdated
Comment thread backend/storage/storage-manager.mo Outdated
Comment thread backend/main/Users.mo Outdated
Comment thread backend/main/Users.mo Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backend/main/main-canister.mo Outdated
Comment thread backend/main/main-canister.mo Outdated
@Kamirus

Kamirus commented Apr 14, 2026

Copy link
Copy Markdown
Collaborator

Suggestion: Extract memory stats logic into a shared module

The sampling logic is duplicated across 4 files (DownloadLog.mo, Users.mo, StorageManager.mo, main-canister.mo) with 6+ near-identical sampling functions. I'd like to keep the domain modules as clean as possible — could we extract the shared logic into a dedicated MemoryStats.mo module?

The idea:

1. New backend/main/MemoryStats.mo — owns everything:

  • Type definitions (StructureStats, MemoryStats)
  • Generic sampling helpers (sampleMapBytes, sampleBufferBytes, sampleMapOfBuffersBytes)
  • The collect(...) function that assembles the full MemoryStats record
  • The SAMPLE_SIZE constant

2. Domain modules — each getMemoryStats() becomes a thin wrapper calling the shared helpers:

// Users.mo — before: ~40 lines of inline sampling
// Users.mo — after:
public func getMemoryStats() : { users : MemoryStatsHelper.StructureStats; names : MemoryStatsHelper.StructureStats } {
  {
    users = MemoryStatsHelper.sampleMapBytes(_users, func(k : Principal, v : Types.User) : Blob = to_candid((k, v)));
    names = MemoryStatsHelper.sampleSetBytes(_names, func(k : Text) : Blob = to_candid(k));
  }
};

Similar for DownloadLog and StorageManager — they keep getMemoryStats() because the fields are private, but the actual sampling logic lives in the shared module.

3. Main actor — stays minimal:

public query ({ caller }) func getMemoryStats() : async MemoryStatsHelper.MemoryStats {
  assert (Utils.isAdmin(caller));
  MemoryStatsHelper.collect(
    { packageVersions; packageConfigs; highestConfigs; packagePublications; ownersByPackage; maintainersByPackage; fileIdsByPackage; hashByFileId; packageFileStats; packageTestStats; packageBenchmarks; packageNotes; packageDocsCoverage },
    downloadLog.getMemoryStats(),
    storageManager.getMemoryStats(),
    users.getMemoryStats(),
  );
};

This way:

  • The sampling algorithm is defined once, not 6+ times
  • DownloadLog.mo, Users.mo, StorageManager.mo each get ~5-10 lines added instead of 40-150
  • Changing the sampling strategy (sample size, algorithm) is a single-file change
  • The admin helper code is clearly separated from domain logic

rvanasa added 3 commits April 14, 2026 09:51
Move the duplicated getMemoryStats sampling logic (sampleMapBytes,
sampleBufferBytes, sampleIterBytes, sampleMapOfBuffersBytes) into a
dedicated backend/main/MemoryStats.mo module. Domain modules
(DownloadLog, Users, StorageManager) and the main actor now delegate
to these shared helpers, reducing inline sampling code from ~180 lines
to thin call-site wrappers.

Made-with: Cursor
- Collapse sampleMapBytes/sampleBufferBytes into the single
  sampleIterBytes primitive; keep sampleMapOfBuffersBytes for the
  two-level case
- Add statsForMap, statsForBuffer, statsForIter, statsForMapOfBuffers
  convenience functions that return StructureStats directly
- Update all call sites to use the statsFor* helpers, removing the
  repeated { count = ...; bytes = ... } boilerplate
- Rename MemoryStatsHelper → MemoryStats import in main-canister.mo

Made-with: Cursor
Comment thread backend/main/main-canister.mo Outdated
Comment thread backend/main/main-canister.mo Outdated
Comment thread backend/main/main-canister.mo
- Move MemoryStats type definition into MemoryStats.mo module
- Use record update syntax ({ dlStats and smStats and uStats with ... })
  to eliminate manual field forwarding from sub-module stats
- Remove inaccurate comment about packageOwners stable storage clearing

Made-with: Cursor

@Kamirus Kamirus left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go for it, just one nit that might raise some warnings

Comment thread backend/main/main-canister.mo Outdated
@rvanasa rvanasa enabled auto-merge (squash) April 14, 2026 15:15
@rvanasa rvanasa merged commit dca7892 into main Apr 14, 2026
25 checks passed
@rvanasa rvanasa deleted the registry-memory-stats branch April 14, 2026 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants