Skip to content

spec : save the dynamic/static ngram cache file#22055

Open
petersid2022 wants to merge 1 commit intoggml-org:masterfrom
petersid2022:self-speculation-save-cache
Open

spec : save the dynamic/static ngram cache file#22055
petersid2022 wants to merge 1 commit intoggml-org:masterfrom
petersid2022:self-speculation-save-cache

Conversation

@petersid2022
Copy link
Copy Markdown
Contributor

@petersid2022 petersid2022 commented Apr 17, 2026

Overview

  • When we select the COMMON_SPECULATIVE_TYPE_NGRAM_CACHE speculative implementation we create a new common_speculative_state_ngram_cache state using create_state_ngram_cache, where we instantiate the new state by specifying various parameters (e.g, n_draft, save_static and save_dynamic) by hardcoding them.

  • Instead we extend common_params_speculative to include those options as well.

  • An attempt was also made to implement the save_static / save_dynamic behavior by calling common_ngram_cache_save on object destruction.

Additional information

Add self‑speculative decoding (no draft model required)#18471

Requirements

@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch from 2e1c956 to 430c0ca Compare April 18, 2026 13:14
@petersid2022 petersid2022 marked this pull request as ready for review April 18, 2026 13:37
@petersid2022 petersid2022 requested a review from a team as a code owner April 18, 2026 13:37
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch 2 times, most recently from d5448ea to ba99720 Compare April 20, 2026 05:49
@petersid2022 petersid2022 requested review from a team, CISC, IMbackK, ggerganov and pwilkin as code owners April 20, 2026 05:49
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch 4 times, most recently from cf7a308 to 8ae6c04 Compare April 20, 2026 06:59
@CISC CISC removed request for a team, CISC, IMbackK and pwilkin April 20, 2026 08:03
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch 2 times, most recently from afc3295 to dc2ab62 Compare April 20, 2026 18:29
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch 5 times, most recently from c402b3d to 9da23a4 Compare April 21, 2026 18:34
@petersid2022 petersid2022 changed the title spec: save the dynamic/static ngram cache file spec : save the dynamic/static ngram cache file Apr 21, 2026
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch 9 times, most recently from 89b10b8 to 5c5bea4 Compare April 29, 2026 11:12
Comment thread common/common.h
};

struct common_params_speculative_ngram_cache {
struct common_params_speculative_ngram_cache : common_params_speculative_ngram_map {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the wrong way of going about this, but I am curious if the same concept of m-gram speculative tokens can be applied in the ngram-cache implemetantion

@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch from 5c5bea4 to cee5400 Compare April 29, 2026 14:48
* fix todo on providing n_draft, save_static and save_dynamic from common/common.h

* implement the functionality by saving the cache at the common_speculative_state_ngram_cache destruction
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch from cee5400 to 4d256ae Compare April 30, 2026 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant