Skip to content

[None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests#242

Merged
govind-ramnarayan merged 2 commits into
feat/paperclip_maximizerfrom
gramnarayan/skywork-test-fix
Mar 12, 2026
Merged

[None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests#242
govind-ramnarayan merged 2 commits into
feat/paperclip_maximizerfrom
gramnarayan/skywork-test-fix

Conversation

@govind-ramnarayan

@govind-ramnarayan govind-ramnarayan commented Mar 12, 2026

Copy link
Copy Markdown

Summary

  • Replace SkyworkChatConfig (loaded via AutoConfig.from_pretrained with trust_remote_code, requiring the 38B checkpoint in local HF cache) with a plain Qwen2Config passed directly to SkyworkR1V2ForConditionalGeneration
  • Remove module-level pytest.skip that skipped all tests when the checkpoint was absent — all tests now run in CI without any checkpoint
  • Drop _create_small_chat_config() helper; all tests use _create_small_llm_config() directly

The model's __init__ already has a getattr(config, "llm_config", config) fallback, so Qwen2Config works as-is. No vision tower is instantiated (no vision_config attr on Qwen2Config).

Test plan

  • All existing block/layer/full-model/export equivalence tests pass unchanged
  • New test_skywork_r1v2_config_qwen2_fallback verifies the direct Qwen2Config path
  • test_skywork_r1v2_state_dict_keys updated to only assert language_model.* keys (no vision)
  • No HF checkpoint required to run any test

🤖 Generated with Claude Code

Replace SkyworkChatConfig (loaded via AutoConfig with trust_remote_code,
which requires the checkpoint in the local HF cache) with a plain
Qwen2Config passed directly to SkyworkR1V2ForConditionalGeneration.

The model's __init__ already has a fallback:
  llm_config = getattr(config, "llm_config", config)
so passing Qwen2Config directly works without any wrapper config.
The vision tower is simply not instantiated (no vision_config attr).

This makes all tests runnable in CI without requiring the full 38B
checkpoint in the local HF cache.

Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
Replace AutoConfig.from_pretrained(..., local_files_only=True) with
minimal faithful copies of SkyworkChatConfig and SkyworkVisionConfig
defined in the test file (same pattern used for HF modeling classes not
in transformers).

This removes the module-level pytest.skip that silently skipped all
Skywork tests in CI when the 38B checkpoint was absent. Tests now run
without any HF checkpoint, while still exercising the real config-
wrapping behavior (nested llm_config, vision_config, vision weight keys).

Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
@govind-ramnarayan govind-ramnarayan merged commit 54a2eeb into feat/paperclip_maximizer Mar 12, 2026
2 of 3 checks passed
bmarimuthu-nv pushed a commit that referenced this pull request Mar 13, 2026
…242)

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace SkyworkChatConfig (loaded via AutoConfig with trust_remote_code,
which requires the checkpoint in the local HF cache) with a plain
Qwen2Config passed directly to SkyworkR1V2ForConditionalGeneration.

The model's __init__ already has a fallback:
  llm_config = getattr(config, "llm_config", config)
so passing Qwen2Config directly works without any wrapper config.
The vision tower is simply not instantiated (no vision_config attr).

This makes all tests runnable in CI without requiring the full 38B
checkpoint in the local HF cache.

Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace AutoConfig.from_pretrained(..., local_files_only=True) with
minimal faithful copies of SkyworkChatConfig and SkyworkVisionConfig
defined in the test file (same pattern used for HF modeling classes not
in transformers).

This removes the module-level pytest.skip that silently skipped all
Skywork tests in CI when the 38B checkpoint was absent. Tests now run
without any HF checkpoint, while still exercising the real config-
wrapping behavior (nested llm_config, vision_config, vision weight keys).

Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

---------

Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
bmarimuthu-nv pushed a commit that referenced this pull request Mar 13, 2026
…242)

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace SkyworkChatConfig (loaded via AutoConfig with trust_remote_code,
which requires the checkpoint in the local HF cache) with a plain
Qwen2Config passed directly to SkyworkR1V2ForConditionalGeneration.

The model's __init__ already has a fallback:
  llm_config = getattr(config, "llm_config", config)
so passing Qwen2Config directly works without any wrapper config.
The vision tower is simply not instantiated (no vision_config attr).

This makes all tests runnable in CI without requiring the full 38B
checkpoint in the local HF cache.

Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace AutoConfig.from_pretrained(..., local_files_only=True) with
minimal faithful copies of SkyworkChatConfig and SkyworkVisionConfig
defined in the test file (same pattern used for HF modeling classes not
in transformers).

This removes the module-level pytest.skip that silently skipped all
Skywork tests in CI when the 38B checkpoint was absent. Tests now run
without any HF checkpoint, while still exercising the real config-
wrapping behavior (nested llm_config, vision_config, vision weight keys).

Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

---------

Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
bmarimuthu-nv pushed a commit that referenced this pull request Mar 14, 2026
…242)

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace SkyworkChatConfig (loaded via AutoConfig with trust_remote_code,
which requires the checkpoint in the local HF cache) with a plain
Qwen2Config passed directly to SkyworkR1V2ForConditionalGeneration.

The model's __init__ already has a fallback:
  llm_config = getattr(config, "llm_config", config)
so passing Qwen2Config directly works without any wrapper config.
The vision tower is simply not instantiated (no vision_config attr).

This makes all tests runnable in CI without requiring the full 38B
checkpoint in the local HF cache.

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace AutoConfig.from_pretrained(..., local_files_only=True) with
minimal faithful copies of SkyworkChatConfig and SkyworkVisionConfig
defined in the test file (same pattern used for HF modeling classes not
in transformers).

This removes the module-level pytest.skip that silently skipped all
Skywork tests in CI when the 38B checkpoint was absent. Tests now run
without any HF checkpoint, while still exercising the real config-
wrapping behavior (nested llm_config, vision_config, vision weight keys).

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

---------

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
bmarimuthu-nv pushed a commit that referenced this pull request Mar 18, 2026
…242)

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace SkyworkChatConfig (loaded via AutoConfig with trust_remote_code,
which requires the checkpoint in the local HF cache) with a plain
Qwen2Config passed directly to SkyworkR1V2ForConditionalGeneration.

The model's __init__ already has a fallback:
  llm_config = getattr(config, "llm_config", config)
so passing Qwen2Config directly works without any wrapper config.
The vision tower is simply not instantiated (no vision_config attr).

This makes all tests runnable in CI without requiring the full 38B
checkpoint in the local HF cache.

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace AutoConfig.from_pretrained(..., local_files_only=True) with
minimal faithful copies of SkyworkChatConfig and SkyworkVisionConfig
defined in the test file (same pattern used for HF modeling classes not
in transformers).

This removes the module-level pytest.skip that silently skipped all
Skywork tests in CI when the 38B checkpoint was absent. Tests now run
without any HF checkpoint, while still exercising the real config-
wrapping behavior (nested llm_config, vision_config, vision weight keys).

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

---------

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
bmarimuthu-nv pushed a commit that referenced this pull request Mar 25, 2026
…242)

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace SkyworkChatConfig (loaded via AutoConfig with trust_remote_code,
which requires the checkpoint in the local HF cache) with a plain
Qwen2Config passed directly to SkyworkR1V2ForConditionalGeneration.

The model's __init__ already has a fallback:
  llm_config = getattr(config, "llm_config", config)
so passing Qwen2Config directly works without any wrapper config.
The vision tower is simply not instantiated (no vision_config attr).

This makes all tests runnable in CI without requiring the full 38B
checkpoint in the local HF cache.

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace AutoConfig.from_pretrained(..., local_files_only=True) with
minimal faithful copies of SkyworkChatConfig and SkyworkVisionConfig
defined in the test file (same pattern used for HF modeling classes not
in transformers).

This removes the module-level pytest.skip that silently skipped all
Skywork tests in CI when the 38B checkpoint was absent. Tests now run
without any HF checkpoint, while still exercising the real config-
wrapping behavior (nested llm_config, vision_config, vision weight keys).

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

---------

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
bmarimuthu-nv pushed a commit that referenced this pull request Apr 1, 2026
…242)

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace SkyworkChatConfig (loaded via AutoConfig with trust_remote_code,
which requires the checkpoint in the local HF cache) with a plain
Qwen2Config passed directly to SkyworkR1V2ForConditionalGeneration.

The model's __init__ already has a fallback:
  llm_config = getattr(config, "llm_config", config)
so passing Qwen2Config directly works without any wrapper config.
The vision tower is simply not instantiated (no vision_config attr).

This makes all tests runnable in CI without requiring the full 38B
checkpoint in the local HF cache.

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

* [None][fix] Remove HF cache dependency from Skywork-R1V2 unit tests

Replace AutoConfig.from_pretrained(..., local_files_only=True) with
minimal faithful copies of SkyworkChatConfig and SkyworkVisionConfig
defined in the test file (same pattern used for HF modeling classes not
in transformers).

This removes the module-level pytest.skip that silently skipped all
Skywork tests in CI when the 38B checkpoint was absent. Tests now run
without any HF checkpoint, while still exercising the real config-
wrapping behavior (nested llm_config, vision_config, vision weight keys).

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>

---------

Signed-off-by: gramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant