Skip to content

Identity User/robot management skills #522

Open
t-hsia wants to merge 10 commits intomainfrom
feat/identitySkills
Open

Identity User/robot management skills #522
t-hsia wants to merge 10 commits intomainfrom
feat/identitySkills

Conversation

@t-hsia
Copy link
Copy Markdown

@t-hsia t-hsia commented May 1, 2026

No description provided.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Claude finished @t-hsia's task in 4m 37s —— View job


Summary

This PR adds a new uipath-admin skill for Identity Server management via uip admin identity. It includes a well-structured SKILL.md, 5 reference files covering users, groups, robot accounts, external apps, and onboarding workflows, plus 8 test tasks (5 smoke, 3 e2e). Overall a solid first skill with good test coverage — a few convention issues to address.

Change-by-Change Review

1. skills/uipath-admin/SKILL.md

Severity: Low

Well-structured SKILL.md. Frontmatter is valid, name matches folder, description is 305 chars (well under 1024 cap). Front-loads identity and unique signals (uip admin identity, Users, groups, robot accounts). Includes redirects for uipath-platform and uipath-rpa. Preview status correctly placed in body callout, not in frontmatter. Critical Rules, Quick Start, Task Navigation, Anti-patterns, and Troubleshooting sections all present and well-ordered.

Minor: The Quick Start section (lines 44-49) has 4 commands but no surrounding prose explaining what the agent should do with the output. Adding a one-liner like "Verify login, then discover existing resources:" would help.

2. skills/uipath-admin/references/identity-commands.md

Severity: Medium

Comprehensive 452-line CLI reference. Every command is documented with flags, required/optional markers, and output codes. Good structure.

Issue — Significant content overlap with domain-specific reference files. Every command documented here (e.g., users list, groups create, external-apps generate-secret) is re-documented with nearly identical bash examples in user-management.md, group-management.md, robot-account-management.md, and external-app-management.md. An agent following the Task Navigation table will read identity-commands.md AND the domain file, ingesting duplicate command documentation. The domain files should link to identity-commands.md for flag details and focus only on workflow orchestration (step ordering, verification, decision logic).

Fix this →

3. skills/uipath-admin/references/onboarding-workflows.md

Severity: High

This file references the uipath-platform skill 4 times in its body (lines 75, 85, 96, 179), violating the self-containment rule. Skills must not reference or depend on other skills.

  • Line 75: "These commands use the uipath-platform skill's Orchestrator commands."
  • Line 85: "> For detailed Orchestrator folder and permission management, use the uipath-platform skill."
  • Line 96: "> Machine template management is handled via the uipath-platform skill or the Orchestrator REST API."
  • Line 179: "> For folder permission management, use the uipath-platform skill."

Additionally, Steps 4 and 5 of Workflow 1 (lines 77-97) contain placeholder code blocks with only comments and no actual commands — they're essentially stubs pointing to another skill. These incomplete steps will confuse an agent that attempts to follow the full onboarding workflow.

Fix: Either (a) remove Steps 4-5 and note that folder/machine setup is outside the scope of this skill, with the agent stopping after credential generation, or (b) include the actual uip CLI commands for folder assignment if they exist under the uip admin namespace. Remove all uipath-platform skill name references from the body.

Fix this →

4. skills/uipath-admin/references/user-management.md

Severity: Low

Clean workflow-oriented guide. Error handling table present. Code block comments like # List all users in the organization (line 10), # Search for a specific user (line 13), # First page (20 users) (line 93) are borderline per token optimization rules — they differentiate command variants, so they're acceptable here.

5. skills/uipath-admin/references/group-management.md

Severity: OK

Good workflow structure. Group types table (line 7) is useful. Correctly documents the built-in vs custom group distinction. Membership workflow correctly emphasizes using user IDs, not usernames.

6. skills/uipath-admin/references/robot-account-management.md

Severity: Low

The "Robot Account vs External App" table (lines 84-90) is helpful disambiguation. One issue: line 43 links to external-app-management.md with See [external-app-management.md](external-app-management.md) for credential generation — this is fine as an intra-skill link, but the phrasing could be tighter. The link navigates correctly.

7. skills/uipath-admin/references/external-app-management.md

Severity: OK

Well-structured. Key Concepts section (lines 8-13) is concise. The scopes table (lines 16-25) is useful reference. The warning about scopes being replaced, not merged (line 101) is an important non-obvious behavior — good inclusion. Error handling table present.

8. CODEOWNERS

Severity: Medium

Entry for /skills/uipath-admin/ is present (line 85). However, no entry for /tests/tasks/uipath-admin/ — other skills with tests (e.g., uipath-platform, uipath-data-fabric, uipath-tasks, uipath-human-in-the-loop) include CODEOWNERS for their test paths. Add:

/tests/tasks/uipath-admin/ @UiPath/Identity

Fix this →

9. tests/tasks/uipath-admin/smoke/*.yaml (5 files)

Severity: Low

Good smoke test coverage for list operations (users, groups, robot-accounts) and create operations (robot-account, external-app). All tasks correctly:

  • Tag uipath-admin as first tag ✓
  • Use valid test types (smoke) ✓
  • Follow task_id pattern skill-admin-identity-*
  • Validate login status check, correct command patterns, and --output json usage ✓

Minor: The smoke tests define their own agent block with lower max_turns (20-25) and no plugins field. If these run under the default experiment config, plugin loading should be inherited. If the task-level agent block replaces rather than merges with experiment defaults, the plugin might not load. Verify that coder_eval deep-merges agent config.

10. tests/tasks/uipath-admin/e2e/*.yaml (3 files)

Severity: OK

Strong e2e coverage for group membership lifecycle, human onboarding, and robot onboarding. Good success criteria that validate multi-step workflows (discover → create → verify → modify). The group_membership_management.yaml tests the full add/remove cycle with 8 success criteria. The robot_account_onboarding.yaml validates the cross-cutting workflow across robot-accounts and external-apps.

What's Missing

  • CODEOWNERS for test path/tests/tasks/uipath-admin/ needs an ownership entry (see finding feat(CodedAgents): llm and interruption patterns #8).
  • users invite missing --organization flag? — In identity-commands.md (lines 108-120), the users invite command does not list --organization as a flag, while every other command requires it. If this is intentional (invite operates at account level, not partition level), documenting why would help. If it's a documentation gap, add the flag.
  • No integration test tasks — Only smoke and e2e types are present. This is acceptable per the review criteria (both required types are covered), but an integration-level test for individual CRUD operations (e.g., create then delete a user) would strengthen coverage.

Area Ratings

Area Status Notes
Frontmatter ✅ OK Valid YAML, 305-char description, correct redirects
E2E Tests ✅ OK 5 smoke + 3 e2e tasks with good success criteria
Skill Body ⚠️ Issue Cross-skill references in onboarding-workflows.md
References & Assets ⚠️ Issue Significant duplication between identity-commands.md and domain files; placeholder steps in onboarding
Repo Hygiene ⚠️ Issue Missing CODEOWNERS entry for test path

Issues for Manual Review

  1. CLI command accuracy — Unable to verify that uip admin identity commands, flags, and output codes are correct against the actual CLI. The commands look well-structured but should be validated against uip admin identity --help output.
  2. users invite and --organization flag — Is invite intentionally organization-agnostic? This differs from every other identity command and could confuse the agent.
  3. Plugin loading in smoke tests — Verify that task-level agent blocks (which omit plugins) correctly inherit plugin config from the experiment defaults in coder_eval.
  4. Scope completeness — The skill covers Identity Server CRUD well. If there are other uip admin subcommands beyond identity (e.g., uip admin settings, uip admin audit), the skill name uipath-admin might be too broad for just identity management, or the skill should note its current scope explicitly.

Conclusion

Request changes. The skill is well-built with strong structure, good test coverage, and clear documentation. Three items need fixing before merge:

  1. High — Remove cross-skill references to uipath-platform in onboarding-workflows.md and resolve placeholder Steps 4-5
  2. Medium — Add CODEOWNERS entry for /tests/tasks/uipath-admin/
  3. Medium — Reduce duplication between identity-commands.md and domain-specific reference files (domain files should link to the command reference for flag details)

@t-hsia t-hsia changed the title Identity skills Add full Identity CLI support for user, robot, and groups. Partial for external apps May 4, 2026
@t-hsia t-hsia changed the title Add full Identity CLI support for user, robot, and groups. Partial for external apps Identity User/robot management skills May 4, 2026
Comment thread skills/uipath-admin/SKILL.md Outdated

> **Preview** — Under active development. Command coverage will expand.

Identity Server management via `uip admin identity`. Users, groups, robot accounts, external OAuth2 apps.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding "uipath-admin" folder name, I would suggest you to consult with @tomasz Religa if this is ok.
Because, we will have other admin commands.

@bai-uipath
Copy link
Copy Markdown
Contributor

bai-uipath commented May 5, 2026

I ran the AI linter on the new tasks - results below.


Scope: 8 task YAMLs under tests/tasks/uipath-admin/

Summary

Metric Count
Total linted 8
Critical 0
High 6
Medium 2
Low 0
OK 0

Per-Task Verdicts

Task Verdict Notes
group_membership_management_e2e.yaml Medium Unique issue + Theme 2, 3
human_user_onboarding_e2e.yaml Medium Unique issue + Theme 2, 3
identity_create_external_app_smoke.yaml High Theme-captured (Theme 1, 3)
identity_create_robot_account_smoke.yaml High Theme-captured (Theme 1, 3)
identity_list_groups_smoke.yaml High Theme-captured (Theme 1, 3)
identity_list_robot_accounts_smoke.yaml High Theme-captured (Theme 1, 3)
identity_list_users_smoke.yaml High Theme-captured (Theme 1, 3)
robot_account_onboarding_e2e.yaml High Theme-captured (Theme 1, 2, 3)

Per-Task Findings

group_membership_management_e2e.yaml — Medium

Issues

  • [Medium] Meaningful coverage / Could pass for the wrong reason (lines 35–105) — e2e criteria are all command_executed regex matches; nothing verifies the group was created, members added, or the second user removed at the platform side. A run where every CLI fails (auth issue) but pattern-matches still passes.

Suggested fixes

  • After create/add/remove commands, add run_command criteria invoking uip admin identity groups get-members --group-id <X> --output json and json_check the resulting member list (count after add, count after remove).
  • Drop Use --output json on all commands. (line 30); the cross-cutting command_executed check on lines 99–105 is sufficient.

human_user_onboarding_e2e.yaml — Medium

Issues

  • [Medium] Meaningful coverage / Could pass for the wrong reason (lines 34–94) — e2e criteria are all command_executed regex matches; nothing verifies the user was actually invited or actually added to the "Automation Developer" group at the platform side.

Suggested fixes

  • Add a run_command post-condition listing "Automation Developer" group members and json_check for the invited user, OR a follow-up uip admin identity users list filtered by email.
  • Drop the --output json prescription on line 29; cross-cutting check on lines 88–94 stays.

identity_create_external_app_smoke.yaml — High

Theme-captured; see Theme 1, Theme 3.


identity_create_robot_account_smoke.yaml — High

Theme-captured; see Theme 1, Theme 3.


identity_list_groups_smoke.yaml — High

Theme-captured; see Theme 1, Theme 3.


identity_list_robot_accounts_smoke.yaml — High

Theme-captured; see Theme 1, Theme 3.


identity_list_users_smoke.yaml — High

Theme-captured; see Theme 1, Theme 3.


robot_account_onboarding_e2e.yaml — High

Theme-captured; see Theme 1, Theme 2, Theme 3.

Themes

Theme 1 — [Critical] Self-report anti-pattern (6 tasks)

File Prompt lines Criteria lines
identity_create_external_app_smoke.yaml 20–27 62–87
identity_create_robot_account_smoke.yaml 19–26 69–86
identity_list_groups_smoke.yaml 18–23 58–70
identity_list_robot_accounts_smoke.yaml 17–22 49–61
identity_list_users_smoke.yaml 19–25 60–85
robot_account_onboarding_e2e.yaml 32–37 92–109

Each prompt instructs the agent to write a report.json recording the very things the criteria then grade — command_used, commands_attempted, app_name, scopes_used, robot_name, listed_first, steps_completed. The file_exists, file_contains, and json_check criteria all read this self-written file. Combined with the explicit "no live tenant — commands will fail" caveat, a lazy agent satisfies criteria by running each CLI once (failing) and writing the expected JSON. Skill is not exercised against ground truth.

Fix: Remove the report.json requirement entirely. Grade on the CLI invocation itself:

  • Replace file_exists / file_contains / json_check on report.json with run_command blocks that re-execute the canonical CLI and assert via expected_stdout / stdout_match against the real (failing) error string when no tenant is connected — proves the agent constructed the right command.
  • Or, if smoke cannot reach a tenant, drop the file checks and tighten command_pattern regexes to pin exact flags + values (e.g. --display-name "Invoice Processing Bot", --scope "?OR\.(Folders|Jobs)").

Theme 2 — [High] Prompt over-specification: recipe-style numbered prompts (3 tasks)

File Prompt lines Recipe length
group_membership_management_e2e.yaml 22–30 5 steps
human_user_onboarding_e2e.yaml 22–29 3 steps
robot_account_onboarding_e2e.yaml 25–37 2 steps

Initial prompts enumerate the procedure step-by-step (1. Create… 2. List… 3. Add…). The skill's onboarding-workflows reference is supposed to teach this ordering; if the prompt enumerates it, an agent can pass without invoking the skill at all.

Fix: State the goal, not the procedure. Examples:

  • group_membership_management_e2e: "Set up a new 'Invoice Processing Team' group containing the first two users in the org, verify membership, then remove the second user."
  • human_user_onboarding_e2e: "Onboard john.doe@example.com (John Doe) and add them to the 'Automation Developer' group once provisioned."
  • robot_account_onboarding_e2e: "Onboard an unattended invoice-processing robot ('invoice-processor' / 'Invoice Processing Robot') with appropriate group access."

Theme 3 — [Low] Prompt over-specification: --output json prescription (all 8 tasks)

All prompts contain "Use --output json on (all|every uip admin identity) commands." Procedural — the skill already teaches --output json. Cross-cutting command_executed criteria already verify usage, so the prompt-level prescription is redundant and slightly leaks procedure.

Fix: Drop the --output json instruction from each initial_prompt. Keep the cross-cutting command_executed criterion that pattern-matches --output\s+json so the test still rewards correct flag use.

Top 3 Priority Fixes (whole PR)

  1. Remove the report.json self-report pattern from all 6 smoke + e2e tasks (Theme 1). Most expensive issue — these tests grade the agent's own narrative, not platform behavior.
  2. Convert numbered-step recipes in the 3 onboarding/management e2e prompts to goal statements (Theme 2). Lets the skill carry workflow knowledge.
  3. Add real platform side-effect checks (run_command + json_check of get-members / users list filtered output) to group_membership_management_e2e and human_user_onboarding_e2e. They're the only two tasks that can hit a live tenant; right now they only verify the agent typed the right commands.

t-hsia and others added 2 commits May 5, 2026 11:53
Align skill docs and tests with CLI refactor (075b3a8d):
- get-members → members list
- add-members → members add
- remove-members → members revoke
- GroupMembersRemoved → GroupMembersRevoked

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Claude finished @t-hsia's task in 3m 7s —— View job


Coder-eval task lint (advisory)

8 task YAMLs changed; verdicts: 0 Critical, 6 High, 2 Medium, 0 Low, 0 OK.

Rubric: .claude/commands/lint-task.md. This check is advisory and never blocks merge.

Evidence of passing run

High — PR body does not claim the changed tasks have been run and passed. Please edit the PR description to add a line like: Ran skill-admin-identity-* locally and it passed.

Per-task lint

tests/tasks/uipath-admin/group_membership_management_e2e.yaml — verdict: Medium

Issues:

  • [Medium] Meaningful coverage (lines 35–105): all 8 success_criteria are command_executed regex matches — no run_command, json_check, or output validation. Proves the agent typed the right commands but not that the group was created or members were added/removed.
  • [Medium] Could pass for the wrong reason (lines 35–105): on a live tenant, if commands fail silently (e.g. auth issue), criteria still pass as long as the command string matched. No platform side-effect verification.

Suggested fixes:

  • After create/add/remove commands, add run_command criteria invoking uip admin groups get-members --group-id <X> --output json and json_check the resulting member list (count after add, count after remove).
  • Alternatively, add json_check criteria on the stdout of the groups create command to verify the group name in the response.

tests/tasks/uipath-admin/human_user_onboarding_e2e.yaml — verdict: Medium

Issues:

  • [Medium] Meaningful coverage (lines 34–94): all 7 success_criteria are command_executed — no verification that the user was actually invited or added to the group at the platform side.
  • [Medium] Could pass for the wrong reason (lines 34–94): a run where every CLI fails but pattern-matches the regex still passes.

Suggested fixes:

  • Add a run_command post-condition listing "Automation Developer" group members and json_check for the invited user, or a follow-up uip admin users list filtered by email with output validation.

tests/tasks/uipath-admin/identity_create_external_app_smoke.yaml — verdict: High

Theme-captured; see Theme 1, 3.


tests/tasks/uipath-admin/identity_create_robot_account_smoke.yaml — verdict: High

Theme-captured; see Theme 1, 3.


tests/tasks/uipath-admin/identity_list_groups_smoke.yaml — verdict: High

Theme-captured; see Theme 1, 3.


tests/tasks/uipath-admin/identity_list_robot_accounts_smoke.yaml — verdict: High

Theme-captured; see Theme 1, 3.


tests/tasks/uipath-admin/identity_list_users_smoke.yaml — verdict: High

Theme-captured; see Theme 1, 3.


tests/tasks/uipath-admin/robot_account_onboarding_e2e.yaml — verdict: High

Theme-captured; see Theme 1, 2, 3.

Within-PR duplicates

No duplicate clusters detected. The 3 list-smoke tasks and 2 create-smoke tasks share scaffolding but exercise materially distinct CLI subcommands (users list, groups list, robot-accounts list, etc.) — this is good template reuse, not duplication.

Themes

Theme 1 — [Critical] Self-report anti-pattern (6 tasks)

File Prompt lines Criteria lines
identity_create_external_app_smoke.yaml 20–27 62–87
identity_create_robot_account_smoke.yaml 19–26 69–86
identity_list_groups_smoke.yaml 18–23 58–70
identity_list_robot_accounts_smoke.yaml 17–22 49–61
identity_list_users_smoke.yaml 19–25 60–85
robot_account_onboarding_e2e.yaml 32–37 92–109

Each prompt instructs the agent to write a report.json recording the very things the criteria then grade — command_used, commands_attempted, app_name, scopes_used, robot_name, listed_first, steps_completed. The file_exists, file_contains, and json_check criteria all read this self-written file. Combined with the explicit "no live tenant — commands will fail" caveat, a lazy agent satisfies criteria by running each CLI once (failing) and writing the expected JSON. Skill is not exercised against ground truth.

Fix: Remove the report.json requirement entirely. Grade on the CLI invocation itself:

  • Replace file_exists / file_contains / json_check on report.json with tighter command_executed regexes that pin exact flags and values (e.g. --display-name "Invoice Processing Bot", --scope "?OR\.(Folders|Jobs)").
  • Or, if the sandbox can reach a tenant, use run_command blocks that re-execute the canonical CLI and assert via expected_stdout / stdout_match.

Fix this →

Theme 2 — [High] Prompt over-specification: recipe-style numbered steps (3 tasks)

File Prompt lines Step count
group_membership_management_e2e.yaml 22–30 5 steps
human_user_onboarding_e2e.yaml 22–27 3 steps
robot_account_onboarding_e2e.yaml 25–31 2 steps

Prompts enumerate the procedure step-by-step. The skill's onboarding-workflows and group-management references are supposed to teach ordering; if the prompt enumerates it, an agent passes without invoking the skill.

Fix: State the goal, not the procedure:

  • group_membership_management_e2e: "Set up a new 'Invoice Processing Team' group containing the first two users in the org, verify membership, then remove the second user."
  • human_user_onboarding_e2e: "Onboard john.doe@example.com (John Doe) and add them to the 'Automation Developer' group once provisioned."
  • robot_account_onboarding_e2e: "Onboard an unattended invoice-processing robot ('invoice-processor' / 'Invoice Processing Robot') with appropriate group access."

Fix this →

Theme 3 — [Low] Prompt over-specification: --output json prescription (all 8 tasks)

All prompts contain Use --output json on (all|every) uip admin commands. The skill already teaches --output json usage. Cross-cutting command_executed criteria already verify the flag, so the prompt-level prescription is redundant and slightly leaks procedure.

Fix: Drop the --output json instruction from each initial_prompt. Keep the cross-cutting command_executed criterion that pattern-matches --output\s+json.

Conclusion

⚠ 8 task(s) have issues, max severity Critical (at theme level; High at per-task level after theme downgrade). Advisory only — not blocking merge.

Top 3 priority fixes:

  1. Remove report.json self-report pattern from all 6 affected tasks (Theme 1) — most impactful issue; these tests grade the agent's own narrative, not platform behavior.
  2. Convert numbered-step recipes to goal statements in the 3 e2e/onboarding prompts (Theme 2) — lets the skill carry workflow knowledge.
  3. Add platform side-effect checks (run_command + json_check) to group_membership_management_e2e and human_user_onboarding_e2e — the only two tasks expected to hit a live tenant; currently they only verify command strings were typed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants