Skip to content

Revamp the experiment list UI#1982

Open
deep1401 wants to merge 9 commits intomainfrom
add/experiments-ui
Open

Revamp the experiment list UI#1982
deep1401 wants to merge 9 commits intomainfrom
add/experiments-ui

Conversation

@deep1401
Copy link
Copy Markdown
Member

@deep1401 deep1401 commented May 4, 2026

No description provided.

@netlify
Copy link
Copy Markdown

netlify Bot commented May 4, 2026

Deploy Preview for transformerlab canceled.

Name Link
🔨 Latest commit b74810b
🔍 Latest deploy log https://app.netlify.com/projects/transformerlab/deploys/69fa62b8e024b900075c5738

@paragon-review
Copy link
Copy Markdown

paragon-review Bot commented May 4, 2026

Paragon Review Skipped

Hi @deep1401! Your Polarity credit balance is insufficient to complete this review.

Please visit https://app.paragon.run to finish your review.

@sentry
Copy link
Copy Markdown

sentry Bot commented May 4, 2026

Codecov Report

❌ Patch coverage is 57.64706% with 36 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...pi/transformerlab/routers/experiment/experiment.py 25.00% 35 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Member

@aliasaria aliasaria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This PR adds a per-user "recently opened experiments" feature, an experiments manager modal, rename/share modals, and a new user_experiment_access table. The shape is solid — service-layer separation is clean, route ordering for /recent vs /{id} is correct, and require_permission is wired in properly. The most important issues are a race condition in the touch_experiment upsert, missing cleanup of access records on experiment delete, an unhandled ValueError path on rename, and a few client-side robustness/typing issues.


Must Fix (blocks merge)

  • [Correctness] Race condition in touch_experiment SELECT-then-INSERT. Two concurrent calls (e.g. fast double-click, navigation + manual touch) both observe record is None, both session.add(...), the second commit raises IntegrityError against uq_user_experiment_access. The route returns 500 and the touch silently fails. This is a frontload-on-startup endpoint, so concurrency isn't theoretical.

    Use a portable upsert pattern that works on both SQLite and Postgres. Either retry-on-IntegrityError, or update-then-insert-if-zero-rows:

    async def touch_experiment(session, user_id, team_id, experiment_id):
        now = datetime.now(timezone.utc)
        result = await session.execute(
            update(UserExperimentAccess)
            .where(
                UserExperimentAccess.user_id == user_id,
                UserExperimentAccess.team_id == team_id,
                UserExperimentAccess.experiment_id == experiment_id,
            )
            .values(last_opened_at=now)
        )
        if result.rowcount == 0:
            try:
                session.add(UserExperimentAccess(
                    user_id=user_id, team_id=team_id,
                    experiment_id=experiment_id, last_opened_at=now,
                ))
                await session.commit()
            except IntegrityError:
                await session.rollback()  # lost the race — the other writer won, that's fine
        else:
            await session.commit()
  • [Correctness] Stale access records leak when experiments are deleted. experiment_delete (in experiment_service.py) does not delete the matching user_experiment_access rows. Per CLAUDE.md the project intentionally avoids FK constraints, so cleanup must be explicit. The /recent endpoint then returns ids that no longer exist; experiment_service.experiment_get(exp_id) returns falsy and they are filtered out — so the user-visible result is a "recent" list of fewer than 3 items (or empty, falling through to fallback). It also bloats the table over time.

    Delete the rows in experiment_delete:

    async def experiment_delete(id):
        try:
            exp = await Experiment.get(id)
            await exp.delete()
            # Clean up per-user access records (no FK in this DB by convention).
            async with get_async_session_ctx() as session:
                await session.execute(
                    delete(UserExperimentAccess).where(
                        UserExperimentAccess.experiment_id == str(id)
                    )
                )
                await session.commit()
            await cache.invalidate("experiments")
        ...
  • [Correctness] experiment_rename raises ValueError → 500 on sanitized-empty input. Router validates new_name is non-empty, but secure_filename("..."), secure_filename("\$\$\$"), secure_filename("../"), etc. return "" and the service raises ValueError("Invalid experiment name") — which propagates as a 500. Should be a 422.

    Catch in the router (or raise HTTPException from the service):

    try:
        result = await experiment_service.experiment_rename(id, new_name)
    except ValueError as e:
        raise HTTPException(status_code=422, detail=str(e))
    if result is None:
        raise HTTPException(status_code=404, detail=f"Experiment {id} not found")

Should Fix (important but not blocking)

  • [Performance] N+1 permission checks + N+1 experiment reads in /recent fallback. When a non-owner with no recent records hits /recent, you iterate every experiment, calling check_permission for each, until you collect 3. For workspaces with hundreds of experiments and a non-owner user, that's hundreds of DB roundtrips on every dropdown open. A simpler, faster pattern: reuse experiments_get_all's already-filtered list and slice the first 3. The same applies to the recent path — fetch all permitted experiments once, intersect with recent_ids, preserve order.

  • [Maintainability] Owner column shows raw UUID prefix instead of email. ExperimentsManagerModal.tsx:664-668 falls back to created_by.slice(0, 8) + '…' when the creator isn't the current user. You already have members loaded — look up the email there:

    ```ts
    const ownerLabel =
    exp.config?.created_by === currentUserId
    ? 'you'
    : members.find((m) => m.user_id === exp.config?.created_by)?.email
    ?? 'unknown';
    ```

  • [Correctness] ShareExperimentModal silently swallows delete errors. onRemove's catch {} leaves the row in the UI even when the request fails. At minimum, surface via setError like onAdd does, and don't optimistically remove from rules until the response is OK.

  • [Maintainability] Stale "recent" mutate after rename only — not after share or permission changes. mutateRecent is plumbed through rename and delete but not after sharing. If a teammate shares an experiment with the user, the recent list won't reflect access changes until next refresh. Either pass mutateRecent to ShareExperimentModal (onClose would call it) or use SWR key-based revalidation.

  • [Style] any types violate the CLAUDE.md "Avoid any" rule. Several spots: (exp: any) at SelectExperimentMenu.tsx, err: any / e: any in RenameExperimentModal.tsx and ShareExperimentModal.tsx. Catch handlers should use unknown and narrow.

  • [Correctness] Frontend missing last_opened_at update on the open path of ExperimentsManagerModal. handleOpen calls onExperimentSelect(exp.id)createHandleClose/touch + mutate() of the Recent SWR. But the manager's own useSWR(GetAll()) cache isn't invalidated, so reopening the manager shows stale last_opened_at. Call mutate() from the manager too, or share the cache key.


Consider Improving

  • The migration adds idx_user_experiment_access_user_team but get_recent_experiment_ids orders by last_opened_at DESC. With limit=3 and per-user data this is fine today; a composite (user_id, team_id, last_opened_at DESC) index would be tidier if the table grows.
  • experiment_access_service.touch_experiment calls await session.commit() from inside a service that receives the session as a dependency. The session lifecycle is owned by the FastAPI dependency; committing inside the service couples it to that lifecycle. Consider letting the caller commit (consistent with how some other services in this repo are structured).
  • formatRelativeTime uses Math.floor((Date.now() - date) / 86400000) — wall-clock-day rough math, not calendar days. "Yesterday" is wrong across DST boundaries. Minor, but date-fns/formatDistanceToNow would be sturdier.
  • The new LEVEL_ACTIONS.admin = ['read', 'write', 'execute', 'delete', 'admin'] is defined inline; verify it matches the canonical VALID_ACTIONS set in permission_service.py so the share UI stays in sync with the backend.

What's Working Well

  • Route ordering in experiment.py is correct: /recent is registered before /{id}, so it won't be shadowed by the dynamic-id route. Easy mistake to make; it was avoided here.
  • Service-layer split (experiment_access_service.py) follows the project's "business logic in services, not routers" convention from CLAUDE.md.
  • The migration uses table_exists + if_exists=True on drops — idempotent and consistent with existing migration patterns.
  • The unit tests in test_experiment_access_service.py correctly test the upsert branches at the service layer (rather than going through the full router stack), which matches the testing guidance in CLAUDE.md.

@greninja
Copy link
Copy Markdown
Contributor

greninja commented May 5, 2026

I was trying this out -- I can validate the renaming issue. nothing happens on renaming the experiment from the list.

@deep1401
Copy link
Copy Markdown
Member Author

deep1401 commented May 5, 2026

Fixed everything now!

Renaming was working but the experiments were cached. (@greninja could you check again once?)

@deep1401
Copy link
Copy Markdown
Member Author

deep1401 commented May 5, 2026

Fixed one more issue to mutate current experimentinfo upon rename as well

@greninja
Copy link
Copy Markdown
Contributor

greninja commented May 5, 2026

rename is still not working, had a chat with claude about this and I manually traced the path and verified, it seems:

the experiment_rename writes the new name to index.json["config"]["name"], but the readers in experiment_get_all and experiment_get both read the top-level name field, so maybe rename never reachs the UI.

@greninja
Copy link
Copy Markdown
Contributor

greninja commented May 5, 2026

I also saw another potential issue but it seems a bit non-deterministic so not able to reproduce -- when i open the drop down list to see all the experiments on the top left, I sometimes see 3, sometimes 2, sometimes just 1 even though all these times I had atleast 6 in my workspace

@deep1401
Copy link
Copy Markdown
Member Author

deep1401 commented May 5, 2026

I also saw another potential issue but it seems a bit non-deterministic so not able to reproduce -- when i open the drop down list to see all the experiments on the top left, I sometimes see 3, sometimes 2, sometimes just 1 even though all these times I had atleast 6 in my workspace

This is not an issue. We see the 3 most recently used ones. If there's no recent then we show them alphabetically but if there's atleast one then we show them as-is

@greninja
Copy link
Copy Markdown
Contributor

greninja commented May 5, 2026

and also when creating a new experiment with an id that already exists it throws this error (for "test" as the id):

  File "/home/shadab/projects/transformerlab/transformerlab-app/api/transformerlab/services/experime
nt_service.py", line 107, in experiment_create
    await Experiment.create_with_config(name, config)
  File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/experiment.py", line 141, in create_with_config
    exp = await cls.create(name)
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/labresource.py", line 31, in create
    await newobj._initialize()
  File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/experiment.py", line 116, in _initialize
    await super()._initialize()
  File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/labresource.py", line 70, in _initialize
    raise FileExistsError(f"{type(self).__name__}with id '{self.id}' already exists")
FileExistsError: Experiment with id 'test' already exists

in the logs + UI error

but still allowed me to create create the experiment, was that intentional? or should we not allow duplicate names/ids

@greninja
Copy link
Copy Markdown
Contributor

greninja commented May 5, 2026

I also saw another potential issue but it seems a bit non-deterministic so not able to reproduce -- when i open the drop down list to see all the experiments on the top left, I sometimes see 3, sometimes 2, sometimes just 1 even though all these times I had atleast 6 in my workspace

This is not an issue. We see the 3 most recently used ones. If there's no recent then we show them alphabetically but if there's atleast one then we show them as-is

oh I thought we show 3 no matter what as long as there are atleast 3

@deep1401
Copy link
Copy Markdown
Member Author

deep1401 commented May 5, 2026

@greninja Okay sorry about this, I disabled the experiment rename functionality because it sort of required a migration of the folder if things were on aws and it required moving most files. So thats why I just removed it instead of making it more confusing while waiting for the migration

@deep1401
Copy link
Copy Markdown
Member Author

deep1401 commented May 5, 2026

File "/home/shadab/projects/transformerlab/transformerlab-app/api/transformerlab/services/experime
nt_service.py", line 107, in experiment_create
await Experiment.create_with_config(name, config)
File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/experiment.py", line 141, in create_with_config
exp = await cls.create(name)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/labresource.py", line 31, in create
await newobj._initialize()
File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/experiment.py", line 116, in _initialize
await super()._initialize()
File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/labresource.py", line 70, in _initialize
raise FileExistsError(f"{type(self).name}with id '{self.id}' already exists")
FileExistsError: Experiment with id 'test' already exists

Thanks fixed this, it was silently swallowing the error. Fixed it

@greninja
Copy link
Copy Markdown
Contributor

greninja commented May 5, 2026

File "/home/shadab/projects/transformerlab/transformerlab-app/api/transformerlab/services/experime
nt_service.py", line 107, in experiment_create
await Experiment.create_with_config(name, config)
File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/experiment.py", line 141, in create_with_config
exp = await cls.create(name)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/labresource.py", line 31, in create
await newobj._initialize()
File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/experiment.py", line 116, in _initialize
await super()._initialize()
File "/home/shadab/projects/transformerlab/transformerlab-app/lab-sdk/src/lab/labresource.py", line 70, in _initialize
raise FileExistsError(f"{type(self).name}with id '{self.id}' already exists")
FileExistsError: Experiment with id 'test' already exists

Thanks fixed this, it was silently swallowing the error. Fixed it

yeah this fails loudly now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants