store_new_application lacks location normalization, causing duplicate components

The `store_new_application` MCP tool in `repo_context.py` performs an upsert
via `filter_by(repo=repo, location=location)`, but the `location` field is
passed through as-is from the LLM agent. When the agent calls the tool
multiple times for the same root-level component with slightly different
location strings (e.g. `.`, `..`, `...`), each call creates a new row
because the exact-match check does not match them.

There is also no UNIQUE constraint on (repo, location) in the Application
model, so the database does not prevent this.

This causes downstream amplification in taskflows that use `repeat_prompt:
true` with `async: true` (e.g. classify_application_local), since
`get_components` returns all rows and the taskflow spawns a parallel agent
for each duplicate.

Observed on a single-file Go application (one component), where
identify_applications stored 3 rows:

```
id | location | notes (truncated)
1  | .        | Single-component Go web application ...
2  | ..       | Single-component Go web application ...
3  | ...      | Single-component Go web application ...
```

This resulted in classify_application_local running 3 identical analyses
in parallel instead of 1.

Possible fixes:
- Normalize `location` before the upsert (strip trailing dots/slashes, canonicalize `.` variants)
- Add a UNIQUE(repo, location) constraint to the Application table
- Both

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

store_new_application lacks location normalization, causing duplicate components #72

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

store_new_application lacks location normalization, causing duplicate components #72

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions