Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions CLUSTERING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Overview

The clustering mechanism groups similar reports within each domain using unsupervised machine learning (SBERT embeddings and agglomerative clustering) and creates a bucket for each cluster.
The clustering mechanism groups similar reports within each domain using unsupervised machine learning (SBERT embeddings and agglomerative clustering) and creates a bucket for each cluster.

## Running Full Clustering
Note that running full clustering **will delete existing clusters and cluster-based buckets** and recreate them from scratch. Generally we'll need to do it only once.
Expand All @@ -11,10 +11,10 @@ Note that running full clustering **will delete existing clusters and cluster-ba

```bash
# Cluster reports for a specific domain only
uv run -p 3.12 --extra=server server/manage.py cluster_reports --domain example.com
uv run --extra=server server/manage.py cluster_reports --domain example.com

# Cluster all reports across all domains
uv run -p 3.12 --extra=server server/manage.py cluster_reports cluster_reports
uv run --extra=server server/manage.py cluster_reports cluster_reports

```

Expand Down Expand Up @@ -78,7 +78,7 @@ Low-quality reports skip clustering and go directly to domain-based buckets.

You can also run triage manually:
```bash
uv run -p 3.12 --extra=server server/manage.py triage_new_reports
uv run --extra=server server/manage.py triage_new_reports
```

Note: This command requires at least one successful full clustering run to have occurred first.
Expand Down
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,12 @@ The server is expected to be run using [`uv`](https://docs.astral.sh/uv/)
To setup the server, run the following commands:

```
$ uv run -p 3.12 --extra=server server/manage.py migrate
$ uv run --extra=server server/manage.py migrate
```

Create the webcompatmanager user.
```
$ uv run -p 3.12 --extra=server server/manage.py createsuperuser
$ uv run --extra=server server/manage.py createsuperuser
Username (leave blank to use 'user'): webcompatmanager
Email address: webcompatmanager@internal.com
Password:
Expand All @@ -52,7 +52,7 @@ Superuser created successfully.

It is now possible to run the development server locally:
```
$ uv run -p 3.12 --extra=server server/manage.py runserver
$ uv run --extra=server server/manage.py runserver
```

Log in using the credentials created above.
Expand All @@ -62,17 +62,17 @@ Log in using the credentials created above.
Lints are run with pre-commit. This can be installed as a Git hook, or run manually using:

```
uv run --extra=dev -p 3.12 pre-commit run --all
uv run --extra=dev pre-commit run --all
```

Tests are run using tox:
```
uv run --extra=dev -p 3.12 tox
uv run --extra=dev tox
```

End-to-end tests can be run separately using:
```
uv run --extra=dev -p 3.12 tox -e e2e
uv run --extra=dev tox -e e2e
```

Note: E2E tests require the frontend to be built first. Run `npm run build` or `npm run start` in `server/frontend/` before running E2E tests.
Expand All @@ -97,7 +97,7 @@ with:

This requires first authenticating with gcloud.

```uv run -p 3.12 --extra=server server/manage.py import_reports_from_bigquery --since <date>```
```uv run --extra=server server/manage.py import_reports_from_bigquery --since <date>```


### Important changes in settings.py
Expand Down
Loading