Skip to content

[DEV-14703] Create NL Search Django Model#4645

Open
AHasan-FRB wants to merge 12 commits intoqatfrom
ftr/dev-14561-nl-django-models
Open

[DEV-14703] Create NL Search Django Model#4645
AHasan-FRB wants to merge 12 commits intoqatfrom
ftr/dev-14561-nl-django-models

Conversation

@AHasan-FRB
Copy link
Copy Markdown

@AHasan-FRB AHasan-FRB commented Apr 30, 2026

Description:

Create the Django models and the corresponding migrations, ensuring the correct tables and relationships are created in Postgres, related to NL search assistant design and work.

Technical Details:

  • To load data for fixtures, you can run each separately or alternative run the script to do it automatically with the command python manage.py load_llm_fixtures
  • New tests created to ensure loading new modelsworks successfully
  • Models include str attribute for easy admin debugging
  • Default ordering is present for most model fields, for easier indexing and searching

Requirements for PR Merge:

  1. Unit & integration tests updated
  2. API documentation updated (examples listed below)
    1. API Contracts
    2. API UI
    3. Comments
  3. Data validation completed (examples listed below)
    1. Does this work well with the current frontend? Or is the frontend aware of a needed change?
    2. Is performance impacted in the changes (e.g., API, pipeline, downloads, etc.)?
    3. Is the expected data returned with the expected format?
  4. Appropriate Operations ticket(s) created
  5. Jira Ticket(s)
    1. DEV-14703

Explain N/A in above checklist:

Copy link
Copy Markdown
Contributor

@zachflanders-frb zachflanders-frb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I am requesting changes to remove the LLMSearchQurey model and to add the migrations file to the commit.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add Nova Pro? (amazon.nova-pro-v1:0)

db_table = "ai_model"
ordering = ["-id"]

class Prompts(models.Model):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder about adding a name field to this model in order to have an easier way to get prompts than using the id or the full description?

system_prompt = models.ForeignKey(Prompts, on_delete=models.SET_NULL, null=True, related_name="sessions")
started_at = models.DateTimeField(auto_now_add=True)
ended_at = models.DateTimeField(null=True, blank=True)
feedback = models.BooleanField(default=None, null=True, blank=True, help_text="positive=True, negative=False")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the UX mocks it looks like we are going to have a short survey if someone gives feedback. This makes me think that feedback could be its own model with an is_positive field and then potentially a survey json field to represent survey questions and answers or even survey question and survey response models to fully model out the survey. OTOH the survey might not use the api at all and use some other tool to collect feedback.

created_at = models.DateTimeField(auto_now_add=True)
input_tokens = models.IntegerField(default=0)
output_tokens = models.IntegerField(default=0)
total_tokens = models.IntegerField(default=0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my initial testing total_tokens might not be that important to keep track of since input tokens and output tokens are priced differently and this is primarily to keep track of usage and cost.

Comment thread usaspending_api/llm/models/db_models.py Outdated
Comment on lines +93 to +107
class LLMSearchQuery(models.Model):
user_query = models.TextField()
session = models.ForeignKey(Session, on_delete=models.CASCADE, related_name="search_queries")
created_at = models.DateTimeField(auto_now_add=True)

def __str__(self):
preview = self.user_query[:75] + "..." if len(self.user_query) > 75 else self.user_query
return f"Query {self.id}: {preview}"


class Meta:
db_table = "llm_search_query"
indexes = [
models.Index(fields=["-created_at"]),
] No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that this model is redundant because the first message in the session will be a user message that includes the user query, so I would recommend that we do not keep this model.

selectedRecipientLocations: dict[str, Any] = Field(default_factory=dict)
awardType: list[str] = Field(default_factory=list)
selectedAwardIDs: dict[str, Any] = Field(default_factory=dict)
awardAmounts: dict[str, list[int]] = Field(default_factory=dict)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added to the poc branch to give the llm more context and understanding of the awardAmounts field.

Suggested change
awardAmounts: dict[str, list[int]] = Field(default_factory=dict)
awardAmounts: dict[str, list[int | None]] = Field(
default_factory=dict,
description=(
"Dictionary of award amount ranges for filtering. "
"Each value is a two-element list: [min_amount, max_amount]. "
"Use `None` for unbounded ranges.\n\n"
"TWO MUTUALLY EXCLUSIVE MODES:\n\n"
"MODE 1 - STANDARD RANGES (can select multiple):\n"
"- 'range-0': [None, 1000000] - Awards up to $1M\n"
"- 'range-1': [1000000, 25000000] - Awards $1M to $25M\n"
"- 'range-2': [25000000, 100000000] - Awards $25M to $100M\n"
"- 'range-3': [100000000, 500000000] - Awards $100M to $500M\n"
"- 'range-4': [500000000, None] - Awards over $500M\n\n"
"MODE 2 - SPECIFIC RANGE (must be alone):\n"
"- 'specific': [min, max] - Specify exact dollar amounts\n\n"
"CRITICAL RULES:\n"
"1. You can use multiple standard ranges together (range-0 through range-4)\n"
"2. You can use ONE specific range with specific min/max values\n"
"3. NEVER mix standard ranges with specific range\n"
"4. When using 'specific', it must be the ONLY key in the dictionary"
),
json_schema_extra={
"examples": [
# Example 1: Multiple standard ranges
{"range-0": [None, 1000000], "range-2": [25000000, 100000000]},
# Example 2: Single standard range
{"range-3": [100000000, 500000000]},
# Example 3: Custom range with both bounds
{"specific": [5000000, 50000000]},
# Example 4: Custom range unbounded above
{"specific": [10000000, None]},
# Example 5: Custom range unbounded below
{"specific": [None, 75000000]},
]
},
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the migrations file to this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants