Skip to content

Commit 2d7736d

Browse files
authored
Assistant improvements: Table annotations, and few-shot examples (#664)
* Table annotations * Few-shot prompt examples * View Assistant history * Better 'relevant table' detection and UI * Improved prompts * Cmd+shift+F shortcut for formatting SQL
1 parent 84c2eef commit 2d7736d

43 files changed

Lines changed: 1532 additions & 294 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

HISTORY.rst

Lines changed: 40 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,44 @@ This project adheres to `Semantic Versioning <https://semver.org/>`_.
77

88
vNext
99
===========================
10-
* `#660`_: Userspace connection migration. This should be an invisible change, but represents a significant refactor of how connections function.
11-
Instead of a weird blend of DatabaseConnection models and underlying Django models (which were the original Explorer connections),
12-
this migrates all connections to DatabaseConnection models and implements proper foreign keys to them on the Query and QueryLog models.
13-
A data migration creates new DatabaseConnection models based on the configured settings.EXPLORER_CONNECTIONS.
14-
Going forward, admins can create new Django-backed DatabaseConnection models by registering the connection in EXPLORER_CONNECTIONS, and then creating a
15-
DatabaseConnection model using the Django admin or the user-facing /connections/new/ form, and entering the Django DB alias and setting the connection type to "Django Connection"
16-
10+
* `#664`_: Improvements to the AI SQL Assistant:
11+
12+
- Table Annotations: Write persistent table annotations with descriptive information that will get injected into the
13+
prompt for the assistant. For example, if a table is commonly joined to another table through a non-obvious foreign
14+
key, you can tell the assistant about it in plain english, as an annotation to that table. Every time that table is
15+
deemed 'relevant' to an assistant request, that annotation will be included alongside the schema and sample data.
16+
- Few-Shot Examples: Using the small checkbox on the bottom-right of any saved queries, you can designate certain
17+
queries as 'few shot examples". When making an assistant request, any designated few-shot examples that reference
18+
the same tables as your assistant request will get included as 'reference sql' in the prompt for the LLM.
19+
- Autocomplete / multiselect when selecting tables info to send to the SQL Assistant. Much easier and more keyboard
20+
focused.
21+
- Relevant tables are added client-side visually, in real time, based on what's in the SQL editor and/or any tables
22+
mentioned in the assistant request. The dependency on sql_metadata is therefore removed, as server-side SQL parsing
23+
is no longer necessary.
24+
- Ability to view Assistant request/response history.
25+
- Improved system prompt that emphasizes the particular SQL dialect being used.
26+
- Addresses issue #657.
27+
28+
* `#660`_: Userspace connection migration.
29+
30+
- This should be an invisible change, but represents a significant refactor of how connections function. Instead of a
31+
weird blend of DatabaseConnection models and underlying Django models (which were the original Explorer
32+
connections), this migrates all connections to DatabaseConnection models and implements proper foreign keys to them
33+
on the Query and QueryLog models. A data migration creates new DatabaseConnection models based on the configured
34+
settings.EXPLORER_CONNECTIONS. Going forward, admins can create new Django-backed DatabaseConnection models by
35+
registering the connection in EXPLORER_CONNECTIONS, and then creating a DatabaseConnection model using the Django
36+
admin or the user-facing /connections/new/ form, and entering the Django DB alias and setting the connection type
37+
to "Django Connection".
38+
- The Query.connection and QueryLog.connection fields are deprecated and will be removed in a future release. They
39+
are kept around in this release in case there is an unforeseen issue with the migration. Preserving the fields for
40+
now ensures there is no data loss in the event that a rollback to an earlier version is required.
41+
42+
* Fixed a bug when validating connections to uploaded files. Also added basic locking when downloading files from S3.
43+
44+
* Keyboard shortcut for formatting the SQL in the editor.
45+
46+
- Cmd+Shift+F (Windows: Ctrl+Shift+F)
47+
- The format button has been moved tobe a small icon towards the bottom-right of the SQL editor.
1748

1849
`5.2.0`_ (2024-08-19)
1950
===========================
@@ -643,6 +674,8 @@ Initial Release
643674
.. _#651: https://github.com/explorerhq/sql-explorer/pull/651
644675
.. _#659: https://github.com/explorerhq/sql-explorer/pull/659
645676
.. _#662: https://github.com/explorerhq/sql-explorer/pull/662
677+
.. _#660: https://github.com/explorerhq/sql-explorer/pull/660
678+
.. _#664: https://github.com/explorerhq/sql-explorer/pull/664
646679

647680
.. _#269: https://github.com/explorerhq/sql-explorer/issues/269
648681
.. _#288: https://github.com/explorerhq/sql-explorer/issues/288

docs/features.rst

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,19 @@ SQL Assistant
55
-------------
66
- Built in integration with OpenAI (or the LLM of your choosing)
77
to quickly get help with your query, with relevant schema
8-
automatically injected into the prompt. Simple, effective.
8+
automatically injected into the prompt.
9+
- The assistant tries hard to get relevant context into the prompt to the LLM, alongside your explicit request. You
10+
can choose tables to include explicitly (and any tables you are reference in your SQL you will see get included as
11+
well). When a table is "included", the prompt will include the schema of the table, 3 sample rows, any Table
12+
Annotations you have added, and any designated "few shot examples". More on each of those below.
13+
- Table Annotations: Write persistent table annotations with descriptive information that will get injected into the
14+
prompt for the assistant. For example, if a table is commonly joined to another table through a non-obvious foreign
15+
key, you can tell the assistant about it in plain english, as an annotation to that table. Every time that table is
16+
deemed 'relevant' to an assistant request, that annotation will be included alongside the schema and sample data.
17+
- Few-shot examples: Using the small checkbox on the bottom-right of any saved query, you can designate queries as
18+
"Assistant Examples". When making an assistant request, the 'included tables' are intersected with tables referenced
19+
by designated Example queries, and those queries are injected into the prompt, and the LLM is told that that these
20+
are good reference queries.
921

1022
Database Support
1123
----------------
@@ -222,8 +234,7 @@ Power tips
222234
view.
223235
- Command+Enter and Ctrl+Enter will execute a query when typing in
224236
the SQL editor area.
225-
- Hit the "Format" button to format and clean up your SQL (this is
226-
non-validating -- just formatting).
237+
- Cmd+Shift+F (Windows: Ctrl+Shift+F) to format the SQL in the editor.
227238
- Use the Query Logs feature to share one-time queries that aren't
228239
worth creating a persistent query for. Just run your SQL in the
229240
playground, then navigate to ``/logs`` and share the link

explorer/admin.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
@admin.register(Query)
99
class QueryAdmin(admin.ModelAdmin):
10-
list_display = ("title", "description", "created_by_user",)
10+
list_display = ("title", "description", "created_by_user", "few_shot")
1111
list_filter = ("title",)
1212
raw_id_fields = ("created_by_user",)
1313
actions = [generate_report_action()]

explorer/app_settings.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,11 +152,17 @@
152152
EXPLORER_AI_API_KEY = getattr(settings, "EXPLORER_AI_API_KEY", None)
153153

154154
EXPLORER_ASSISTANT_BASE_URL = getattr(settings, "EXPLORER_ASSISTANT_BASE_URL", "https://api.openai.com/v1")
155+
156+
# Deprecated. Will be removed in a future release. Please use EXPLORER_ASSISTANT_MODEL_NAME instead
155157
EXPLORER_ASSISTANT_MODEL = getattr(settings, "EXPLORER_ASSISTANT_MODEL",
156158
# Return the model name and max_tokens it supports
157159
{"name": "gpt-4o",
158160
"max_tokens": 128000})
159161

162+
EXPLORER_ASSISTANT_MODEL_NAME = getattr(settings, "EXPLORER_ASSISTANT_MODEL_NAME",
163+
EXPLORER_ASSISTANT_MODEL["name"])
164+
165+
160166
EXPLORER_DB_CONNECTIONS_ENABLED = getattr(settings, "EXPLORER_DB_CONNECTIONS_ENABLED", False)
161167
EXPLORER_USER_UPLOADS_ENABLED = getattr(settings, "EXPLORER_USER_UPLOADS_ENABLED", False)
162168
EXPLORER_PRUNE_LOCAL_UPLOAD_COPY_DAYS_INACTIVITY = getattr(settings,

explorer/assistant/forms.py

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
from django import forms
2+
from explorer.assistant.models import TableDescription
3+
from explorer.ee.db_connections.utils import default_db_connection
4+
5+
6+
class TableDescriptionForm(forms.ModelForm):
7+
class Meta:
8+
model = TableDescription
9+
fields = "__all__"
10+
widgets = {
11+
"database_connection": forms.Select(attrs={"class": "form-select"}),
12+
"description": forms.Textarea(attrs={"class": "form-control", "rows": 3}),
13+
}
14+
15+
def __init__(self, *args, **kwargs):
16+
super().__init__(*args, **kwargs)
17+
if not self.instance.pk: # Check if this is a new instance
18+
# Set the default value for database_connection
19+
self.fields["database_connection"].initial = default_db_connection()
20+
21+
if self.instance and self.instance.table_name:
22+
choices = [(self.instance.table_name, self.instance.table_name)]
23+
else:
24+
choices = []
25+
26+
f = forms.ChoiceField(
27+
choices=choices,
28+
widget=forms.Select(attrs={"class": "form-select", "data-placeholder": "Select table"})
29+
)
30+
31+
# We don't actually care about validating the 'choices' that the ChoiceField does by default.
32+
# Really we are just using that field type in order to get a valid pre-populated Select widget on the client
33+
# But also it can't be blank!
34+
def valid_value_new(v):
35+
return bool(v)
36+
37+
f.valid_value = valid_value_new
38+
39+
self.fields["table_name"] = f
40+
41+
if self.instance and self.instance.table_name:
42+
self.fields["table_name"].initial = self.instance.table_name

explorer/assistant/models.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
from django.db import models
22
from django.conf import settings
3+
from explorer.ee.db_connections.models import DatabaseConnection
34

45

56
class PromptLog(models.Model):
@@ -8,6 +9,7 @@ class Meta:
89
app_label = "explorer"
910

1011
prompt = models.TextField(blank=True)
12+
user_request = models.TextField(blank=True)
1113
response = models.TextField(blank=True)
1214
run_by_user = models.ForeignKey(
1315
settings.AUTH_USER_MODEL,
@@ -19,3 +21,18 @@ class Meta:
1921
duration = models.FloatField(blank=True, null=True) # seconds
2022
model = models.CharField(blank=True, max_length=128, default="")
2123
error = models.TextField(blank=True, null=True)
24+
database_connection = models.ForeignKey(to=DatabaseConnection, on_delete=models.SET_NULL, blank=True, null=True)
25+
26+
27+
class TableDescription(models.Model):
28+
29+
class Meta:
30+
app_label = "explorer"
31+
unique_together = ("database_connection", "table_name")
32+
33+
database_connection = models.ForeignKey(to=DatabaseConnection, on_delete=models.CASCADE)
34+
table_name = models.CharField(max_length=512)
35+
description = models.TextField()
36+
37+
def __str__(self):
38+
return f"{self.database_connection.alias} - {self.table_name}"

explorer/assistant/urls.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
from django.urls import path
2+
from explorer.assistant.views import (TableDescriptionListView,
3+
TableDescriptionCreateView,
4+
TableDescriptionUpdateView,
5+
TableDescriptionDeleteView,
6+
AssistantHelpView,
7+
AssistantHistoryApiView)
8+
9+
assistant_urls = [
10+
path("assistant/", AssistantHelpView.as_view(), name="assistant"),
11+
path("assistant/history/", AssistantHistoryApiView.as_view(), name="assistant_history"),
12+
path("table-descriptions/", TableDescriptionListView.as_view(), name="table_description_list"),
13+
path("table-descriptions/new/", TableDescriptionCreateView.as_view(), name="table_description_create"),
14+
path("table-descriptions/<int:pk>/update/", TableDescriptionUpdateView.as_view(), name="table_description_update"),
15+
path("table-descriptions/<int:pk>/delete/", TableDescriptionDeleteView.as_view(), name="table_description_delete"),
16+
]

0 commit comments

Comments
 (0)