-
Notifications
You must be signed in to change notification settings - Fork 84
SafeQuery Explained
The StringFormat class offers compile-time safety with interpolation-style string formatting, using named placeholders.
What you shouldn't use it for, is to create SQL queries, because string interpolation with untrusted input can lead to SQL injection attack. For example:
private static final StringFormat FIND_USER_BY_ID =
new StringFormat("SELECT * FROM Users WHERE user_id = '{user_id}'");
...
String query = FIND_USER_BY_ID.format(userIdInput);If the userIdInput comes from untrusted sources, it can be used to steal information about other users. The attacker can just send a string like "' OR user_id = 'victim", and then they can see all information about the victim.
Instead, consider using the com.google.mu.safesql package:
private static final StringFormat.To<SafeQuery> FIND_USER_BY_ID =
SafeQuery.template("SELECT * FROM Users WHERE user_id = '{user_id}'");
...
SafeQuery query = FIND_USER_BY_ID.with(userIdInput);Benefits provided by the safesql package:
-
SafeQueryautomatically escapes special characters to prevent injection attack. - You can interpolate not only literal values, but tables, columns and even sub-queries.
- The same set of compile-time checks ensure that you can't make human errors (like passing the user password as the user id).
- Extra sql-aware compile-time checks ensure that your SQL template is sane.
- You can compose smaller
SafeQueryobjects to create largeSafeQueryobjects, making it easier to manage complex queries. - The
GoogleSqlclass provides extra GoogleSQL conversions such as translatingjava.time.Instantto GoogleSQL'sTIMESTAMP()function. You can useGoogleSqlto generateSafeQueryobjects for e.g. BigQuery.
If you are using Maven, make sure to put mug-errorprone artifact in your annotationProcessorPath in order to get the compile-time guardrails.
Then refer to the javadoc of SafeQuery and GoogleSql for the specifics of the API.
Overall, there are a few best practices and examples worth mentioning.
Create a client library that wraps the interface with the underlying database. The interface should accept SafeQuery as the input instead of raw String so that you can better enforce the protection. The client library internally can call SafeQuery#toString() to get the wrapped query, but the callers of the client library must all pass in SafeQuery. Example:
class BigQueryClient {
public BigQueryResponse sendQuery(SafeQuery query) {
// ...
}
}
...
// a caller that calls BigQuery
private static final StringFormat.To<SafeQuery> JOBS_TIMELINE_BY_PROJECT_ID =
GoogleSql.template(
"""
SELECT period_start, period_slot_ms
FROM INFORMATION_SCHEMA.JOBS_TIMELINE
WHERE period_start BETWEEN {start_time} AND {end_time}
AND project_id = '{project_id}'
""");
private final BigQueryClient bigQuery;
// ...
Instant startTime = ...;
Instant endTime = ...;
String projectId = ...;
var response = bigQuery.sendQuery(
JOBS_TIMELINE_BY_PROJECT_ID.with(startTime, endTime, projectId));If the query expects a string literal, do quote the placeholder in the template, like the '{project_id}' in the above example. Failing to do so will not compile and even if you manage to fool the compile-time checks, SafeQuery will throw IllegalArgumentException.
To parameterize by table name, column names etc., use backticks to quote around the placeholder (or else you get a compile-time error).
For example:
private static StringFormat.To<SafeQuery> SLOT_MS_PER_PERIOD =
GoogleSql.template(
"""
SELECT sum(period_slot_ms) as total_slot_ms FROM `{timeline_table}`
GROUP BY period_start
""");
// ...
SafeQuery query = SLOT_MS_PER_PERIOD.with("JOBS_TIMELINE_BY_PROJECT");