Skip to content

Joins#763

Open
fogelito wants to merge 315 commits intomainfrom
joins8
Open

Joins#763
fogelito wants to merge 315 commits intomainfrom
joins8

Conversation

@fogelito
Copy link
Copy Markdown
Contributor

@fogelito fogelito commented Nov 20, 2025

Summary by CodeRabbit

  • New Features

    • Added support for database joins (inner, left, and right joins) enabling queries across related collections.
    • Introduced context-aware query processing for improved permission and relationship handling.
  • Refactor

    • Restructured query API to use context-driven parameters, affecting how selections and filters are specified.
    • Redesigned database adapter interface signatures for more flexible query parameter composition.

Comment on lines +698 to +712
if ($options['relationType'] === Database::RELATION_ONE_TO_ONE && $options['twoWay'] === false && $options['side'] === Database::RELATION_SIDE_CHILD) {
throw new \Exception('Cannot query on virtual relationship attribute');
}

if ($options['relationType'] === Database::RELATION_ONE_TO_MANY && $options['side'] === Database::RELATION_SIDE_PARENT) {
throw new \Exception('Cannot query on virtual relationship attribute');
}

if ($options['relationType'] === Database::RELATION_MANY_TO_ONE && $options['side'] === Database::RELATION_SIDE_CHILD) {
throw new \Exception('Cannot query on virtual relationship attribute');
}

if ($options['relationType'] === Database::RELATION_MANY_TO_MANY) {
throw new \Exception('Cannot query on virtual relationship attribute');
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possible now

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are still in the main branch in Filter.php class 🤔

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 6, 2026

Greptile Summary

This PR adds SQL JOIN support (inner, left, right) to utopia-php/database. It introduces QueryContext to carry multi-collection state through query processing, refactors find/count/getDocument adapter signatures to accept pre-partitioned query arrays, and rewrites casting()/decode() to resolve attributes through alias-aware select context. The new Validator/Queries/Queries.php replaces the old Documents/IndexedQueries hierarchy with a unified validator that handles joins, relation-equality, and fulltext-index validation in one place.

Key issues found:

  • A commented-out removeNullKeys call in Mongo::createDocument() (line 1277) causes null-valued fields to be stored explicitly in MongoDB, silently breaking isNull/isNotNull and $exists query semantics for all newly inserted documents
  • SQL::getAttributeProjection() returns an empty string when all provided selects are $collection (always skipped), producing a SQL syntax error — the validator allows select('$collection') so this is reachable
  • The new casting() creates a fresh Document and silently drops any attribute not found in a collection schema, changing the previous behavior of preserving unknown keys
  • filterDocumentsBySelections now only auto-preserves $id, stripping $sequence, $createdAt, $updatedAt, and $permissions from nested relationship documents when explicit selects are used
  • IndexedQueries.php and Queries/Documents.php should be deleted rather than left as large comment blocks

Confidence Score: 3/5

Not safe to merge yet — two P1 runtime bugs (null key insertion in MongoDB, potential SQL syntax error from empty projection) must be fixed first

The join architecture is well-designed and has good test coverage (JoinsTests.php). However, two concrete runtime bugs were identified: the commented-out removeNullKeys in the Mongo adapter will silently corrupt document storage semantics for all new inserts, and getAttributeProjection can produce invalid SQL for a valid user input. Additionally, casting() now silently drops unrecognised attributes which is a breaking behavioral change that needs explicit documentation or a fix.

src/Database/Adapter/Mongo.php (line 1277: commented-out removeNullKeys), src/Database/Adapter/SQL.php (getAttributeProjection empty string fallback missing), src/Database/Database.php (casting() attribute dropping, filterDocumentsBySelections change)

Important Files Changed

Filename Overview
src/Database/Adapter/Mongo.php Adapts find/count/getDocument to new QueryContext API; critical: removeNullKeys commented out on line 1277, causing null values to be persisted in all new MongoDB documents
src/Database/Adapter/SQL.php Rewrites find/count/getDocument to use QueryContext and per-query selects; getAttributeProjection returns empty string when all selects are $collection, producing invalid SQL
src/Database/Database.php Adds join support via QueryContext; casting() now creates new document silently dropping unrecognised attributes; filterDocumentsBySelections drops all internal attributes except $id
src/Database/QueryContext.php New context class holding multi-collection state (aliases, per-collection auth-skip flags) used throughout find/count/decode
src/Database/Query.php Adds join query types (innerJoin, leftJoin, rightJoin), alias/attributeRight/as properties, and helper extraction methods
src/Database/Validator/Queries/Queries.php New unified query validator replacing Documents/IndexedQueries; correctly validates fulltext indexes, join queries, relation equality, and alias constraints
src/Database/Validator/IndexedQueries.php Entire class body commented out — file should be deleted to avoid confusion about whether the class is still active
src/Database/Validator/Queries/Documents.php Entire class body commented out — file should be deleted
src/Database/Adapter/Pool.php Forwarding adapter updated for new find/count signatures and getSupportForJoins delegation
tests/e2e/Adapter/Scopes/JoinsTests.php New comprehensive test suite for join queries across collections with permission, alias, and multi-join scenarios
src/Database/Validator/Alias.php New validator for table alias names used in join and select queries
src/Database/Validator/AsQuery.php New validator for 'as' alias in select queries; correctly rejects wildcards with aliases

Comments Outside Diff (1)

  1. src/Database/Database.php, line 5493-5511 (link)

    P2 filterDocumentsBySelections no longer preserves internal $-prefixed attributes on nested documents

    The old code kept every internal attribute plus any $-prefixed key even when not explicitly selected:

    // OLD
    foreach ($internalKeys as $key) { $attributesToKeep[$key] = true; }
    if (!isset($attributesToKeep[$attrKey]) && !\str_starts_with($attrKey, '$')) {
        $doc->removeAttribute($attrKey);
    }

    The new code only preserves $id:

    // NEW
    $attributesToKeep['$id'] = true;
    if (!isset($attributesToKeep[$attrKey])) {
        $doc->removeAttribute($attrKey);
    }

    Nested documents in relationship population will now have $sequence, $createdAt, $updatedAt, and $permissions stripped unless explicitly selected. Please confirm that populateDocumentsRelationships and any cursor logic operating on nested results do not depend on those attributes being present.

Reviews (1): Last reviewed commit: "Merge branch 'main' of github.com:utopia..." | Re-trigger Greptile

Comment on lines +1277 to +1278
//$record = $this->removeNullKeys($record);
$result = $this->insertDocument($name, $record, $options);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Commented-out removeNullKeys left in production code

The removeNullKeys call was commented out, meaning MongoDB documents are now inserted with null-value keys stored explicitly:

//$record = $this->removeNullKeys($record);
$result = $this->insertDocument($name, $record, $options);

This is a significant behavioral change:

  1. isNull() / isNotNull() queries may now match (or stop matching) documents that previously behaved differently — in MongoDB a stored null key is distinct from a missing key.
  2. $exists / $notExists semantics are affected for the same reason.
  3. Document size grows for every null-valued attribute in every newly inserted document.

This looks like debug code accidentally left in. If the intent is to stop stripping nulls, the line should be deleted (not commented out) with an explanatory comment. If it was accidentally removed, restore it:

Suggested change
//$record = $this->removeNullKeys($record);
$result = $this->insertDocument($name, $record, $options);
$record = $this->removeNullKeys($record);
$result = $this->insertDocument($name, $record, $options);

Comment on lines +2411 to +2441
$string = '';
foreach ($selects as $select) {
if ($select->getAttribute() === '$collection') {
continue;
}

$selections = \array_diff($selections, [...$internalKeys, '$collection']);
$alias = $select->getAlias();
$alias = $this->filter($alias);

foreach ($internalKeys as $internalKey) {
$selections[] = $this->getInternalKeyForAttribute($internalKey);
}
$attribute = $this->getInternalKeyForAttribute($select->getAttribute());


if ($attribute !== '*') {
$attribute = $this->filter($attribute);
$attribute = $this->quote($attribute);
}

$as = $select->getAs();

if (!empty($as)) {
$as = ' as '.$this->quote($this->filter($as));
}

$projections = [];
foreach ($selections as $selection) {
$filteredSelection = $this->filter($selection);
$quotedSelection = $this->quote($filteredSelection);
$projections[] = "{$this->quote($prefix)}.{$quotedSelection}";
if (!empty($string)) {
$string .= ', ';
}

$string .= "{$this->quote($alias)}.{$attribute}{$as}";
}

return \implode(',', $projections);
return $string;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 getAttributeProjection() can return an empty string, producing invalid SQL

The method skips $collection selects but will return '' if all provided selects resolve to $collection:

$string = '';
foreach ($selects as $select) {
    if ($select->getAttribute() === '$collection') {
        continue; // all entries skipped
    }
    $string .= ...;
}
return $string; // '' when all selects were '$collection'

When called with [Query::select('$collection')], the generated SQL becomes SELECT FROM \table` AS `main` …— a syntax error that throws an unhandledPDOException. The validator permits select('$collection')via thevalidateAttributeExist` early-return, so this path is reachable by callers.

Add a fallback after the loop:

Suggested change
$string = '';
foreach ($selects as $select) {
if ($select->getAttribute() === '$collection') {
continue;
}
$selections = \array_diff($selections, [...$internalKeys, '$collection']);
$alias = $select->getAlias();
$alias = $this->filter($alias);
foreach ($internalKeys as $internalKey) {
$selections[] = $this->getInternalKeyForAttribute($internalKey);
}
$attribute = $this->getInternalKeyForAttribute($select->getAttribute());
if ($attribute !== '*') {
$attribute = $this->filter($attribute);
$attribute = $this->quote($attribute);
}
$as = $select->getAs();
if (!empty($as)) {
$as = ' as '.$this->quote($this->filter($as));
}
$projections = [];
foreach ($selections as $selection) {
$filteredSelection = $this->filter($selection);
$quotedSelection = $this->quote($filteredSelection);
$projections[] = "{$this->quote($prefix)}.{$quotedSelection}";
if (!empty($string)) {
$string .= ', ';
}
$string .= "{$this->quote($alias)}.{$attribute}{$as}";
}
return \implode(',', $projections);
return $string;
if (empty($string)) {
return $this->quote(Query::DEFAULT_ALIAS).'.*';
}
return $string;

Comment on lines 8959 to +8963

$attributes = $collection->getAttribute('attributes', []);
$internals = [];
$schema = [];

foreach ($this->getInternalAttributes() as $attribute) {
$attributes[] = $attribute;
foreach (Database::INTERNAL_ATTRIBUTES as $attribute) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 casting() silently drops all attributes not found in any collection schema

The rewritten casting() creates a brand-new Document and skips any key it cannot resolve against $internals or $schema:

$attribute = $internals[$key] ?? null;
if (is_null($attribute)) {
    $attribute = $schema[$collection->getId()][$this->adapter->filter($key)] ?? null;
}
if (is_null($attribute)) {
    continue; // silently dropped from the new document
}

The old implementation mutated the original document, preserving unknown keys. Because getSupportForCasting() is true for all SQL adapters, any runtime-injected attribute or join-sourced column that cannot be resolved through the select-alias lookup is now silently lost before decode() runs.

If this is intentional (all join columns are guaranteed to be matched via the $selects loop), please add a comment making that contract explicit and add a test that covers a column not present in any schema. Otherwise, consider preserving unresolved attributes similar to how decode() handles schemaless databases.

Comment on lines 8408 to +8410

if (!empty($cursor)) {
foreach ($orderAttributes as $order) {
if ($cursor->getAttribute($order) === null) {
foreach ($orders as $order) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Misleading comment contradicts the actual code behaviour

$orders[] = Query::orderAsc(); // In joins we should not add a default order, we should validate when using a cursor we should have a unique order

The comment says joins should not get a default order, but Query::orderAsc() (which defaults to $sequence) is unconditionally added for all queries, including joins. Either:

  • Remove the comment if a default $sequence order is intentional for joins too, or
  • Guard with if (empty($joins)) so join queries behave differently, and document what cursor users must supply instead.

Comment on lines 1 to +10
<?php

namespace Utopia\Database\Validator;

use Exception;
use Utopia\Database\Database;
use Utopia\Database\Document;
use Utopia\Database\Query;
use Utopia\Database\Validator\Query\Base;

class IndexedQueries extends Queries
{
/**
* @var array<Document>
*/
protected array $attributes = [];

/**
* @var array<Document>
*/
protected array $indexes = [];

/**
* Expression constructor
*
* This Queries Validator filters indexes for only available indexes
*
* @param array<Document> $attributes
* @param array<Document> $indexes
* @param array<Base> $validators
* @throws Exception
*/
public function __construct(array $attributes = [], array $indexes = [], array $validators = [])
{
$this->attributes = $attributes;

$this->indexes[] = new Document([
'type' => Database::INDEX_UNIQUE,
'attributes' => ['$id']
]);

$this->indexes[] = new Document([
'type' => Database::INDEX_KEY,
'attributes' => ['$createdAt']
]);

$this->indexes[] = new Document([
'type' => Database::INDEX_KEY,
'attributes' => ['$updatedAt']
]);

foreach ($indexes as $index) {
$this->indexes[] = $index;
}

parent::__construct($validators);
}

/**
* Count vector queries across entire query tree
*
* @param array<Query> $queries
* @return int
*/
private function countVectorQueries(array $queries): int
{
$count = 0;

foreach ($queries as $query) {
if (in_array($query->getMethod(), Query::VECTOR_TYPES)) {
$count++;
}

if ($query->isNested()) {
$count += $this->countVectorQueries($query->getValues());
}
}

return $count;
}

/**
* @param mixed $value
* @return bool
* @throws Exception
*/
public function isValid($value): bool
{
if (!parent::isValid($value)) {
return false;
}
$queries = [];
foreach ($value as $query) {
if (! $query instanceof Query) {
try {
$query = Query::parse($query);
} catch (\Throwable $e) {
$this->message = 'Invalid query: '.$e->getMessage();

return false;
}
}

if ($query->isNested()) {
if (! self::isValid($query->getValues())) {
return false;
}
}

$queries[] = $query;
}

$vectorQueryCount = $this->countVectorQueries($queries);
if ($vectorQueryCount > 1) {
$this->message = 'Cannot use multiple vector queries in a single request';
return false;
}

$grouped = Query::groupByType($queries);
$filters = $grouped['filters'];

foreach ($filters as $filter) {
if (
$filter->getMethod() === Query::TYPE_SEARCH ||
$filter->getMethod() === Query::TYPE_NOT_SEARCH
) {
$matched = false;

foreach ($this->indexes as $index) {
if (
$index->getAttribute('type') === Database::INDEX_FULLTEXT
&& $index->getAttribute('attributes') === [$filter->getAttribute()]
) {
$matched = true;
}
}

if (!$matched) {
$this->message = "Searching by attribute \"{$filter->getAttribute()}\" requires a fulltext index.";
return false;
}
}
}

return true;
}
}
//
//namespace Utopia\Database\Validator;
//
//use Exception;
//use Utopia\Database\Database;
//use Utopia\Database\Document;
//use Utopia\Database\Query;
//use Utopia\Database\Validator\Query\Base;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Commented-out class bodies should be deleted, not left as dead code

Both IndexedQueries.php and Queries/Documents.php now consist entirely of commented-out class bodies (148 and ~80 lines respectively). This creates confusion about whether these classes are still partially active, temporarily disabled, or permanently superseded.

If Queries/Queries.php is the permanent replacement, please delete these files so git history cleanly records the removal. If they need to stay temporarily, add a @deprecated docblock at the top of each file pointing to the replacement class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants