Skip to content

Commit b914765

Browse files
authored
Merge pull request #2728 from dolthub/nicktobey/dolt_json_diff
Document DOLT_JSON_DIFF table function
2 parents 092be84 + af53fa1 commit b914765

1 file changed

Lines changed: 118 additions & 0 deletions

File tree

packages/dolt/content/reference/sql/version-control/dolt-sql-functions.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -634,6 +634,124 @@ With result of single row:
634634
+-----------------+---------------+-----------+-------------+---------------+
635635
```
636636

637+
## `DOLT_JSON_DIFF()`
638+
639+
The `DOLT_JSON_DIFF()` table function is a summary of the changes between two JSON documents.
640+
641+
### Options
642+
643+
```sql
644+
DOLT_DIFF_SUMMARY(<from_document>, <to_document>)
645+
```
646+
647+
The `DOLT_DIFF_SUMMARY()` table function takes two arguments:
648+
649+
- `from_document` — the document for the start of the diff. This
650+
argument is required. This may be a value from a JSON column, or a string that can
651+
be converted to JSON.
652+
- `to_document` — the document for the end of the diff. This
653+
argument is required. This may be a value from a JSON column, or a string that can
654+
be converted to JSON.
655+
656+
### Schema
657+
658+
```text
659+
+-----------------+---------+
660+
| field | type |
661+
+-----------------+---------+
662+
| diff_type | TEXT |
663+
| path | TEXT |
664+
| from_value | JSON |
665+
| to_value | JSON |
666+
+-----------------+---------+
667+
```
668+
669+
### Example
670+
671+
Consider we start with a table `inventory` in a database on `main` branch.
672+
673+
Here is the schema of `inventory` at the tip of `main`:
674+
675+
```text
676+
+----------+-------------+------+-----+---------+-------+
677+
| Field | Type | Null | Key | Default | Extra |
678+
+----------+-------------+------+-----+---------+-------+
679+
| pk | int | NO | PRI | NULL | |
680+
| name | varchar(50) | YES | | NULL | |
681+
| metadata | json | YES | | NULL | |
682+
+----------+-------------+------+-----+---------+-------+
683+
```
684+
685+
And here's the initial state of `inventory` has at the tip of `main`:
686+
687+
```text
688+
+----+-------+----------------------------------------------------------------+
689+
| pk | name | metadata. |
690+
+----+-------+----------------------------------------------------------------+
691+
| 1 | shirt | {"colors": ["red"] } |
692+
| 2 | shoes | {"colors": ["black"], "size": "small" } |
693+
| 3 | pants | {"colors": ["blue", "beige"], "materials": ["denim", "silk"] } |
694+
| 4 | tie | {"colours": ["red"], "clip-on": true } |
695+
+----+-------+----------------------------------------------------------------+
696+
```
697+
698+
We then create (but don't stage) a number of different changes, resulting in a working set that looks like this:
699+
700+
```text
701+
+----+-------+-------------------------------------------------------------+
702+
| pk | name | metadata |
703+
+----+-------+-------------------------------------------------------------+
704+
| 1 | shirt | {"colors": ["red", "blue"], "types": ["tee", "hawaiian"] } |
705+
| 2 | shoes | {"colors": ["white"], "size": "medium" } |
706+
| 3 | pants | {"colors": ["blue"] } |
707+
| 4 | tie | { "colors": ["red"], "clip-on": false } |
708+
+----+-------+-------------------------------------------------------------+
709+
```
710+
711+
We added values to the "shirt" document, edited data in the "shoes" document, deleted data from the "pants" document, and renamed a key in the "tie" document.
712+
713+
If we want to get a list of every unstaged change made to any value in the metadata column, we can combine the DOLT_JSON_DIFF() table function with the DOLT_WORKSPACE_inventory table, via a lateral join:
714+
715+
```sql
716+
SELECT
717+
to_pk as pk,
718+
to_name as name,
719+
json_diff.diff_type as json_diff_type,
720+
row_diff.from_metadata,
721+
row_diff.to_metadata,
722+
path,
723+
json_diff.from_value,
724+
json_diff.to_value
725+
FROM
726+
DOLT_WORKSPACE_inventory AS row_diff
727+
JOIN
728+
lateral (SELECT * FROM DOLT_JSON_DIFF(from_metadata, to_metadata)) json_diff
729+
WHERE row_diff.diff_type = 'modified' and row_diff.staged = false;
730+
```
731+
732+
The results of the query provide a summary of only the parts of the JSON documents that have changed between the staged version and the working set:
733+
734+
```
735+
+----+-------+----------------+----------------------------------------------------------+------------------------------------------------------+-------------+------------------+--------------------+
736+
| pk | name | json_diff_type | from_metadata | to_metadata | path | from_value | to_value |
737+
+----+-------+----------------+----------------------------------------------------------+------------------------------------------------------+-------------+------------------+--------------------+
738+
| 0 | shirt | added | {"colors":["red"]} | {"colors":["red","blue"],"types":["tee","hawaiian"]} | $.colors[1] | NULL | "blue" |
739+
| 0 | shirt | added | {"colors":["red"]} | {"colors":["red","blue"],"types":["tee","hawaiian"]} | $.types | NULL | ["tee","hawaiian"] |
740+
| 1 | shoes | modified | {"colors":["black"],"size":"small"} | {"colors":["white"],"size":"medium"} | $.colors[0] | "black" | "white" |
741+
| 1 | shoes | modified | {"colors":["black"],"size":"small"} | {"colors":["white"],"size":"medium"} | $.size | "small" | "medium" |
742+
| 2 | pants | removed | {"colors":["blue","beige"],"materials":["denim","silk"]} | {"colors":["blue"]} | $.colors[1] | "beige" | NULL |
743+
| 2 | pants | removed | {"colors":["blue","beige"],"materials":["denim","silk"]} | {"colors":["blue"]} | $.materials | ["denim","silk"] | NULL |
744+
| 3 | tie | modified | {"clip-on":true,"colours":["red"]} | {"clip-on":false,"colors":["red"]} | $.clip-on | true | false |
745+
| 3 | tie | added | {"clip-on":true,"colours":["red"]} | {"clip-on":false,"colors":["red"]} | $.colors | NULL | ["red"] |
746+
| 3 | tie | removed | {"clip-on":true,"colours":["red"]} | {"clip-on":false,"colors":["red"]} | $.colours | ["red"] | NULL |
747+
+----+-------+----------------+----------------------------------------------------------+------------------------------------------------------+-------------+------------------+--------------------+
748+
```
749+
750+
Note how multiple changes in a single row of the `inventory` table are rendered as multiple rows in the result. When multiple keys in the same object have changed, `DOLT_JSON_DIFF` reports an individual diff for each key, instead of reporting a single diff for the entire object.
751+
752+
Arrays are diffed by considering each index of the array separately. This means that inserting or removing values
753+
in an array anywhere other than the end will shift the indexes of each element, and will be reported as a modification at each index where the value changed.
754+
637755
## `DOLT_LOG()`
638756

639757
The `DOLT_LOG` table function gets the commit log for all commits reachable from the

0 commit comments

Comments
 (0)