@@ -634,6 +634,124 @@ With result of single row:
634634+-----------------+---------------+-----------+-------------+---------------+
635635```
636636
637+ ## ` DOLT_JSON_DIFF() `
638+
639+ The ` DOLT_JSON_DIFF() ` table function is a summary of the changes between two JSON documents.
640+
641+ ### Options
642+
643+ ``` sql
644+ DOLT_DIFF_SUMMARY(< from_document> , < to_document> )
645+ ```
646+
647+ The ` DOLT_DIFF_SUMMARY() ` table function takes two arguments:
648+
649+ - ` from_document ` — the document for the start of the diff. This
650+ argument is required. This may be a value from a JSON column, or a string that can
651+ be converted to JSON.
652+ - ` to_document ` — the document for the end of the diff. This
653+ argument is required. This may be a value from a JSON column, or a string that can
654+ be converted to JSON.
655+
656+ ### Schema
657+
658+ ``` text
659+ +-----------------+---------+
660+ | field | type |
661+ +-----------------+---------+
662+ | diff_type | TEXT |
663+ | path | TEXT |
664+ | from_value | JSON |
665+ | to_value | JSON |
666+ +-----------------+---------+
667+ ```
668+
669+ ### Example
670+
671+ Consider we start with a table ` inventory ` in a database on ` main ` branch.
672+
673+ Here is the schema of ` inventory ` at the tip of ` main ` :
674+
675+ ``` text
676+ +----------+-------------+------+-----+---------+-------+
677+ | Field | Type | Null | Key | Default | Extra |
678+ +----------+-------------+------+-----+---------+-------+
679+ | pk | int | NO | PRI | NULL | |
680+ | name | varchar(50) | YES | | NULL | |
681+ | metadata | json | YES | | NULL | |
682+ +----------+-------------+------+-----+---------+-------+
683+ ```
684+
685+ And here's the initial state of ` inventory ` has at the tip of ` main ` :
686+
687+ ``` text
688+ +----+-------+----------------------------------------------------------------+
689+ | pk | name | metadata. |
690+ +----+-------+----------------------------------------------------------------+
691+ | 1 | shirt | {"colors": ["red"] } |
692+ | 2 | shoes | {"colors": ["black"], "size": "small" } |
693+ | 3 | pants | {"colors": ["blue", "beige"], "materials": ["denim", "silk"] } |
694+ | 4 | tie | {"colours": ["red"], "clip-on": true } |
695+ +----+-------+----------------------------------------------------------------+
696+ ```
697+
698+ We then create (but don't stage) a number of different changes, resulting in a working set that looks like this:
699+
700+ ``` text
701+ +----+-------+-------------------------------------------------------------+
702+ | pk | name | metadata |
703+ +----+-------+-------------------------------------------------------------+
704+ | 1 | shirt | {"colors": ["red", "blue"], "types": ["tee", "hawaiian"] } |
705+ | 2 | shoes | {"colors": ["white"], "size": "medium" } |
706+ | 3 | pants | {"colors": ["blue"] } |
707+ | 4 | tie | { "colors": ["red"], "clip-on": false } |
708+ +----+-------+-------------------------------------------------------------+
709+ ```
710+
711+ We added values to the "shirt" document, edited data in the "shoes" document, deleted data from the "pants" document, and renamed a key in the "tie" document.
712+
713+ If we want to get a list of every unstaged change made to any value in the metadata column, we can combine the DOLT_JSON_DIFF() table function with the DOLT_WORKSPACE_inventory table, via a lateral join:
714+
715+ ``` sql
716+ SELECT
717+ to_pk as pk,
718+ to_name as name,
719+ json_diff .diff_type as json_diff_type,
720+ row_diff .from_metadata ,
721+ row_diff .to_metadata ,
722+ path ,
723+ json_diff .from_value ,
724+ json_diff .to_value
725+ FROM
726+ DOLT_WORKSPACE_inventory AS row_diff
727+ JOIN
728+ lateral (SELECT * FROM DOLT_JSON_DIFF(from_metadata, to_metadata)) json_diff
729+ WHERE row_diff .diff_type = ' modified' and row_diff .staged = false;
730+ ```
731+
732+ The results of the query provide a summary of only the parts of the JSON documents that have changed between the staged version and the working set:
733+
734+ ```
735+ +----+-------+----------------+----------------------------------------------------------+------------------------------------------------------+-------------+------------------+--------------------+
736+ | pk | name | json_diff_type | from_metadata | to_metadata | path | from_value | to_value |
737+ +----+-------+----------------+----------------------------------------------------------+------------------------------------------------------+-------------+------------------+--------------------+
738+ | 0 | shirt | added | {"colors":["red"]} | {"colors":["red","blue"],"types":["tee","hawaiian"]} | $.colors[1] | NULL | "blue" |
739+ | 0 | shirt | added | {"colors":["red"]} | {"colors":["red","blue"],"types":["tee","hawaiian"]} | $.types | NULL | ["tee","hawaiian"] |
740+ | 1 | shoes | modified | {"colors":["black"],"size":"small"} | {"colors":["white"],"size":"medium"} | $.colors[0] | "black" | "white" |
741+ | 1 | shoes | modified | {"colors":["black"],"size":"small"} | {"colors":["white"],"size":"medium"} | $.size | "small" | "medium" |
742+ | 2 | pants | removed | {"colors":["blue","beige"],"materials":["denim","silk"]} | {"colors":["blue"]} | $.colors[1] | "beige" | NULL |
743+ | 2 | pants | removed | {"colors":["blue","beige"],"materials":["denim","silk"]} | {"colors":["blue"]} | $.materials | ["denim","silk"] | NULL |
744+ | 3 | tie | modified | {"clip-on":true,"colours":["red"]} | {"clip-on":false,"colors":["red"]} | $.clip-on | true | false |
745+ | 3 | tie | added | {"clip-on":true,"colours":["red"]} | {"clip-on":false,"colors":["red"]} | $.colors | NULL | ["red"] |
746+ | 3 | tie | removed | {"clip-on":true,"colours":["red"]} | {"clip-on":false,"colors":["red"]} | $.colours | ["red"] | NULL |
747+ +----+-------+----------------+----------------------------------------------------------+------------------------------------------------------+-------------+------------------+--------------------+
748+ ```
749+
750+ Note how multiple changes in a single row of the ` inventory ` table are rendered as multiple rows in the result. When multiple keys in the same object have changed, ` DOLT_JSON_DIFF ` reports an individual diff for each key, instead of reporting a single diff for the entire object.
751+
752+ Arrays are diffed by considering each index of the array separately. This means that inserting or removing values
753+ in an array anywhere other than the end will shift the indexes of each element, and will be reported as a modification at each index where the value changed.
754+
637755## ` DOLT_LOG() `
638756
639757The ` DOLT_LOG ` table function gets the commit log for all commits reachable from the
0 commit comments