Skip to content

Use dolt JSON encoding#2639

Merged
zachmu merged 82 commits into
mainfrom
zachmu/json-enc
Jun 5, 2026
Merged

Use dolt JSON encoding#2639
zachmu merged 82 commits into
mainfrom
zachmu/json-enc

Conversation

@zachmu

@zachmu zachmu commented Apr 24, 2026

Copy link
Copy Markdown
Member

This gives us the storage, merge, and perf benefits for JSON in doltgres.

This change also results in a behavior change for ORDER BY on JSONB columns. Postgres has a particular b-tree order it uses to store JSON documents in order to efficient perform its lookups. Dolt's method for efficient JSON retrieval is quite different, and has nothing to do with the order in which documents are stored in a primary index.

@github-actions

github-actions Bot commented Apr 24, 2026

Copy link
Copy Markdown
Contributor
Main PR
covering_index_scan_postgres 1954.13/s 1989.03/s +1.7%
groupby_scan_postgres 132.37/s 131.79/s -0.5%
index_join_postgres 678.80/s 680.55/s +0.2%
index_join_scan_postgres 865.95/s 864.35/s -0.2%
index_scan_postgres 24.45/s 24.31/s -0.6%
oltp_delete_insert_postgres 843.40/s 883.26/s +4.7%
oltp_insert 726.93/s 742.02/s +2.0%
oltp_point_select 3094.35/s 3029.66/s -2.1%
oltp_read_only 3085.28/s 3002.08/s -2.7%
oltp_read_write 2375.67/s 2391.43/s +0.6%
oltp_update_index 773.37/s 764.51/s -1.2%
oltp_update_non_index 816.50/s 814.36/s -0.3%
oltp_write_only 1807.27/s 1820.56/s +0.7%
select_random_points 1975.47/s 1960.74/s -0.8%
select_random_ranges 1158.28/s 1151.96/s -0.6%
table_scan_postgres 23.72/s 23.74/s 0.0%
types_delete_insert_postgres 811.56/s 790.67/s -2.6%
types_table_scan_postgres 10.35/s ${\color{red}8.49/s}$ ${\color{red}-18.0\%}$

@zachmu zachmu requested a review from nicktobey April 24, 2026 23:30
@github-actions

github-actions Bot commented Apr 24, 2026

Copy link
Copy Markdown
Contributor
Main PR
Total 42090 42090
Successful 18218 18242
Failures 23872 23848
Partial Successes1 5303 5328
Main PR
Successful 43.2834% 43.3405%
Failures 56.7166% 56.6595%

${\color{red}Regressions (7)}$

json

QUERY:          select '{"a": {"b":{"c": "foo"}}}'::json #>> array['a', ''];
RECEIVED ERROR: row sets differ:
    Postgres:
        {�}
    Doltgres:
        {"{\"b\": {\"c\": \"foo\"}}"}

json_encoding

QUERY:          SELECT '"\u0000"'::json;
RECEIVED ERROR: row sets differ:
    Postgres:
        {"�"}
    Doltgres:
        {"\"�\""}

jsonb

QUERY:          select '{"a": {"b":{"c": "foo"}}}'::jsonb #> array['a', ''];
RECEIVED ERROR: row sets differ:
    Postgres:
        {�}
    Doltgres:
        {[123 34 98 34 58 123 34 99 34 58 34 102 111 111 34 125 125]}
QUERY:          select '{"a": {"b":{"c": "foo"}}}'::jsonb #>> array['a', ''];
RECEIVED ERROR: row sets differ:
    Postgres:
        {�}
    Doltgres:
        {"{\"b\": {\"c\": \"foo\"}}"}
QUERY:          SELECT count(*) FROM testjsonb WHERE j > '{"p":1}';
RECEIVED ERROR: row sets differ:
    Postgres:
        {884}
    Doltgres:
        {894}
QUERY:          select '12345.0000000000000000000000000000000000000000000005'::jsonb::numeric;
RECEIVED ERROR: row sets differ:
    Postgres:
        {"12345.0000000000000000000000000000000000000000000005"}
    Doltgres:
        {"12345"}

random

QUERY:          (SELECT unique1 AS random
  FROM onek ORDER BY random() LIMIT 1)
INTERSECT
(SELECT unique1 AS random
  FROM onek ORDER BY random() LIMIT 1)
INTERSECT
(SELECT unique1 AS random
  FROM onek ORDER BY random() LIMIT 1);
RECEIVED ERROR: expected row count 0 but received 1

${\color{lightgreen}Progressions (31)}$

json

QUERY: SELECT test_json -> 'x'
FROM test_json
WHERE json_type = 'scalar';
QUERY: SELECT test_json -> 'x'
FROM test_json
WHERE json_type = 'array';
QUERY: SELECT test_json -> 'x'
FROM test_json
WHERE json_type = 'object';
QUERY: SELECT test_json -> 2
FROM test_json
WHERE json_type = 'scalar';
QUERY: SELECT test_json -> 2
FROM test_json
WHERE json_type = 'object';
QUERY: select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> 1;
QUERY: select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> -1;
QUERY: select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> 'z';
QUERY: select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json -> '';
QUERY: select '[{"b": "c"}, {"b": "cc"}]'::json -> 3;
QUERY: select '[{"b": "c"}, {"b": "cc"}]'::json -> 'z';
QUERY: select '"foo"'::json -> 1;
QUERY: select '"foo"'::json -> 'z';
QUERY: select '{"a": {"b":{"c": "foo"}}}'::json #> array['a', null];
QUERY: select '{"a": {"b":{"c": "foo"}}}'::json #> array['a','b','c','d'];
QUERY: select '{"a": {"b":{"c": "foo"}}}'::json #> array['a','z','c'];
QUERY: select '{"a": [{"b": "c"}, {"b": "cc"}]}'::json #> array['a','z','b'];
QUERY: select '[{"b": "c"}, {"b": "cc"}]'::json #> array['z','b'];
QUERY: select '"foo"'::json #> array['z'];
QUERY: select '42'::json #> array['f2'];
QUERY: select '42'::json #> array['0'];

json_encoding

QUERY: SELECT '"\u000g"'::json;
QUERY: SELECT '"\u000g"'::jsonb;
QUERY: SELECT jsonb '{ "a":  "dollar \\u0024 character" }' as not_an_escape;
QUERY: SELECT jsonb '{ "a":  "null \\u0000 escape" }' as not_an_escape;

jsonb

QUERY: SELECT count(distinct j) FROM testjsonb;
QUERY: SELECT '["a","b","c",[1,2],null]'::jsonb -> -6;
QUERY: SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,3}';
QUERY: SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-1}';
QUERY: SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-3}';
QUERY: SELECT '{"a":"b","c":[1,2,3]}'::jsonb #> '{c,-4}';

Footnotes

  1. These are tests that we're marking as Successful, however they do not match the expected output in some way. This is due to small differences, such as different wording on the error messages, or the column names being incorrect while the data itself is correct.

@fulghum fulghum left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment thread testing/go/issues_test.go Outdated
Co-authored-by: Jason Fulghum <jason.fulghum@gmail.com>
@zachmu zachmu merged commit 5bf6ad6 into main Jun 5, 2026
20 of 21 checks passed
@zachmu zachmu deleted the zachmu/json-enc branch June 5, 2026 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants