You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix incorrect scan position during bitmap index words scan (#13479)
Since this PR #11377 has fixed bitmap index scan when concurrent insert update full
bitmap page, which resolved concurrent read on bitmap index when there's insert running
in the backend may cause the bitmap scan read wrong tid:
1. Query on bitmap: A query starts and reads all bitmap pages to PAGE_FULL, increases
the next tid to fetch, and releases the lock after reading each page.
2. Concurrent insert: insert a tid into PAGE_FULL cause expand compressed words to
new words, and rearrange words into PAGE_NEXT.
3. Query on bitmap: fetch PAGE_NEXT and expect the first tid in it should equal the
saved next tid. But actually PAGE_NEXT now contains words used to belong in PAGE_FULL.
This causes the real next tid less than the expected next tid. But our scan keeps increasing
the wrong tid. And then this leads to a wrong result.
The PR used _bitmap_catchup_to_next_tid() function to adjust result->lastScanWordNo
to the correct position if there has concurrent read/write and causes rearranging words into the
next bitmap index page, it used the value of words->firstTid and result->nextTid to judge
whether rearranging words into the next bitmap index page happened or not, this is not entirely
right.
This is because BMIterateResult can only store 16*1024=16384 TIDs, but BMBatchWords
can save 3968 words by default, a words is 64bit, even in the worst case (all words not
compressed) , BMBatchWords can hold 3968 * 64 = 253952 TIDs. So if there has a PAGE_FULL,
the PAGE_FULL must scan it more than one time, but the value of BMBatchWords->firstTid will
not be updated during each scan, it will only be updated when new bitmap index pages are read.
Therefore, in the absence of concurrent read/write, if we need to scan the same BMBatchWords
multiple times, it will lead to the wrong scan position, resulting in wrong output results, as #13446.
To summarize, we just need to check for a rearranged condition when the new bitmap index page is
read from disk, we should do nothing when we scan the same BMBatchWords.
Copy file name to clipboardExpand all lines: src/test/regress/expected/bitmap_index.out
+20Lines changed: 20 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1088,3 +1088,23 @@ select * from bm_test where b = 1;
1088
1088
1089
1089
-- clean up
1090
1090
drop table bm_test;
1091
+
-- test the scenario that we need read the same batch words many times
1092
+
-- more detials can be found at https://github.com/greenplum-db/gpdb/issues/13446
1093
+
SET enable_seqscan = OFF;
1094
+
SET enable_bitmapscan = OFF;
1095
+
create table foo_13446(a int, b int);
1096
+
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'a' as the Greenplum Database data distribution key for this table.
1097
+
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
1098
+
create index idx_13446 on foo_13446 using bitmap(b);
1099
+
insert into foo_13446 select 1, 1 from generate_series(0, 16384);
1100
+
-- At current implementation, BMIterateResult can only store 16*1024=16384 TIDs,
1101
+
-- if we have 13685 TIDs to read, it must scan same batch words twice, that's what we want
Copy file name to clipboardExpand all lines: src/test/regress/expected/bitmap_index_optimizer.out
+20Lines changed: 20 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1099,3 +1099,23 @@ select * from bm_test where b = 1;
1099
1099
1100
1100
-- clean up
1101
1101
drop table bm_test;
1102
+
-- test the scenario that we need read the same batch words many times
1103
+
-- more detials can be found at https://github.com/greenplum-db/gpdb/issues/13446
1104
+
SET enable_seqscan = OFF;
1105
+
SET enable_bitmapscan = OFF;
1106
+
create table foo_13446(a int, b int);
1107
+
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'a' as the Greenplum Database data distribution key for this table.
1108
+
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
1109
+
create index idx_13446 on foo_13446 using bitmap(b);
1110
+
insert into foo_13446 select 1, 1 from generate_series(0, 16384);
1111
+
-- At current implementation, BMIterateResult can only store 16*1024=16384 TIDs,
1112
+
-- if we have 13685 TIDs to read, it must scan same batch words twice, that's what we want
0 commit comments