Skip to content

Commit 8c74fd2

Browse files
MuhammadTahaNaveedZigzagAKjrgemignani
authored
Update PG16 prior to 1.7.0 part 2 (#2375)
* Add index on id columns (#2117) - Whenever a label will be created, indices on id columns will be created by default. In case of vertex, a unique index on id column will be created, which will also serve as a unique constraint. In case of edge, a non-unique index on start_id and end_id columns will be created. - This change is expected to improve the performance of queries that involve joins. From some performance tests, it was observed that the performance of queries improved alot. - Loader was updated to insert tuples in indices as well. This has caused to slow the loader down a bit, but it was necessary. - A bug related to command ids in cypher_delete executor was also fixed. * Fix possible memory and file descriptors leaks (#2258) - Used postgres memory allocation functions instead of standard ones. - Wrapped main loop of csv loader in PG_TRY block for better error handling. * Restrict age_load commands (#2274) This PR applies restrictions to the following age_load commands - load_labels_from_file() load_edges_from_file() They are now tied to a specific root directory and are required to have a specific file extension to eliminate any attempts to force them to access any other files. Nothing else has changed with the actual command formats or parameters, only that they work out of the /tmp/age directory and only access files with an extension of .csv. Added regression tests and updated the location of the csv files for those regression tests. modified: regress/expected/age_load.out modified: regress/sql/age_load.sql modified: src/backend/utils/load/age_load.c * Fix and improve index.sql regression test coverage (#2300) NOTE: This PR was created with AI tools and a human. - Remove unused copy command (leftover from deleted agload_test_graph test) - Replace broken Section 4 that referenced non-existent graph with comprehensive WHERE clause tests covering string, int, bool, and float properties with AND/OR/NOT operators - Add EXPLAIN tests to verify index usage: - Section 3: Validate GIN indices (load_city_gin_idx, load_country_gin_idx) show Bitmap Index Scan for property matching - Section 4: Validate all expression indices (city_country_code_idx, city_id_idx, city_west_coast_idx, country_life_exp_idx) show Index Scan for WHERE clause filtering All indices now have EXPLAIN verification confirming they are used as expected. modified: regress/expected/index.out modified: regress/sql/index.sql * Fix and improve index.sql addendum (#2301) NOTE: This PR was created with the help of AI tools and a human. Added additional requested regression tests - *EXPLAIN for pattern with WHERE clause *EXPLAIN for pattern with filters on both country and city modified: regress/expected/index.out modified: regress/sql/index.sql * Replace libcsv with pg COPY for csv loading (#2310) - Commit also adds permission checks - Resolves a critical memory spike issue on loading large file - Use pg's COPY infrastructure (BeginCopyFrom, NextCopyFromRawFields) for 64KB buffered CSV parsing instead of libcsv - Add byte based flush threshold (64KB) matching COPY behavior for memory safety - Use heap_multi_insert with BulkInsertState for optimized batch inserts - Add per batch memory context to prevent memory growth during large loads - Remove libcsv dependency (libcsv.c, csv.h) - Improves loading performance by 15-20% - No previous regression tests were impacted - Added regression tests for permissions/rls Assisted-by AI --------- Co-authored-by: Aleksey Konovkin <alkon2000@mail.ru> Co-authored-by: John Gemignani <jrgemignani@gmail.com>
1 parent ba8d6c4 commit 8c74fd2

24 files changed

Lines changed: 2042 additions & 1603 deletions

Makefile

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,6 @@ OBJS = src/backend/age.o \
6969
src/backend/utils/load/ag_load_labels.o \
7070
src/backend/utils/load/ag_load_edges.o \
7171
src/backend/utils/load/age_load.o \
72-
src/backend/utils/load/libcsv.o \
7372
src/backend/utils/name_validation.o \
7473
src/backend/utils/ag_guc.o
7574

regress/expected/age_load.out

Lines changed: 232 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,9 @@
1616
* specific language governing permissions and limitations
1717
* under the License.
1818
*/
19-
\! cp -r regress/age_load/data regress/instance/data/age_load
19+
\! rm -rf /tmp/age/age_load
20+
\! mkdir -p /tmp/age
21+
\! cp -r regress/age_load/data /tmp/age/age_load
2022
LOAD 'age';
2123
SET search_path TO ag_catalog;
2224
-- Create a country using CREATE clause
@@ -43,13 +45,6 @@ SELECT load_labels_from_file('agload_test_graph', 'Country',
4345

4446
(1 row)
4547

46-
-- A temporary table should have been created with 54 ids; 1 from CREATE and 53 from file
47-
SELECT COUNT(*)=54 FROM "_agload_test_graph_ag_vertex_ids";
48-
?column?
49-
----------
50-
t
51-
(1 row)
52-
5348
-- Sequence should be equal to max entry id i.e. 248
5449
SELECT currval('agload_test_graph."Country_id_seq"')=248;
5550
?column?
@@ -74,13 +69,6 @@ NOTICE: VLabel "City" has been created
7469

7570
(1 row)
7671

77-
-- Temporary table should have 54+72485 rows now
78-
SELECT COUNT(*)=54+72485 FROM "_agload_test_graph_ag_vertex_ids";
79-
?column?
80-
----------
81-
t
82-
(1 row)
83-
8472
-- Sequence should be equal to max entry id i.e. 146941
8573
SELECT currval('agload_test_graph."City_id_seq"')=146941;
8674
?column?
@@ -415,6 +403,43 @@ SELECT * FROM cypher('agload_conversion', $$ MATCH ()-[e:Edges2]->() RETURN prop
415403
{"bool": "false", "string": "nUll", "numeric": "3.14"}
416404
(6 rows)
417405

406+
--
407+
-- Check sandbox
408+
--
409+
-- check null file name
410+
SELECT load_labels_from_file('agload_conversion', 'Person1', NULL, true, true);
411+
ERROR: file path must not be NULL
412+
SELECT load_edges_from_file('agload_conversion', 'Edges1', NULL, true);
413+
ERROR: file path must not be NULL
414+
-- check no file name
415+
SELECT load_labels_from_file('agload_conversion', 'Person1', '', true, true);
416+
ERROR: file name cannot be zero length
417+
SELECT load_edges_from_file('agload_conversion', 'Edges1', '', true);
418+
ERROR: file name cannot be zero length
419+
-- check for file/path does not exist
420+
SELECT load_labels_from_file('agload_conversion', 'Person1', 'age_load_xxx/conversion_vertices.csv', true, true);
421+
ERROR: File or path does not exist [/tmp/age/age_load_xxx/conversion_vertices.csv]
422+
SELECT load_edges_from_file('agload_conversion', 'Edges1', 'age_load_xxx/conversion_edges.csv', true);
423+
ERROR: File or path does not exist [/tmp/age/age_load_xxx/conversion_edges.csv]
424+
SELECT load_labels_from_file('agload_conversion', 'Person1', 'age_load/conversion_vertices.txt', true, true);
425+
ERROR: File or path does not exist [/tmp/age/age_load/conversion_vertices.txt]
426+
SELECT load_edges_from_file('agload_conversion', 'Edges1', 'age_load/conversion_edges.txt', true);
427+
ERROR: File or path does not exist [/tmp/age/age_load/conversion_edges.txt]
428+
-- check wrong extension
429+
\! touch /tmp/age/age_load/conversion_vertices.txt
430+
\! touch /tmp/age/age_load/conversion_edges.txt
431+
SELECT load_labels_from_file('agload_conversion', 'Person1', 'age_load/conversion_vertices.txt', true, true);
432+
ERROR: You can only load files with extension [.csv].
433+
SELECT load_edges_from_file('agload_conversion', 'Edges1', 'age_load/conversion_edges.txt', true);
434+
ERROR: You can only load files with extension [.csv].
435+
-- check outside sandbox directory
436+
SELECT load_labels_from_file('agload_conversion', 'Person1', '../../etc/passwd', true, true);
437+
ERROR: You can only load files located in [/tmp/age/].
438+
SELECT load_edges_from_file('agload_conversion', 'Edges1', '../../etc/passwd', true);
439+
ERROR: You can only load files located in [/tmp/age/].
440+
--
441+
-- Cleanup
442+
--
418443
SELECT drop_graph('agload_conversion', true);
419444
NOTICE: drop cascades to 6 other objects
420445
DETAIL: drop cascades to table agload_conversion._ag_label_vertex
@@ -429,3 +454,195 @@ NOTICE: graph "agload_conversion" has been dropped
429454

430455
(1 row)
431456

457+
--
458+
-- Test security and permissions
459+
--
460+
SELECT create_graph('agload_security');
461+
NOTICE: graph "agload_security" has been created
462+
create_graph
463+
--------------
464+
465+
(1 row)
466+
467+
SELECT create_vlabel('agload_security', 'Person1');
468+
NOTICE: VLabel "Person1" has been created
469+
create_vlabel
470+
---------------
471+
472+
(1 row)
473+
474+
SELECT create_vlabel('agload_security', 'Person2');
475+
NOTICE: VLabel "Person2" has been created
476+
create_vlabel
477+
---------------
478+
479+
(1 row)
480+
481+
SELECT create_elabel('agload_security', 'SecEdge');
482+
NOTICE: ELabel "SecEdge" has been created
483+
create_elabel
484+
---------------
485+
486+
(1 row)
487+
488+
--
489+
-- Test 1: File read permission (pg_read_server_files role)
490+
--
491+
-- Create a user without pg_read_server_files role
492+
CREATE USER load_test_user;
493+
GRANT USAGE ON SCHEMA ag_catalog TO load_test_user;
494+
-- This should fail because load_test_user doesn't have pg_read_server_files
495+
SET ROLE load_test_user;
496+
SELECT load_labels_from_file('agload_security', 'Person1', 'age_load/conversion_vertices.csv', true);
497+
ERROR: permission denied to LOAD from a file
498+
DETAIL: Only roles with privileges of the "pg_read_server_files" role may LOAD from a file.
499+
SELECT load_edges_from_file('agload_security', 'SecEdge', 'age_load/conversion_edges.csv');
500+
ERROR: permission denied to LOAD from a file
501+
DETAIL: Only roles with privileges of the "pg_read_server_files" role may LOAD from a file.
502+
RESET ROLE;
503+
-- Grant pg_read_server_files and try again - should fail on table permission now
504+
GRANT pg_read_server_files TO load_test_user;
505+
--
506+
-- Test 2: Table INSERT permission (ACL_INSERT)
507+
--
508+
-- User has file read permission but no INSERT on the label table
509+
SET ROLE load_test_user;
510+
SELECT load_labels_from_file('agload_security', 'Person1', 'age_load/conversion_vertices.csv', true);
511+
ERROR: permission denied for table Person1
512+
SELECT load_edges_from_file('agload_security', 'SecEdge', 'age_load/conversion_edges.csv');
513+
ERROR: permission denied for table SecEdge
514+
RESET ROLE;
515+
-- Grant INSERT permission and try again - should succeed
516+
GRANT USAGE ON SCHEMA agload_security TO load_test_user;
517+
GRANT INSERT ON agload_security."Person1" TO load_test_user;
518+
GRANT INSERT ON agload_security."SecEdge" TO load_test_user;
519+
GRANT UPDATE ON SEQUENCE agload_security."Person1_id_seq" TO load_test_user;
520+
GRANT UPDATE ON SEQUENCE agload_security."SecEdge_id_seq" TO load_test_user;
521+
GRANT SELECT ON ag_catalog.ag_label TO load_test_user;
522+
GRANT SELECT ON ag_catalog.ag_graph TO load_test_user;
523+
SET ROLE load_test_user;
524+
SELECT load_labels_from_file('agload_security', 'Person1', 'age_load/conversion_vertices.csv', true);
525+
load_labels_from_file
526+
-----------------------
527+
528+
(1 row)
529+
530+
SELECT load_edges_from_file('agload_security', 'SecEdge', 'age_load/conversion_edges.csv');
531+
load_edges_from_file
532+
----------------------
533+
534+
(1 row)
535+
536+
RESET ROLE;
537+
-- Verify data was loaded
538+
SELECT COUNT(*) FROM agload_security."Person1";
539+
count
540+
-------
541+
6
542+
(1 row)
543+
544+
SELECT COUNT(*) FROM agload_security."SecEdge";
545+
count
546+
-------
547+
6
548+
(1 row)
549+
550+
-- cleanup
551+
DELETE FROM agload_security."Person1";
552+
DELETE FROM agload_security."SecEdge";
553+
--
554+
-- Test 3: Row-Level Security (RLS)
555+
--
556+
-- Enable RLS on the label tables
557+
ALTER TABLE agload_security."Person1" ENABLE ROW LEVEL SECURITY;
558+
ALTER TABLE agload_security."SecEdge" ENABLE ROW LEVEL SECURITY;
559+
-- Switch to load_test_user
560+
SET ROLE load_test_user;
561+
-- Loading should fail when RLS is enabled
562+
SELECT load_labels_from_file('agload_security', 'Person1', 'age_load/conversion_vertices.csv', true);
563+
ERROR: LOAD from file is not supported with row-level security
564+
HINT: Use Cypher CREATE clause instead.
565+
SELECT load_edges_from_file('agload_security', 'SecEdge', 'age_load/conversion_edges.csv');
566+
ERROR: LOAD from file is not supported with row-level security
567+
HINT: Use Cypher CREATE clause instead.
568+
RESET ROLE;
569+
-- Disable RLS and try again - should succeed
570+
ALTER TABLE agload_security."Person1" DISABLE ROW LEVEL SECURITY;
571+
ALTER TABLE agload_security."SecEdge" DISABLE ROW LEVEL SECURITY;
572+
SELECT load_labels_from_file('agload_security', 'Person1', 'age_load/conversion_vertices.csv', true);
573+
load_labels_from_file
574+
-----------------------
575+
576+
(1 row)
577+
578+
SELECT load_edges_from_file('agload_security', 'SecEdge', 'age_load/conversion_edges.csv');
579+
load_edges_from_file
580+
----------------------
581+
582+
(1 row)
583+
584+
-- Verify data was loaded
585+
SELECT COUNT(*) FROM agload_security."Person1";
586+
count
587+
-------
588+
6
589+
(1 row)
590+
591+
SELECT COUNT(*) FROM agload_security."SecEdge";
592+
count
593+
-------
594+
6
595+
(1 row)
596+
597+
-- cleanup
598+
DELETE FROM agload_security."Person1";
599+
DELETE FROM agload_security."SecEdge";
600+
--
601+
-- Test 4: Constraint checking (CHECK constraint)
602+
--
603+
-- Add constraint on vertex properties - fail if bool property is false
604+
ALTER TABLE agload_security."Person1" ADD CONSTRAINT check_bool_true
605+
CHECK ((properties->>'"bool"')::boolean = true);
606+
-- This should fail - constraint violation
607+
SELECT load_labels_from_file('agload_security', 'Person1', 'age_load/conversion_vertices.csv', true);
608+
ERROR: new row for relation "Person1" violates check constraint "check_bool_true"
609+
DETAIL: Failing row contains (844424930131970, {"id": "2", "bool": "false", "__id__": 2, "string": "John", "num...).
610+
-- Add constraint on edge properties - fail if bool property is false
611+
ALTER TABLE agload_security."SecEdge" ADD CONSTRAINT check_bool_true
612+
CHECK ((properties->>'"bool"')::boolean = true);
613+
-- This should fail - some edges have bool = false
614+
SELECT load_edges_from_file('agload_security', 'SecEdge', 'age_load/conversion_edges.csv');
615+
ERROR: new row for relation "SecEdge" violates check constraint "check_bool_true"
616+
DETAIL: Failing row contains (1407374883553294, 844424930131969, 1125899906842625, {"bool": "false", "string": "John", "numeric": "-2"}).
617+
-- cleanup
618+
ALTER TABLE agload_security."Person1" DROP CONSTRAINT check_bool_true;
619+
ALTER TABLE agload_security."SecEdge" DROP CONSTRAINT check_bool_true;
620+
--
621+
-- Cleanup
622+
--
623+
REVOKE ALL ON agload_security."Person1" FROM load_test_user;
624+
REVOKE ALL ON agload_security."SecEdge" FROM load_test_user;
625+
REVOKE ALL ON SEQUENCE agload_security."Person1_id_seq" FROM load_test_user;
626+
REVOKE ALL ON SEQUENCE agload_security."SecEdge_id_seq" FROM load_test_user;
627+
REVOKE ALL ON ag_catalog.ag_label FROM load_test_user;
628+
REVOKE ALL ON ag_catalog.ag_graph FROM load_test_user;
629+
REVOKE ALL ON SCHEMA agload_security FROM load_test_user;
630+
REVOKE ALL ON SCHEMA ag_catalog FROM load_test_user;
631+
REVOKE pg_read_server_files FROM load_test_user;
632+
DROP USER load_test_user;
633+
SELECT drop_graph('agload_security', true);
634+
NOTICE: drop cascades to 5 other objects
635+
DETAIL: drop cascades to table agload_security._ag_label_vertex
636+
drop cascades to table agload_security._ag_label_edge
637+
drop cascades to table agload_security."Person1"
638+
drop cascades to table agload_security."Person2"
639+
drop cascades to table agload_security."SecEdge"
640+
NOTICE: graph "agload_security" has been dropped
641+
drop_graph
642+
------------
643+
644+
(1 row)
645+
646+
--
647+
-- End
648+
--

0 commit comments

Comments
 (0)