#59269 Add support for adding metadata in bulk#5128
Conversation
This reverts commit bde79b0.
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
|
Should we have a way to allow for adding multiple entries for the same meta key? |
|
Maybe? I haven't had a need for it myself. Could be useful, but it would complicate the function signature as it wouldn't be able to support an associative array of meta key => meta value elements. |
|
This ticket was featured in today's 6.9 Bug Scrub. I'd be happy if you could check out @aaronjorbin's feedback. |
peterwilsoncc
left a comment
There was a problem hiding this comment.
A few notes inline, some questions, some nitpicks
| * @param array<string,mixed> $meta_fields Metadata values keyed by their meta key. Values must be serializable if non-scalar. | ||
| * @return array<string,int>|false Array of meta IDs keyed by their meta key on success, false on failure. | ||
| */ | ||
| function bulk_add_site_meta( int $site_id, array $meta_fields ) { |
There was a problem hiding this comment.
Site meta is weird, might be best not to make it weirder.
It's intended not to accept duplicates and be interacted with via the site option functions, see https://core.trac.wordpress.org/ticket/61467
There was a problem hiding this comment.
Thanks for the ticket link. This new function uses the same filters and the same cache key(s) as the existing add_site_meta() function.
Until https://core.trac.wordpress.org/ticket/61467 is resolved do you think it's best to not introduce a new bulk_add_site_meta() function?
There was a problem hiding this comment.
Until https://core.trac.wordpress.org/ticket/61467 is resolved do you think it's best to not introduce a new
bulk_add_site_meta()function?
I'm inclined not to add the new function while the other ticket is worked out, just to avoid adding to the surface area.
There was a problem hiding this comment.
Pull request overview
This PR introduces bulk metadata addition functionality to improve performance when adding multiple meta fields. The new bulk_add_metadata() function and its wrapper functions (bulk_add_post_meta(), bulk_add_user_meta(), bulk_add_term_meta(), bulk_add_site_meta(), and bulk_add_comment_meta()) use a single database query instead of multiple queries, providing significant performance improvements (61-96% faster based on benchmarks).
Changes:
- Added
wpdb::insert_multiple()method for bulk row insertion - Added
bulk_add_metadata()core function and wrapper functions for each meta type - Updated
wp_insert_post(),wp_insert_user(), andwp_insert_comment()to use bulk functions for new object creation
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
src/wp-includes/class-wpdb.php |
Added insert_multiple() method for bulk database inserts |
src/wp-includes/meta.php |
Added bulk_add_metadata() core function |
src/wp-includes/post.php |
Added bulk_add_post_meta() wrapper and integrated into wp_insert_post() |
src/wp-includes/user.php |
Added bulk_add_user_meta() wrapper and integrated into wp_insert_user() |
src/wp-includes/taxonomy.php |
Added bulk_add_term_meta() wrapper with shared term validation |
src/wp-includes/ms-site.php |
Added bulk_add_site_meta() wrapper |
src/wp-includes/comment.php |
Added bulk_add_comment_meta() wrapper and integrated into wp_insert_comment() |
tests/phpunit/tests/meta/bulkAddMetadata.php |
Added tests for bulk metadata functionality |
tests/phpunit/tests/db/insertMultiple.php |
Added tests for insert_multiple() method |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| <?php | ||
|
|
||
| /** | ||
| * Test the insertion of multiple rows. | ||
| * | ||
| * @group wpdb | ||
| * | ||
| * @covers wpdb::insert_multiple | ||
| */ | ||
| class Tests_DB_InsertMultiple extends WP_UnitTestCase { | ||
|
|
||
| /** | ||
| * @ticket 59269 | ||
| */ | ||
| public function test_correct_rows_are_inserted() { | ||
| global $wpdb; | ||
|
|
||
| $table = $wpdb->postmeta; | ||
|
|
||
| $columns = array( | ||
| 'post_id', | ||
| 'meta_key', | ||
| 'meta_value', | ||
| ); | ||
|
|
||
| $datas = array( | ||
| array( 1, 'key1', 'value1' ), | ||
| array( 2, 'key2', 'value2' ), | ||
| array( 3, 'key3', 'value3' ), | ||
| ); | ||
|
|
||
| $format = array( | ||
| '%d', | ||
| '%s', | ||
| '%s', | ||
| ); | ||
|
|
||
| $query_count_before = $wpdb->num_queries; | ||
|
|
||
| $inserted = $wpdb->insert_multiple( | ||
| $table, | ||
| $columns, | ||
| $datas, | ||
| $format | ||
| ); | ||
|
|
||
| $query_count_after_insert = $wpdb->num_queries; | ||
|
|
||
| $rows = $wpdb->get_results( | ||
| $wpdb->prepare( | ||
| 'SELECT post_id, meta_key, meta_value FROM %i ORDER BY post_id ASC', | ||
| $table | ||
| ), | ||
| ARRAY_A | ||
| ); | ||
|
|
||
| $expected_rows = array( | ||
| array( | ||
| 'post_id' => '1', | ||
| 'meta_key' => 'key1', | ||
| 'meta_value' => 'value1', | ||
| ), | ||
| array( | ||
| 'post_id' => '2', | ||
| 'meta_key' => 'key2', | ||
| 'meta_value' => 'value2', | ||
| ), | ||
| array( | ||
| 'post_id' => '3', | ||
| 'meta_key' => 'key3', | ||
| 'meta_value' => 'value3', | ||
| ), | ||
| ); | ||
|
|
||
| $queries_used = $query_count_after_insert - $query_count_before; | ||
|
|
||
| $this->assertSame( 3, $inserted ); | ||
| $this->assertSame( $expected_rows, $rows ); | ||
|
|
||
| // insert_multiple should use at most 2 queries: | ||
| // 1 for charset info (if not cached) + 1 for the bulk insert. | ||
| // When run in isolation, it uses 2 queries. When run as part of the full wpdb test suite, | ||
| // charset info is cached from previous tests, so it only uses 1 query. | ||
| $this->assertLessThanOrEqual( 2, $queries_used, 'Expected insert_multiple to use at most 2 queries for bulk insert' ); | ||
| } | ||
| } |
There was a problem hiding this comment.
The test only covers the successful insertion case. Edge cases such as empty rows array, empty columns array, mismatched row/column counts, or database errors are not tested. These scenarios should be covered to ensure robust error handling.
| $added_keys[] = $meta_key; | ||
|
|
||
| $data[] = array( | ||
| $object_id, | ||
| $meta_key, | ||
| maybe_serialize( $meta_value ), | ||
| ); | ||
| } | ||
|
|
||
| // If there is no data to save, don't attempt to save it. | ||
| if ( ! $data ) { | ||
| return $return; | ||
| } | ||
|
|
||
| $inserted = $wpdb->insert_multiple( | ||
| $table, | ||
| array( | ||
| $column, | ||
| 'meta_key', | ||
| 'meta_value', | ||
| ), | ||
| $data | ||
| ); | ||
|
|
||
| if ( ! $inserted ) { | ||
| return false; | ||
| } | ||
|
|
||
| $first_mid = (int) $wpdb->insert_id; | ||
| $inserted_mids = range( $first_mid, $first_mid + $inserted - 1 ); | ||
| $keyed_mids = array_combine( $added_keys, $inserted_mids ); |
There was a problem hiding this comment.
When the filter short-circuits a meta key (line 238-240), the function continues to the next key without tracking it in $added_keys. However, if there are duplicate meta keys in $meta_fields, this could cause issues where $added_keys and the actual inserted keys get out of sync, leading to incorrect meta ID mapping in the returned array. Consider validating that all keys in $meta_fields are unique, or handle duplicate keys explicitly.
| <?php | ||
|
|
||
| /** | ||
| * @group meta | ||
| * | ||
| * @covers bulk_add_metadata | ||
| */ | ||
| class Tests_Meta_BulkAddMetadata extends WP_UnitTestCase { | ||
|
|
||
| /* | ||
| * It is important to test with both even and odd numbered slashes, | ||
| * as KSES does a strip-then-add slashes in some of its function calls. | ||
| */ | ||
|
|
||
| const SLASH_1 = 'String with 1 slash \\'; | ||
| const SLASH_2 = 'String with 2 slashes \\\\'; | ||
| const SLASH_3 = 'String with 3 slashes \\\\\\'; | ||
| const SLASH_4 = 'String with 4 slashes \\\\\\\\'; | ||
| const SLASH_5 = 'String with 5 slashes \\\\\\\\\\'; | ||
| const SLASH_6 = 'String with 6 slashes \\\\\\\\\\\\'; | ||
| const SLASH_7 = 'String with 7 slashes \\\\\\\\\\\\\\'; | ||
|
|
||
| /** | ||
| * @ticket 59269 | ||
| * @dataProvider data_meta_types | ||
| */ | ||
| public function test_all_meta_fields_should_be_added( string $meta_type ) { | ||
| global $wpdb; | ||
|
|
||
| // Create the object. | ||
| $object_id = self::factory()->{$meta_type}->create(); | ||
|
|
||
| // Prepare the meta values to add in bulk. | ||
| $meta = array( | ||
| 'key1' => '1', | ||
| 'key2' => '2', | ||
| 'key3' => '3', | ||
| ); | ||
|
|
||
| // Track the mid before adding new meta. | ||
| $next_mid = self::get_autoincrement( $meta_type ); | ||
|
|
||
| // Set up mock actions and filters to track calls. | ||
| $add_metadata_filter = new MockAction(); | ||
| $add_meta_action = new MockAction(); | ||
| $added_meta_action = new MockAction(); | ||
| add_filter( "add_{$meta_type}_metadata", array( $add_metadata_filter, 'filter' ), 10, 5 ); | ||
| add_action( "add_{$meta_type}_meta", array( $add_meta_action, 'action' ) ); | ||
| add_action( "added_{$meta_type}_meta", array( $added_meta_action, 'action' ) ); | ||
|
|
||
| // Bulk add the meta. | ||
| $result = bulk_add_metadata( $meta_type, $object_id, $meta ); | ||
|
|
||
| // Read back the actual meta values. | ||
| $actual_vals = array( | ||
| 'key1' => get_metadata( $meta_type, $object_id, 'key1', true ), | ||
| 'key2' => get_metadata( $meta_type, $object_id, 'key2', true ), | ||
| 'key3' => get_metadata( $meta_type, $object_id, 'key3', true ), | ||
| ); | ||
|
|
||
| // Prepare expected meta IDs. | ||
| $expected_mids = array( | ||
| 'key1' => ( $next_mid ), | ||
| 'key2' => ( $next_mid + 1 ), | ||
| 'key3' => ( $next_mid + 2 ), | ||
| ); | ||
|
|
||
| $this->assertSame( $meta, $actual_vals, 'Actual meta values should match expected values.' ); | ||
| $this->assertSame( $expected_mids, $result, 'Actual meta IDs should match expected meta IDs.' ); | ||
| $this->assertSame( 3, $add_metadata_filter->get_call_count(), "'add_{$meta_type}_metadata' filter should be called the correct number of times." ); | ||
| $this->assertSame( 3, $add_meta_action->get_call_count(), "'add_{$meta_type}_meta' action should be called the correct number of times." ); | ||
| $this->assertSame( 3, $added_meta_action->get_call_count(), "'added_{$meta_type}_meta' action should be called the correct number of times." ); | ||
| } | ||
|
|
||
| /** | ||
| * @ticket 59269 | ||
| * @dataProvider data_meta_types | ||
| */ | ||
| public function test_correct_mids_should_be_returned_when_filter_is_in_place( string $meta_type ) { | ||
| global $wpdb; | ||
|
|
||
| // Set up a filter to modify the mid for 'key2'. | ||
| add_filter( | ||
| "add_{$meta_type}_metadata", | ||
| static function ( $check, $object_id, $meta_key ) { | ||
| return ( 'key2' === $meta_key ) ? 123456 : $check; | ||
| }, | ||
| 10, | ||
| 3 | ||
| ); | ||
|
|
||
| // Create the object. | ||
| $object_id = self::factory()->{$meta_type}->create(); | ||
|
|
||
| // Prepare the meta values to add in bulk. | ||
| $meta = array( | ||
| 'key1' => '1', | ||
| 'key2' => '2', | ||
| 'key3' => '3', | ||
| ); | ||
|
|
||
| // Track the mid before adding new meta. | ||
| $next_mid = self::get_autoincrement( $meta_type ); | ||
|
|
||
| // Set up mock actions and filters to track calls. | ||
| $add_metadata_filter = new MockAction(); | ||
| $add_meta_action = new MockAction(); | ||
| $added_meta_action = new MockAction(); | ||
| add_filter( "add_{$meta_type}_metadata", array( $add_metadata_filter, 'filter' ), 10, 5 ); | ||
| add_action( "add_{$meta_type}_meta", array( $add_meta_action, 'action' ) ); | ||
| add_action( "added_{$meta_type}_meta", array( $added_meta_action, 'action' ) ); | ||
|
|
||
| // Bulk add the meta. | ||
| $result = bulk_add_metadata( $meta_type, $object_id, $meta ); | ||
|
|
||
| // Prepare expected meta values. | ||
| $expected_vals = array( | ||
| 'key1' => '1', | ||
| 'key2' => '', | ||
| 'key3' => '3', | ||
| ); | ||
|
|
||
| // Read back the actual meta values. | ||
| $actual_vals = array( | ||
| 'key1' => get_metadata( $meta_type, $object_id, 'key1', true ), | ||
| 'key2' => get_metadata( $meta_type, $object_id, 'key2', true ), | ||
| 'key3' => get_metadata( $meta_type, $object_id, 'key3', true ), | ||
| ); | ||
|
|
||
| // Prepare expected meta IDs. | ||
| $expected_mids = array( | ||
| 'key2' => 123456, | ||
| 'key1' => ( $next_mid ), | ||
| 'key3' => ( $next_mid + 1 ), | ||
| ); | ||
|
|
||
| $this->assertSame( $expected_vals, $actual_vals, 'Actual meta values should match expected values.' ); | ||
| $this->assertSame( $expected_mids, $result, 'Actual meta IDs should match expected meta IDs.' ); | ||
| $this->assertSame( 3, $add_metadata_filter->get_call_count(), "'add_{$meta_type}_metadata' filter should be called the correct number of times." ); | ||
| $this->assertSame( 2, $add_meta_action->get_call_count(), "'add_{$meta_type}_meta' action should be called the correct number of times." ); | ||
| $this->assertSame( 2, $added_meta_action->get_call_count(), "'added_{$meta_type}_meta' action should be called the correct number of times." ); | ||
| } | ||
|
|
||
| /** | ||
| * @ticket 59269 | ||
| * @dataProvider data_meta_types | ||
| */ | ||
| public function test_slashed_data_should_be_handled_correctly( string $meta_type ) { | ||
| // Create the object. | ||
| $object_id = self::factory()->{$meta_type}->create(); | ||
|
|
||
| // Prepare the meta values to add in bulk. | ||
| $meta = array( | ||
| 'key1' => addslashes( self::SLASH_1 ), | ||
| 'key2' => addslashes( self::SLASH_2 ), | ||
| 'key3' => addslashes( self::SLASH_3 ), | ||
| 'key4' => addslashes( self::SLASH_4 ), | ||
| 'key5' => addslashes( self::SLASH_5 ), | ||
| 'key6' => addslashes( self::SLASH_6 ), | ||
| 'key7' => addslashes( self::SLASH_7 ), | ||
| ); | ||
|
|
||
| // Bulk add the meta. | ||
| $result = bulk_add_metadata( $meta_type, $object_id, $meta ); | ||
|
|
||
| // Read back the actual meta values. | ||
| $actual_vals = array( | ||
| 'key1' => get_metadata( $meta_type, $object_id, 'key1', true ), | ||
| 'key2' => get_metadata( $meta_type, $object_id, 'key2', true ), | ||
| 'key3' => get_metadata( $meta_type, $object_id, 'key3', true ), | ||
| 'key4' => get_metadata( $meta_type, $object_id, 'key4', true ), | ||
| 'key5' => get_metadata( $meta_type, $object_id, 'key5', true ), | ||
| 'key6' => get_metadata( $meta_type, $object_id, 'key6', true ), | ||
| 'key7' => get_metadata( $meta_type, $object_id, 'key7', true ), | ||
| ); | ||
|
|
||
| // Prepare expected meta values. | ||
| $expected_vals = array( | ||
| 'key1' => self::SLASH_1, | ||
| 'key2' => self::SLASH_2, | ||
| 'key3' => self::SLASH_3, | ||
| 'key4' => self::SLASH_4, | ||
| 'key5' => self::SLASH_5, | ||
| 'key6' => self::SLASH_6, | ||
| 'key7' => self::SLASH_7, | ||
| ); | ||
|
|
||
| $this->assertCount( 7, $result, 'The correct number of meta IDs should be returned.' ); | ||
| $this->assertSame( $expected_vals, $actual_vals, 'Actual meta values should match expected values.' ); | ||
| } | ||
|
|
||
| /** | ||
| * @return array<string,array<string>> | ||
| */ | ||
| public function data_meta_types(): array { | ||
| $types = array( | ||
| 'post' => array( 'post' ), | ||
| 'user' => array( 'user' ), | ||
| 'comment' => array( 'comment' ), | ||
| 'term' => array( 'term' ), | ||
| ); | ||
|
|
||
| if ( is_multisite() ) { | ||
| $types['blog'] = array( 'blog' ); | ||
| } | ||
|
|
||
| return $types; | ||
| } | ||
|
|
||
| private static function get_autoincrement( string $meta_type ): int { | ||
| global $wpdb; | ||
|
|
||
| $table = "{$meta_type}meta"; | ||
|
|
||
| return (int) $wpdb->get_var( | ||
| $wpdb->prepare( | ||
| ' | ||
| SELECT AUTO_INCREMENT | ||
| FROM INFORMATION_SCHEMA.TABLES | ||
| WHERE TABLE_SCHEMA = DATABASE() | ||
| AND TABLE_NAME = %s | ||
| ', | ||
| $wpdb->$table | ||
| ) | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
The tests do not cover important edge cases such as: empty meta_fields array, invalid object_id (0 or negative), invalid meta_type, database insert failure, or behavior when multiple items have the same meta key. These edge cases should be tested to ensure the function handles them correctly and returns appropriate values.
| $values_sql = array(); | ||
| $values = array( | ||
| $table, | ||
| ); | ||
|
|
||
| foreach ( $rows as $row ) { | ||
| $data = $this->process_fields( $table, array_combine( $columns, $row ), $format ); |
There was a problem hiding this comment.
The function does not validate that the $columns and $rows arrays are not empty, or that all rows have the same number of elements as the $columns array. If a row has fewer or more elements than columns, the array_combine() call on line 2584 will return false, which is then checked on line 2585. However, if $rows or $columns is empty, the function will generate an invalid SQL query. These edge cases should be validated and return false early.
| $values_sql = array(); | |
| $values = array( | |
| $table, | |
| ); | |
| foreach ( $rows as $row ) { | |
| $data = $this->process_fields( $table, array_combine( $columns, $row ), $format ); | |
| // Validate that we have at least one column and one row to insert. | |
| if ( empty( $columns ) || empty( $rows ) ) { | |
| return false; | |
| } | |
| $values_sql = array(); | |
| $values = array( | |
| $table, | |
| ); | |
| foreach ( $rows as $row ) { | |
| // Each row must be an array with the same number of elements as the columns. | |
| if ( ! is_array( $row ) || count( $row ) !== count( $columns ) ) { | |
| return false; | |
| } | |
| $combined = array_combine( $columns, $row ); | |
| if ( false === $combined ) { | |
| return false; | |
| } | |
| $data = $this->process_fields( $table, $combined, $format ); |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Trac ticket: https://core.trac.wordpress.org/ticket/59269
This introduces a
bulk_add_metadata()function along with wrapper functions for post, user, site, term, and comment meta. This function accepts an array of key value pairs for inserting metadata in bulk. The metadata is inserted using a single database query and a single flush of the term cache, which is more performant than multiple calls toadd_{type}_meta(). The performance benefit increases linearly as the number of meta fields increases.Example
$meta = [ 'key1' => 'value1', 'key2' => 'value2', 'key3' => 'value3', ]; - foreach ( $meta as $key => $value ) { - add_user_meta( $user_id, $key, $value ); - } + bulk_add_user_meta( $user_id, $meta );Performance
The below table compares the peak memory usage and time taken reported by PHPBench to add various sets of data using multiple calls to
add_post_meta()versus one call tobulk_add_post_meta(). Absolute times vary depending on the system being used to run the tests (eg. my local Docker environment is slower than the GitHub Actions workflow runs), but the relative changes remain quite consistent.Various data sizes have been tested and can be seen in the
provideMetaData()method in the tests/phpbench/BulkAddMetaBench.php file. PHPBench is configured to run each test set 30 times and report the peak memory usage and the mode time taken.Memory usage remains unchanged in the majority of cases. It peaks somewhat at the extreme end for multiple extra large data sets where meta values are between 10kb and 200kb in size. I have yet to investigate what causes this, but it's not a big concern.
Timing decreases significantly for all test cases, as expected.
Notes
add_multiple_metadata()is a better name, there are a few other functions in core that usemultiplein their name.add_{type}_meta()multiple times:add_{$meta_type}_metadatais called once for every keyadd_{$meta_type}_metaandadded_{$meta_type}_metaare called once for every key when the above filter hasn't short-circuited the corresponding value for a key