Skip to content

#59269 Add support for adding metadata in bulk#5128

Open
johnbillion wants to merge 66 commits into
WordPress:trunkfrom
johnbillion:59269
Open

#59269 Add support for adding metadata in bulk#5128
johnbillion wants to merge 66 commits into
WordPress:trunkfrom
johnbillion:59269

Conversation

@johnbillion
Copy link
Copy Markdown
Member

@johnbillion johnbillion commented Sep 1, 2023

Trac ticket: https://core.trac.wordpress.org/ticket/59269

This introduces a bulk_add_metadata() function along with wrapper functions for post, user, site, term, and comment meta. This function accepts an array of key value pairs for inserting metadata in bulk. The metadata is inserted using a single database query and a single flush of the term cache, which is more performant than multiple calls to add_{type}_meta(). The performance benefit increases linearly as the number of meta fields increases.

Example

 $meta = [
     'key1' => 'value1',
     'key2' => 'value2',
     'key3' => 'value3',
 ];

- foreach ( $meta as $key => $value ) {
-     add_user_meta( $user_id, $key, $value );
- }
+ bulk_add_user_meta( $user_id, $meta );

Performance

The below table compares the peak memory usage and time taken reported by PHPBench to add various sets of data using multiple calls to add_post_meta() versus one call to bulk_add_post_meta(). Absolute times vary depending on the system being used to run the tests (eg. my local Docker environment is slower than the GitHub Actions workflow runs), but the relative changes remain quite consistent.

Various data sizes have been tested and can be seen in the provideMetaData() method in the tests/phpbench/BulkAddMetaBench.php file. PHPBench is configured to run each test set 30 times and report the peak memory usage and the mode time taken.

Memory usage remains unchanged in the majority of cases. It peaks somewhat at the extreme end for multiple extra large data sets where meta values are between 10kb and 200kb in size. I have yet to investigate what causes this, but it's not a big concern.

Timing decreases significantly for all test cases, as expected.

set mem_peak mode
3 metas, 10-30 bytes 30.383mb -0.00% 928.586μs -61.37%
20 metas, 10-200 bytes 30.393mb -0.00% 1.401ms -91.85%
50 metas, 5-250 bytes 30.405mb -0.00% 2.018ms -95.30%
100 metas, 5-500 bytes 30.688mb -0.00% 3.642ms -96.11%
10 metas, 0.1-1 Kb 30.396mb -0.00% 1.193ms -86.82%
10 metas, 1-10 Kb 30.506mb -0.00% 2.393ms -86.66%
3 metas, 30-90 Kb 30.934mb +1.99% 4.162ms -51.67%
10 metas, 10-100 Kb 32.193mb +9.70% 10.563ms -62.09%
20 metas, 10-200 Kb 37.426mb +35.22% 36.750ms -45.43%

Notes

  • The PHPBench code and its workflow won't be included when this gets committed. It's there to demonstrate the performance improvements while working on this PR.
  • Naming is hard. Perhaps add_multiple_metadata() is a better name, there are a few other functions in core that use multiple in their name.
  • The same actions and filters are fired as get fired when calling add_{type}_meta() multiple times:
    • add_{$meta_type}_metadata is called once for every key
    • add_{$meta_type}_meta and added_{$meta_type}_meta are called once for every key when the above filter hasn't short-circuited the corresponding value for a key

Comment thread composer.json Outdated
@johnbillion johnbillion changed the title 59269 Add support for adding metadata in bulk #59269 Add support for adding metadata in bulk Feb 21, 2025
@github-actions
Copy link
Copy Markdown

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@TimothyBJacobs
Copy link
Copy Markdown
Member

Should we have a way to allow for adding multiple entries for the same meta key?

@johnbillion
Copy link
Copy Markdown
Member Author

Maybe? I haven't had a need for it myself. Could be useful, but it would complicate the function signature as it wouldn't be able to support an associative array of meta key => meta value elements.

Comment thread tests/phpunit/tests/meta/bulkAddMetadata.php
Comment thread .gitignore Outdated
Comment thread composer.json Outdated
Comment thread phpbench.json.dist Outdated
Comment thread .github/workflows/phpbench-tests.yml Outdated
Comment thread tests/phpbench/AddMetaBenchBase.php Outdated
Comment thread tests/phpbench/BulkAddMetaBench.php Outdated
Comment thread tests/phpbench/SingleAddMetaBench.php Outdated
Comment thread src/wp-includes/class-wpdb.php Outdated
@t-hamano
Copy link
Copy Markdown
Contributor

t-hamano commented Oct 6, 2025

This ticket was featured in today's 6.9 Bug Scrub. I'd be happy if you could check out @aaronjorbin's feedback.

Copy link
Copy Markdown
Contributor

@peterwilsoncc peterwilsoncc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few notes inline, some questions, some nitpicks

Comment thread tests/phpunit/tests/meta/bulkAddMetadata.php Outdated
Comment thread tests/phpunit/tests/meta/bulkAddMetadata.php Outdated
Comment thread tests/phpunit/tests/db/insertMultiple.php
* @param array<string,mixed> $meta_fields Metadata values keyed by their meta key. Values must be serializable if non-scalar.
* @return array<string,int>|false Array of meta IDs keyed by their meta key on success, false on failure.
*/
function bulk_add_site_meta( int $site_id, array $meta_fields ) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Site meta is weird, might be best not to make it weirder.

It's intended not to accept duplicates and be interacted with via the site option functions, see https://core.trac.wordpress.org/ticket/61467

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ticket link. This new function uses the same filters and the same cache key(s) as the existing add_site_meta() function.

Until https://core.trac.wordpress.org/ticket/61467 is resolved do you think it's best to not introduce a new bulk_add_site_meta() function?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until https://core.trac.wordpress.org/ticket/61467 is resolved do you think it's best to not introduce a new bulk_add_site_meta() function?

I'm inclined not to add the new function while the other ticket is worked out, just to avoid adding to the surface area.

Comment thread src/wp-includes/meta.php
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces bulk metadata addition functionality to improve performance when adding multiple meta fields. The new bulk_add_metadata() function and its wrapper functions (bulk_add_post_meta(), bulk_add_user_meta(), bulk_add_term_meta(), bulk_add_site_meta(), and bulk_add_comment_meta()) use a single database query instead of multiple queries, providing significant performance improvements (61-96% faster based on benchmarks).

Changes:

  • Added wpdb::insert_multiple() method for bulk row insertion
  • Added bulk_add_metadata() core function and wrapper functions for each meta type
  • Updated wp_insert_post(), wp_insert_user(), and wp_insert_comment() to use bulk functions for new object creation

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/wp-includes/class-wpdb.php Added insert_multiple() method for bulk database inserts
src/wp-includes/meta.php Added bulk_add_metadata() core function
src/wp-includes/post.php Added bulk_add_post_meta() wrapper and integrated into wp_insert_post()
src/wp-includes/user.php Added bulk_add_user_meta() wrapper and integrated into wp_insert_user()
src/wp-includes/taxonomy.php Added bulk_add_term_meta() wrapper with shared term validation
src/wp-includes/ms-site.php Added bulk_add_site_meta() wrapper
src/wp-includes/comment.php Added bulk_add_comment_meta() wrapper and integrated into wp_insert_comment()
tests/phpunit/tests/meta/bulkAddMetadata.php Added tests for bulk metadata functionality
tests/phpunit/tests/db/insertMultiple.php Added tests for insert_multiple() method

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/wp-includes/ms-site.php Outdated
Comment thread src/wp-includes/class-wpdb.php Outdated
Comment on lines +1 to +86
<?php

/**
* Test the insertion of multiple rows.
*
* @group wpdb
*
* @covers wpdb::insert_multiple
*/
class Tests_DB_InsertMultiple extends WP_UnitTestCase {

/**
* @ticket 59269
*/
public function test_correct_rows_are_inserted() {
global $wpdb;

$table = $wpdb->postmeta;

$columns = array(
'post_id',
'meta_key',
'meta_value',
);

$datas = array(
array( 1, 'key1', 'value1' ),
array( 2, 'key2', 'value2' ),
array( 3, 'key3', 'value3' ),
);

$format = array(
'%d',
'%s',
'%s',
);

$query_count_before = $wpdb->num_queries;

$inserted = $wpdb->insert_multiple(
$table,
$columns,
$datas,
$format
);

$query_count_after_insert = $wpdb->num_queries;

$rows = $wpdb->get_results(
$wpdb->prepare(
'SELECT post_id, meta_key, meta_value FROM %i ORDER BY post_id ASC',
$table
),
ARRAY_A
);

$expected_rows = array(
array(
'post_id' => '1',
'meta_key' => 'key1',
'meta_value' => 'value1',
),
array(
'post_id' => '2',
'meta_key' => 'key2',
'meta_value' => 'value2',
),
array(
'post_id' => '3',
'meta_key' => 'key3',
'meta_value' => 'value3',
),
);

$queries_used = $query_count_after_insert - $query_count_before;

$this->assertSame( 3, $inserted );
$this->assertSame( $expected_rows, $rows );

// insert_multiple should use at most 2 queries:
// 1 for charset info (if not cached) + 1 for the bulk insert.
// When run in isolation, it uses 2 queries. When run as part of the full wpdb test suite,
// charset info is cached from previous tests, so it only uses 1 query.
$this->assertLessThanOrEqual( 2, $queries_used, 'Expected insert_multiple to use at most 2 queries for bulk insert' );
}
}
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test only covers the successful insertion case. Edge cases such as empty rows array, empty columns array, mismatched row/column counts, or database errors are not tested. These scenarios should be covered to ensure robust error handling.

Copilot uses AI. Check for mistakes.
Comment thread src/wp-includes/comment.php
Comment thread src/wp-includes/meta.php Outdated
Comment thread src/wp-includes/meta.php
Comment on lines +246 to +276
$added_keys[] = $meta_key;

$data[] = array(
$object_id,
$meta_key,
maybe_serialize( $meta_value ),
);
}

// If there is no data to save, don't attempt to save it.
if ( ! $data ) {
return $return;
}

$inserted = $wpdb->insert_multiple(
$table,
array(
$column,
'meta_key',
'meta_value',
),
$data
);

if ( ! $inserted ) {
return false;
}

$first_mid = (int) $wpdb->insert_id;
$inserted_mids = range( $first_mid, $first_mid + $inserted - 1 );
$keyed_mids = array_combine( $added_keys, $inserted_mids );
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the filter short-circuits a meta key (line 238-240), the function continues to the next key without tracking it in $added_keys. However, if there are duplicate meta keys in $meta_fields, this could cause issues where $added_keys and the actual inserted keys get out of sync, leading to incorrect meta ID mapping in the returned array. Consider validating that all keys in $meta_fields are unique, or handle duplicate keys explicitly.

Copilot uses AI. Check for mistakes.
Comment thread src/wp-includes/meta.php Outdated
Comment on lines +1 to +227
<?php

/**
* @group meta
*
* @covers bulk_add_metadata
*/
class Tests_Meta_BulkAddMetadata extends WP_UnitTestCase {

/*
* It is important to test with both even and odd numbered slashes,
* as KSES does a strip-then-add slashes in some of its function calls.
*/

const SLASH_1 = 'String with 1 slash \\';
const SLASH_2 = 'String with 2 slashes \\\\';
const SLASH_3 = 'String with 3 slashes \\\\\\';
const SLASH_4 = 'String with 4 slashes \\\\\\\\';
const SLASH_5 = 'String with 5 slashes \\\\\\\\\\';
const SLASH_6 = 'String with 6 slashes \\\\\\\\\\\\';
const SLASH_7 = 'String with 7 slashes \\\\\\\\\\\\\\';

/**
* @ticket 59269
* @dataProvider data_meta_types
*/
public function test_all_meta_fields_should_be_added( string $meta_type ) {
global $wpdb;

// Create the object.
$object_id = self::factory()->{$meta_type}->create();

// Prepare the meta values to add in bulk.
$meta = array(
'key1' => '1',
'key2' => '2',
'key3' => '3',
);

// Track the mid before adding new meta.
$next_mid = self::get_autoincrement( $meta_type );

// Set up mock actions and filters to track calls.
$add_metadata_filter = new MockAction();
$add_meta_action = new MockAction();
$added_meta_action = new MockAction();
add_filter( "add_{$meta_type}_metadata", array( $add_metadata_filter, 'filter' ), 10, 5 );
add_action( "add_{$meta_type}_meta", array( $add_meta_action, 'action' ) );
add_action( "added_{$meta_type}_meta", array( $added_meta_action, 'action' ) );

// Bulk add the meta.
$result = bulk_add_metadata( $meta_type, $object_id, $meta );

// Read back the actual meta values.
$actual_vals = array(
'key1' => get_metadata( $meta_type, $object_id, 'key1', true ),
'key2' => get_metadata( $meta_type, $object_id, 'key2', true ),
'key3' => get_metadata( $meta_type, $object_id, 'key3', true ),
);

// Prepare expected meta IDs.
$expected_mids = array(
'key1' => ( $next_mid ),
'key2' => ( $next_mid + 1 ),
'key3' => ( $next_mid + 2 ),
);

$this->assertSame( $meta, $actual_vals, 'Actual meta values should match expected values.' );
$this->assertSame( $expected_mids, $result, 'Actual meta IDs should match expected meta IDs.' );
$this->assertSame( 3, $add_metadata_filter->get_call_count(), "'add_{$meta_type}_metadata' filter should be called the correct number of times." );
$this->assertSame( 3, $add_meta_action->get_call_count(), "'add_{$meta_type}_meta' action should be called the correct number of times." );
$this->assertSame( 3, $added_meta_action->get_call_count(), "'added_{$meta_type}_meta' action should be called the correct number of times." );
}

/**
* @ticket 59269
* @dataProvider data_meta_types
*/
public function test_correct_mids_should_be_returned_when_filter_is_in_place( string $meta_type ) {
global $wpdb;

// Set up a filter to modify the mid for 'key2'.
add_filter(
"add_{$meta_type}_metadata",
static function ( $check, $object_id, $meta_key ) {
return ( 'key2' === $meta_key ) ? 123456 : $check;
},
10,
3
);

// Create the object.
$object_id = self::factory()->{$meta_type}->create();

// Prepare the meta values to add in bulk.
$meta = array(
'key1' => '1',
'key2' => '2',
'key3' => '3',
);

// Track the mid before adding new meta.
$next_mid = self::get_autoincrement( $meta_type );

// Set up mock actions and filters to track calls.
$add_metadata_filter = new MockAction();
$add_meta_action = new MockAction();
$added_meta_action = new MockAction();
add_filter( "add_{$meta_type}_metadata", array( $add_metadata_filter, 'filter' ), 10, 5 );
add_action( "add_{$meta_type}_meta", array( $add_meta_action, 'action' ) );
add_action( "added_{$meta_type}_meta", array( $added_meta_action, 'action' ) );

// Bulk add the meta.
$result = bulk_add_metadata( $meta_type, $object_id, $meta );

// Prepare expected meta values.
$expected_vals = array(
'key1' => '1',
'key2' => '',
'key3' => '3',
);

// Read back the actual meta values.
$actual_vals = array(
'key1' => get_metadata( $meta_type, $object_id, 'key1', true ),
'key2' => get_metadata( $meta_type, $object_id, 'key2', true ),
'key3' => get_metadata( $meta_type, $object_id, 'key3', true ),
);

// Prepare expected meta IDs.
$expected_mids = array(
'key2' => 123456,
'key1' => ( $next_mid ),
'key3' => ( $next_mid + 1 ),
);

$this->assertSame( $expected_vals, $actual_vals, 'Actual meta values should match expected values.' );
$this->assertSame( $expected_mids, $result, 'Actual meta IDs should match expected meta IDs.' );
$this->assertSame( 3, $add_metadata_filter->get_call_count(), "'add_{$meta_type}_metadata' filter should be called the correct number of times." );
$this->assertSame( 2, $add_meta_action->get_call_count(), "'add_{$meta_type}_meta' action should be called the correct number of times." );
$this->assertSame( 2, $added_meta_action->get_call_count(), "'added_{$meta_type}_meta' action should be called the correct number of times." );
}

/**
* @ticket 59269
* @dataProvider data_meta_types
*/
public function test_slashed_data_should_be_handled_correctly( string $meta_type ) {
// Create the object.
$object_id = self::factory()->{$meta_type}->create();

// Prepare the meta values to add in bulk.
$meta = array(
'key1' => addslashes( self::SLASH_1 ),
'key2' => addslashes( self::SLASH_2 ),
'key3' => addslashes( self::SLASH_3 ),
'key4' => addslashes( self::SLASH_4 ),
'key5' => addslashes( self::SLASH_5 ),
'key6' => addslashes( self::SLASH_6 ),
'key7' => addslashes( self::SLASH_7 ),
);

// Bulk add the meta.
$result = bulk_add_metadata( $meta_type, $object_id, $meta );

// Read back the actual meta values.
$actual_vals = array(
'key1' => get_metadata( $meta_type, $object_id, 'key1', true ),
'key2' => get_metadata( $meta_type, $object_id, 'key2', true ),
'key3' => get_metadata( $meta_type, $object_id, 'key3', true ),
'key4' => get_metadata( $meta_type, $object_id, 'key4', true ),
'key5' => get_metadata( $meta_type, $object_id, 'key5', true ),
'key6' => get_metadata( $meta_type, $object_id, 'key6', true ),
'key7' => get_metadata( $meta_type, $object_id, 'key7', true ),
);

// Prepare expected meta values.
$expected_vals = array(
'key1' => self::SLASH_1,
'key2' => self::SLASH_2,
'key3' => self::SLASH_3,
'key4' => self::SLASH_4,
'key5' => self::SLASH_5,
'key6' => self::SLASH_6,
'key7' => self::SLASH_7,
);

$this->assertCount( 7, $result, 'The correct number of meta IDs should be returned.' );
$this->assertSame( $expected_vals, $actual_vals, 'Actual meta values should match expected values.' );
}

/**
* @return array<string,array<string>>
*/
public function data_meta_types(): array {
$types = array(
'post' => array( 'post' ),
'user' => array( 'user' ),
'comment' => array( 'comment' ),
'term' => array( 'term' ),
);

if ( is_multisite() ) {
$types['blog'] = array( 'blog' );
}

return $types;
}

private static function get_autoincrement( string $meta_type ): int {
global $wpdb;

$table = "{$meta_type}meta";

return (int) $wpdb->get_var(
$wpdb->prepare(
'
SELECT AUTO_INCREMENT
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = DATABASE()
AND TABLE_NAME = %s
',
$wpdb->$table
)
);
}
}
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests do not cover important edge cases such as: empty meta_fields array, invalid object_id (0 or negative), invalid meta_type, database insert failure, or behavior when multiple items have the same meta key. These edge cases should be tested to ensure the function handles them correctly and returns appropriate values.

Copilot uses AI. Check for mistakes.
Comment on lines +2578 to +2584
$values_sql = array();
$values = array(
$table,
);

foreach ( $rows as $row ) {
$data = $this->process_fields( $table, array_combine( $columns, $row ), $format );
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function does not validate that the $columns and $rows arrays are not empty, or that all rows have the same number of elements as the $columns array. If a row has fewer or more elements than columns, the array_combine() call on line 2584 will return false, which is then checked on line 2585. However, if $rows or $columns is empty, the function will generate an invalid SQL query. These edge cases should be validated and return false early.

Suggested change
$values_sql = array();
$values = array(
$table,
);
foreach ( $rows as $row ) {
$data = $this->process_fields( $table, array_combine( $columns, $row ), $format );
// Validate that we have at least one column and one row to insert.
if ( empty( $columns ) || empty( $rows ) ) {
return false;
}
$values_sql = array();
$values = array(
$table,
);
foreach ( $rows as $row ) {
// Each row must be an array with the same number of elements as the columns.
if ( ! is_array( $row ) || count( $row ) !== count( $columns ) ) {
return false;
}
$combined = array_combine( $columns, $row );
if ( false === $combined ) {
return false;
}
$data = $this->process_fields( $table, $combined, $format );

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants