MDEV-39061 mariadb-backup compatible wrappers for BACKUP SERVER#5140
MDEV-39061 mariadb-backup compatible wrappers for BACKUP SERVER#5140Thirunarayanan wants to merge 15 commits into
Conversation
|
|
There was a problem hiding this comment.
Code Review
This pull request introduces server-side backup support (BACKUP SERVER) for MariaDB, implementing backup and log archiving mechanisms for both the InnoDB and Aria storage engines. It also includes a compatibility wrapper script (mariabackup.sh) to map legacy mariabackup CLI commands to the new server-side SQL interface. The code review identified several critical issues, including a potential server crash in backup_innodb.cc due to invalid format arguments in error reporting, null pointer dereferences and assertion failures in both engines when backup steps are executed out of order or fail to initialize, and security vulnerabilities (command injection) and argument parsing bugs in the shell wrapper script.
| { | ||
| fail: | ||
| fail= 1; | ||
| my_error(ER_ERROR_ON_RENAME, MYF(ME_ERROR_LOG), s, d, errno); |
There was a problem hiding this comment.
Passing s and d (which are int file descriptors on Unix) to my_error with the ER_ERROR_ON_RENAME format specifier will cause a segmentation fault/crash because ER_ERROR_ON_RENAME expects string arguments (const char*). They should be replaced with the actual file names.
| my_error(ER_ERROR_ON_RENAME, MYF(ME_ERROR_LOG), s, d, errno); | |
| my_error(ER_ERROR_ON_RENAME, MYF(ME_ERROR_LOG), "ib_logfile101", dst.c_str(), errno); |
| int aria_backup_end(THD *thd, bool abort) noexcept | ||
| { | ||
| int ret_val= aria_backup->end(thd, abort); | ||
| aria_backup.reset(); | ||
| return ret_val; | ||
| } |
There was a problem hiding this comment.
If aria_backup_start was never called (e.g., because another plugin's backup_start failed earlier in plugin_foreach_with_mask), aria_backup will be nullptr. When plugin_foreach_with_mask subsequently calls backup_end for all plugins, aria_backup_end will dereference aria_backup without a null check, causing a server crash. A null check should be added.
int aria_backup_end(THD *thd, bool abort) noexcept
{
if (!aria_backup)
return 0;
int ret_val= aria_backup->end(thd, abort);
aria_backup.reset();
return ret_val;
}| int end(THD *thd, bool abort) noexcept | ||
| { | ||
| int fail= 0; | ||
| log_sys.latch.wr_lock(); |
There was a problem hiding this comment.
If backup_start was never called or failed, max_first_lsn is 0. In this case, calling InnoDB_backup::end will trigger assertions like ut_ad(log_sys.archive) which can fail and crash the debug server. We should return early if max_first_lsn is 0.
int end(THD *thd, bool abort) noexcept
{
int fail= 0;
log_sys.latch.wr_lock();
if (!max_first_lsn)
{
log_sys.latch.wr_unlock();
return 0;
}| if (!max_first_lsn) | ||
| { | ||
| ut_ad(queue.empty()); | ||
| if (old_size) | ||
| delete_logs(); | ||
| logs.clear(); | ||
| } | ||
| log_sys.latch.wr_unlock(); |
There was a problem hiding this comment.
If backup_start was never called or failed, max_first_lsn is 0. In this case, calling InnoDB_backup::fini will check the transaction state and trigger the assertion ut_ad("invalid state" == 0) because the transaction state is not TRX_STATE_BACKUP. We should return early if max_first_lsn is 0.
| if (!max_first_lsn) | |
| { | |
| ut_ad(queue.empty()); | |
| if (old_size) | |
| delete_logs(); | |
| logs.clear(); | |
| } | |
| log_sys.latch.wr_unlock(); | |
| if (!max_first_lsn) | |
| { | |
| ut_ad(queue.empty()); | |
| if (old_size) | |
| delete_logs(); | |
| logs.clear(); | |
| log_sys.latch.wr_unlock(); | |
| return 0; | |
| } | |
| log_sys.latch.wr_unlock(); |
| if [[ -n "$FINAL_INCLUDE" ]]; then | ||
| echo "Setting backup_include='$FINAL_INCLUDE'" >&2 | ||
| mariadb $MARIADB_OPTS -e "SET GLOBAL backup_include='$FINAL_INCLUDE'" | ||
| fi | ||
|
|
||
| if [[ -n "$FINAL_EXCLUDE" ]]; then | ||
| echo "Setting backup_exclude='$FINAL_EXCLUDE'" >&2 | ||
| mariadb $MARIADB_OPTS -e "SET GLOBAL backup_exclude='$FINAL_EXCLUDE'" | ||
| fi | ||
|
|
||
| # Execute BACKUP SERVER | ||
| SQL="BACKUP SERVER TO '$TARGET_DIR'" | ||
| echo "Executing: $SQL" >&2 | ||
| mariadb $MARIADB_OPTS -e "$SQL" |
There was a problem hiding this comment.
The parameters FINAL_INCLUDE, FINAL_EXCLUDE, and TARGET_DIR are interpolated into shell commands. To effectively prevent command injection, we should validate these inputs using a strict whitelist of allowed characters (such as alphanumeric characters and a limited set of safe symbols like '-', '_', '.') instead of relying on a blacklist or simple escaping.
| if [[ -n "$FINAL_INCLUDE" ]]; then | |
| echo "Setting backup_include='$FINAL_INCLUDE'" >&2 | |
| mariadb $MARIADB_OPTS -e "SET GLOBAL backup_include='$FINAL_INCLUDE'" | |
| fi | |
| if [[ -n "$FINAL_EXCLUDE" ]]; then | |
| echo "Setting backup_exclude='$FINAL_EXCLUDE'" >&2 | |
| mariadb $MARIADB_OPTS -e "SET GLOBAL backup_exclude='$FINAL_EXCLUDE'" | |
| fi | |
| # Execute BACKUP SERVER | |
| SQL="BACKUP SERVER TO '$TARGET_DIR'" | |
| echo "Executing: $SQL" >&2 | |
| mariadb $MARIADB_OPTS -e "$SQL" | |
| if [[ -n "$FINAL_INCLUDE" && ! "$FINAL_INCLUDE" =~ ^[a-zA-Z0-9_.-]+$ ]]; then | |
| echo "Error: Invalid characters in FINAL_INCLUDE" >&2 | |
| exit 1 | |
| fi |
References
- When validating input that will be interpolated into shell commands, use a strict whitelist of allowed characters (e.g., alphanumeric and a limited set of safe symbols like '-', '_', '.') instead of a blacklist to effectively prevent command injection.
| TABLES_PATTERN="" | ||
| TABLES_EXCLUDE_PATTERN="" | ||
| TABLES_FILE="" | ||
| MARIADB_OPTS="" |
| --user=*|--password=*|--host=*|--port=*|--socket=*) | ||
| # Connection options (long form) - pass through to mariadb client | ||
| MARIADB_OPTS="$MARIADB_OPTS $1" | ||
| shift | ||
| ;; | ||
| -u|-p|-h|-P|-S) | ||
| # Short form. If the next argv looks like a value (not another | ||
| # flag), consume it as well — supports both `-u USER` and bare | ||
| # `-p` (which mariadb client treats as a password prompt). | ||
| if [[ -n "${2-}" && "$2" != -* ]]; then | ||
| MARIADB_OPTS="$MARIADB_OPTS $1 $2" | ||
| shift 2 | ||
| else | ||
| MARIADB_OPTS="$MARIADB_OPTS $1" | ||
| shift | ||
| fi | ||
| ;; | ||
| -u*|-p*|-h*|-P*|-S*) | ||
| # Bundled short form: -uUSER, -pSECRET, -hHOST, ... | ||
| MARIADB_OPTS="$MARIADB_OPTS $1" | ||
| shift | ||
| ;; | ||
| --defaults-file=*|--defaults-extra-file=*) | ||
| # Config file options - pass through to mariadb client | ||
| MARIADB_OPTS="$MARIADB_OPTS $1" | ||
| shift | ||
| ;; |
There was a problem hiding this comment.
We should append connection options to the MARIADB_OPTS array instead of concatenating them into a flat string to correctly handle arguments with spaces or special characters.
--user=*|--password=*|--host=*|--port=*|--socket=*)
# Connection options (long form) - pass through to mariadb client
MARIADB_OPTS+=("$1")
shift
;;
-u|-p|-h|-P|-S)
# Short form. If the next argv looks like a value (not another
# flag), consume it as well — supports both `-u USER` and bare
# `-p` (which mariadb client treats as a password prompt).
if [[ -n "${2-}" && "$2" != -* ]]; then
MARIADB_OPTS+=("$1" "$2")
shift 2
else
MARIADB_OPTS+=("$1")
shift
fi
;;
-u*|-p*|-h*|-P*|-S*)
# Bundled short form: -uUSER, -pSECRET, -hHOST, ...
MARIADB_OPTS+=("$1")
shift
;;
--defaults-file=*|--defaults-extra-file=*)
# Config file options - pass through to mariadb client
MARIADB_OPTS+=("$1")
shift
;;| } | ||
|
|
||
| #ifndef _WIN32 | ||
| int dir= open(target.str, O_DIRECTORY); |
| explicit Aria_backup(THD *thd, Target target) noexcept | ||
| : target(target) | ||
| #ifndef _WIN32 | ||
| , datadir_fd(open(maria_data_root, O_DIRECTORY)) |
|
Hey @Thirunarayanan , have you thought of how this is going to end up in packaging? It is replacing the original mariadb-backup? It needs some INSTALL/INSTALL_SCRIPT cmake directives around this to give it a install location and a cmake component. debian installation/packaging would need the relevant debian/{package}.install to include te script. |
8d0dc79 to
051f681
Compare
dr-m
left a comment
There was a problem hiding this comment.
Please rebase this on the current branch.
| @@ -0,0 +1,82 @@ | |||
| --source include/have_mariabackup_wrapper.inc | |||
There was a problem hiding this comment.
Can this be implemented in a way that is compatible with the .combinations logic?
I would like most of --suite=mariabackup to be run both with the genuine mariadb-backup and with the BACKUP SERVER based wrapper scripts. Currently, this is the only test that exercises the wrappers, and it fails to test the original executables to demonstrate compatibility between them and the wrappers.
| --echo # | ||
| --echo # --stream=mbstream emits a valid tar archive to stdout | ||
| --echo # | ||
| --let $targetdir=$MYSQLTEST_VARDIR/tmp/bk_stream | ||
| --let $streamfile=$MYSQLTEST_VARDIR/tmp/bk_stream.tar | ||
| --exec $XTRABACKUP $defaults --backup --target-dir=$targetdir --stream=mbstream > $streamfile 2>$logfile | ||
| --exec tar -tf $streamfile > /dev/null | ||
| --let SEARCH_PATTERN=Creating tar stream | ||
| --source include/search_pattern_in_file.inc | ||
| --rmdir $targetdir | ||
| --remove_file $streamfile |
There was a problem hiding this comment.
I’m not sure if this should be in the current scope. We don’t have any MDEV-38362 streaming backup yet. What should be more in the scope is demonstrating that the mbstream wrapper script is syntactically compatible with its namesake utility.
| @@ -0,0 +1,19 @@ | |||
| #!/bin/bash | |||
There was a problem hiding this comment.
Can we write this for the POSIX shell? Bourne Again Shell (bash) is not available in all environments, for good reasons: reduced resource usage as well as attack surface; remember https://en.wikipedia.org/wiki/Shellshock_(software_bug)
| -p|--parallel) | ||
| SKIP_NEXT=1 | ||
| ;; | ||
| -p*) | ||
| ;; | ||
| --parallel=*) | ||
| ;; |
There was a problem hiding this comment.
These are necessary but not sufficient rules. mbstream -xp4 would be a valid invocation of the original utility (equivalent to -x -p4), but not of GNU tar.
I did test that mbstream indeed treats anything after -p as a single argument. For example, -p4t and -p4x would report the following, respectively:
Warning: option 'parallel': signed value 4398046511104 adjusted to 2147483647
Unknown suffix 'x' used for variable 'parallel' (value '4x'). Legal suffix characters are: K, M, G, T, P, E
mbstream: Error while setting value '4x' to 'parallel'
Please add a test file that covers both the real mbstream and this wrapper. Both with some invalid and valid invocation. For example, mbstream -t is not allowed, while tar -t is.
051f681 to
6e37d3a
Compare
| --parallel=*|--throttle=*|--no-lock|--safe-slave-backup) | ||
| # Handled server-side by BACKUP SERVER. | ||
| shift ;; |
There was a problem hiding this comment.
The --parallel option will have to be passed through to non-streaming backup.
8077134 in #4817 implements multi-threaded backup to a mounted file system. By default, only 1 thread will be used.
For streaming backup (which is not implemented yet except as stubs), I think that we must limit the backup to a single thread, because unlike the old xbstream or mbstream format, BACKUP SERVER will generate one stream per thread. So, the only way to end up with one backup stream is to use a single thread.
6e37d3a to
3427e04
Compare
This introduces a basic driver Sql_cmd_backup, storage engine interfaces, and basic copying of InnoDB data files. TODO: Complete the implementation of streaming backup in GNU tar --format=oldgnu backup_target: A structured data type to represent a target directory. On Microsoft Windows, we must use directory paths because there is no variant of CopyFileEx() that would work on file handles. backup_sink: Wraps a per-thread output stream as well as storage engine specific context. handlerton::backup_start(), handlerton::backup_end(): Invoked at the start or end of a backup phase, in the thread that executes a BACKUP SERVER statement. handlerton::backup_step(): A backup step that can be invoked from multiple threads concurrently, between the execution of the corresponding handlerton::backup_start() and handlerton::backup_end() of the same phase. copy_entire_file(): A file copying service for POSIX systems. copy_file(): A sparse file-copying service for all systems. InnoDB_backup::context: Backup context, attached to backup_sink so that context can continue to exist between the time a BACKUP SERVER releases all locks and another BACKUP SERVER starts executing, with innodb_backup pointing to the new backup, while the old backup is still being finished. fil_space_t::write_or_backup: Keep track of in-flight page writes and pending backup operation. We must not allow them concurrently, because that could lead into torn pages in the backup. fil_space_t::backup_end: The first page number that is not being backed up (by default 0, to indicate that no backup is in progress). fil_space_t::BACKUP_BATCH_SIZE: The number of preceding pages that will be covered by fil_space_t::backup_end. This is the unit of "page range locking" during InnoDB backup. log_sys.backup: Whether BACKUP SERVER is in progress. The purpose of this is to make BACKUP SERVER prevent the concurrent execution of SET GLOBAL innodb_log_archive=OFF or SET GLOBAL innodb_log_file_size when innodb_log_archive=OFF. log_sys.archived_checkpoint: Keep track of the earliest available checkpoint, corresponding to log_sys.archived_lsn. This reflects SET GLOBAL innodb_log_recovery_start (which is settable now), for incremental backup. buf_flush_list_space(): Check for concurrent backup before writing each page. This is inefficient, but this function may be invoked from multiple threads concurrently, and it cannot be changed easily, especially for fil_crypt_thread().
551dd2b to
b1a28e9
Compare
TODO: Test with actual scripts and extract with tar
scripts/mariabackup/mariabackup.sh: a drop-in, mariadb-backup compatible POSIX sh wrapper that drives the server-side BACKUP SERVER command, so existing mariadb-backup invocations keep working without changing user scripts. --backup translates into "BACKUP SERVER TO '<dir>'" over the mariadb client, forwarding connection options; --parallel=N becomes the "<N> CONCURRENT" clause. After a successful backup the wrapper writes backup-prepare.cnf into the target directory, recording the running server's mariadbd path and InnoDB layout (page size, data file path, undo tablespaces, checksum algorithm, log file size). When the server is encrypted it also records how to reload the key-management plugin (plugin name, key-file path, algorithm). --prepare runs mariadbd --bootstrap on the backup with backup-prepare.cnf as its defaults file, replays the archived redo over the [recovery_start, recovery_target] window read from backup.cnf, then replaces the archived log with a freshly built circular ib_logfile0 so an ordinary server can start on the directory. --copy-back / --move-back place a prepared backup into the datadir with cp / mv, create the datadir if missing, refuse a non-empty datadir unless --force-non-empty-directories is given, and print the post-restore chown reminder. wrapper with these options are not supported yet --incremental backup/prepare, --apply-log-only, --rollback-xa, partial backup (--databases/--tables, which would need server-side backup_include/backup_exclude), and output streaming/compression/encryption. --export is accepted but warns and runs a plain recovery. extra/mariabackup/CMakeLists.txt: install the wrapper as mariadb-backup-server behind a new WITH_MARIABACKUP_WRAPPER option (OFF by default) scripts/mariabackup/README.md documents the modes, the unsupported options, and the backup.cnf / backup-prepare.cnf formats. include/have_mariabackup_wrapper.inc redirects $XTRABACKUP to the wrapper so a test opts in by sourcing one file, skipping when the wrapper, sh, or the mariadb client is unavailable. include/have_mariabackup_combination.inc runs a test under both the [OLD] mariadb-backup binary and the [NEW] wrapper
Fill in the anomaly shutdown paths of InnoDB to include call to buf_mem_pressure_shutdown now that MDEV-39585 provides some proper calls shutdown InnoDB and other plugins during the shutdown under --bootstrap. Alternate: destructor attribute on buf_mem_pressure_shutdown would acheive the same thing and given Linux compilers are capabile of this. This is possible as mem_pressure is currently implemented in Linux onny.
Don't perform mysqld_win_initiate_shutdown under --bootstrap when triggered by SHUTDOWN. With this we don't perform any service interactions. Then the shutdown can proceeded without then hard process termination in mysqld_win_initiate_shutdown. This previously occurred because the handle_connections_win() was never called in --bootstrap and therefore startup_complete() was false. Thanks Vladislav Vaintroub for investigation and providing implementation guidance.
mariadbd under --bootstrap failed to preform plugin deinitialization. The sleep(2);exit is removed and replaced to a goto termination label to perform the same shutdown procedure of the server after all the connection closing. To prevent a compile error about char *user being uninitialized this sql_print_information(ER_DEFAULT(ER_NORMAL_SHUTDOWN)) is moved to its own block. The memory free did need to occur in the bootstrap mode too to avoid memory leak errors. wait_for_signal_thread_to_end(), was previously in close_connections() however its required too for --bootstrap.
9ce9751 to
1fcb44e
Compare
dr-m
left a comment
There was a problem hiding this comment.
Now that #4817 includes support for streaming (see the test backup.backup_stream, which passes on Linux, FreeBSD, Microsoft Windows and Apple macOS), it would be good to add support for streaming backup.
When I tested the merge of this to the current #4817, I got one test failure:
CURRENT_TEST: mariabackup.wrapper_basic
NOT FOUND /Executing: BACKUP SERVER TO/ in wrapper.log
mysqltest: In included file "./include/search_pattern_in_file.inc":
included from /mariadb/main/mysql-test/suite/mariabackup/wrapper_basic.test at line 19:
At line 65: command "perl" failed with error: 255 my_errno: 0 errno: 0
I couldn’t figure this out. At the very end of scripts/mariabackup/mariabackup.sh we are writing a configuration file, and that is what happens. I even added set -eux or set -eu to the start of the script, and it did not complain about any uninitialized variables. I think that set -eu would be a good safety measure.
Note: I am going to squash #4817 shortly.
| if ($errno) | ||
| { | ||
| --skip mariabackup.sh wrapper unavailable (script or sh missing) | ||
| } |
There was a problem hiding this comment.
/bin/sh can’t possibly be missing on a POSIX system; it is what system(3), popen(3) and friends will invoke.
| # | ||
| # $XTRABACKUP — now points at mariabackup.sh | ||
|
|
||
| --source include/linux.inc |
There was a problem hiding this comment.
I think we only want to exclude non-POSIX environments, more specifically, Microsoft Windows. For that, I would suggest the following:
--source include/not_windows.incor if it cannot be used for some reason:
if ($MARIADB_UPGRADE_EXE) {
--skip Need POSIX
}
87036f8 to
4769a43
Compare
scripts/mariabackup/mariabackup.sh: a drop-in wrapper that
lets existing mariabackup --backup invocations drive the server-side
BACKUP SERVER command without changing user scripts.