Skip to content

MDEV-39153 Fix sporadic main.change_master_default mismatch#5221

Open
FarihaIS wants to merge 1 commit into
MariaDB:12.3from
FarihaIS:mdev-39153
Open

MDEV-39153 Fix sporadic main.change_master_default mismatch#5221
FarihaIS wants to merge 1 commit into
MariaDB:12.3from
FarihaIS:mdev-39153

Conversation

@FarihaIS

@FarihaIS FarihaIS commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Description

The test main.change_master_default sporadically fails with slave_heartbeat_period showing 60.000 instead of 0.000. The test used restart_abort: commands in the expect file, which MTR never recognized - they fell through to the else branch, which deleted restart_opts and started the server with defaults.

Replace restart_abort with direct --exec $MYSQLD calls to properly test invalid heartbeat options

Release Notes

N/A

How can this PR be tested?

Execute the main.change_master_default test in mysql-test-run.

Before this change:

There were sporadic failures like so:

@@ -209,9 +209,9 @@
 SELECT connection_name, slave_heartbeat_period
 FROM information_schema.slave_status ORDER BY connection_name;
 connection_name	slave_heartbeat_period
-	0.000
-defaulted	0.000
-unset	0.000
+	60.000
+defaulted	60.000
+unset	60.000
 CALL mtr.add_suppression('.*master-heartbeat-period.+0.*disabl.+');
 FOUND 1 /\[Warning\] .*master-heartbeat-period.+0.*disabl.+/ in mysqld.1.err
 # restart: --skip-slave-start --skip-master-ssl --master-ssl --skip-master-ssl-verify-server-cert --master-ssl-verify-server-cert --master-use-gtid=NO --autoset-master-use-gtid --master-heartbeat-period=45 --autoset-master-heartbeat-period
Result length mismatch

After this change:

Running this test 50 times consecutively with --repeat=50 in mysql-test-run shows no failures:

==============================================================================

TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------

worker[01] Using MTR_BUILD_THREAD 300, with reserved ports 19000..19029
worker[01] mysql-test-run: WARNING: running this script as _root_ will cause some tests to be skipped
main.change_master_default               [ pass ]   4428
main.change_master_default               [ 2 pass ]   4370
main.change_master_default               [ 3 pass ]   4412
main.change_master_default               [ 4 pass ]   4506
main.change_master_default               [ 5 pass ]   4413
...
main.change_master_default               [ 50 pass ]   4425
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 221.983 of 248 seconds executing testcases

Completed: All 50 tests were successful.

Basing the PR against the correct MariaDB version

  • This is a bug fix, and the PR is based against the branch 12.3.

Copyright

All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for a new 'restart_abort:' command in 'mariadb-test-run.pl' to handle cases where a server is expected to abort during restart. However, the regular expression used to parse this command is too restrictive, as it requires a colon and at least one option character to match. If the command is written without options, it will fail to match and fall back to a normal restart, potentially causing a race condition. A more robust regex and fallback handling have been suggested to resolve this issue.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread mysql-test/mariadb-test-run.pl Outdated
@FarihaIS FarihaIS marked this pull request as ready for review June 11, 2026 19:26
@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Jun 12, 2026
@gkodinov gkodinov self-assigned this Jun 12, 2026

@gkodinov gkodinov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! This is a preliminary review.

LGTM. Please stand by for the final reviewer.

@gkodinov gkodinov requested a review from sanja-byelkin June 12, 2026 11:46
@gkodinov gkodinov assigned sanja-byelkin and unassigned gkodinov Jun 12, 2026
@vuvova

vuvova commented Jun 12, 2026

Copy link
Copy Markdown
Member

first, I think, mtr should not ignore unknown commands, but fail on them.
then, restart_bindir isn't used anywhere as far as I can see, this dead code can be removed.
then, restart_abort was first ever used in MDEV-28302, before that it was never used. And I don't see why it was used, we have tons of tests where a server fails on startup and they don't use restart_abort.

So personally I don't understand why restart_abort is needed and at my current level of understanding the fix would be not to use it in the test. Oh, and make mtr to fail on unknown commands for this to never happen again.

@FarihaIS

Copy link
Copy Markdown
Contributor Author

@vuvova thank you for your feedback, agreed! Here are the changes I've made to the PR:

  • Replace restart_abort with direct --exec $MYSQLD calls (matching bad_startup_options.test pattern)
  • Remove restart_bindir dead code from both MTR and start_mysqld.inc
  • Make MTR mtr_error() on unknown expect file commands instead of silently ignoring them

Could you please let me know if this is what you meant? Thank you

@FarihaIS FarihaIS marked this pull request as draft June 12, 2026 17:38
@FarihaIS FarihaIS force-pushed the mdev-39153 branch 2 times, most recently from bb6ae60 to c3c1204 Compare June 12, 2026 18:26

@ParadoxV5 ParadoxV5 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Remove restart_bindir dead code from both MTR and start_mysqld.inc
  • Make MTR mtr_error() on unknown expect file commands instead of silently ignoring them

Could you please let me know if this is what you meant? Thank you

Thank you, but these two modifications are tangent to the MDEV-39153 fix, so please extract them to a separate PR or two.

@FarihaIS FarihaIS force-pushed the mdev-39153 branch 2 times, most recently from 660ba01 to f2076c1 Compare June 12, 2026 19:34
The test used "restart_abort:" in the expect file, which MTR never
recognized. It fell through to the else branch, deleted restart_opts,
and started the server with defaults (heartbeat_period=60 instead of 0).

Replace restart_abort with direct --exec $MYSQLD calls.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
@FarihaIS

Copy link
Copy Markdown
Contributor Author

Thank you, but these two modifications are tangent to the MDEV-39153 fix, so please extract them to a separate PR or two.

@ParadoxV5 sounds good, I extracted the unrelated changes requested from this Jira, created a new Jira for this (https://jira.mariadb.org/browse/MDEV-40023), and moved these changes to a separate PR (#5229).

Is this workflow fine?

@FarihaIS FarihaIS marked this pull request as ready for review June 12, 2026 23:21
@FarihaIS FarihaIS requested a review from ParadoxV5 June 13, 2026 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

5 participants