MDEV-34431: Avoid spin loops on page I/O waits #4168

dr-m · 2025-07-04T08:38:20Z

The Jira issue number for this PR is: MDEV-34431

Description

Recent performance tests suggest that spin loops on index_lock and block_lock are hurting performance, especially in cases when progress is blocked by an I/O wait. The simplest case would be that the waited-for block is io-fixed (being read into the buffer pool or written back to data files). A more complex case is that a thread that is waiting for I/O on another block or on the log file is holding a conflicting latch on an index tree or a buffer page.

Disabling the spin loops altogether for index_lock and block_lock seems to yield overall the best result.

The best scalability was observed with thread_handling=pool-of-threads which defaults to creating my_getncpus() connection handler threads.

Note: For latches that are not being held while waiting for any I/O, such as trx_t::mutex, spin loops may still be helpful. In fact, we tried disabling all spin loops altogether, and got worse performance.

On Microsoft Windows, we will ignore the parameter innodb_sync_spin_loops and rely on the built-in user space spin loop logic in SRWLOCK and WaitOnAddress().

ssux_lock_impl<false>::rd_lock(): Jump straight to rd_wait() without one step of spinning.

ssux_lock_impl<false>::rd_wait(): Invoke rd_lock_try() before entering the wait.

Release Notes

Spin loops on waits for InnoDB index tree or buffer page latches have been disabled, because they are often a waste of CPU when running I/O bound workloads.

On Microsoft Windows, the parameter innodb_sync_spin_loops will be ignored.

How can this PR be tested?

Sysbench or HammerDB on a workload that is larger than the buffer pool. This should be most prominent when there are synchronous waits for pages in the buffer pool.

Measuring the CPU usage while running the test case in MDEV-32067 should show some impact.

Basing the PR against the correct MariaDB version

This is a new feature or a refactoring, and the PR is based against the main branch.
This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

CLAassistant · 2025-07-04T08:38:26Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

dr-m · 2025-07-08T09:50:11Z

I filed #4172 for a 11.4 version of this, comprising this change as well as a second change, which I originally developed and tested on 10.6 as d2f77d6. I think that only the first change has an acceptable level of risk for the 10.6 release. The 11.4 version of the latter change is simpler, thanks to some earlier code removal and cleanup.

dr-m · 2025-07-17T12:11:13Z

storage/innobase/include/srw_lock.h

-  void rd_lock() noexcept { if (!rd_lock_try()) rd_wait(); }
+  void rd_lock() noexcept { if (!spinloop || !rd_lock_try()) rd_wait(); }


Note: On an Intel Xeon system where I tested, this change (as well as the corresponding adjustment to rd_wait()) shrunk the CMAKE_BUILD_TYPE=RelWithDebInfo sql/mariadbd executable size by about 30 KiB. The rd_lock_try() would translate into a loop around lock cmpxchg. I didn’t observe a significant performance difference; this patch could slightly improve the performance.

The Microsoft Windows implementations of SRWLOCK and WaitOnAddress() include some spin loop logic before entering the operating system kernel. Let us avoid duplicating some of that spin loop logic. Thanks to Vladislav Vaintroub for this fix.

While waiting for for I/O completion, let us skip spin loops. Even on fast storage, reading a page into the buffer pool takes so long that a spin loop would only end up wasting CPU time. block_lock::s_lock_nospin(): A no-spin variant of acquiring a shared buffer page latch. Regular s_lock() always involveis a spin loop. ssux_lock_impl::rd_lock_spin(), ssux_lock_impl::rd_lock_nospin(): Split from rd_wait(). ssux_lock_impl::rd_lock(): Invoke either rd_lock_nospin() or rd_lock_try() and rd_lock_spin(). buf_page_get_low(): After acquiring a page latch on an io-fixed block, try to optimize operations on the page latch.

dr-m · 2025-07-21T10:06:10Z

I suspect that the reason for timeouts on AppVeyor is 9d34322, which @vaintroub suggested. We would seem to need a more sophisticated fix to disable the additional spin loop layer on Microsoft Windows. For instance, the following template specializations would need to skip the for loops altogether on _WIN32 if the AcquireSRWLockShared() and AcquireSRWLockExclusive() include spin loop logic on their own:

#if defined _WIN32 || defined SUX_LOCK_GENERIC
template<> void srw_lock_<true>::rd_wait() noexcept
{
  const unsigned delay= srw_pause_delay();

  for (auto spin= srv_n_spin_wait_rounds; spin; spin--)
  {
    srw_pause(delay);
    if (rd_lock_try())
      return;
  }

  IF_WIN(AcquireSRWLockShared(&lk), rw_rdlock(&lk));
}

template<> void srw_lock_<true>::wr_wait() noexcept
{
  const unsigned delay= srw_pause_delay();

  for (auto spin= srv_n_spin_wait_rounds; spin; spin--)
  {
    srw_pause(delay);
    if (wr_lock_try())
      return;
  }

  IF_WIN(AcquireSRWLockExclusive(&lk), rw_wrlock(&lk));
}
#endif

dr-m · 2025-07-21T12:40:31Z

https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435045 and https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435868 (for the revert 87e2123) prove that @vaintroub’s Windows specific patch indeed has a problem. Several tests were timing out (240 seconds) when the patch is present.

vaintroub · 2025-07-21T12:56:24Z

https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435045 and https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435868 (for the revert 87e2123) prove that @vaintroub’s Windows specific patch indeed has a problem. Several tests were timing out (240 seconds) when the patch is present.

There is a hidden endless loop in current existing code, in srw_mutex_impl::wait_and_lock(), that's the reason

dr-m requested a review from mariadb-DebarunBanerjee July 4, 2025 08:38

dr-m self-assigned this Jul 4, 2025

dr-m added the MariaDB Corporation label Jul 4, 2025

dr-m force-pushed the MDEV-34431 branch from cf1e950 to 94150a9 Compare July 7, 2025 13:22

dr-m mentioned this pull request Jul 8, 2025

MDEV-34431: Avoid spin loops on page I/O waits #4172

Draft

5 tasks

dr-m force-pushed the MDEV-34431 branch 2 times, most recently from d2f77d6 to acfc1d6 Compare July 8, 2025 09:44

dr-m force-pushed the MDEV-34431 branch 3 times, most recently from 773b5e6 to 2567411 Compare July 14, 2025 07:54

dr-m force-pushed the MDEV-34431 branch from 2567411 to f12a913 Compare July 16, 2025 12:00

dr-m changed the base branch from 10.6 to 10.11 July 17, 2025 12:02

dr-m force-pushed the MDEV-34431 branch from f12a913 to f9e9ecc Compare July 17, 2025 12:07

dr-m commented Jul 17, 2025

View reviewed changes

dr-m marked this pull request as draft July 21, 2025 06:41

dr-m force-pushed the MDEV-34431 branch from f9e9ecc to aa8cf04 Compare July 21, 2025 06:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MDEV-34431: Avoid spin loops on page I/O waits #4168

MDEV-34431: Avoid spin loops on page I/O waits #4168

Uh oh!

dr-m commented Jul 4, 2025 •

edited

Loading

Uh oh!

CLAassistant commented Jul 4, 2025

Uh oh!

dr-m commented Jul 8, 2025

Uh oh!

dr-m Jul 17, 2025

Uh oh!

dr-m commented Jul 21, 2025

Uh oh!

dr-m commented Jul 21, 2025

Uh oh!

vaintroub commented Jul 21, 2025

Uh oh!

Uh oh!

		void rd_lock() noexcept { if (!rd_lock_try()) rd_wait(); }
		void rd_lock() noexcept { if (!spinloop \|\| !rd_lock_try()) rd_wait(); }

Uh oh!

MDEV-34431: Avoid spin loops on page I/O waits #4168

Are you sure you want to change the base?

MDEV-34431: Avoid spin loops on page I/O waits #4168

Uh oh!

Conversation

dr-m commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Release Notes

How can this PR be tested?

Basing the PR against the correct MariaDB version

PR quality check

Uh oh!

CLAassistant commented Jul 4, 2025

Uh oh!

dr-m commented Jul 8, 2025

Uh oh!

dr-m Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

dr-m commented Jul 21, 2025

Uh oh!

dr-m commented Jul 21, 2025

Uh oh!

vaintroub commented Jul 21, 2025

Uh oh!

Uh oh!

dr-m commented Jul 4, 2025 •

edited

Loading