这是indexloc提供的服务,不要输入任何密码
Skip to content

MDEV-34431: Avoid spin loops on page I/O waits #4168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: 10.11
Choose a base branch
from
Draft

Conversation

dr-m
Copy link
Contributor

@dr-m dr-m commented Jul 4, 2025

  • The Jira issue number for this PR is: MDEV-34431

Description

Recent performance tests suggest that spin loops on index_lock and block_lock are hurting performance, especially in cases when progress is blocked by an I/O wait. The simplest case would be that the waited-for block is io-fixed (being read into the buffer pool or written back to data files). A more complex case is that a thread that is waiting for I/O on another block or on the log file is holding a conflicting latch on an index tree or a buffer page.

Disabling the spin loops altogether for index_lock and block_lock seems to yield overall the best result.

The best scalability was observed with thread_handling=pool-of-threads which defaults to creating my_getncpus() connection handler threads.

Note: For latches that are not being held while waiting for any I/O, such as trx_t::mutex, spin loops may still be helpful. In fact, we tried disabling all spin loops altogether, and got worse performance.

On Microsoft Windows, we will ignore the parameter innodb_sync_spin_loops and rely on the built-in user space spin loop logic in SRWLOCK and WaitOnAddress().

ssux_lock_impl<false>::rd_lock(): Jump straight to rd_wait() without one step of spinning.

ssux_lock_impl<false>::rd_wait(): Invoke rd_lock_try() before entering the wait.

Release Notes

Spin loops on waits for InnoDB index tree or buffer page latches have been disabled, because they are often a waste of CPU when running I/O bound workloads.

On Microsoft Windows, the parameter innodb_sync_spin_loops will be ignored.

How can this PR be tested?

Sysbench or HammerDB on a workload that is larger than the buffer pool. This should be most prominent when there are synchronous waits for pages in the buffer pool.

Measuring the CPU usage while running the test case in MDEV-32067 should show some impact.

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

@dr-m dr-m self-assigned this Jul 4, 2025
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@dr-m dr-m force-pushed the MDEV-34431 branch 2 times, most recently from d2f77d6 to acfc1d6 Compare July 8, 2025 09:44
@dr-m
Copy link
Contributor Author

dr-m commented Jul 8, 2025

I filed #4172 for a 11.4 version of this, comprising this change as well as a second change, which I originally developed and tested on 10.6 as d2f77d6. I think that only the first change has an acceptable level of risk for the 10.6 release. The 11.4 version of the latter change is simpler, thanks to some earlier code removal and cleanup.

@dr-m dr-m force-pushed the MDEV-34431 branch 3 times, most recently from 773b5e6 to 2567411 Compare July 14, 2025 07:54
@dr-m dr-m changed the base branch from 10.6 to 10.11 July 17, 2025 12:02
void rd_lock() noexcept { if (!rd_lock_try()) rd_wait(); }
void rd_lock() noexcept { if (!spinloop || !rd_lock_try()) rd_wait(); }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: On an Intel Xeon system where I tested, this change (as well as the corresponding adjustment to rd_wait()) shrunk the CMAKE_BUILD_TYPE=RelWithDebInfo sql/mariadbd executable size by about 30 KiB. The rd_lock_try() would translate into a loop around lock cmpxchg. I didn’t observe a significant performance difference; this patch could slightly improve the performance.

The Microsoft Windows implementations of SRWLOCK and WaitOnAddress()
include some spin loop logic before entering the operating system kernel.
Let us avoid duplicating some of that spin loop logic.

Thanks to Vladislav Vaintroub for this fix.
@dr-m dr-m marked this pull request as draft July 21, 2025 06:41
While waiting for for I/O completion, let us skip spin loops.
Even on fast storage, reading a page into the buffer pool takes
so long that a spin loop would only end up wasting CPU time.

block_lock::s_lock_nospin(): A no-spin variant of acquiring a shared
buffer page latch. Regular s_lock() always involveis a spin loop.

ssux_lock_impl::rd_lock_spin(), ssux_lock_impl::rd_lock_nospin():
Split from rd_wait().

ssux_lock_impl::rd_lock(): Invoke either rd_lock_nospin() or
rd_lock_try() and rd_lock_spin().

buf_page_get_low(): After acquiring a page latch on an io-fixed block,
try to optimize operations on the page latch.
@dr-m
Copy link
Contributor Author

dr-m commented Jul 21, 2025

I suspect that the reason for timeouts on AppVeyor is 9d34322, which @vaintroub suggested. We would seem to need a more sophisticated fix to disable the additional spin loop layer on Microsoft Windows. For instance, the following template specializations would need to skip the for loops altogether on _WIN32 if the AcquireSRWLockShared() and AcquireSRWLockExclusive() include spin loop logic on their own:

#if defined _WIN32 || defined SUX_LOCK_GENERIC
template<> void srw_lock_<true>::rd_wait() noexcept
{
  const unsigned delay= srw_pause_delay();

  for (auto spin= srv_n_spin_wait_rounds; spin; spin--)
  {
    srw_pause(delay);
    if (rd_lock_try())
      return;
  }

  IF_WIN(AcquireSRWLockShared(&lk), rw_rdlock(&lk));
}

template<> void srw_lock_<true>::wr_wait() noexcept
{
  const unsigned delay= srw_pause_delay();

  for (auto spin= srv_n_spin_wait_rounds; spin; spin--)
  {
    srw_pause(delay);
    if (wr_lock_try())
      return;
  }

  IF_WIN(AcquireSRWLockExclusive(&lk), rw_wrlock(&lk));
}
#endif

@dr-m
Copy link
Contributor Author

dr-m commented Jul 21, 2025

https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435045 and https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435868 (for the revert 87e2123) prove that @vaintroub’s Windows specific patch indeed has a problem. Several tests were timing out (240 seconds) when the patch is present.

@vaintroub
Copy link
Member

https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435045 and https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435868 (for the revert 87e2123) prove that @vaintroub’s Windows specific patch indeed has a problem. Several tests were timing out (240 seconds) when the patch is present.

There is a hidden endless loop in current existing code, in srw_mutex_impl::wait_and_lock(), that's the reason

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants