-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
MDEV-34431: Avoid spin loops on page I/O waits #4168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 10.11
Are you sure you want to change the base?
Conversation
|
d2f77d6
to
acfc1d6
Compare
I filed #4172 for a 11.4 version of this, comprising this change as well as a second change, which I originally developed and tested on 10.6 as d2f77d6. I think that only the first change has an acceptable level of risk for the 10.6 release. The 11.4 version of the latter change is simpler, thanks to some earlier code removal and cleanup. |
773b5e6
to
2567411
Compare
storage/innobase/include/srw_lock.h
Outdated
void rd_lock() noexcept { if (!rd_lock_try()) rd_wait(); } | ||
void rd_lock() noexcept { if (!spinloop || !rd_lock_try()) rd_wait(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: On an Intel Xeon system where I tested, this change (as well as the corresponding adjustment to rd_wait()
) shrunk the CMAKE_BUILD_TYPE=RelWithDebInfo
sql/mariadbd
executable size by about 30 KiB. The rd_lock_try()
would translate into a loop around lock cmpxchg
. I didn’t observe a significant performance difference; this patch could slightly improve the performance.
The Microsoft Windows implementations of SRWLOCK and WaitOnAddress() include some spin loop logic before entering the operating system kernel. Let us avoid duplicating some of that spin loop logic. Thanks to Vladislav Vaintroub for this fix.
While waiting for for I/O completion, let us skip spin loops. Even on fast storage, reading a page into the buffer pool takes so long that a spin loop would only end up wasting CPU time. block_lock::s_lock_nospin(): A no-spin variant of acquiring a shared buffer page latch. Regular s_lock() always involveis a spin loop. ssux_lock_impl::rd_lock_spin(), ssux_lock_impl::rd_lock_nospin(): Split from rd_wait(). ssux_lock_impl::rd_lock(): Invoke either rd_lock_nospin() or rd_lock_try() and rd_lock_spin(). buf_page_get_low(): After acquiring a page latch on an io-fixed block, try to optimize operations on the page latch.
I suspect that the reason for timeouts on AppVeyor is 9d34322, which @vaintroub suggested. We would seem to need a more sophisticated fix to disable the additional spin loop layer on Microsoft Windows. For instance, the following template specializations would need to skip the #if defined _WIN32 || defined SUX_LOCK_GENERIC
template<> void srw_lock_<true>::rd_wait() noexcept
{
const unsigned delay= srw_pause_delay();
for (auto spin= srv_n_spin_wait_rounds; spin; spin--)
{
srw_pause(delay);
if (rd_lock_try())
return;
}
IF_WIN(AcquireSRWLockShared(&lk), rw_rdlock(&lk));
}
template<> void srw_lock_<true>::wr_wait() noexcept
{
const unsigned delay= srw_pause_delay();
for (auto spin= srv_n_spin_wait_rounds; spin; spin--)
{
srw_pause(delay);
if (wr_lock_try())
return;
}
IF_WIN(AcquireSRWLockExclusive(&lk), rw_wrlock(&lk));
}
#endif |
https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435045 and https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435868 (for the revert 87e2123) prove that @vaintroub’s Windows specific patch indeed has a problem. Several tests were timing out (240 seconds) when the patch is present. |
There is a hidden endless loop in current existing code, in srw_mutex_impl::wait_and_lock(), that's the reason |
Description
Recent performance tests suggest that spin loops on
index_lock
andblock_lock
are hurting performance, especially in cases when progress is blocked by an I/O wait. The simplest case would be that the waited-for block is io-fixed (being read into the buffer pool or written back to data files). A more complex case is that a thread that is waiting for I/O on another block or on the log file is holding a conflicting latch on an index tree or a buffer page.Disabling the spin loops altogether for
index_lock
andblock_lock
seems to yield overall the best result.The best scalability was observed with
thread_handling=pool-of-threads
which defaults to creatingmy_getncpus()
connection handler threads.Note: For latches that are not being held while waiting for any I/O, such as
trx_t::mutex
, spin loops may still be helpful. In fact, we tried disabling all spin loops altogether, and got worse performance.On Microsoft Windows, we will ignore the parameter
innodb_sync_spin_loops
and rely on the built-in user space spin loop logic inSRWLOCK
andWaitOnAddress()
.ssux_lock_impl<false>::rd_lock()
: Jump straight tord_wait()
without one step of spinning.ssux_lock_impl<false>::rd_wait()
: Invokerd_lock_try()
before entering the wait.Release Notes
Spin loops on waits for InnoDB index tree or buffer page latches have been disabled, because they are often a waste of CPU when running I/O bound workloads.
On Microsoft Windows, the parameter
innodb_sync_spin_loops
will be ignored.How can this PR be tested?
Sysbench or HammerDB on a workload that is larger than the buffer pool. This should be most prominent when there are synchronous waits for pages in the buffer pool.
Measuring the CPU usage while running the test case in MDEV-32067 should show some impact.
Basing the PR against the correct MariaDB version
main
branch.PR quality check