这是indexloc提供的服务,不要输入任何密码
Skip to content

Improve debuggability of data race diagnostics with additional logging of data race detector state and weak memory effects #4638

@sandersaares

Description

@sandersaares

I was recently debugging a data race (faern/oneshot#69) that proved tricky to pin down. To understand what was happening, I found myself hacking in some extra diagnostic logging to Miri. I describe here what I found useful, and request that similar capabilities be supported in Miri natively.

Weak memory effects were a precondition of this data race. The -Zmiri-track-weak-memory-loads flag is useful but additional clues were obtained by emitting the full list of candidates (and their VTimestamps) when reading an outdated value. Example:

4ms INFO miri::concurrency::weak_memory fetch_store: is_seqcst=false, found 2 candidate(s)
4ms INFO miri::concurrency::weak_memory   candidate[0]: store_thread=2, store_timestamp=9, is_seqcst=false, val=Some(0x04)
4ms INFO miri::concurrency::weak_memory   candidate[1]: store_thread=0, store_timestamp=0, is_seqcst=false, val=Some(0x03)
5ms INFO miri::concurrency::weak_memory fetch_store: chose store from thread 0 at timestamp 0 (recency="Outdated")

What proved important was to observe the changes in all the vector clock timestamps over time as the suspect allocation was accessed. The valuable information proved to be:

  • When did the current thread's foreign thread timestamps get updated and what triggered the update?
  • On every access to the tracked allocation, what was the data race tracking state before and after?

The main questions I needed to answer from these logs were:

  • What was the data race detection state when the earlier racing operation was started?
  • What other operations happened between the two racing operations?
  • What was the data race tracking state before/after each operation?

I ended up adding the following logging:

6ms INFO miri::concurrency::data_race Allocation alloc43683 created with timestamp 27 at index 2
6ms INFO miri::concurrency::data_race Atomic op(atomic read-modify-write) with ordering Acquire on 0x12ac68[alloc40955]<140765> (alloc_id=alloc40955, offset=16, size=1)
6ms INFO miri::concurrency::data_race Atomic read detect: thread 2 updating read_vector from VClock([0, 0, 25, 10]), thread_clock=VClock([68, 44, 27, 9])
7ms INFO miri::concurrency::data_race Atomic read detect: thread 2 read_vector updated to VClock([0, 0, 27, 10])
7ms INFO miri::concurrency::data_race Atomic read detect: write_was_before=true, write=(1, 13), na_write=(1, 13), na_read=VClock([])
8ms INFO miri::concurrency::data_race Atomic load acquire: thread 2 acquiring sync_clock VClock([]) into clock VClock([68, 44, 27, 9])
8ms INFO miri::concurrency::data_race Atomic load acquire: thread 2 clock after join: VClock([68, 44, 27, 9])
8ms INFO miri::concurrency::data_race Atomic write detect: thread 2 updating write_vector from VClock([0, 0, 26, 10]), thread_clock=VClock([68, 44, 27, 9])
9ms INFO miri::concurrency::data_race Atomic write detect: thread 2 write_vector updated to VClock([0, 0, 27, 10])
9ms INFO miri::concurrency::data_race Atomic write detect: write_was_before=true, reads_before=true, na_write=(1, 13), na_read=VClock([]), thread_clock=VClock([68, 44, 27, 9])
9ms INFO miri::concurrency::data_race Atomic RMW relaxed: thread 2 joining fence_release VClock([]) with sync_vector VClock([])
10ms INFO miri::concurrency::data_race Atomic RMW relaxed: thread 2 sync_vector after join: VClock([])
10ms INFO miri::concurrency::data_race Incremented clock at index 2 from 27 to 28

This feels like overkill - I just did not know what exactly I would need. The useful information from here was:

  • What type of operation is being observed (this was very important to cross-reference the operations trace with code to understand what line was being executed; the code span references associated with the reads/writes often seemed to just point to some function call further up in the stack).
  • The clock values associated with each operation.
  • When synchronization happens and we increment the current thread's clock to that of a foreign thread.

It would be most helpful if Miri in a future version could provide some information of this nature without needing custom logging added. Perhaps as some sort of -Zmiri-track-alloc-accesses-mode=detailed to opt-in to? Perhaps also alongside -Zmiri-track-weak-memory-loads-mode=detailed for the former? I am assuming that it is desirable to keep such details out of any default data set.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-data-raceArea: data race detectorA-diagnosticserrors and warnings emitted by miriC-enhancementCategory: a PR with an enhancement or an issue tracking an accepted enhancement

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions