这是indexloc提供的服务,不要输入任何密码
Skip to content

init function in .map_init called unexpectedly many times. #1214

@Neutron3529

Description

@Neutron3529

The main idea is that, I want to have a counter, which count the frequency of all possible results. Since I know the result is usize and less than a small value (for example, 300), I could generate a vector of length 300, and deal the results with the vector. The question is, there is no such .count() function, and create vectors with such memory allocation might be slow.

Is it possible to provide some counter (.count_with_vec(vec_length) / .count_with_hash()) for quickly summarize the results?


Old content, which I think that might be a bug, duplicated to #742 #718

use std::sync::atomic::{AtomicUsize, Ordering};
use rayon::prelude::*;
fn main() {
    let item = AtomicUsize::new(0);
    println!(
        "got {}",
        (0..1_0000_0000)
            .into_par_iter()
            .map_init(|| item.fetch_add(1, Ordering::SeqCst), |_, _| 1)
            .sum::<usize>()
    );
    println!("init called: {item:?}")
}

with cargo run --release, I've found the item goes to 3000~5000 after the iter ends, which is unexpectedly high.

$ cargo rr
    Finished `release` profile [optimized] target(s) in 0.01s
     Running `target/release/rayon-bug`
got 100000000
init called: 3498
$ cargo rr
    Finished `release` profile [optimized] target(s) in 0.01s
     Running `target/release/rayon-bug`
got 100000000
init called: 3154
$ cargo rr
    Finished `release` profile [optimized] target(s) in 0.01s
     Running `target/release/rayon-bug`
got 100000000
init called: 4342
$ cargo rr
    Finished `release` profile [optimized] target(s) in 0.01s
     Running `target/release/rayon-bug`
got 100000000
init called: 3670

Even the init is a heavy function, the result does not changes:

use std::{
    sync::atomic::{AtomicUsize, Ordering},
    thread,
    time::Duration,
};

use rayon::prelude::*;
fn main() {
    rayon::ThreadPoolBuilder::new()
        .num_threads(16)
        .build_global()
        .unwrap();
    let item = AtomicUsize::new(0);
    println!(
        "got {}",
        (0..10_0000_0000)
            .into_par_iter()
            .map_init(
                || {
                    item.fetch_add(1, Ordering::SeqCst);
                    thread::sleep(Duration::from_millis(10)); // suppose we do a lot of things here.
                },
                |_, _| 1
            )
            .sum::<usize>()
    );
    println!("init called: {item:?}")
}
$ cargo rr
   Compiling rayon-bug v0.1.0 (/me/rayon-bug)
    Finished `release` profile [optimized] target(s) in 0.81s
     Running `target/release/rayon-bug`
got 1000000000
init called: 4018

I have no idea how to prevent init called so many times.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions