Skip to content

Blocking behavior on Windows when allocating inside a loop on multiple threads #112744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
c-antin opened this issue Jun 17, 2023 · 2 comments
Closed
Labels
C-bug Category: This is a bug. O-windows Operating system: Windows

Comments

@c-antin
Copy link

c-antin commented Jun 17, 2023

I've been porting a simple PHP benchmark of repeated str_replace/preg_replace to Rust (that I scaled up to have 1s runtime on my machine). First I thought it's a problem with the regex crate (more details can be found here).
But it seems to be an issue on Windows if the allocator is shared between threads (vs processes).

use std::time::Instant;

use regex::Regex;

fn main() {
    let mut args = std::env::args();
    let _ = args.next();
    let arg1 = args.next();
    if arg1.is_none() {
        panic!("arg1 missing!");
    }
    let n = arg1.unwrap().parse().unwrap();
    let ts = Instant::now();
    let mut handles = Vec::with_capacity(n);
    for _ in 0..n {
        handles.push(std::thread::spawn(|| {
            let mut subject = "#".to_string();
            let search = Regex::new(&regex::escape("#")).unwrap();
            let replace = "benchmark#";

            let ts = Instant::now();
            for _ in 0..38000 {
                subject = search.replace_all(&subject, replace).to_string();
            }
            println!("thread {}", ts.elapsed().as_secs_f32());
        }));
    }

    for handle in handles {
        handle.join().unwrap();
    }
    println!("total {}", ts.elapsed().as_secs_f32());
}

I expected to see this happen: total runtime at most that of the slowest thread

Instead, this happened: total runtime is as if all threads were run in sequence

I've tested it on 3 distinct Windows machines (same result) and on a macOS machine (no sequential runtime).

Meta

rustc --version --verbose:

rustc 1.70.0 (90c541806 2023-05-31)
binary: rustc
commit-hash: 90c541806f23a127002de5b4038be731ba1458ca
commit-date: 2023-05-31
host: x86_64-pc-windows-msvc
release: 1.70.0
LLVM version: 16.0.2
@ChrisDenton
Copy link
Member

If you don't override it, then std on Windows uses HeapAlloc with the default process heap. This is indeed serialized.

I'd suggest trying with mimalloc and seeing if the results are better.

@c-antin
Copy link
Author

c-antin commented Jun 17, 2023

The results with mimalloc are significantly better, thanks!

@KittyBorgX KittyBorgX added the O-windows Operating system: Windows label Jun 18, 2023
@c-antin c-antin closed this as completed Jun 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. O-windows Operating system: Windows
Projects
None yet
Development

No branches or pull requests

3 participants