-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Clang segfault when using C++ modules and a precompiled header #105994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've also just confirmed this on x86_64. Same repro instructions (inside docker). Compiler: Ubuntu clang version 18.1.3 (1ubuntu1) |
Just so that people don't have to download a .zip file to see the sources, I'm also trying to include the files themselves directly: cmake_minimum_required(VERSION 3.28)
project(PCH-CPPM-problem-repro VERSION 0.0.0 LANGUAGES CXX)
set(CMAKE_CXX_STANDARD 23)
add_executable(pch-cppm-problem-repro main.cpp)
set_target_properties(pch-cppm-problem-repro PROPERTIES CXX_SCAN_FOR_MODULES 1)
target_precompile_headers(pch-cppm-problem-repro PRIVATE stdafx.h)
target_sources(pch-cppm-problem-repro PRIVATE FILE_SET CXX_MODULES FILES
Mod.cppm Part1.cppm Part2.cppm Part3.cppm Part4.cppm
) main.cpp: import mod;
int main() {
} Mod.cppm: export module mod;
export import :part1;
export import :part2;
export import :part3;
export import :part4; Part1.cppm: export module mod:part1; Part2.cppm: export module mod:part2; Part3.cppm: export module mod:part3; Part4.cppm: export module mod:part4; stdafx.h: #pragma once
#include <iostream> |
Having 4 partitions exported in Mod.cppm seems key, removing one of them and segfault stops happening. And funny thing is, it has problems both with the Ubuntu iostream file and the MacOS one as well. |
@llvm/issue-subscribers-clang-modules Author: None (ShaderKeeper)
Hello,
when compiling a project that uses 4 module partitions and a precompiled header with ```#include <iostream>``` , I get a segfault.
I've originally encountered this problem on MacOS with llvm clang-18.1.8 . I've also been able to reproduce it in an Ubuntu docker container (on the same machine) with: Ubuntu clang version 18.1.3 (1ubuntu1) Target: aarch64-unknown-linux-gnu. The machine this happens on is Apple Silicon M2 Pro (mac mini). Host architecture is aarch64. Reproduction instructions using docker:
Should get the following output:
|
Link for reproduction in compiler explorer: https://godbolt.org/z/cMhdaTszb |
I'm trying to look at this. Notable, this happens during processing of the first (and only) input file which is a .pcm file. Even trying to dump info about this .pcm file with -module-file-info segfaults clang. Command line: Offending .pcm file attached |
When in Mod.cppm I remove the last |
I have no idea if this actually fixes it, but this makes the build finish successfully in the test project: Subject: [PATCH] Try to prevent segfault (#105994)
---
Index: clang/lib/Serialization/ASTReader.cpp
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
--- a/clang/lib/Serialization/ASTReader.cpp (revision 33f3ebc86e7d3afcb65c551feba5bbc2421b42ed)
+++ b/clang/lib/Serialization/ASTReader.cpp (revision 2d7041961e94ed641bb372f1281f433dd5b72262)
@@ -9187,6 +9187,9 @@
auto I = GlobalSubmoduleMap.find(getGlobalSubmoduleID(M, ID >> 1));
return I == GlobalSubmoduleMap.end() ? nullptr : I->second;
} else {
+ if(ID == (std::numeric_limits<decltype(ID)>::max()-1)) {
+ return nullptr;
+ }
// It's a prefix (preamble, PCH, ...). Look it up by index.
unsigned IndexFromEnd = ID >> 1;
assert(IndexFromEnd && "got reference to unknown module file");
|
-module-file-info for mod.pcm after applying this patch gets outputted successfully. It would probably be good to somehow check if this doesn't still contain unwanted garbage |
Well, all 4 partitions are exported. But previously, the PCH was mentioned here. Now it isn't. But maybe that's desirable? Because a PCH is not a module partition (I would think)
|
In my fork, I've force pushed 9d6010beef7539cfbc2ee935c90a82f20eed298e , this is rebased against the last commit in the upstream main branch that passed github actions checks. I've compiled this commit in Release mode and tried it on the project where I discovered this problem (with ~80 CPP files, Vulkan, Objective C++, etc.) and it seems to build and run fine for me |
CC @ChuanqiXu9 |
Thanks for the analysis. And the "fix" shows there are some overflow for the ID. The ID is emitted in ASTWrtier.cpp |
@ChuanqiXu9 hello, thank you for the hint. I'll try to get some time after work (probably again over the weekend) and try to debug and find out what's actually happening. |
Hello, I have a progress report. I looked into this and found that during the compilation, clang++ runs 2 separate invocations of clang-20 . The first one is the one that generates the module with the garbage reference. I'll continue trying to find why this is there but it will probably take some time again. I've placed a breakpoint where you've said @ChuanqiXu9 and found that ASTDeclContextNameLookupTrait::EmitFileRef runs 4 times. The structure seems to me to be suspiciously empty? And also importantly Writer.getChain()->getModuleFileID(F) during this function call evaluates to the previously mentioned canary value. Here you can see it well to comparison of one of the C++ module references. (the blue values are changed since one of the other calls). You can see this structure seems to have more reasonable values now. |
I think I've found the problem. This function only gets called when MultiOnDiskHashTable is getting 'merged' for performance reasons (which is triggered by adding a 4th submodule). Otherwise the 'file ref' doesn't ever get emitted. So when it is getting 'merged', this gets triggered: It tries to set the 'base module' of submodules to the first module. Which might be true when you don't use PCH. When you use PCH then the first module(s) might be PCH. I don't still understand this 100% but I think this explains the behavior that is seen. |
I believe this finally fixes it. When the main module .pcm file is being written, the sides of the subtraction to generate an offset from the end were swapped while generating the 'index from end' of the PCH module in ModuleManager. During de-serialization on the other hand, the wrong operator[] was used that only supports unsigned int as argument, despite being passed a negative value. The only way to trigger this bug as far as I can tell is:
if (auto *Merged = Base ? Base->getMergedTable() : nullptr) {
// Write list of overridden files.
Writer.write<uint32_t>(Merged->Files.size());
for (const auto &F : Merged->Files)
Info.EmitFileRef(OutStream, F);
// Add all merged entries from Base to the generator.
for (auto &KV : Merged->Data) {
if (!Gen.contains(KV.first, Info))
Gen.insert(KV.first, Info.ImportData(KV.second), Info);
}
} else {
Writer.write<uint32_t>(0);
} I don't understand how this gets triggered. It seems to be when there is an 'UPDATE_VISIBLE' block being written, which I can't find what it is. (ASTWriter::WriteDeclContextVisibleUpdate). The test setup in this issue triggers it. Fix patch: Subject: [PATCH] [clang][modules] Fix serialization and de-serialization of PCH module file refs (#105994)
---
Index: clang/lib/Serialization/ASTReader.cpp
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
--- a/clang/lib/Serialization/ASTReader.cpp (revision 357bd61744bb8cc2b9b07447294fa977e5758550)
+++ b/clang/lib/Serialization/ASTReader.cpp (revision 36c965fb4feed7b8392ff7d2a60b682a6d4234fc)
@@ -9217,7 +9217,7 @@
// It's a prefix (preamble, PCH, ...). Look it up by index.
unsigned IndexFromEnd = ID >> 1;
assert(IndexFromEnd && "got reference to unknown module file");
- return getModuleManager().pch_modules().end()[-IndexFromEnd];
+ return getModuleManager().pch_modules().end()[-static_cast<int>(IndexFromEnd)];
}
}
@@ -9235,7 +9235,7 @@
auto PCHModules = getModuleManager().pch_modules();
auto I = llvm::find(PCHModules, M);
assert(I != PCHModules.end() && "emitting reference to unknown file");
- return (I - PCHModules.end()) << 1;
+ return std::distance(I, PCHModules.end()) << 1;
}
std::optional<ASTSourceDescriptor> ASTReader::getSourceDescriptor(unsigned ID) { |
@ShaderKeeper thanks for your work and the analysis! You only need to get a test to make this a patch.
And from this message, it looks you "only" need to replace
It is triggered by the last line of |
Here is a reduced test case: https://godbolt.org/z/7741TzTqT
#pragma once
class a {
virtual ~a();
a() {}
}; @ShaderKeeper Do you want to create a PR with this test code? |
Hello, thank you for looking into this. I didn't have time to continue. I've also never done a unit test in LLVM, so if you already know how, please feel free to open a PR and take full credit for my part of the patch. |
…H module file refs (llvm#105994) Co-authored-by: ShaderKeeper <no-reply@shaderkeeper.com>
It is also my first LLVM test, but here is it: #132802. |
…H module file refs (llvm#105994) Co-authored-by: ShaderKeeper <no-reply@shaderkeeper.com>
…H module file refs (#105994) (#132802) The File ID is incorrectly calculated, resulting in an out-of-bounds access. The test code is more complex because the File fetching only happens in specific scenarios. --------- Co-authored-by: ShaderKeeper <no-reply@shaderkeeper.com> Co-authored-by: Chuanqi Xu <yedeng.yd@linux.alibaba.com>
…H module file refs (llvm#105994) (llvm#132802) The File ID is incorrectly calculated, resulting in an out-of-bounds access. The test code is more complex because the File fetching only happens in specific scenarios. --------- Co-authored-by: ShaderKeeper <no-reply@shaderkeeper.com> Co-authored-by: Chuanqi Xu <yedeng.yd@linux.alibaba.com> (cherry picked from commit cca0f81)
Looks like this is fixed in trunk: https://godbolt.org/z/hYh7hKMhn |
I think we can close this. In case, there are still problems, we can reopen it. |
Hello,
when compiling a project that uses 4 module partitions and a precompiled header with
#include <iostream>
, I get a segfault.I've originally encountered this problem on MacOS with llvm clang-18.1.8 . I've also been able to reproduce it in an Ubuntu docker container (on the same machine) with: Ubuntu clang version 18.1.3 (1ubuntu1) Target: aarch64-unknown-linux-gnu.
The machine this happens on is Apple Silicon M2 Pro (mac mini). Host architecture is aarch64.
Reproduction instructions using docker:
Should get the following output:
The text was updated successfully, but these errors were encountered: