Skip to content

GCC passes and returns complex numbers in registers #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
chbessonova opened this issue Feb 4, 2019 · 1 comment
Closed

GCC passes and returns complex numbers in registers #2

chbessonova opened this issue Feb 4, 2019 · 1 comment
Labels
abi msp430 MSP430 target

Comments

@chbessonova
Copy link
Contributor

I noticed that GCC passes and returns complex numbers (complex.h) in registers while user-defined structures, even contain 2 floats are still passed and returned by reference.

#include <complex.h>

struct s { float a[2]; };

void f1(struct s S); 
void f2(float complex C); 

extern struct s SS; 
extern float complex CC; 

int main() {
  f1(SS);
  f2(CC);
}
main:
	MOV.W	#SS, R12
	CALL	#f1
	MOV.W	&CC, R12
	MOV.W	&CC+2, R13
	MOV.W	&CC+4, R14
	MOV.W	&CC+6, R15
	CALL	#f2
	MOV.B	#0, R12

TI CCS passes everything by reference:

main:
        MOV.W     #SS+0,r12
        CALL      #f1 
        MOV.W     #CC+0,r12  
        CALL      #f2  
        MOV.W     #0,r12 
        RET      
@asl asl transferred this issue from access-softek/msp430-clang Apr 6, 2020
@asl asl transferred this issue from another repository Apr 10, 2020
@asl asl transferred this issue from access-softek/msp430-clang Apr 10, 2020
@asl asl added the abi label Apr 16, 2020
atrosinenko pushed a commit that referenced this issue Jul 18, 2020
… Solaris/x86

A dozen 32-bit `AddressSanitizer` testcases FAIL on the latest beta of Solaris 11.4/x86, e.g.
`AddressSanitizer-i386-sunos :: TestCases/null_deref.cpp` produces

  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==29274==ERROR: AddressSanitizer: stack-overflow on address 0x00000028 (pc 0x08135efd bp 0xfeffdfd8 sp 0x00000000 T0)
      #0 0x8135efd in NullDeref(int*) /vol/llvm/src/llvm-project/dist/compiler-rt/test/asan/TestCases/null_deref.cpp:15:10
      #1 0x8135ea6 in main /vol/llvm/src/llvm-project/dist/compiler-rt/test/asan/TestCases/null_deref.cpp:21:3
      #2 0x8084b85 in _start (null_deref.cpp.tmp+0x8084b85)

   SUMMARY: AddressSanitizer: stack-overflow /vol/llvm/src/llvm-project/dist/compiler-rt/test/asan/TestCases/null_deref.cpp:15:10 in NullDeref(int*)
  ==29274==ABORTING

instead of the expected

  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==29276==ERROR: AddressSanitizer: SEGV on unknown address 0x00000028 (pc 0x08135f1f bp 0xfeffdf48 sp 0xfeffdf40 T0)
  ==29276==The signal is caused by a WRITE memory access.
  ==29276==Hint: address points to the zero page.
      #0 0x8135f1f in NullDeref(int*) /vol/llvm/src/llvm-project/local/compiler-rt/test/asan/TestCases/null_deref.cpp:15:10
      #1 0x8135efa in main /vol/llvm/src/llvm-project/local/compiler-rt/test/asan/TestCases/null_deref.cpp:21:3
      #2 0x8084be5 in _start (null_deref.cpp.tmp+0x8084be5)

  AddressSanitizer can not provide additional info.
   SUMMARY: AddressSanitizer: SEGV /vol/llvm/src/llvm-project/local/compiler-rt/test/asan/TestCases/null_deref.cpp:15:10 in NullDeref(int*)
  ==29276==ABORTING

I managed to trace this to a change in `<sys/regset.h>`: previously the header would
primarily define the short register indices (like `UESP`). While they are required by the
i386 psABI, they are only required in `<ucontext.h>` and could previously leak into
unsuspecting user code, polluting the namespace and requiring elaborate workarounds
like that in `llvm/include/llvm/Support/Solaris/sys/regset.h`. The change fixed that by restricting
the definition of the short forms appropriately, at the same time defining all `REG_` prefixed
forms for compatiblity with other systems.  This exposed a bug in `compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp`, however:
Previously, the index for the user stack pointer would be hardcoded if `REG_ESP`
wasn't defined. Now with that definition present, it turned out that `REG_ESP` was the wrong index to use: the previous value 17 (and `REG_SP`) corresponds to `REG_UESP`
instead.

With that change, the failures are all gone.

Tested on `amd-pc-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D83664
@atrosinenko
Copy link
Contributor

The 67422e4 commit is expected to make sure _Complex types are laid out identically by both gcc and clang.

Hmm, now for the test.c with -Os optimization level, clang generates (R1 == SP)

main: 
        sub     #8, r1
        mov     &SS+6, 6(r1)
        mov     &SS+4, 4(r1)
        mov     &SS+2, 2(r1)
        mov     &SS, 0(r1)
        call    #f1
        mov     &CC+6, r15
        mov     &CC+4, r14
        mov     &CC+2, r13
        mov     &CC, r12
        call    #f2
        clr     r12
        add     #8, r1
        ret

And gcc generates

main:
        MOV.W   #SS, R12
        CALL    #f1
        MOV.W   &CC, R12
        MOV.W   &CC+2, R13
        MOV.W   &CC+4, R14
        MOV.W   &CC+6, R15
        CALL    #f2
        MOV.B   #0, R12
        RET

So it seems there is no difference for f2 but something strange is generated for f1 and needs further investigation.

atrosinenko pushed a commit that referenced this issue Aug 6, 2020
…RM` was undefined after definition.

`PP->getMacroInfo()` returns nullptr for undefined macro, which leads to null-dereference at `MI->tockens().back()`.
Stack dump:
```
 #0 0x000000000217d15a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/llvm-project/build/bin/clang-tidy+0x217d15a)
 #1 0x000000000217b17c llvm::sys::RunSignalHandlers() (/llvm-project/build/bin/clang-tidy+0x217b17c)
 #2 0x000000000217b2e3 SignalHandler(int) (/llvm-project/build/bin/clang-tidy+0x217b2e3)
 #3 0x00007f39be5b1390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390)
 #4 0x0000000000593532 clang::tidy::bugprone::BadSignalToKillThreadCheck::check(clang::ast_matchers::MatchFinder::MatchResult const&) (/llvm-project/build/bin/clang-tidy+0x593532)
```

Reviewed By: hokein

Differential Revision: https://reviews.llvm.org/D85401
atrosinenko pushed a commit that referenced this issue Aug 6, 2020
…RM` is not a literal.

If `SIGTERM` is not a literal (e.g. `#define SIGTERM ((unsigned)15)`) bugprone-bad-signal-to-kill-thread check crashes.
Stack dump:
```
 #0 0x000000000217d15a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/llvm-project/build/bin/clang-tidy+0x217d15a)
 #1 0x000000000217b17c llvm::sys::RunSignalHandlers() (/llvm-project/build/bin/clang-tidy+0x217b17c)
 #2 0x000000000217b2e3 SignalHandler(int) (/llvm-project/build/bin/clang-tidy+0x217b2e3)
 #3 0x00007f6a7efb1390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390)
 #4 0x000000000212ac9b llvm::StringRef::getAsInteger(unsigned int, llvm::APInt&) const (/llvm-project/build/bin/clang-tidy+0x212ac9b)
 #5 0x0000000000593501 clang::tidy::bugprone::BadSignalToKillThreadCheck::check(clang::ast_matchers::MatchFinder::MatchResult const&) (/llvm-project/build/bin/clang-tidy+0x593501)
```

Reviewed By: hokein

Differential Revision: https://reviews.llvm.org/D85398
atrosinenko pushed a commit that referenced this issue Aug 13, 2020
… when `__STDC_WANT_LIB_EXT1__` was undefined after definition.

PP->getMacroInfo() returns nullptr for undefined macro, so we need to check this return value before dereference.
Stack dump:
```
 #0 0x0000000002185e6a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/llvm-project/build/bin/clang-tidy+0x2185e6a)
 #1 0x0000000002183e8c llvm::sys::RunSignalHandlers() (/llvm-project/build/bin/clang-tidy+0x2183e8c)
 #2 0x0000000002183ff3 SignalHandler(int) (/llvm-project/build/bin/clang-tidy+0x2183ff3)
 #3 0x00007f37df9b1390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390)
 #4 0x000000000052054e clang::tidy::bugprone::NotNullTerminatedResultCheck::check(clang::ast_matchers::MatchFinder::MatchResult const&) (/llvm-project/build/bin/clang-tidy+0x52054e)
```

Reviewed By: hokein

Differential Revision: https://reviews.llvm.org/D85523
atrosinenko pushed a commit that referenced this issue Aug 13, 2020
… when `__STDC_WANT_LIB_EXT1__` is not a literal.

If `__STDC_WANT_LIB_EXT1__` is not a literal (e.g. `#define __STDC_WANT_LIB_EXT1__ ((unsigned)1)`) bugprone-not-null-terminated-result check crashes.
Stack dump:
```
 #0 0x0000000002185e6a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/llvm-project/build/bin/clang-tidy+0x2185e6a)
 #1 0x0000000002183e8c llvm::sys::RunSignalHandlers() (/llvm-project/build/bin/clang-tidy+0x2183e8c)
 #2 0x0000000002183ff3 SignalHandler(int) (/llvm-project/build/bin/clang-tidy+0x2183ff3)
 #3 0x00007f08d91b1390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390)
 #4 0x00000000021338bb llvm::StringRef::getAsInteger(unsigned int, llvm::APInt&) const (/llvm-project/build/bin/clang-tidy+0x21338bb)
 #5 0x000000000052051c clang::tidy::bugprone::NotNullTerminatedResultCheck::check(clang::ast_matchers::MatchFinder::MatchResult const&) (/llvm-project/build/bin/clang-tidy+0x52051c)
```

Reviewed By: hokein

Differential Revision: https://reviews.llvm.org/D85525
atrosinenko pushed a commit that referenced this issue Aug 25, 2020
When `Target::GetEntryPointAddress()` calls `exe_module->GetObjectFile()->GetEntryPointAddress()`, and the returned
`entry_addr` is valid, it can immediately be returned.

However, just before that, an `llvm::Error` value has been setup, but in this case it is not consumed before returning, like is done further below in the function.

In https://bugs.freebsd.org/248745 we got a bug report for this, where a very simple test case aborts and dumps core:

```
* thread #1, name = 'testcase', stop reason = breakpoint 1.1
    frame #0: 0x00000000002018d4 testcase`main(argc=1, argv=0x00007fffffffea18) at testcase.c:3:5
   1	int main(int argc, char *argv[])
   2	{
-> 3	    return 0;
   4	}
(lldb) p argc
Program aborted due to an unhandled Error:
Error value was Success. (Note: Success values must still be checked prior to being destroyed).

Thread 1 received signal SIGABRT, Aborted.
thr_kill () at thr_kill.S:3
3	thr_kill.S: No such file or directory.
(gdb) bt
#0  thr_kill () at thr_kill.S:3
#1  0x00000008049a0004 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
#2  0x0000000804916229 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
#3  0x000000000451b5f5 in fatalUncheckedError () at /usr/src/contrib/llvm-project/llvm/lib/Support/Error.cpp:112
#4  0x00000000019cf008 in GetEntryPointAddress () at /usr/src/contrib/llvm-project/llvm/include/llvm/Support/Error.h:267
#5  0x0000000001bccbd8 in ConstructorSetup () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:67
#6  0x0000000001bcd2c0 in ThreadPlanCallFunction () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:114
#7  0x00000000020076d4 in InferiorCallMmap () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/Utility/InferiorCallPOSIX.cpp:97
#8  0x0000000001f4be33 in DoAllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/FreeBSD/ProcessFreeBSD.cpp:604
#9  0x0000000001fe51b9 in AllocatePage () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:347
#10 0x0000000001fe5385 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:383
#11 0x0000000001974da2 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2301
#12 CanJIT () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2331
#13 0x0000000001a1bf3d in Evaluate () at /usr/src/contrib/llvm-project/lldb/source/Expression/UserExpression.cpp:190
#14 0x00000000019ce7a2 in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Target/Target.cpp:2372
#15 0x0000000001ad784c in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:414
#16 0x0000000001ad86ae in DoExecute () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:646
#17 0x0000000001a5e3ed in Execute () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandObject.cpp:1003
#18 0x0000000001a6c4a3 in HandleCommand () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:1762
#19 0x0000000001a6f98c in IOHandlerInputComplete () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2760
#20 0x0000000001a90b08 in Run () at /usr/src/contrib/llvm-project/lldb/source/Core/IOHandler.cpp:548
#21 0x00000000019a6c6a in ExecuteIOHandlers () at /usr/src/contrib/llvm-project/lldb/source/Core/Debugger.cpp:903
#22 0x0000000001a70337 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2946
#23 0x0000000001d9d812 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/API/SBDebugger.cpp:1169
#24 0x0000000001918be8 in MainLoop () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:675
#25 0x000000000191a114 in main () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:890```

Fix the incorrect error catch by only instantiating an `Error` object if it is necessary.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D86355
atrosinenko pushed a commit that referenced this issue Mar 6, 2021
Using the same strategy as c38e425.

D69825 revealed (introduced?) a problem when building with ASan, and
some memory leaks somewhere. More details are available in the original
patch.

Looks like we missed one failing tests, this patch adds the workaround
to this test as well.

(cherry picked from commit e174da4)
atrosinenko pushed a commit that referenced this issue Mar 24, 2021
Somewhat surprisingly, signature help is emitted as a side-effect of
computing the expected type of a function argument.
The reason is that both actions require enumerating the possible
function signatures and running partial overload resolution, and doing
this twice would be wasteful and complicated.

Change #1: document this, it's subtle :-)

However, sometimes we need to compute the expected type without having
reached the code completion cursor yet - in particular to allow
completion of designators.
eb4ab33 did this but introduced a
regression - it emits signature help in the wrong location as a side-effect.

Change #2: only emit signature help if the code completion cursor was reached.

Currently there is PP.isCodeCompletionReached(), but we can't use it
because it's set *after* running code completion.
It'd be nice to set this implicitly when the completion token is lexed,
but ConsumeCodeCompletionToken() makes this complicated.

Change #3: call cutOffParsing() *first* when seeing a completion token.

After this, the fact that the Sema::Produce*SignatureHelp() functions
are even more confusing, as they only sometimes do that.
I don't want to rename them in this patch as it's another large
mechanical change, but we should soon.

Change #4: prepare to rename ProduceSignatureHelp() to GuessArgumentType() etc.

Differential Revision: https://reviews.llvm.org/D98488
atrosinenko pushed a commit that referenced this issue Mar 24, 2021
…ocation functions

Elimination of bitcasts with void pointer arguments results in GEPs with pure byte indexes. These GEPs do not preserve struct/array information and interrupt phi address translation in later pipeline stages.

Here is the original motivation for this patch:

```
#include<stdio.h>
#include<malloc.h>

typedef struct __Node{

  double f;
  struct __Node *next;

} Node;

void foo () {
  Node *a = (Node*) malloc (sizeof(Node));
  a->next = NULL;
  a->f = 11.5f;

  Node *ptr = a;
  double sum = 0.0f;
  while (ptr) {
    sum += ptr->f;
    ptr = ptr->next;
  }
  printf("%f\n", sum);
}
```
By explicit assignment  `a->next = NULL`, we can infer the length of the link list is `1`. In this case we can eliminate while loop traversal entirely. This elimination is supposed to be performed by GVN/MemoryDependencyAnalysis/PhiTranslation .

The final IR before this patch:

```
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
entry:
  %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2
  %next = getelementptr inbounds i8, i8* %call, i64 8
  %0 = bitcast i8* %next to %struct.__Node**
  store %struct.__Node* null, %struct.__Node** %0, align 8, !tbaa !2
  %f = bitcast i8* %call to double*
  store double 1.150000e+01, double* %f, align 8, !tbaa !8
  %tobool12 = icmp eq i8* %call, null
  br i1 %tobool12, label %while.end, label %while.body.lr.ph

while.body.lr.ph:                                 ; preds = %entry
  %1 = bitcast i8* %call to %struct.__Node*
  br label %while.body

while.body:                                       ; preds = %while.body.lr.ph, %while.body
  %sum.014 = phi double [ 0.000000e+00, %while.body.lr.ph ], [ %add, %while.body ]
  %ptr.013 = phi %struct.__Node* [ %1, %while.body.lr.ph ], [ %3, %while.body ]
  %f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0
  %2 = load double, double* %f1, align 8, !tbaa !8
  %add = fadd contract double %sum.014, %2
  %next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1
  %3 = load %struct.__Node*, %struct.__Node** %next2, align 8, !tbaa !2
  %tobool = icmp eq %struct.__Node* %3, null
  br i1 %tobool, label %while.end, label %while.body

while.end:                                        ; preds = %while.body, %entry
  %sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add, %while.body ]
  %call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa)
  ret void
}
```

Final IR after this patch:
```
; Function Attrs: nofree nounwind
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
while.end:
  %call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double 1.150000e+01)
  ret void
}
```

IR before GVN before this patch:
```
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
entry:
  %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2
  %next = getelementptr inbounds i8, i8* %call, i64 8
  %0 = bitcast i8* %next to %struct.__Node**
  store %struct.__Node* null, %struct.__Node** %0, align 8, !tbaa !2
  %f = bitcast i8* %call to double*
  store double 1.150000e+01, double* %f, align 8, !tbaa !8
  %tobool12 = icmp eq i8* %call, null
  br i1 %tobool12, label %while.end, label %while.body.lr.ph

while.body.lr.ph:                                 ; preds = %entry
  %1 = bitcast i8* %call to %struct.__Node*
  br label %while.body

while.body:                                       ; preds = %while.body.lr.ph, %while.body
  %sum.014 = phi double [ 0.000000e+00, %while.body.lr.ph ], [ %add, %while.body ]
  %ptr.013 = phi %struct.__Node* [ %1, %while.body.lr.ph ], [ %3, %while.body ]
  %f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0
  %2 = load double, double* %f1, align 8, !tbaa !8
  %add = fadd contract double %sum.014, %2
  %next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1
  %3 = load %struct.__Node*, %struct.__Node** %next2, align 8, !tbaa !2
  %tobool = icmp eq %struct.__Node* %3, null
  br i1 %tobool, label %while.end.loopexit, label %while.body

while.end.loopexit:                               ; preds = %while.body
  %add.lcssa = phi double [ %add, %while.body ]
  br label %while.end

while.end:                                        ; preds = %while.end.loopexit, %entry
  %sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add.lcssa, %while.end.loopexit ]
  %call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa)
  ret void
}
```
IR before GVN after this patch:
```
define dso_local void @foo(i32* nocapture readnone %r) local_unnamed_addr #0 {
entry:
  %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16) #2
  %0 = bitcast i8* %call to %struct.__Node*
  %next = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 1
  store %struct.__Node* null, %struct.__Node** %next, align 8, !tbaa !2
  %f = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 0
  store double 1.150000e+01, double* %f, align 8, !tbaa !8
  %tobool12 = icmp eq i8* %call, null
  br i1 %tobool12, label %while.end, label %while.body.preheader

while.body.preheader:                             ; preds = %entry
  br label %while.body

while.body:                                       ; preds = %while.body.preheader, %while.body
  %sum.014 = phi double [ %add, %while.body ], [ 0.000000e+00, %while.body.preheader ]
  %ptr.013 = phi %struct.__Node* [ %2, %while.body ], [ %0, %while.body.preheader ]
  %f1 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 0
  %1 = load double, double* %f1, align 8, !tbaa !8
  %add = fadd contract double %sum.014, %1
  %next2 = getelementptr inbounds %struct.__Node, %struct.__Node* %ptr.013, i64 0, i32 1
  %2 = load %struct.__Node*, %struct.__Node** %next2, align 8, !tbaa !2
  %tobool = icmp eq %struct.__Node* %2, null
  br i1 %tobool, label %while.end.loopexit, label %while.body

while.end.loopexit:                               ; preds = %while.body
  %add.lcssa = phi double [ %add, %while.body ]
  br label %while.end

while.end:                                        ; preds = %while.end.loopexit, %entry
  %sum.0.lcssa = phi double [ 0.000000e+00, %entry ], [ %add.lcssa, %while.end.loopexit ]
  %call3 = tail call i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double %sum.0.lcssa)
  ret void
}
```

The phi translation fails before this patch and it prevents GVN to remove the loop. The reason for this failure is in InstCombine. When the Instruction combining pass decides to convert:
```
 %call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16)
  %0 = bitcast i8* %call to %struct.__Node*
  %next = getelementptr inbounds %struct.__Node, %struct.__Node* %0, i64 0, i32 1
  store %struct.__Node* null, %struct.__Node** %next
```
to
```
%call = tail call noalias dereferenceable_or_null(16) i8* @malloc(i64 16)
  %next = getelementptr inbounds i8, i8* %call, i64 8
  %0 = bitcast i8* %next to %struct.__Node**
  store %struct.__Node* null, %struct.__Node** %0

```

GEP instructions with pure byte indexes (e.g. `getelementptr inbounds i8, i8* %call, i64 8`) are obstacles for address translation. address translation is looking for structural similarity between GEPs and these GEPs usually do not match since they have different structure.

This change will cause couple of failures in LLVM-tests. However, in all cases we need to change expected result by the test. I will update those tests as soon as I get green light on this patch.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D96881
atrosinenko pushed a commit that referenced this issue Apr 8, 2021
This reverts commit 9be8f8b.
This breaks tsan on Ubuntu 16.04:

    $ cat tiny_race.c
    #include <pthread.h>
    int Global;
    void *Thread1(void *x) {
      Global = 42;
      return x;
    }
    int main() {
      pthread_t t;
      pthread_create(&t, NULL, Thread1, NULL);
      Global = 43;
      pthread_join(t, NULL);
      return Global;
    }
    $ out/gn/bin/clang -fsanitize=thread -g -O1 tiny_race.c --sysroot ~/src/chrome/src/build/linux/debian_sid_amd64-sysroot/
    $ docker run -v $PWD:/foo ubuntu:xenial /foo/a.out
    FATAL: ThreadSanitizer CHECK failed: ../../compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp:447 "((thr_beg)) >= ((tls_addr))" (0x7fddd76beb80, 0xfffffffffffff980)
        #0 <null> <null> (a.out+0x4960b6)
        #1 <null> <null> (a.out+0x4b677f)
        #2 <null> <null> (a.out+0x49cf94)
        #3 <null> <null> (a.out+0x499bd2)
        #4 <null> <null> (a.out+0x42aaf1)
        #5 <null> <null> (libpthread.so.0+0x76b9)
        #6 <null> <null> (libc.so.6+0x1074dc)

(Get the sysroot from here: https://commondatastorage.googleapis.com/chrome-linux-sysroot/toolchain/500976182686961e34974ea7bdc0a21fca32be06/debian_sid_amd64_sysroot.tar.xz)

Also reverts follow-on commits:
This reverts commit 58c62fd.
This reverts commit 31e541e.
atrosinenko pushed a commit that referenced this issue May 8, 2021
This patch implements expansion of llvm.vp.* intrinsics
(https://llvm.org/docs/LangRef.html#vector-predication-intrinsics).

VP expansion is required for targets that do not implement VP code
generation. Since expansion is controllable with TTI, targets can switch
on the VP intrinsics they do support in their backend offering a smooth
transition strategy for VP code generation (VE, RISC-V V, ARM SVE,
AVX512, ..).

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D78203
atrosinenko pushed a commit that referenced this issue May 8, 2021
This reverts the revert 02c5ba8

Fix:

Pass was registered as DUMMY_FUNCTION_PASS causing the newpm-pass
functions to be doubly defined. Triggered in -DLLVM_ENABLE_MODULE=1
builds.

Original commit:

This patch implements expansion of llvm.vp.* intrinsics
(https://llvm.org/docs/LangRef.html#vector-predication-intrinsics).

VP expansion is required for targets that do not implement VP code
generation. Since expansion is controllable with TTI, targets can switch
on the VP intrinsics they do support in their backend offering a smooth
transition strategy for VP code generation (VE, RISC-V V, ARM SVE,
AVX512, ..).

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D78203
atrosinenko pushed a commit that referenced this issue May 8, 2021
…ing it

Having nested macros in the C code could cause clangd to fail an assert in clang::Preprocessor::setLoadedMacroDirective() and crash.

 #1 0x00000000007ace30 PrintStackTraceSignalHandler(void*) /qdelacru/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1
 #2 0x00000000007aaded llvm::sys::RunSignalHandlers() /qdelacru/llvm-project/llvm/lib/Support/Signals.cpp:76:20
 #3 0x00000000007ac7c1 SignalHandler(int) /qdelacru/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
 #4 0x00007f096604db20 __restore_rt (/lib64/libpthread.so.0+0x12b20)
 #5 0x00007f0964b307ff raise (/lib64/libc.so.6+0x377ff)
 #6 0x00007f0964b1ac35 abort (/lib64/libc.so.6+0x21c35)
 #7 0x00007f0964b1ab09 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21b09)
 #8 0x00007f0964b28de6 (/lib64/libc.so.6+0x2fde6)
 #9 0x0000000001004d1a clang::Preprocessor::setLoadedMacroDirective(clang::IdentifierInfo*, clang::MacroDirective*, clang::MacroDirective*) /qdelacru/llvm-project/clang/lib/Lex/PPMacroExpansion.cpp:116:5

An example of the code that causes the assert failure:
```
...
```

During code completion in clangd, the macros will be loaded in loadMainFilePreambleMacros() by iterating over the macro names and calling PreambleIdentifiers->get(). Since these macro names are store in a StringSet (has StringMap underlying container), the order of the iterator is not guaranteed to be same as the order seen in the source code.

When clangd is trying to resolve nested macros it sometimes attempts to load them out of order which causes a macro to be stored twice. In the example above, ECHO2 macro gets resolved first, but since it uses another macro that has not been resolved it will try to resolve/store that as well. Now there are two MacroDirectives stored in the Preprocessor, ECHO and ECHO2. When clangd tries to load the next macro, ECHO, the preprocessor fails an assert in clang::Preprocessor::setLoadedMacroDirective() because there is already a MacroDirective stored for that macro name.

In this diff, I check if the macro is already inside the IdentifierTable and if it is skip it so that it is not resolved twice.

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D101870
atrosinenko pushed a commit that referenced this issue Jul 13, 2023
Running this on Amazon Ubuntu the final backtrace is:
```
(lldb) thread backtrace
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
  * frame #0: 0x0000aaaaaaaa07d0 a.out`func_c at main.c:10:3
    frame #1: 0x0000aaaaaaaa07c4 a.out`func_b at main.c:14:3
    frame #2: 0x0000aaaaaaaa07b4 a.out`func_a at main.c:18:3
    frame #3: 0x0000aaaaaaaa07a4 a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:22:3
    frame #4: 0x0000fffff7b373fc libc.so.6`___lldb_unnamed_symbol2962 + 108
    frame #5: 0x0000fffff7b374cc libc.so.6`__libc_start_main + 152
    frame #6: 0x0000aaaaaaaa06b0 a.out`_start + 48
```
This causes the test to fail because of the extra ___lldb_unnamed_symbol2962 frame
(an inlined function?).

To fix this, strictly check all the frames in main.c then for the rest
just check we find __libc_start_main and _start in that order regardless
of other frames in between.

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D154204
atrosinenko pushed a commit that referenced this issue Jul 14, 2023
…tput

The crash happens in clang::driver::tools::SplitDebugName when Output is
InputInfo::Nothing. It doesn't happen with standalone clang driver because
output is created in Driver::BuildJobsForActionNoCache.

Example backtrace:
```
* thread #1, name = 'clangd', stop reason = hit program assert
  * frame #0: 0x00007ffff5c4eacf libc.so.6`raise + 271
    frame #1: 0x00007ffff5c21ea5 libc.so.6`abort + 295
    frame #2: 0x00007ffff5c21d79 libc.so.6`__assert_fail_base.cold.0 + 15
    frame #3: 0x00007ffff5c47426 libc.so.6`__assert_fail + 70
    frame #4: 0x000055555dc0923c clangd`clang::driver::InputInfo::getFilename(this=0x00007fffffff9398) const at InputInfo.h:84:5
    frame #5: 0x000055555dcd0d8d clangd`clang::driver::tools::SplitDebugName(JA=0x000055555f6c6a50, Args=0x000055555f6d0b80, Input=0x00007fffffff9678, Output=0x00007fffffff9398) at CommonArgs.cpp:1275:40
    frame #6: 0x000055555dc955a5 clangd`clang::driver::tools::Clang::ConstructJob(this=0x000055555f6c69d0, C=0x000055555f6c64a0, JA=0x000055555f6c6a50, Output=0x00007fffffff9398, Inputs=0x00007fffffff9668, Args=0x000055555f6d0b80, LinkingOutput=0x0000000000000000) const at Clang.cpp:5690:33
    frame #7: 0x000055555dbf6b54 clangd`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fffffffb5e0, C=0x000055555f6c64a0, A=0x000055555f6c6a50, TC=0x000055555f6c4be0, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=1, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:5618:10
    frame #8: 0x000055555dbf4ef0 clangd`clang::driver::Driver::BuildJobsForAction(this=0x00007fffffffb5e0, C=0x000055555f6c64a0, A=0x000055555f6c6a50, TC=0x000055555f6c4be0, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=1, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:5306:26
    frame #9: 0x000055555dbeb590 clangd`clang::driver::Driver::BuildJobs(this=0x00007fffffffb5e0, C=0x000055555f6c64a0) const at Driver.cpp:4844:5
    frame #10: 0x000055555dbe6b0f clangd`clang::driver::Driver::BuildCompilation(this=0x00007fffffffb5e0, ArgList=ArrayRef<const char *> @ 0x00007fffffffb268) at Driver.cpp:1496:3
    frame #11: 0x000055555b0cc0d9 clangd`clang::createInvocation(ArgList=ArrayRef<const char *> @ 0x00007fffffffbb38, Opts=CreateInvocationOptions @ 0x00007fffffffbb90) at CreateInvocationFromCommandLine.cpp:53:52
    frame #12: 0x000055555b378e7b clangd`clang::clangd::buildCompilerInvocation(Inputs=0x00007fffffffca58, D=0x00007fffffffc158, CC1Args=size=0) at Compiler.cpp:116:44
    frame #13: 0x000055555895a6c8 clangd`clang::clangd::(anonymous namespace)::Checker::buildInvocation(this=0x00007fffffffc760, TFS=0x00007fffffffe570, Contents= Has Value=false ) at Check.cpp:212:9
    frame #14: 0x0000555558959cec clangd`clang::clangd::check(File=(Data = "build/test.cpp", Length = 64), TFS=0x00007fffffffe570, Opts=0x00007fffffffe600) at Check.cpp:486:34
    frame #15: 0x000055555892164a clangd`main(argc=4, argv=0x00007fffffffecd8) at ClangdMain.cpp:993:12
    frame #16: 0x00007ffff5c3ad85 libc.so.6`__libc_start_main + 229
    frame #17: 0x00005555585bbe9e clangd`_start + 46
```

Test Plan: ninja ClangDriverTests && tools/clang/unittests/Driver/ClangDriverTests

Differential Revision: https://reviews.llvm.org/D154602
asl pushed a commit that referenced this issue Jul 24, 2023
ParmVarDecl of BlockDecl is unnecessarily dumped twice.
Remove this duplication as other FunctionDecls.

Fixes llvm/llvm-project#64005 (#2)

Differential Revision: https://reviews.llvm.org/D155985
asl pushed a commit that referenced this issue Aug 25, 2023
ThreadSanitizer reports the following issue:

```
  Write of size 8 at 0x00010a70abb0 by thread T3 (mutexes: write M0):
    #0 lldb_private::ThreadList::Update(lldb_private::ThreadList&) ThreadList.cpp:741 (liblldb.18.0.0git.dylib:arm64+0x5dedf4) (BuildId: 9bced2aafa373580ae9d750d9cf79a8f32000000200000000100000000000e00)
    #1 lldb_private::Process::UpdateThreadListIfNeeded() Process.cpp:1212 (liblldb.18.0.0git.dylib:arm64+0x53bbec) (BuildId: 9bced2aafa373580ae9d750d9cf79a8f32000000200000000100000000000e00)

  Previous read of size 8 at 0x00010a70abb0 by main thread (mutexes: write M1):
    #0 lldb_private::ThreadList::GetMutex() const ThreadList.cpp:785 (liblldb.18.0.0git.dylib:arm64+0x5df138) (BuildId: 9bced2aafa373580ae9d750d9cf79a8f32000000200000000100000000000e00)
    #1 lldb_private::ThreadList::DidResume() ThreadList.cpp:656 (liblldb.18.0.0git.dylib:arm64+0x5de5c0) (BuildId: 9bced2aafa373580ae9d750d9cf79a8f32000000200000000100000000000e00)
    #2 lldb_private::Process::PrivateResume() Process.cpp:3130 (liblldb.18.0.0git.dylib:arm64+0x53cd7c) (BuildId: 9bced2aafa373580ae9d750d9cf79a8f32000000200000000100000000000e00)
```

Fix this by only using the mutex in ThreadList and removing the one in
process entirely.

Differential Revision: https://reviews.llvm.org/D158034
asl pushed a commit that referenced this issue Aug 25, 2023
Replace `BPFMIPeepholeTruncElim` by adding an overload for
`TargetLowering::isZExtFree()` aware that zero extension is
free for `ISD::LOAD`.

Short description
=================

The `BPFMIPeepholeTruncElim` handles two patterns:

Pattern #1:

    %1 = LDB %0, ...              %1 = LDB %0, ...
    %2 = AND_ri %1, 0xff      ->  %2 = MOV_ri %1    <-- (!)

Pattern #2:

    bb.1:                         bb.1:
      %a = LDB %0, ...              %a = LDB %0, ...
      br %bb3                       br %bb3
    bb.2:                         bb.2:
      %b = LDB %0, ...        ->    %b = LDB %0, ...
      br %bb3                       br %bb3
    bb.3:                         bb.3:
      %1 = PHI %a, %b               %1 = PHI %a, %b
      %2 = AND_ri %1, 0xff          %2 = MOV_ri %1  <-- (!)

Plus variations:
- AND_ri_32 instead of AND_ri
- SLL/SLR instead of AND_ri
- LDH, LDW, LDB32, LDH32, LDW32

Both patterns could be handled by built-in transformations at
instruction selection phase if suitable `isZExtFree()` implementation
is provided. The idea is borrowed from `ARMTargetLowering::isZExtFree`.

When evaluating on BPF kernel selftests and remove_truncate_*.ll LLVM
test cases this revisions performs slightly better than
BPFMIPeepholeTruncElim, see "Impact" section below for details.

Commit also adds a few test cases to make sure that patterns in
question are handled.

Long description
================

Why this works: Pattern #1
--------------------------

Consider the following example:

    define i1 @foo(ptr %p) {
    entry:
      %a = load i8, ptr %p, align 1
      %cond = icmp eq i8 %a, 0
      ret i1 %cond
    }

Log for `llc -mcpu=v2 -mtriple=bpfel -debug-only=isel` command:

    ...
    Type-legalized selection DAG: %bb.0 'foo:entry'
    SelectionDAG has 13 nodes:
      t0: ch,glue = EntryToken
              t2: i64,ch = CopyFromReg t0, Register:i64 %0
            t16: i64,ch = load<(load (s8) from %ir.p), anyext from i8> t0, t2, undef:i64
          t19: i64 = and t16, Constant:i64<255>
        t17: i64 = setcc t19, Constant:i64<0>, seteq:ch
      t11: ch,glue = CopyToReg t0, Register:i64 $r0, t17
      t12: ch = BPFISD::RET_GLUE t11, Register:i64 $r0, t11:1
    ...
    Replacing.1 t19: i64 = and t16, Constant:i64<255>
    With: t16: i64,ch = load<(load (s8) from %ir.p), anyext from i8> t0, t2, undef:i64
     and 0 other values
    ...
    Optimized type-legalized selection DAG: %bb.0 'foo:entry'
    SelectionDAG has 11 nodes:
      t0: ch,glue = EntryToken
            t2: i64,ch = CopyFromReg t0, Register:i64 %0
          t20: i64,ch = load<(load (s8) from %ir.p), zext from i8> t0, t2, undef:i64
        t17: i64 = setcc t20, Constant:i64<0>, seteq:ch
      t11: ch,glue = CopyToReg t0, Register:i64 $r0, t17
      t12: ch = BPFISD::RET_GLUE t11, Register:i64 $r0, t11:1
    ...

Note:
- Optimized type-legalized selection DAG:
  - `t19 = and t16, 255` had been replaced by `t16` (load).
  - Patterns like `(and (load ... i8), 255)` are replaced by `load`
    in `DAGCombiner::BackwardsPropagateMask` called from
    `DAGCombiner::visitAND`.
  - Similarly patterns like `(shl (srl ..., 56), 56)` are replaced by
    `(and ..., 255)` in `DAGCombiner::visitSRL` (this function is huge,
    look for `TLI.shouldFoldConstantShiftPairToMask()` call).

Why this works: Pattern #2
--------------------------

Consider the following example:

    define i1 @foo(ptr %p) {
    entry:
      %a = load i8, ptr %p, align 1
      br label %next

    next:
      %cond = icmp eq i8 %a, 0
      ret i1 %cond
    }

Consider log for `llc -mcpu=v2 -mtriple=bpfel -debug-only=isel` command.
Log for first basic block:

    Initial selection DAG: %bb.0 'foo:entry'
    SelectionDAG has 9 nodes:
      t0: ch,glue = EntryToken
      t3: i64 = Constant<0>
            t2: i64,ch = CopyFromReg t0, Register:i64 %1
          t5: i8,ch = load<(load (s8) from %ir.p)> t0, t2, undef:i64
        t6: i64 = zero_extend t5
      t8: ch = CopyToReg t0, Register:i64 %0, t6
    ...
    Replacing.1 t6: i64 = zero_extend t5
    With: t9: i64,ch = load<(load (s8) from %ir.p), zext from i8> t0, t2, undef:i64
     and 0 other values
    ...
    Optimized lowered selection DAG: %bb.0 'foo:entry'
    SelectionDAG has 7 nodes:
      t0: ch,glue = EntryToken
          t2: i64,ch = CopyFromReg t0, Register:i64 %1
        t9: i64,ch = load<(load (s8) from %ir.p), zext from i8> t0, t2, undef:i64
      t8: ch = CopyToReg t0, Register:i64 %0, t9

Note:
- Initial selection DAG:
  - `%a = load ...` is lowered as `t6 = (zero_extend (load ...))`
    w/o special `isZExtFree()` overload added by this commit
    it is instead lowered as `t6 = (any_extend (load ...))`.
  - The decision to generate `zero_extend` or `any_extend` is
    done in `RegsForValue::getCopyToRegs` called from
    `SelectionDAGBuilder::CopyValueToVirtualRegister`:
    - if `isZExtFree()` for load returns true `zero_extend` is used;
    - `any_extend` is used otherwise.
- Optimized lowered selection DAG:
  - `t6 = (any_extend (load ...))` is replaced by
    `t9 = load ..., zext from i8`
    This is done by `DagCombiner.cpp:tryToFoldExtOfLoad()` called from
    `DAGCombiner::visitZERO_EXTEND`.

Log for second basic block:

    Initial selection DAG: %bb.1 'foo:next'
    SelectionDAG has 13 nodes:
      t0: ch,glue = EntryToken
                t2: i64,ch = CopyFromReg t0, Register:i64 %0
              t4: i64 = AssertZext t2, ValueType:ch:i8
            t5: i8 = truncate t4
          t8: i1 = setcc t5, Constant:i8<0>, seteq:ch
        t9: i64 = any_extend t8
      t11: ch,glue = CopyToReg t0, Register:i64 $r0, t9
      t12: ch = BPFISD::RET_GLUE t11, Register:i64 $r0, t11:1
    ...
    Replacing.2 t18: i64 = and t4, Constant:i64<255>
    With: t4: i64 = AssertZext t2, ValueType:ch:i8
    ...
    Type-legalized selection DAG: %bb.1 'foo:next'
    SelectionDAG has 13 nodes:
      t0: ch,glue = EntryToken
              t2: i64,ch = CopyFromReg t0, Register:i64 %0
            t4: i64 = AssertZext t2, ValueType:ch:i8
          t18: i64 = and t4, Constant:i64<255>
        t16: i64 = setcc t18, Constant:i64<0>, seteq:ch
      t11: ch,glue = CopyToReg t0, Register:i64 $r0, t16
      t12: ch = BPFISD::RET_GLUE t11, Register:i64 $r0, t11:1
    ...
    Optimized type-legalized selection DAG: %bb.1 'foo:next'
    SelectionDAG has 11 nodes:
      t0: ch,glue = EntryToken
            t2: i64,ch = CopyFromReg t0, Register:i64 %0
          t4: i64 = AssertZext t2, ValueType:ch:i8
        t16: i64 = setcc t4, Constant:i64<0>, seteq:ch
      t11: ch,glue = CopyToReg t0, Register:i64 $r0, t16
      t12: ch = BPFISD::RET_GLUE t11, Register:i64 $r0, t11:1
    ...

Note:
- Initial selection DAG:
  - `t0` is an input value for this basic block, it corresponds load
    instruction (`t9`) from the first basic block.
  - It is accessed within basic block via
    `t4` (AssertZext (CopyFromReg t0, ...)).
  - The `AssertZext` is generated by RegsForValue::getCopyFromRegs
    called from SelectionDAGBuilder::getCopyFromRegs, it is generated
    only when `LiveOutInfo` with known number of leading zeros is
    present for `t0`.
  - Known register bits in `LiveOutInfo` are computed by
    `SelectionDAG::computeKnownBits` called from
    `SelectionDAGISel::ComputeLiveOutVRegInfo`.
  - `computeKnownBits()` generates leading zeros information for
    `(load ..., zext from ...)` but *does not* generate leading zeros
    information for `(load ..., anyext from ...)`.
    This is why `isZExtFree()` added in this commit is important.
- Type-legalized selection DAG:
  - `t5 = truncate t4` is replaced by `t18 = and t4, 255`
- Optimized type-legalized selection DAG:
  - `t18 = and t4, 255` is replaced by `t4`, this is done by
    `DAGCombiner::SimplifyDemandedBits` called from
    `DAGCombiner::visitAND`, which simplifies patterns like
    `(and (assertzext ...))`

Impact
------

This change covers all remove_truncate_*.ll test cases:
- for -mcpu=v4 there are no changes in the generated code;
- for -mcpu=v2 code generated for remove_truncate_7 and
  remove_truncate_8 improved slightly, for other tests it is
  unchanged.

For remove_truncate_7:

    Before this revision                 After this revision
    --------------------                 -------------------
        r1 <<= 0x20                          r1 <<= 0x20
        r1 >>= 0x20                          r1 >>= 0x20
        if r1 == 0x0 goto +0x2 <LBB0_2>      if r1 == 0x0 goto +0x2 <LBB0_2>
        r1 = *(u32 *)(r2 + 0x0)              r0 = *(u32 *)(r2 + 0x0)
        goto +0x1 <LBB0_3>                   goto +0x1 <LBB0_3>
    <LBB0_2>:                            <LBB0_2>:
        r1 = *(u32 *)(r2 + 0x4)              r0 = *(u32 *)(r2 + 0x4)
    <LBB0_3>:                            <LBB0_3>:
        r0 = r1                              exit
        exit

For remove_truncate_8:

    Before this revision                 After this revision
    --------------------                 -------------------
        r2 = *(u32 *)(r1 + 0x0)              r2 = *(u32 *)(r1 + 0x0)
        r3 = r2                              r3 = r2
        r3 <<= 0x20                          r3 <<= 0x20
        r4 = r3                              r3 s>>= 0x20
        r4 s>>= 0x20
        if r4 s> 0x2 goto +0x5 <LBB0_3>      if r3 s> 0x2 goto +0x4 <LBB0_3>
        r4 = *(u32 *)(r1 + 0x4)              r3 = *(u32 *)(r1 + 0x4)
        r3 >>= 0x20
        if r3 >= r4 goto +0x2 <LBB0_3>       if r2 >= r3 goto +0x2 <LBB0_3>
        r2 += 0x2                            r2 += 0x2
        *(u32 *)(r1 + 0x0) = r2              *(u32 *)(r1 + 0x0) = r2
    <LBB0_3>:                            <LBB0_3>:
        r0 = 0x3                             r0 = 0x3
        exit                                 exit

For kernel BPF selftests statistics is as follows: (-mcpu=v4):
- For -mcpu=v4: 9 out of 655 object files have differences,
  in all cases total number of instructions marginally decreased
  (-27 instructions).
- For -mcpu=v2: 9 out of 655 object files have differences:
  - For 19 object files number of instruction decreased
    (-129 instruction in total): some redundant `rX &= 0xffff`
    and register to register assignments removed;
  - For 2 object files number of instructions increased +2
    instructions in each file.

Both -mcpu=v2 instruction increases could be reduced to the same
example:

    define void @foo(ptr %p) {
    entry:
      %a = load i32, ptr %p, align 4
      %b = sext i32 %a to i64
      %c = icmp ult i64 1, %b
      br i1 %c, label %next, label %end

    next:
      call void inttoptr (i64 62 to ptr)(i32 %a)
      br label %end

    end:
      ret void
    }

Note that this example uses value loaded to `%a` both as a sign
extended (`%b`) and as zero extended (`%a` passed as parameter).
Here is the difference in final assembly code:

    Before this revision          After this revision
    --------------------          -------------------
        r1 = *(u32 *)(r1 + 0)         r1 = *(u32 *)(r1 + 0)
        r1 <<= 32                     r1 <<= 32
        r1 s>>= 32                    r1 s>>= 32
        if r1 < 2 goto <LBB0_2>       if r1 < 2 goto <LBB0_2>
                                      r1 <<= 32
                                      r1 >>= 32
        call 62                       call 62
    <LBB0_2>:                     <LBB0_2>:
        exit                          exit

Before this commit `%a` is passed to call as a sign extended value,
after this commit `%a` is passed to call as a zero extended value,
both are correct as 32-bit sub-register is the same.

The difference comes from `DAGCombiner` operation on the initial DAG:

Initial selection DAG before this commit:

    t5: i32,ch = load<(load (s32) from %ir.p)> t0, t2, undef:i64
          t6: i64 = any_extend t5         <--------------------- (1)
        t8: ch = CopyToReg t0, Register:i64 %0, t6
            t9: i64 = sign_extend t5
          t12: i1 = setcc Constant:i64<1>, t9, setult:ch

Initial selection DAG after this commit:

    t5: i32,ch = load<(load (s32) from %ir.p)> t0, t2, undef:i64
          t6: i64 = zero_extend t5        <--------------------- (2)
        t8: ch = CopyToReg t0, Register:i64 %0, t6
            t9: i64 = sign_extend t5
          t12: i1 = setcc Constant:i64<1>, t9, setult:ch

The node `t9` is processed before node `t6` and `load` instruction is
combined to load with sign extension:

    Replacing.1 t9: i64 = sign_extend t5
    With: t30: i64,ch = load<(load (s32) from %ir.p), sext from i32> t0, t2, undef:i64
     and 0 other values
    Replacing.1 t5: i32,ch = load<(load (s32) from %ir.p)> t0, t2, undef:i64
    With: t31: i32 = truncate t30
     and 1 other values

This is done by `DAGCombiner.cpp:tryToFoldExtOfLoad` called from
`DAGCombiner::visitSIGN_EXTEND`. Note that `t5` is used by `t6` which
is `any_extend` in (1) and `zero_extend` in (2).
`tryToFoldExtOfLoad()` rewrites such uses of `t5` differently:
- `any_extend` is simply removed
- `zero_extend` is replaced by `and t30, 0xffffffff`, which is later
  converted to a pair of shifts. This pair of shifts survives till the
  end of translation.

Differential Revision: https://reviews.llvm.org/D157870
atrosinenko pushed a commit that referenced this issue Aug 28, 2023
This reverts commit 0e63f1a.

clang-format started to crash with contents like:
a.h:
```
```
$ clang-format a.h
```
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: ../llvm/build/bin/clang-format a.h
 #0 0x0000560b689fe177 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x0000560b689fbfbe llvm::sys::RunSignalHandlers() /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Signals.cpp:106:18
 #2 0x0000560b689feaca SignalHandler(int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x00007f030405a540 (/lib/x86_64-linux-gnu/libc.so.6+0x3c540)
 #4 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/include/clang/Lex/Token.h:98:44
 #5 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:562:51
 #6 0x0000560b68a9a980 startsSequenceInternal<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:831:9
 #7 0x0000560b68a9a980 startsSequence<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:600:12
 #8 0x0000560b68a9a980 getFunctionName /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3131:17
 #9 0x0000560b68a9a980 clang::format::TokenAnnotator::annotate(clang::format::AnnotatedLine&) /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3191:17
Segmentation fault
```
asl pushed a commit that referenced this issue Sep 13, 2023
The new ACLE PR#225[1] now combines the slice parameters for some
builtins. This patch is the #2 of 3 patches to update the interface.

Slice specifies the ZA slice number directly and needs to be explicity
implemented by the "user" with the base register plus the immediate
offset

[1]https://github.com/ARM-software/acle/pull/225/files
@asl asl added the msp430 MSP430 target label Nov 13, 2023
@asl asl closed this as completed Nov 13, 2023
atrosinenko pushed a commit that referenced this issue Nov 15, 2023
…tePluginObject

After llvm/llvm-project#68052 this function changed from returning
a nullptr with `return {};` to returning Expected and hitting `llvm_unreachable` before it could
do so.

I gather that we're never supposed to call this function, but on Windows we actually do call
this function because `interpreter->CreateScriptedProcessInterface()` returns
`ScriptedProcessInterface` not `ScriptedProcessPythonInterface`. Likely because
`target_sp->GetDebugger().GetScriptInterpreter()` also does not return a Python related class.

The previously XFAILed test crashed with:
```
 # .---command stderr------------
 # | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
 # | Stack dump:
 # | 0.  Program arguments: c:\\users\\tcwg\\david.spickett\\build-llvm\\bin\\lldb-test.exe ir-memory-map C:\\Users\\tcwg\\david.spickett\\build-llvm\\tools\\lldb\\test\\Shell\\Expr\\Output\\TestIRMemoryMapWindows.test.tmp C:\\Users\\tcwg\\david.spickett\\llvm-project\\lldb\\test\\Shell\\Expr/Inputs/ir-memory-map-basic
 # | 1.  HandleCommand(command = "run")
 # | Exception Code: 0xC000001D
 # | #0 0x00007ff696b5f588 lldb_private::ScriptedProcessInterface::CreatePluginObject(class llvm::StringRef, class lldb_private::ExecutionContext &, class std::shared_ptr<class lldb_private::StructuredData::Dictionary>, class lldb_private::StructuredData::Generic *) C:\Users\tcwg\david.spickett\llvm-project\lldb\include\lldb\Interpreter\Interfaces\ScriptedProcessInterface.h:28:0
 # | #1 0x00007ff696b1d808 llvm::Expected<std::shared_ptr<lldb_private::StructuredData::Generic> >::operator bool C:\Users\tcwg\david.spickett\llvm-project\llvm\include\llvm\Support\Error.h:567:0
 # | #2 0x00007ff696b1d808 lldb_private::ScriptedProcess::ScriptedProcess(class std::shared_ptr<class lldb_private::Target>, class std::shared_ptr<class lldb_private::Listener>, class lldb_private::ScriptedMetadata const &, class lldb_private::Status &) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Plugins\Process\scripted\ScriptedProcess.cpp:115:0
 # | #3 0x00007ff696b1d124 std::shared_ptr<lldb_private::ScriptedProcess>::shared_ptr C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1478:0
 # | #4 0x00007ff696b1d124 lldb_private::ScriptedProcess::CreateInstance(class std::shared_ptr<class lldb_private::Target>, class std::shared_ptr<class lldb_private::Listener>, class lldb_private::FileSpec const *, bool) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Plugins\Process\scripted\ScriptedProcess.cpp:61:0
 # | #5 0x00007ff69699c8f4 std::_Ptr_base<lldb_private::Process>::_Move_construct_from C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1237:0
 # | #6 0x00007ff69699c8f4 std::shared_ptr<lldb_private::Process>::shared_ptr C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1534:0
 # | #7 0x00007ff69699c8f4 std::shared_ptr<lldb_private::Process>::operator= C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1594:0
 # | #8 0x00007ff69699c8f4 lldb_private::Process::FindPlugin(class std::shared_ptr<class lldb_private::Target>, class llvm::StringRef, class std::shared_ptr<class lldb_private::Listener>, class lldb_private::FileSpec const *, bool) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Target\Process.cpp:396:0
 # | #9 0x00007ff6969bd708 std::_Ptr_base<lldb_private::Process>::_Move_construct_from C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1237:0
 # | #10 0x00007ff6969bd708 std::shared_ptr<lldb_private::Process>::shared_ptr C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1534:0
 # | #11 0x00007ff6969bd708 std::shared_ptr<lldb_private::Process>::operator= C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1594:0
 # | #12 0x00007ff6969bd708 lldb_private::Target::CreateProcess(class std::shared_ptr<class lldb_private::Listener>, class llvm::StringRef, class lldb_private::FileSpec const *, bool) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Target\Target.cpp:215:0
 # | #13 0x00007ff696b13af0 std::_Ptr_base<lldb_private::Process>::_Ptr_base C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1230:0
 # | #14 0x00007ff696b13af0 std::shared_ptr<lldb_private::Process>::shared_ptr C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1524:0
 # | #15 0x00007ff696b13af0 lldb_private::PlatformWindows::DebugProcess(class lldb_private::ProcessLaunchInfo &, class lldb_private::Debugger &, class lldb_private::Target &, class lldb_private::Status &) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Plugins\Platform\Windows\PlatformWindows.cpp:495:0
 # | #16 0x00007ff6969cf590 std::_Ptr_base<lldb_private::Process>::_Move_construct_from C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1237:0
 # | #17 0x00007ff6969cf590 std::shared_ptr<lldb_private::Process>::shared_ptr C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1534:0
 # | #18 0x00007ff6969cf590 std::shared_ptr<lldb_private::Process>::operator= C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1594:0
 # | #19 0x00007ff6969cf590 lldb_private::Target::Launch(class lldb_private::ProcessLaunchInfo &, class lldb_private::Stream *) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Target\Target.cpp:3274:0
 # | #20 0x00007ff696fff82c CommandObjectProcessLaunch::DoExecute(class lldb_private::Args &, class lldb_private::CommandReturnObject &) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Commands\CommandObjectProcess.cpp:258:0
 # | #21 0x00007ff696fab6c0 lldb_private::CommandObjectParsed::Execute(char const *, class lldb_private::CommandReturnObject &) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Interpreter\CommandObject.cpp:751:0
 # `-----------------------------
 # error: command failed with exit status: 0xc000001d
```

That might be a bug on the Windows side, or an artifact of how our build is setup,
but whatever it is, having `CreatePluginObject` return an error and
the caller check it, fixes the failing test.

The built lldb can run the script command to use Python, but I'm not sure if that means
anything.
atrosinenko pushed a commit that referenced this issue Nov 15, 2023
…e defintion if available (#71004)"

This reverts commit ef3feba.

This caused an LLDB test failure on Linux for `lang/cpp/symbols/TestSymbols.test_dwo`:

```
make: Leaving directory '/home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/lang/cpp/symbols/TestSymbols.test_dwo'
runCmd: expression -- D::i
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	HandleCommand(command = "expression -- D::i")
1.	<user expression 0>:1:4: current parser token 'i'
2.	<lldb wrapper prefix>:44:1: parsing function body '$__lldb_expr'
3.	<lldb wrapper prefix>:44:1: in compound statement ('{}')
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  _lldb.cpython-39-x86_64-linux-gnu.so 0x00007fbcfcb08b87
1  _lldb.cpython-39-x86_64-linux-gnu.so 0x00007fbcfcb067ae
2  _lldb.cpython-39-x86_64-linux-gnu.so 0x00007fbcfcb0923f
3  libpthread.so.0                      0x00007fbd07ab7140
```

And a failure in `TestCallStdStringFunction.py` on Linux aarch64:
```
--
Exit Code: -11

Command Output (stdout):
--
lldb version 18.0.0git (https://github.com/llvm/llvm-project.git revision ef3feba)
  clang revision ef3feba
  llvm revision ef3feba

--
Command Output (stderr):
--
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      HandleCommand(command = "expression str")
1.      <lldb wrapper prefix>:45:34: current parser token ';'
2.      <lldb wrapper prefix>:44:1: parsing function body '$__lldb_expr'
3.      <lldb wrapper prefix>:44:1: in compound statement ('{}')
  #0 0x0000ffffb72a149c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_[lldb.cpython-38-aarch64-linux-gnu.so](http://lldb.cpython-38-aarch64-linux-gnu.so/)+0x58c749c)
  #1 0x0000ffffb729f458 llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_[lldb.cpython-38-aarch64-linux-gnu.so](http://lldb.cpython-38-aarch64-linux-gnu.so/)+0x58c5458)
  #2 0x0000ffffb72a1bd0 SignalHandler(int) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_[lldb.cpython-38-aarch64-linux-gnu.so](http://lldb.cpython-38-aarch64-linux-gnu.so/)+0x58c7bd0)
  #3 0x0000ffffbdd9e7dc (linux-vdso.so.1+0x7dc)
  #4 0x0000ffffb71799d8 lldb_private::plugin::dwarf::SymbolFileDWARF::FindGlobalVariables(lldb_private::ConstString, lldb_private::CompilerDeclContext const&, unsigned int, lldb_private::VariableList&) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_[lldb.cpython-38-aarch64-linux-gnu.so](http://lldb.cpython-38-aarch64-linux-gnu.so/)+0x579f9d8)
  #5 0x0000ffffb7197508 DWARFASTParserClang::FindConstantOnVariableDefinition(lldb_private::plugin::dwarf::DWARFDIE) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_[lldb.cpython-38-aarch64-linux-gnu.so](http://lldb.cpython-38-aarch64-linux-gnu.so/)+0x57bd508)
```
atrosinenko pushed a commit that referenced this issue Nov 15, 2023
The const.cpp testcase fails when running in MSVC mode, while it does
succeed in MinGW mode.

In MSVC mode, there are more constructor invocations than expected, as
the printout looks like this:

    A(1), this = 0000025597930000
    A(1), this = 0000025597930000
    f: this = 0000025597930000, val = 1
    A(1), this = 0000025597930000
    f: this = 0000025597930000, val = 1
    ~A, this = 0000025597930000, val = 1
    ~A, this = 0000025597930000, val = 1
    ~A, this = 0000025597930000, val = 1

While the expected printout looks like this:

    A(1), this = 000002C903E10000
    f: this = 000002C903E10000, val = 1
    f: this = 000002C903E10000, val = 1
    ~A, this = 000002C903E10000, val = 1

Reapplying #70991 with the XFAIL changed to check the host triple, not
the target triple. On an MSVC based build of Clang, but with the default
target triple set to PS4/PS5, we will still see the failure. And a Linux
based build of Clang that targets PS4/PS5 won't see the issue.
atrosinenko pushed a commit that referenced this issue Nov 15, 2023
…ooking options for a custom subcommand (#71975)

…ooking options for a custom subcommand. (#71776)"

This reverts commit b88308b.

The build-bot is unhappy
(https://lab.llvm.org/buildbot/#/builders/186/builds/13096),
`GroupingAndPrefix` fails after `TopLevelOptInSubcommand` (the newly
added test).

Revert while I look into this (might be related with test sharding but
not sure)

```

[----------] 3 tests from CommandLineTest
[ RUN      ] CommandLineTest.TokenizeWindowsCommandLine2
[       OK ] CommandLineTest.TokenizeWindowsCommandLine2 (0 ms)
[ RUN      ] CommandLineTest.TopLevelOptInSubcommand
[       OK ] CommandLineTest.TopLevelOptInSubcommand (0 ms)
[ RUN      ] CommandLineTest.GroupingAndPrefix
 #0 0x00ba8118 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x594118)
 #1 0x00ba5914 llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x591914)
 #2 0x00ba89c4 SignalHandler(int) (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5949c4)
 #3 0xf7828530 __default_sa_restorer /build/glibc-9MGTF6/glibc-2.31/signal/../sysdeps/unix/sysv/linux/arm/sigrestorer.S:67:0
 #4 0x00af91f0 (anonymous namespace)::CommandLineParser::ResetAllOptionOccurrences() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x4e51f0)
 #5 0x00af8e1c llvm::cl::ResetCommandLineParser() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x4e4e1c)
 #6 0x0077cda0 (anonymous namespace)::CommandLineTest_GroupingAndPrefix_Test::TestBody() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x168da0)
 #7 0x00bc5adc testing::Test::Run() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5b1adc)
 #8 0x00bc6cc0 testing::TestInfo::Run() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5b2cc0)
 #9 0x00bc7880 testing::TestSuite::Run() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5b3880)
#10 0x00bd7974 testing::internal::UnitTestImpl::RunAllTests() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5c3974)
#11 0x00bd6ebc testing::UnitTest::Run() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5c2ebc)
#12 0x00bb1058 main (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x59d058)
#13 0xf78185a4 __libc_start_main /build/glibc-9MGTF6/glibc-2.31/csu/libc-start.c:342:3
```
atrosinenko pushed a commit that referenced this issue Nov 15, 2023
… functions (#72069)

Fixes a bug introduced by
commit f95b2f1 ("Reland [InstrProf][compiler-rt] Enable MC/DC
Support in LLVM Source-based Code Coverage (1/3)")

The InstrProfiling pass was refactored when introducing support for
MC/DC such that the creation of the data variable was abstracted and
called only once per function from ::run(). Because ::run() only
iterated over functions there were not fully inlined, and because it
only created the data variable for the first intrinsic that it saw, data
variables corresponding to functions fully inlined into other
instrumented callers would end up without a data variable, resulting in
loss of coverage information. This patch does the following:

1.) Move the call of createDataVariable() to getOrCreateRegionCounters()
so that the creation of the data variable will happen indirectly either
from ::new() or during profile intrinsic lowering when it is needed.
This effectively restores the behavior prior to the refactor and ensures
that all data variables are created when needed (and not duplicated).

2.) Process all MC/DC bitmap parameter intrinsics in ::run() prior to
calling getOrCreateRegionCounters(). This ensures bitmap regions are
created for each function including functions that are fully inlined. It
also ensures that the bitmap region is created for each function prior
to the creation of the data variable because it is referenced by the
data variable. Again, duplication is prevented if the same parameter
intrinsic is inlined into multiple functions.

3.) No longer pass the MC/DC intrinsic to createDataVariable(). This
decouples the creation of the data variable from a specific MC/DC
intrinsic. Instead, with #2 above, store the number of bitmap bytes
required in the PerFunctionProfileData in the ProfileDataMap along with
the function's CounterRegion and BitmapRegion variables. This ties the
bitmap information directly to the function to which it belongs, and the
data variable created for that function can reference that.
atrosinenko pushed a commit that referenced this issue Dec 4, 2023
…(#73463)

Despite CWG2497 not being resolved, it is reasonable to expect the
following code to compile (and which is supported by other compilers)

```cpp
  template<typename T> constexpr T f();
  constexpr int g() { return f<int>(); } // #1
  template<typename T> constexpr T f() { return 123; }
  int k[g()];
  // #2
```

To that end, we eagerly instantiate all referenced specializations of
constexpr functions when they are defined.

We maintain a map of (pattern, [instantiations]) independent of
`PendingInstantiations` to avoid having to iterate that list after each
function definition.

We should apply the same logic to constexpr variables, but I wanted to
keep the PR small.

Fixes #73232
kovdan01 pushed a commit that referenced this issue Mar 14, 2024
TestCases/Misc/Linux/sigaction.cpp fails because dlsym() may call malloc
on failure. And then the wrapped malloc appears to access thread local
storage using global dynamic accesses, thus calling
___interceptor___tls_get_addr, before REAL(__tls_get_addr) has
been set, so we get a crash inside ___interceptor___tls_get_addr. For
example, this can happen when looking up __isoc23_scanf which might not
exist in some libcs.

Fix this by marking the thread local variable accessed inside the
debug checks as "initial-exec", which does not require __tls_get_addr.

This is probably a better alternative to llvm/llvm-project#83886.

This fixes a different crash but is related to llvm/llvm-project#46204.

Backtrace:
```
#0 0x0000000000000000 in ?? ()
#1 0x00007ffff6a9d89e in ___interceptor___tls_get_addr (arg=0x7ffff6b27be8) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:2759
#2 0x00007ffff6a46bc6 in __sanitizer::CheckedMutex::LockImpl (this=0x7ffff6b27be8, pc=140737331846066) at /path/to/llvm/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:218
#3 0x00007ffff6a448b2 in __sanitizer::CheckedMutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:129
#4 __sanitizer::Mutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:167
#5 0x00007ffff6abdbb2 in __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock (mu=0x730000000580, this=<optimized out>) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:383
#6 __sanitizer::SizeClassAllocator64<__tsan::AP64>::GetFromAllocator (this=0x7ffff7487dc0 <__tsan::allocator_placeholder>, stat=stat@entry=0x7ffff570db68, class_id=11, chunks=chunks@entry=0x7ffff5702cc8, n_chunks=n_chunks@entry=128) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_primary64.h:207
#7 0x00007ffff6abdaa0 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Refill (this=<optimized out>, c=c@entry=0x7ffff5702cb8, allocator=<optimized out>, class_id=<optimized out>)
 at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:103
#8 0x00007ffff6abd731 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Allocate (this=0x7ffff6b27be8, allocator=0x7ffff5702cc8, class_id=140737311157448)
 at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:39
#9 0x00007ffff6abc397 in __sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocator64<__tsan::AP64>, __sanitizer::LargeMmapAllocatorPtrArrayDynamic>::Allocate (this=0x7ffff5702cc8, cache=0x7ffff6b27be8, size=<optimized out>, size@entry=175, alignment=alignment@entry=16)
 at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_combined.h:69
#10 0x00007ffff6abaa6a in __tsan::user_alloc_internal (thr=0x7ffff7ebd980, pc=140737331499943, sz=sz@entry=175, align=align@entry=16, signal=true) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:198
#11 0x00007ffff6abb0d1 in __tsan::user_alloc (thr=0x7ffff6b27be8, pc=140737331846066, sz=11, sz@entry=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:223
#12 0x00007ffff6a693b5 in ___interceptor_malloc (size=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:666
#13 0x00007ffff7fce7f2 in malloc (size=175) at ../include/rtld-malloc.h:56
#14 __GI__dl_exception_create_format (exception=exception@entry=0x7fffffffd0d0, objname=0x7ffff7fc3550 "/path/to/llvm/compiler-rt/cmake-build-all-sanitizers/lib/linux/libclang_rt.tsan-x86_64.so",
 fmt=fmt@entry=0x7ffff7ff2db9 "undefined symbol: %s%s%s") at ./elf/dl-exception.c:157
#15 0x00007ffff7fd50e8 in _dl_lookup_symbol_x (undef_name=0x7ffff6af868b "__isoc23_scanf", undef_map=<optimized out>, ref=0x7fffffffd148, symbol_scope=<optimized out>, version=<optimized out>, type_class=0, flags=2, skip_map=0x7ffff7fc35e0) at ./elf/dl-lookup.c:793
--Type <RET> for more, q to quit, c to continue without paging--
#16 0x00007ffff656d6ed in do_sym (handle=<optimized out>, name=0x7ffff6af868b "__isoc23_scanf", who=0x7ffff6a3bb84 <__interception::InterceptFunction(char const*, unsigned long*, unsigned long, unsigned long)+36>, vers=vers@entry=0x0, flags=flags@entry=2) at ./elf/dl-sym.c:146
#17 0x00007ffff656d9dd in _dl_sym (handle=<optimized out>, name=<optimized out>, who=<optimized out>) at ./elf/dl-sym.c:195
#18 0x00007ffff64a2854 in dlsym_doit (a=a@entry=0x7fffffffd3b0) at ./dlfcn/dlsym.c:40
#19 0x00007ffff7fcc489 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffd310, operate=0x7ffff64a2840 <dlsym_doit>, args=0x7fffffffd3b0) at ./elf/dl-catch.c:237
#20 0x00007ffff7fcc5af in _dl_catch_error (objname=0x7fffffffd368, errstring=0x7fffffffd370, mallocedp=0x7fffffffd367, operate=<optimized out>, args=<optimized out>) at ./elf/dl-catch.c:256
#21 0x00007ffff64a2257 in _dlerror_run (operate=operate@entry=0x7ffff64a2840 <dlsym_doit>, args=args@entry=0x7fffffffd3b0) at ./dlfcn/dlerror.c:138
#22 0x00007ffff64a28e5 in dlsym_implementation (dl_caller=<optimized out>, name=<optimized out>, handle=<optimized out>) at ./dlfcn/dlsym.c:54
#23 ___dlsym (handle=<optimized out>, name=<optimized out>) at ./dlfcn/dlsym.c:68
#24 0x00007ffff6a3bb84 in __interception::GetFuncAddr (name=0x7ffff6af868b "__isoc23_scanf", trampoline=140737311157448) at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:42
#25 __interception::InterceptFunction (name=0x7ffff6af868b "__isoc23_scanf", ptr_to_real=0x7ffff74850e8 <__interception::real___isoc23_scanf>, func=11, trampoline=140737311157448)
 at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:61
#26 0x00007ffff6a9f2d9 in InitializeCommonInterceptors () at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_common_interceptors.inc:10315
```

Reviewed By: vitalybuka, MaskRay

Pull Request: llvm/llvm-project#83890
kovdan01 pushed a commit that referenced this issue Mar 22, 2024
This reverts commit daebe5c.

This commit causes the following asan issue:

```
<snip>/llvm-project/build/bin/mlir-opt <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir | <snip>/llvm-project/build/bin/FileCheck <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir
# executed command: <snip>/llvm-project/build/bin/mlir-opt <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir
# .---command stderr------------
# | =================================================================
# | ==2772558==ERROR: AddressSanitizer: stack-use-after-return on address 0x7fd2c2c42b90 at pc 0x55e406d54614 bp 0x7ffc810e4070 sp 0x7ffc810e4068
# | READ of size 8 at 0x7fd2c2c42b90 thread T0
# |     #0 0x55e406d54613 in operator()<long int const*> /usr/include/c++/13/bits/predefined_ops.h:318
# |     #1 0x55e406d54613 in __count_if<long int const*, __gnu_cxx::__ops::_Iter_pred<mlir::verifyListOfOperandsOrIntegers(Operation*, llvm::StringRef, unsigned int, llvm::ArrayRef<long int>, ValueRange)::<lambda(int64_t)> > > /usr/include/c++/13/bits/stl_algobase.h:2125
# |     #2 0x55e406d54613 in count_if<long int const*, mlir::verifyListOfOperandsOrIntegers(Operation*, 
...
```
kovdan01 pushed a commit that referenced this issue Mar 22, 2024
…oint. (#83821)"

This reverts commit c2c1e6e. It creates
a use after free.

==8342==ERROR: AddressSanitizer: heap-use-after-free on address 0x50f000001760 at pc 0x55b9fb84a8fb bp 0x7ffc18468a10 sp 0x7ffc18468a08
READ of size 1 at 0x50f000001760 thread T0
 #0 0x55b9fb84a8fa in dropPoisonGeneratingFlags llvm/lib/Transforms/Vectorize/VPlan.h:1040:13
 #1 0x55b9fb84a8fa in llvm::VPlanTransforms::dropPoisonGeneratingRecipes(llvm::VPlan&, llvm::function_ref<bool (llvm::BasicBlock*)>)::$_0::operator()(llvm::VPRecipeBase*) const llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:1236:23
 #2 0x55b9fb84a196 in llvm::VPlanTransforms::dropPoisonGeneratingRecipes(llvm::VPlan&, llvm::function_ref<bool (llvm::BasicBlock*)>) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

Can be reproduced with asan on
Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll
Transforms/LoopVectorize/X86/pr81872.ll
Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
kovdan01 pushed a commit that referenced this issue May 13, 2024
…e exception specification of a function (#90760)

[temp.deduct.general] p6 states:
> At certain points in the template argument deduction process it is
necessary to take a function type that makes use of template parameters
and replace those template parameters with the corresponding template
arguments.
This is done at the beginning of template argument deduction when any
explicitly specified template arguments are substituted into the
function type, and again at the end of template argument deduction when
any template arguments that were deduced or obtained from default
arguments are substituted.

[temp.deduct.general] p7 goes on to say:
> The _deduction substitution loci_ are
> - the function type outside of the _noexcept-specifier_,
> - the explicit-specifier,
> - the template parameter declarations, and
> - the template argument list of a partial specialization
>
> The substitution occurs in all types and expressions that are used in
the deduction substitution loci. [...]

Consider the following:
```cpp
struct A
{
    static constexpr bool x = true;
};

template<typename T, typename U>
void f(T, U) noexcept(T::x); // #1

template<typename T, typename U>
void f(T, U*) noexcept(T::y); // #2

template<>
void f<A>(A, int*) noexcept; // clang currently accepts, GCC and EDG reject
```

Currently, `Sema::SubstituteExplicitTemplateArguments` will substitute
into the _noexcept-specifier_ when deducing template arguments from a
function declaration or when deducing template arguments for taking the
address of a function template (and the substitution is treated as a
SFINAE context). In the above example, `#1` is selected as the primary
template because substitution of the explicit template arguments into
the _noexcept-specifier_ of `#2` failed, which resulted in the candidate
being ignored.

This behavior is incorrect ([temp.deduct.general] note 4 says as much), and
this patch corrects it by deferring all substitution into the
_noexcept-specifier_ until it is instantiated.

As part of the necessary changes to make this patch work, the
instantiation of the exception specification of a function template
specialization when taking the address of a function template is changed
to only occur for the function selected by overload resolution per
[except.spec] p13.1 (as opposed to being instantiated for every candidate).
kovdan01 pushed a commit that referenced this issue May 13, 2024
…ined member functions & member function templates (#88963)

Consider the following snippet from the discussion of CWG2847 on the core reflector:
```
template<typename T>
concept C = sizeof(T) <= sizeof(long);

template<typename T>
struct A 
{
    template<typename U>
    void f(U) requires C<U>; // #1, declares a function template 

    void g() requires C<T>; // #2, declares a function

    template<>
    void f(char);  // #3, an explicit specialization of a function template that declares a function
};

template<>
template<typename U>
void A<short>::f(U) requires C<U>; // #4, an explicit specialization of a function template that declares a function template

template<>
template<>
void A<int>::f(int); // #5, an explicit specialization of a function template that declares a function

template<>
void A<long>::g(); // #6, an explicit specialization of a function that declares a function
```

A number of problems exist:
- Clang rejects `#4` because the trailing _requires-clause_ has `U`
substituted with the wrong template parameter depth when
`Sema::AreConstraintExpressionsEqual` is called to determine whether it
matches the trailing _requires-clause_ of the implicitly instantiated
function template.
- Clang rejects `#5` because the function template specialization
instantiated from `A<int>::f` has a trailing _requires-clause_, but `#5`
does not (nor can it have one as it isn't a templated function).
- Clang rejects `#6` for the same reasons it rejects `#5`.

This patch resolves these issues by making the following changes:
- To fix `#4`, `Sema::AreConstraintExpressionsEqual` is passed
`FunctionTemplateDecl`s when comparing the trailing _requires-clauses_
of `#4` and the function template instantiated from `#1`.
- To fix `#5` and `#6`, the trailing _requires-clauses_ are not compared
for explicit specializations that declare functions.

In addition to these changes, `CheckMemberSpecialization` now considers
constraint satisfaction/constraint partial ordering when determining
which member function is specialized by an explicit specialization of a
member function for an implicit instantiation of a class template (we
previously would select the first function that has the same type as the
explicit specialization). With constraints taken under consideration, we
match EDG's behavior for these declarations.
kovdan01 pushed a commit that referenced this issue May 27, 2024
...which caused issues like

> ==42==ERROR: AddressSanitizer failed to deallocate 0x32 (50) bytes at
address 0x117e0000 (error code: 28)
> ==42==Cannot dump memory map on emscriptenAddressSanitizer: CHECK
failed: sanitizer_common.cpp:81 "((0 && "unable to unmmap")) != (0)"
(0x0, 0x0) (tid=288045824)
> #0 0x14f73b0c in __asan::CheckUnwind()+0x14f73b0c
(this.program+0x14f73b0c)
> #1 0x14f8a3c2 in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)+0x14f8a3c2
(this.program+0x14f8a3c2)
> #2 0x14f7d6e1 in __sanitizer::ReportMunmapFailureAndDie(void*,
unsigned long, int, bool)+0x14f7d6e1 (this.program+0x14f7d6e1)
> #3 0x14f81fbd in __sanitizer::UnmapOrDie(void*, unsigned
long)+0x14f81fbd (this.program+0x14f81fbd)
> #4 0x14f875df in __sanitizer::SuppressionContext::ParseFromFile(char
const*)+0x14f875df (this.program+0x14f875df)
> #5 0x14f74eab in __asan::InitializeSuppressions()+0x14f74eab
(this.program+0x14f74eab)
> #6 0x14f73a1a in __asan::AsanInitInternal()+0x14f73a1a
(this.program+0x14f73a1a)

when trying to use an ASan suppressions file under Emscripten: Even
though it would be considered OK by SUSv4, the Emscripten runtime states
"We don't support partial munmapping" (see

<emscripten-core/emscripten@f4115eb>
"Implement MAP_ANONYMOUS on top of malloc in STANDALONE_WASM mode
(#16289)").

Co-authored-by: Stephan Bergmann <stephan.bergmann@allotropia.de>
kovdan01 pushed a commit that referenced this issue May 27, 2024
…ication as used during partial ordering (#91534)

We do not deduce template arguments from the exception specification
when determining the primary template of a function template
specialization or when taking the address of a function template.
Therefore, this patch changes `isAtLeastAsSpecializedAs` such that we do
not mark template parameters in the exception specification as 'used'
during partial ordering (per [temp.deduct.partial]
p12) to prevent the following from being ambiguous:

```
template<typename T, typename U>
void f(U) noexcept(noexcept(T())); // #1

template<typename T>
void f(T*) noexcept; // #2

template<>
void f<int>(int*) noexcept; // currently ambiguous, selects #2 with this patch applied 
```

Although there is no corresponding wording in the standard (see core issue filed here
cplusplus/CWG#537), this seems
to be the intended behavior given the definition of _deduction
substitution loci_ in [temp.deduct.general] p7 (and EDG does the same thing).
kovdan01 pushed a commit that referenced this issue May 27, 2024
…erSize (#67657)"

This reverts commit f0b3654.

This commit triggers UB by reading an uninitialized variable.

`UP.PartialThreshold` is used uninitialized in `getUnrollingPreferences()` when
it is called from `LoopVectorizationPlanner::executePlan()`. In this case the
`UP` variable is created on the stack and its fields are not initialized.

```
==8802==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x557c0b081b99 in llvm::BasicTTIImplBase<llvm::X86TTIImpl>::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) llvm-project/llvm/include/llvm/CodeGen/BasicTTIImpl.h
    #1 0x557c0b07a40c in llvm::TargetTransformInfo::Model<llvm::X86TTIImpl>::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) llvm-project/llvm/include/llvm/Analysis/TargetTransformInfo.h:2277:17
    #2 0x557c0f5d69ee in llvm::TargetTransformInfo::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) const llvm-project/llvm/lib/Analysis/TargetTransformInfo.cpp:387:19
    #3 0x557c0e6b96a0 in llvm::LoopVectorizationPlanner::executePlan(llvm::ElementCount, unsigned int, llvm::VPlan&, llvm::InnerLoopVectorizer&, llvm::DominatorTree*, bool, llvm::DenseMap<llvm::SCEV const*, llvm::Value*, llvm::DenseMapInfo<llvm::SCEV const*, void>, llvm::detail::DenseMapPair<llvm::SCEV const*, llvm::Value*>> const*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7624:7
    #4 0x557c0e6e4b63 in llvm::LoopVectorizePass::processLoop(llvm::Loop*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10253:13
    #5 0x557c0e6f2429 in llvm::LoopVectorizePass::runImpl(llvm::Function&, llvm::ScalarEvolution&, llvm::LoopInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::BlockFrequencyInfo*, llvm::TargetLibraryInfo*, llvm::DemandedBits&, llvm::AssumptionCache&, llvm::LoopAccessInfoManager&, llvm::OptimizationRemarkEmitter&, llvm::ProfileSummaryInfo*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10344:30
    #6 0x557c0e6f2f97 in llvm::LoopVectorizePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10383:9

[...]

  Uninitialized value was created by an allocation of 'UP' in the stack frame
    #0 0x557c0e6b961e in llvm::LoopVectorizationPlanner::executePlan(llvm::ElementCount, unsigned int, llvm::VPlan&, llvm::InnerLoopVectorizer&, llvm::DominatorTree*, bool, llvm::DenseMap<llvm::SCEV const*, llvm::Value*, llvm::DenseMapInfo<llvm::SCEV const*, void>, llvm::detail::DenseMapPair<llvm::SCEV const*, llvm::Value*>> const*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7623:3
```
kovdan01 pushed a commit that referenced this issue May 27, 2024
…0820)

This solves some ambuguity introduced in P0522 regarding how
template template parameters are partially ordered, and should reduce
the negative impact of enabling `-frelaxed-template-template-args`
by default.

When performing template argument deduction, a template template
parameter
containing no packs should be more specialized than one that does.

Given the following example:
```C++
template<class T2> struct A;
template<template<class ...T3s> class TT1, class T4> struct A<TT1<T4>>; // #1
template<template<class    T5 > class TT2, class T6> struct A<TT2<T6>>; // #2

template<class T1> struct B;
template struct A<B<char>>;
```

Prior to P0522, candidate `#2` would be more specialized.
After P0522, neither is more specialized, so this becomes ambiguous.
With this change, `#2` becomes more specialized again,
maintaining compatibility with pre-P0522 implementations.

The problem is that in P0522, candidates are at least as specialized
when matching packs to fixed-size lists both ways, whereas before,
a fixed-size list is more specialized.

This patch keeps the original behavior when checking template arguments
outside deduction, but restores this aspect of pre-P0522 matching
during deduction.

---

Since this changes provisional implementation of CWG2398 which has
not been released yet, and already contains a changelog entry,
we don't provide a changelog entry here.
kovdan01 pushed a commit that referenced this issue May 27, 2024
'reduction' has a few restrictions over normal 'var-list' clauses:

1- On parallel, a num_gangs can only have 1 argument when combined with
reduction. These two aren't able to be combined on any other of the
compute constructs however.

2- The vars all must be 'numerical data types' types of some sort, or a
'composite of numerical data types'. A list of types is given in the
standard as a minimum, so we choose 'isScalar', which covers all of
these types and keeps types that are actually numeric. Other compilers
don't seem to implement the 'composite of numerical data types', though
we do.

3- Because of the above restrictions, member-of-composite is not
allowed, so any access via a memberexpr is disallowed. Array-element and
sub-arrays (aka array sections) are both permitted, so long as they meet
the requirements of #2.

This patch implements all of these for compute constructs.
kovdan01 pushed a commit that referenced this issue May 27, 2024
… (#92855)

This solves some ambuguity introduced in P0522 regarding how template
template parameters are partially ordered, and should reduce the
negative impact of enabling `-frelaxed-template-template-args` by
default.

When performing template argument deduction, we extend the provisional
wording introduced in llvm/llvm-project#89807 so
it also covers deduction of class templates.

Given the following example:
```C++
template <class T1, class T2 = float> struct A;
template <class T3> struct B;

template <template <class T4> class TT1, class T5> struct B<TT1<T5>>;   // #1
template <class T6, class T7>                      struct B<A<T6, T7>>; // #2

template struct B<A<int>>;
```
Prior to P0522, `#2` was picked. Afterwards, this became ambiguous. This
patch restores the pre-P0522 behavior, `#2` is picked again.

This has the beneficial side effect of making the following code valid:
```C++
template<class T, class U> struct A {};
A<int, float> v;
template<template<class> class TT> void f(TT<int>);

// OK: TT picks 'float' as the default argument for the second parameter.
void g() { f(v); }
```

---

Since this changes provisional implementation of CWG2398 which has not
been released yet, and already contains a changelog entry, we don't
provide a changelog entry here.
kovdan01 pushed a commit that referenced this issue Jun 19, 2024
On macOS, to make DYLD_INSERT_LIBRARIES and the Python shim work
together, we have a workaroud that copies the "real" Python interpreter
into the build directory. This doesn't work when running in a virtual
environment, as the copied interpreter cannot find the packages
installed in the virtual environment relative to itself.

Address this issue by copying the Python interpreter into the virtual
environment's `bin` folder, rather than the build folder, when the test
suite detects that it's being run inside a virtual environment.

I'm not thrilled about this solution because it puts a file outside the
build directory. However, given virtual environments are considered
disposable, this seems reasonable.
kovdan01 pushed a commit that referenced this issue Jun 24, 2024
….5)

The Python interpreter in Xcode cannot be copied because of a relative
RPATH. Our workaround would just use that Python interpreter directly
when it detects this. For the reasons explained in my previous commit,
that doesn't work in a virtual environment. Address this case by
creating a symlink to the "real" interpreter in the virtual environment.
kovdan01 pushed a commit that referenced this issue Jun 24, 2024
…on (#94752)

Fixes #62925.

The following code:
```cpp
#include <map>

int main() {
   std::map m1 = {std::pair{"foo", 2}, {"bar", 3}}; // guide #2
   std::map m2(m1.begin(), m1.end()); // guide #1
}
```
Is rejected by clang, but accepted by both gcc and msvc:
https://godbolt.org/z/6v4fvabb5 .

So basically CTAD with copy-list-initialization is rejected.

Note that this exact code is also used in a cppreference article:
https://en.cppreference.com/w/cpp/container/map/deduction_guides

I checked the C++11 and C++20 standard drafts to see whether suppressing
user conversion is the correct thing to do for user conversions. Based
on the standard I don't think that it is correct.

```
13.3.1.4 Copy-initialization of class by user-defined conversion [over.match.copy]
Under the conditions specified in 8.5, as part of a copy-initialization of an object of class type, a user-defined
conversion can be invoked to convert an initializer expression to the type of the object being initialized.
Overload resolution is used to select the user-defined conversion to be invoked
```
So we could use user defined conversions according to the standard.

```
If a narrowing conversion is required to initialize any of the elements, the
program is ill-formed.
```
We should not do narrowing.

```
In copy-list-initialization, if an explicit constructor is chosen, the initialization is ill-formed.
```
We should not use explicit constructors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
abi msp430 MSP430 target
Projects
None yet
Development

No branches or pull requests

3 participants