invoke of fcn ptr under -O0 isnt correctly restoring spilled reg before callq #17635

llvmbot · 2013-09-17T03:50:19Z


Bugzilla Link	17261
Version	unspecified
OS	MacOS X
Attachments	isolate.invoke.ll: LLVM input using invoke, isolate.call.ll: LLVM input using call, isolate.invoke.ll.s: assembly output when using invoke, isolate.call.ll.s: assembly output when using call (works)
Reporter	LLVM Bugzilla Contributor

Extended Description

This bug is an attempt to isolate the problem identified on the Rust project here:
rust-lang/rust#9129 (comment)

Summary bullet points for reproduction:

It appears to be a problem with invoke reloading a spilled call target; using call fixes the problem.
You need llc -O0 to see the problem; higher optimization levels mask the problem.
This bug report is written based on assembly code analysis, but the problem was discovered (as you might expect) based on actual crashes of compiled code.

The plot:

I have found a standalone .ll input for which an invoke instruction compiles via llc into code that loads from a location on the stack to which no value has been previously assigned. When I substitute a call instruction for the same spot, llc generates code that correctly loads a spilled value from the stack back into a register in preparation for the call.

I am running llc via command lines like this:
llc -o isolate.invoke.ll.s isolate.invoke.ll -O0 -filetype=asm

In particular, to reproduce the bug, you need to use -O0.

I am running with llc compiled from:
"""
commit 107cfa2
Author: Tim Northover tnorthover@apple.com
Date: Mon Sep 16 17:33:40 2013 +0000

  TableGen: fix constness of new comparison function.
  
  libc++ didn't seem to like a non-const call operator.
  
  git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190797 91177308-0d34-0410-b5e6-96231b3b80d8

"""

I have not been able to come up with a C++ program that generates the offending code. I am hoping that people will be able to see the same problem on their own systems by inspecting the assembly (as I have been) and then be able to figure out why invoke is not being compiled correctly.

I'm attaching four files: two are the input .ll files (one using call, and the other using invoke), and the other two are the output .s files.

For ease of reference, I am posting the most relevant bits of generated assembly here.

The variant that uses the call instruction generates correct machine code that looks like this (snippet):

## BB#0:                                ## %function top level
subq	$56, %rsp

Ltmp1:
.cfi_def_cfa_offset 64
addq $8, %rdi
movq %rdi, 8(%rsp) ## 8-byte Spill
callq __ZN9breakdown17haf444eb869dbfb724v0.0E
movq %rax, 32(%rsp)
movq 32(%rsp), %rdi
movq 8(%rsp), %rax ## 8-byte Reload
callq *%rax

In particular, the spill and reload are both referring to the same stack location, 8(%rsp)

Conversely, the code that is generated from using invoke looks like this:

BB#0: ## %function top level

subq	$88, %rsp

Ltmp4:
.cfi_def_cfa_offset 96
addq $8, %rdi
Ltmp0:
movq %rdi, 40(%rsp) ## 8-byte Spill
callq __ZN9breakdown17haf444eb869dbfb724v0.0E
Ltmp1:
movq %rax, 32(%rsp) ## 8-byte Spill
jmp LBB1_1
LBB1_1: ## %normal return
movq 32(%rsp), %rax ## 8-byte Reload
movq %rax, 64(%rsp)
movq 64(%rsp), %rdi
movq 24(%rsp), %rcx ## 8-byte Reload
callq *%rcx

Note that the reload that establishes %rcx is coming from 24(%rsp), a stack location that has not been established by any register spills.

(Note: The attached code is only meant to illustrate incorrect assembly+machine code generation; it is the result of me iteratively reducing an initial function from the Rust bug linked above until I came up with something small enough for reasonable submission, i.e. <50 lines. But it does not run on its own, and at this point I removed so much of the original computation that it almost certainly would break if you did actually run it.)

The text was updated successfully, but these errors were encountered:

llvmbot · 2013-09-17T04:22:24Z

Okay, so llc --print-{before,after}-all provides us with this nugget:

* IR Dump Before Prologue/Epilogue Insertion & Frame Finalization *:

...
BB#1: derived from LLVM BB %"normal return"
Predecessors according to CFG: BB#0
%RAX = MOV64rm <fi#4>, 1, %noreg, 0, %noreg; mem:LD8[FixedStack4]
MOV64mr <fi#1>, 1, %noreg, 0, %noreg, %RAX
%RDI = MOV64rm <fi#1>, 1, %noreg, 0, %noreg
ADJCALLSTACKDOWN64 0, %RSP, %EFLAGS, %RSP
%RCX = MOV64rm <fi#5>, 1, %noreg, 0, %noreg; mem:LD8[FixedStack5]
CALL64r %RCX, , %RSP, %RDI<imp-use,kill>
ADJCALLSTACKUP64 0, 0, %RSP, %EFLAGS, %RSP
RET

...

End machine code for function main_anon_expr.

* IR Dump After Prologue/Epilogue Insertion & Frame Finalization *:

Machine code for function main_anon_expr: Post SSA

...

BB#1: derived from LLVM BB %"normal return"
Predecessors according to CFG: BB#0
%RAX = MOV64rm %RSP, 1, %noreg, 32, %noreg; mem:LD8[FixedStack4]
MOV64mr %RSP, 1, %noreg, 64, %noreg, %RAX
%RDI = MOV64rm %RSP, 1, %noreg, 64, %noreg
%RCX = MOV64rm %RSP, 1, %noreg, 24, %noreg; mem:LD8[FixedStack5]
CALL64r %RCX, , %RSP, %RDI<imp-use,kill>
%RSP<def,tied1> = ADD64ri8 %RSP, 88, %EFLAGS<imp-def,dead>
RET
...

End machine code for function main_anon_expr.

In particular, I notice in the above that the ADJCALLSTACKDOWN64 is appearing after the load into %RDI and before the load into %RCX. In the final output, there is no actual code emitted for the ADJCALLSTACKDOWN64 at that location; but is the Prologue/Epilogue Insertion & Frame Finalization getting confused and trying to compensate for the ADJCALLSTACKDOWN64 in some manner when it determines the stack location for the spilled mem:LD8[FixedStack5]? (Then again, this is all new material for me, so I might be misinterpreting the instrumentation.)

Endilll · 2023-08-08T21:30:00Z

A comment in downstream issue claims this to be fixed.

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 9, 2021

Endilll closed this as completed Aug 8, 2023

Endilll added llvm:codegen miscompilation labels Aug 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

invoke of fcn ptr under -O0 isnt correctly restoring spilled reg before callq #17635

invoke of fcn ptr under -O0 isnt correctly restoring spilled reg before callq #17635

llvmbot commented Sep 17, 2013

llvmbot commented Sep 17, 2013

Endilll commented Aug 8, 2023

invoke of fcn ptr under -O0 isnt correctly restoring spilled reg before callq #17635

invoke of fcn ptr under -O0 isnt correctly restoring spilled reg before callq #17635

Comments

llvmbot commented Sep 17, 2013

Extended Description

BB#0: ## %function top level

llvmbot commented Sep 17, 2013

*** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***:

End machine code for function main_anon_expr.

*** IR Dump After Prologue/Epilogue Insertion & Frame Finalization ***:

Machine code for function main_anon_expr: Post SSA

End machine code for function main_anon_expr.

Endilll commented Aug 8, 2023

* IR Dump Before Prologue/Epilogue Insertion & Frame Finalization *:

* IR Dump After Prologue/Epilogue Insertion & Frame Finalization *: