Describe the bug
Compiling and Linking with LLD's LTO causes either 1. Debug (or just unexpected) output with ineffective optimization or 2. a compiler crash.
Expected behavior
- Compilation and Linking should print no extra debug information and optimize away function calls to empty functions.
- The Cilksan version to be compiled, linked, and optimized successfully.
Note: cilksort.out (instrumented with cilkprace) asserting almost immediately is correct behavior, at least as far as the compiler is concerned.
OpenCilk version
Built from source (with modifications):
opencilk-project: cilkprace af4ec7f
cheetah: 5a80d8a534e8d56a5e354997bbfea23257c1c36b
productivity-tools: cilkprace af31596
System information
- OS: Linux Mint 20.1
- CPU: Intel(R) Xeon(R) Gold 5220R CPU @ 2.20GHz
Steps to reproduce (include relevant output)
- Clone with infrastructure to build from source
- Set git urls to the above projects and branches and pull
- in the build directoy, enable LLD as a subproject. We also build clang with LLD and split-dwarf, but it's unlikely to be load bearing. Recompile
- Compile with -fuse-ld=lld and -flto
Build command:
../build/bin/clang -fopencilk -fcilktool=cilkprace -g -O3 -DTIMING_COUNT=3 -fuse-ld=lld -flto -lm cilksort.c ktiming.c getoptions.c -o cilksort.out
Sample unexpected output:
<unknown>:0:0: 5 instructions in function
<unknown>:0:0: 1 instructions in function
cilksort.c:189:0: 272 instructions in function
cilksort.c:305:0: 554 instructions in function
cilksort.c:367:0: 391 instructions in function
cilksort.c:442:0: 1877 instructions in function
<unknown>:0:0: 5 instructions in function
<unknown>:0:0: 1 instructions in function
ktiming.c:41:0: 66 instructions in function
<unknown>:0:0: 5 instructions in function
<unknown>:0:0: 1 instructions in function
getoptions.c:9:0: 1523 instructions in function
cilksort.c:367:0: 112 instructions in function
cilksort.c:367:0: 109 instructions in function
cilksort.c:367:0: 109 instructions in function
cilksort.c:367:0: 109 instructions in function
cilksort.c:305:0: 112 instructions in function
Calling objdump to check for LTO
objdump -D cilksort.out
The code is littered with calls to __csi_after_store
15438: e8 93 6f ff ff callq c3d0 <__csi_after_store>
Which are prime targets for LTO to opimize away:
17388 000000000000c3d0 <__csi_after_store>:
17389 c3d0: c3 retq
17390 c3d1: 66 66 66 66 66 66 2e data16 data16 data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
17391 c3d8: 0f 1f 84 00 00 00 00
17392 c3df: 00
17393
Instrumenting the same code with cilksan instead gives us a compiler crash:
../build/bin/clang -fopencilk -fsanitize=cilk -mllvm -enable-static-race-detection=false -g -O3 -DTIMING_COUNT=3 -lm -g -O3 -DTIMING_COUNT=3 -fuse-ld=lld -flto cilksort.c ktiming.c getoptions.c -o cilksort.csan
cilksort backtrace.txt
Requested attached (bonus .txt extensions to appease github)
cilksort-d7b45f.sh.txt
getoptions-ff4d8b.c.txt
ktiming-2759e6.c.txt
cilksort-d7b45f.c.txt
Additionally, using LTO to compile and link cilksort without either cilktool gives the unexpected output and does not crash. I haven't verified if LTO is being applied properly.
Working example code
cilksort.c, ktiming.c, and getoptions.c can be found in the cilkprace subdirectory of cilktools cilkprace/bench_cheetah
There's also a makefile there, but the all target doesn't work. I'd recommend using make cilksort.out and make cilksort.csan instead of make.
Additional comments
Add any other comments about the issue here.
Describe the bug
Compiling and Linking with LLD's LTO causes either 1. Debug (or just unexpected) output with ineffective optimization or 2. a compiler crash.
Expected behavior
Note: cilksort.out (instrumented with cilkprace) asserting almost immediately is correct behavior, at least as far as the compiler is concerned.
OpenCilk version
Built from source (with modifications):
opencilk-project:cilkpraceaf4ec7fcheetah:5a80d8a534e8d56a5e354997bbfea23257c1c36bproductivity-tools:cilkpraceaf31596System information
Steps to reproduce (include relevant output)
Build command:
../build/bin/clang -fopencilk -fcilktool=cilkprace -g -O3 -DTIMING_COUNT=3 -fuse-ld=lld -flto -lm cilksort.c ktiming.c getoptions.c -o cilksort.outSample unexpected output:
Calling objdump to check for LTO
objdump -D cilksort.outThe code is littered with calls to __csi_after_store
15438: e8 93 6f ff ff callq c3d0 <__csi_after_store>Which are prime targets for LTO to opimize away:
Instrumenting the same code with cilksan instead gives us a compiler crash:
../build/bin/clang -fopencilk -fsanitize=cilk -mllvm -enable-static-race-detection=false -g -O3 -DTIMING_COUNT=3 -lm -g -O3 -DTIMING_COUNT=3 -fuse-ld=lld -flto cilksort.c ktiming.c getoptions.c -o cilksort.csancilksort backtrace.txt
Requested attached (bonus .txt extensions to appease github)
cilksort-d7b45f.sh.txt
getoptions-ff4d8b.c.txt
ktiming-2759e6.c.txt
cilksort-d7b45f.c.txt
Additionally, using LTO to compile and link cilksort without either cilktool gives the unexpected output and does not crash. I haven't verified if LTO is being applied properly.
Working example code
cilksort.c, ktiming.c, and getoptions.c can be found in the cilkprace subdirectory of cilktools
cilkprace/bench_cheetahThere's also a makefile there, but the
alltarget doesn't work. I'd recommend usingmake cilksort.outandmake cilksort.csaninstead ofmake.Additional comments
Add any other comments about the issue here.