Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSAN deadlock(?) with signals and atomics on FreeBSD #92437

Open
tavianator opened this issue May 16, 2024 · 1 comment
Open

TSAN deadlock(?) with signals and atomics on FreeBSD #92437

tavianator opened this issue May 16, 2024 · 1 comment

Comments

@tavianator
Copy link
Contributor

#include <signal.h>
#include <stdatomic.h>
#include <stdio.h>
#include <sys/time.h>

static atomic_size_t i = 0;
static atomic_size_t j = 0;

static void handler(int sig) {
        atomic_fetch_add_explicit(&i, 1, memory_order_relaxed);
}

int main(void) {
        signal(SIGALRM, handler);

        struct itimerval ival = {0};
        ival.it_value.tv_usec = 100;
        ival.it_interval.tv_usec = 100;
        setitimer(ITIMER_REAL, &ival, NULL);

        while (atomic_load_explicit(&i, memory_order_relaxed) < 1000) {
                atomic_store_explicit(&j, 1, memory_order_release);
        }

        ival.it_value.tv_usec = 0;
        setitimer(ITIMER_REAL, &ival, NULL);
        return 0;
}
tavianator@muon $ clang18 -fsanitize=thread foo.c -o foo                                       
tavianator@muon $ ./foo                                                                        
^C^\[2]    45828 killed     ./foo

I had to kill it from another terminal. GDB gives this stack trace:

(gdb) bt
#0  _umtx_op () at _umtx_op.S:4
#1  0x00000000002616ea in Wait () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:35
#2  0x00000000002d91c9 in Lock () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:196
#3  SlotLock () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp:366
#4  0x00000000002e95a3 in SlotLocker () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_rtl.h:641
#5  Acquire () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp:448
#6  0x000000000028a499 in CallUserSignalHandler () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:2071
#7  0x000000000028a2eb in ProcessPendingSignalsImpl () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:2142
#8  0x0000000000281883 in ProcessPendingSignals () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_rtl.h:674
#9  ~ScopedInterceptor () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:304
#10 0x0000000000280740 in ___interceptor_memcpy ()
    at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc:115
#11 0x0000000800352587 in handle_signal (actp=actp@entry=0x7fffffffcd40, sig=sig@entry=14, info=info@entry=0x7fffffffd130, ucp=ucp@entry=0x7fffffffcdc0)
    at /usr/src/lib/libthr/thread/thr_sig.c:311
#12 0x0000000800351afb in thr_sighandler (sig=14, info=0x7fffffffd130, _ucp=0x7fffffffcdc0) at /usr/src/lib/libthr/thread/thr_sig.c:244
#13 <signal handler called>
#14 TraceSkipGap () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp:929
#15 TraceSwitchPart () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp:941
#16 0x00000000002dd462 in TraceEvent<__tsan::EventTime> () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_rtl.h:738
#17 TraceTime () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp:145
#18 0x00000000002c1f9b in AtomicStore<unsigned long long> () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp:281
#19 __tsan_atomic64_store () at /wrkdirs/usr/ports/devel/llvm18/work-default/llvm-project-18.1.3.src/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp:541
#20 0x00000000002f41a0 in main ()
@tavianator
Copy link
Contributor Author

I'm guessing what's happening is tsan wants to defer the signal handler until after the atomic op, but then this line in FreeBSD's threading library calls memcpy(): https://github.com/freebsd/freebsd-src/blob/75529910f77a1623b83599de0518d39c5fb789df/lib/libthr/thread/thr_sig.c#L310.

This is after the actual signal handler has been invoked, but before sigreturn(). memcpy() is intercepted and the tsan runtime thinks now would be a good time to run the deferred signal handler, except we're still in the actual signal handler that interrupted __tsan_atomic64_store(), and it deadlocks.

tavianator added a commit to tavianator/bfs that referenced this issue May 16, 2024
ThreadSanitizer has some FreeBSD-specific bugs that are too difficult to
work around.  In particular, deadlock is possible if any signal with a
user-defined handler interrupts an atomic operation.

Link: llvm/llvm-project#92313
Link: llvm/llvm-project#92437
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants