Because of the heap-use-after-free race condition that was rather easily reproducible with AddressSanitizer (-fsanitize=address), I thought that I should finally try to learn to use ThreadSanitizer (TSAN, -fsanitize=thread in GCC and clang).
https://clang.llvm.org/docs/ThreadSanitizer.html
Because VDR makes use of POSIX thread synchronization primitives, no additional instrumentation via <sanitizer/tsan_interface.h> should be necessary.
Before C++11 defined a memory model for multi-threaded applications, semantics around shared data structures were rather unclear, and I would guess that most multi-threaded pre-C++11 code bases would trip ThreadSanitizer. Also, multi-threaded CPUs were rare in the early days, and the Total Store Order of the x86 is very forgiving, compared to the weak memory model of ARM (see https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html for some examples).
To silence one prominent source of TSAN warnings in the cThread destructor, I applied the attached patch. It is not ideal, because std::atomic defaults to std::memory_order_seq_cst while std::memory_order_relaxed would likely suffice here.
Even after applying the first patch, a simple test with no DVB receiver device and no valid data directory would still produce a couple of data race reports (see the end of this message). I recorded a trace of such a run with "rr record" (https://rr-project.org) and debugged it in "rr replay". Unsurprisingly, I did not find actual evidence of a race condition.
Finally, I figured out what is causing the first report: cThread::description is not protected by cThread::mutex. Possibly, most cThread data members (including cThread::active) should be protected by cThread::mutex?
With both attached patches applied, the report of the first race will disappear. The second race is apparently about some memory that is allocated inside opendir(). I did not figure it out yet.
Related to this, cThread::Cancel() especially when invoked with WaitSeconds=-1 looks problematic to me, and I see that VDR is invoking pthread_detach() and never invoking pthread_join(). The second patch includes an attempt to clean this up as well.
Both patches are just a proof of concept; I did not test them beyond the simple failing-to-start VDR run under TSAN. Unfortunately, TSAN is not available for my primary VDR hardware, running on 32-bit ARM.
With best regards,
Marko
vdr: error while reading '/var/lib/vdr/sources.conf' vdr: error while reading '/var/lib/vdr/channels.conf' vdr: no primary device found - using first device! ================== WARNING: ThreadSanitizer: data race on vptr (ctor/dtor vs virtual call) (pid=96847) Write of size 8 at 0x7ffc7f773e60 by main thread: #0 cThread::~cThread() /dev/shm/vdr/thread.c:249 (vdr+0x1d72b8) #1 cEpgDataReader::~cEpgDataReader() /dev/shm/vdr/epg.h:236 (vdr+0xa956b) #2 main /dev/shm/vdr/vdr.c:731 (vdr+0xa956b)
Previous read of size 8 at 0x7ffc7f773e60 by thread T2: #0 cThread::StartThread(cThread*) /dev/shm/vdr/thread.c:293 (vdr+0x1d76d9)
Location is stack of main thread.
Location is global '<null>' at 0x000000000000 ([stack]+0x1fe60)
Thread T2 'epg data reader' (tid=96855, finished) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1001 (libtsan.so.2+0x5e686) #1 cThread::Start() /dev/shm/vdr/thread.c:316 (vdr+0x1d6fe0) #2 main /dev/shm/vdr/vdr.c:804 (vdr+0xaa477)
SUMMARY: ThreadSanitizer: data race on vptr (ctor/dtor vs virtual call) /dev/shm/vdr/thread.c:249 in cThread::~cThread() ================== ================== WARNING: ThreadSanitizer: data race (pid=96847) Write of size 8 at 0x7b04000005a0 by main thread: #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:706 (libtsan.so.2+0x47e82) #1 cString::~cString() /dev/shm/vdr/tools.c:1097 (vdr+0x1e52df) #2 cxa_at_exit_wrapper ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:389 (libtsan.so.2+0x2dee3)
Previous read of size 8 at 0x7b04000005a0 by thread T1: #0 opendir ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:3271 (libtsan.so.2+0x4c641) #1 cReadDir::cReadDir(char const*) /dev/shm/vdr/tools.c:1553 (vdr+0x1ea8bd) #2 cVideoDirectoryScannerThread::ScanVideoDir(char const*, int, int) /dev/shm/vdr/recording.c:1439 (vdr+0x180620) #3 cVideoDirectoryScannerThread::Action() /dev/shm/vdr/recording.c:1433 (vdr+0x180bfc) #4 cThread::StartThread(cThread*) /dev/shm/vdr/thread.c:293 (vdr+0x1d76ea)
Thread T1 'video directory' (tid=96854, running) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1001 (libtsan.so.2+0x5e686) #1 cThread::Start() /dev/shm/vdr/thread.c:316 (vdr+0x1d6fe0) #2 cRecordings::Update(bool) /dev/shm/vdr/recording.c:1554 (vdr+0x175387) #3 main /dev/shm/vdr/vdr.c:788 (vdr+0xaa3f8)
SUMMARY: ThreadSanitizer: data race /dev/shm/vdr/tools.c:1097 in cString::~cString() ================== ThreadSanitizer: reported 2 warnings