1 2Valgrind-developer notes, re the MacOSX port 3~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4 5JRS 22 Mar 09: re these comments in m_libc* and m_debuglog: 6 7/* IMPORTANT: on Darwin it is essential to use the _nocancel versions 8 of syscalls rather than the vanilla version, if a _nocancel version 9 is available. See docs/internals/Darwin-notes.txt for the reason 10 why. */ 11 12when Valgrind does (for its own purposes, not for the client) 13read/write/open/close etc syscalls, it really is critical to use the 14_nocancel versions of syscalls rather than the vanilla versions. This 15holds throughout the entire code base: whenever V does a syscall for 16its own purposes, we must use the _nocancel version if it exists. 17This is of course most prevalent in m_libc* since all of our 18own-purpose (non-client) syscalls should get routed through there. 19 20Why? Because on Darwin, pthread cancellation is done within the 21kernel (unlike on Linux, iiuc). And read/write/open/close and a whole 22bunch of other syscalls to do with stream I/O are cancellation points. 23So what can happen is, client informs the kernel that a given thread 24is to be cancelled. Then at the next (eg) VG_(printf) call by that 25thread, which leads to a sys_write, the write syscall gets hit by the 26cancellation request, and is duly nuked by the kernel. Of course from 27the outside it looks as if the thread had mysteriously disappeared off 28the radar for no reason. 29 30In short, we need to use _nocancel versions in order to ensure that 31cancellation requests only take effect at the places where the client 32does a syscall, and not the places where Valgrind does syscalls. 33 34How observed: using the standard pipe-based implementation in 35coregrind/m_scheduler/sema.c, none/tests/pth_cancel1 would hang 36(compared to succeeding using native Darwin semaphores). And if the 37"pause()" call in said test is turned into a spin ("while (1) ;") then 38the entire Valgrind run mysteriously disappears, rather than spinning 39using native Darwin semaphores. 40 41Because the pipe-based semaphore intensively uses sys_read/sys_write, 42it is not surprising that it inadvertantly was eating up cancellation 43requests directed to client threads. With abovementioned change in 44force the pipe-based semaphore appears to work correctly. 45 46 47 48Valgrind-developer notes, things removed from the original MacOSX port 49~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 50There was a broken debugstub implementation. It was removed over several 51commits: r9477, which removed most of it, and r9711, r9759, and r10012, 52which cleaned up remaining bits. 53 54There was machinery to read function names from Dwarf3 debug info. But we 55already read function names from the symbol tables, so this was duplicated 56functionality. Furthermore, a Darwin-specific hack was required in 57storage.c to choose between symbol table names vs. Dwarf3 names. So this 58machinery was removed in r10155. 59 60 61Valgrind-developer notes, todos re the MacOSX port 62~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 63 64* m_syswrap/syscall-amd64-darwin.S 65 - correct signal mask is not applied during syscall 66 - restart-labels are completely bogus 67 68* m_syswrap/syswrap-darwin.c: 69 - PRE(sys_posix_spawn) completely ignores signal issues, and 70 also ignores the file_actions argument 71 72* env var handling w/ exec on Darwin: is there something odd? Compare 73 "valgrind env" on Darwin and Linux. On the former there are 74 settings VALGRIND_LIB and VALGRIND_LIB_INNER, but not for the 75 former. 76 There's a suspicious-looking "#if defined(VGO_darwin)" in 77 VG_(env_remove_valgrind_env_stuff). Maybe related? 78 79* Cleanups: sort wrappers in syswrap-darwin.c and priv_syswrap-darwin.h 80 alphabetically. Also, some aren't properly implemented -- check and 81 print warnings 82 83* Cleanups: m_scheduler/sema.c: use pipe implementation 84 (but this apparently causes none/tests/pth_cancel1 to hang. 85 I have no idea why, despite quite some investigation). 86 87* Cleanups: m_debugstub: move to attic 88 89* syswrap-darwin.c: sys_{f,}chmod_extended: handling of ARG5 is way 90 wrong 91 92* Cleanups (Linux,AIX5): bogus launcher-path mangling logic in 93 PRE(sys_execve) 94 95* Cleanups (ALL PLATFORMS): m_signals.c: are the _MY_SIGRETURN 96 assembly stubs actually necessary for anything? I don't know. 97 98* Cleanups: check that changes to VG_(stat) and VG_(stat64) have 99 not broken 64-bit statting on 32-bit Linux 100 101* Cleanups: #if !HAVE_PROC in m_main (to do with /proc/<pid>/cmdline 102 103-------- 104 105m_main doesn't read symbols for the valgrind exe itself, which is 106annoying. On minimal investigation it seems that the executable isn't 107even listed by aspacem. This is very strange and not in accordance 108with the Linux or AIX ports. 109 110 111m_main: relatedly, Darwin version does not collect/give out 112initial debuginfo handles; hence ptrcheck won't work 113 114 115m_main: Darwin port relies on blocking out big sections of address 116space with mmap at startup. We know from history that this is a bad 117idea. (It's also really slow on 64-bit builds, taking 3--4 seconds.) 118Also, startup is not done on the interim startup stack -- why not? 119 120 121VG_(di_notify_mmap): Linux version is also used for Darwin, and 122contains some ifdeffery. Clean up. 123 124 125PRE(sys_fork), #ifdeffery 126 127 128syswrap-generic.c: VG_(init_preopened_fds) is #ifdefd for Darwin 129 130 131scheduler.c: #ifdeffery in VG_(get_thread_out_of_syscall) 132 133 134look at notes in coregrind/Makefile.am re Mach RPC interface 135definitions. See if we can get rid of any more stuff now that 136m_debugstub is gone. 137