1Demonstrations of funclatency, the Linux eBPF/bcc version. 2 3 4Timing the do_sys_open() kernel function until Ctrl-C: 5 6# ./funclatency do_sys_open 7Tracing do_sys_open... Hit Ctrl-C to end. 8^C 9 nsecs : count distribution 10 0 -> 1 : 0 | | 11 2 -> 3 : 0 | | 12 4 -> 7 : 0 | | 13 8 -> 15 : 0 | | 14 16 -> 31 : 0 | | 15 32 -> 63 : 0 | | 16 64 -> 127 : 0 | | 17 128 -> 255 : 0 | | 18 256 -> 511 : 0 | | 19 512 -> 1023 : 0 | | 20 1024 -> 2047 : 0 | | 21 2048 -> 4095 : 124 |**************** | 22 4096 -> 8191 : 291 |**************************************| 23 8192 -> 16383 : 36 |**** | 24 16384 -> 32767 : 16 |** | 25 32768 -> 65535 : 8 |* | 26 65536 -> 131071 : 0 | | 27 131072 -> 262143 : 0 | | 28 262144 -> 524287 : 0 | | 29 524288 -> 1048575 : 0 | | 30 1048576 -> 2097151 : 0 | | 31 2097152 -> 4194303 : 1 | | 32Detaching... 33 34The output shows a histogram of function latency (call time), measured from when 35the function began executing (was called) to when it finished (returned). 36 37This example output shows that most of the time, do_sys_open() took between 382048 and 65536 nanoseconds (2 to 65 microseconds). The peak of this distribution 39shows 291 calls of between 4096 and 8191 nanoseconds. There was also one 40occurrence, an outlier, in the 2 to 4 millisecond range. 41 42How this works: the function entry and return are traced using the kernel kprobe 43and kretprobe tracer. Timestamps are collected, the delta time calculated, which 44is the bucketized and stored as an in-kernel histogram for efficiency. The 45histogram is visible in the output: it's the "count" column; everything else is 46decoration. Only the count column is copied to user-level on output. This is an 47efficient way to time kernel functions and examine their latency distribution. 48 49 50Now trace a user function, pthread_mutex_lock in libpthread, to determine if 51there is considerable lock contention: 52 53# ./funclatency pthread:pthread_mutex_lock -p $(pidof contentions) 54Tracing 1 function for "pthread:pthread_mutex_lock"... Hit Ctrl-C to end. 55 nsecs : count distribution 56 0 -> 1 : 0 | | 57 2 -> 3 : 0 | | 58 4 -> 7 : 0 | | 59 8 -> 15 : 0 | | 60 16 -> 31 : 0 | | 61 32 -> 63 : 0 | | 62 64 -> 127 : 0 | | 63 128 -> 255 : 0 | | 64 256 -> 511 : 0 | | 65 512 -> 1023 : 0 | | 66 1024 -> 2047 : 0 | | 67 2048 -> 4095 : 508967 |****************************************| 68 4096 -> 8191 : 70072 |***** | 69 8192 -> 16383 : 27686 |** | 70 16384 -> 32767 : 5075 | | 71 32768 -> 65535 : 2318 | | 72 65536 -> 131071 : 581 | | 73 131072 -> 262143 : 38 | | 74 262144 -> 524287 : 5 | | 75 524288 -> 1048575 : 1 | | 76 1048576 -> 2097151 : 9 | | 77Detaching... 78 79It seems that most calls to pthread_mutex_lock completed rather quickly (in 80under 4us), but there were some cases of considerable contention, sometimes 81over a full millisecond. 82 83 84Run a quick-and-dirty profiler over all the functions in an executable: 85# ./funclatency /home/user/primes:* -p $(pidof primes) -F 86Tracing 15 functions for "/home/user/primes:*"... Hit Ctrl-C to end. 87^C 88 89Function = is_prime [6556] 90 nsecs : count distribution 91 0 -> 1 : 0 | | 92 2 -> 3 : 0 | | 93 4 -> 7 : 0 | | 94 8 -> 15 : 0 | | 95 16 -> 31 : 0 | | 96 32 -> 63 : 0 | | 97 64 -> 127 : 0 | | 98 128 -> 255 : 0 | | 99 256 -> 511 : 0 | | 100 512 -> 1023 : 0 | | 101 1024 -> 2047 : 0 | | 102 2048 -> 4095 : 1495322 |****************************************| 103 4096 -> 8191 : 95744 |** | 104 8192 -> 16383 : 9926 | | 105 16384 -> 32767 : 3070 | | 106 32768 -> 65535 : 1415 | | 107 65536 -> 131071 : 112 | | 108 131072 -> 262143 : 9 | | 109 262144 -> 524287 : 3 | | 110 524288 -> 1048575 : 0 | | 111 1048576 -> 2097151 : 8 | | 112 113Function = insert_result [6556] 114 nsecs : count distribution 115 0 -> 1 : 0 | | 116 2 -> 3 : 0 | | 117 4 -> 7 : 0 | | 118 8 -> 15 : 0 | | 119 16 -> 31 : 0 | | 120 32 -> 63 : 0 | | 121 64 -> 127 : 0 | | 122 128 -> 255 : 0 | | 123 256 -> 511 : 0 | | 124 512 -> 1023 : 0 | | 125 1024 -> 2047 : 0 | | 126 2048 -> 4095 : 111047 |****************************************| 127 4096 -> 8191 : 3998 |* | 128 8192 -> 16383 : 720 | | 129 16384 -> 32767 : 238 | | 130 32768 -> 65535 : 106 | | 131 65536 -> 131071 : 5 | | 132 131072 -> 262143 : 4 | | 133Detaching... 134 135From the results, we can see that the is_prime function has something resembling 136an exponential distribution -- very few primes take a very long time to test, 137while most numbers are verified as prime or composite in less than 4us. The 138insert_result function exhibits a similar phenomenon, likely due to contention 139over a shared results container. 140 141 142Now vfs_read() is traced, and a microseconds histogram printed: 143 144# ./funclatency -u vfs_read 145Tracing vfs_read... Hit Ctrl-C to end. 146^C 147 usecs : count distribution 148 0 -> 1 : 1143 |**************************************| 149 2 -> 3 : 420 |************* | 150 4 -> 7 : 159 |***** | 151 8 -> 15 : 295 |********* | 152 16 -> 31 : 25 | | 153 32 -> 63 : 5 | | 154 64 -> 127 : 1 | | 155 128 -> 255 : 0 | | 156 256 -> 511 : 0 | | 157 512 -> 1023 : 0 | | 158 1024 -> 2047 : 1 | | 159 2048 -> 4095 : 0 | | 160 4096 -> 8191 : 5 | | 161 8192 -> 16383 : 0 | | 162 16384 -> 32767 : 0 | | 163 32768 -> 65535 : 0 | | 164 65536 -> 131071 : 7 | | 165 131072 -> 262143 : 7 | | 166 262144 -> 524287 : 3 | | 167 524288 -> 1048575 : 7 | | 168Detaching... 169 170This shows a bimodal distribution. Many vfs_read() calls were faster than 15 171microseconds, however, there was also a small handful between 65 milliseconds 172and 1 second, seen at the bottom of the table. These are likely network reads 173from SSH, waiting on interactive keystrokes. 174 175 176Tracing do_nanosleep() in milliseconds: 177 178# ./funclatency -m do_nanosleep 179Tracing do_nanosleep... Hit Ctrl-C to end. 180^C 181 msecs : count distribution 182 0 -> 1 : 0 | | 183 2 -> 3 : 0 | | 184 4 -> 7 : 0 | | 185 8 -> 15 : 0 | | 186 16 -> 31 : 0 | | 187 32 -> 63 : 0 | | 188 64 -> 127 : 0 | | 189 128 -> 255 : 0 | | 190 256 -> 511 : 0 | | 191 512 -> 1023 : 328 |**************************************| 192 1024 -> 2047 : 0 | | 193 2048 -> 4095 : 0 | | 194 4096 -> 8191 : 32 |*** | 195 8192 -> 16383 : 0 | | 196 16384 -> 32767 : 0 | | 197 32768 -> 65535 : 2 | | 198Detaching... 199 200This looks like it has found threads that are sleeping every 1, 5, and 60 201seconds. 202 203 204An interval can be provided using -i, and timestamps added using -T. For 205example, tracing vfs_read() latency in milliseconds and printing output 206every 5 seconds: 207 208# ./funclatency -mTi 5 vfs_read 209Tracing vfs_read... Hit Ctrl-C to end. 210 21120:10:08 212 msecs : count distribution 213 0 -> 1 : 1500 |*************************************+| 214 2 -> 3 : 3 | | 215 4 -> 7 : 1 | | 216 8 -> 15 : 2 | | 217 16 -> 31 : 0 | | 218 32 -> 63 : 0 | | 219 64 -> 127 : 4 | | 220 128 -> 255 : 3 | | 221 256 -> 511 : 1 | | 222 512 -> 1023 : 7 | | 223 22420:10:13 225 msecs : count distribution 226 0 -> 1 : 1251 |*************************************+| 227 2 -> 3 : 3 | | 228 4 -> 7 : 2 | | 229 8 -> 15 : 0 | | 230 16 -> 31 : 2 | | 231 32 -> 63 : 3 | | 232 64 -> 127 : 5 | | 233 128 -> 255 : 5 | | 234 256 -> 511 : 3 | | 235 512 -> 1023 : 6 | | 236 1024 -> 2047 : 2 | | 237 23820:10:18 239 msecs : count distribution 240 0 -> 1 : 1265 |*************************************+| 241 2 -> 3 : 0 | | 242 4 -> 7 : 5 | | 243 8 -> 15 : 9 | | 244 16 -> 31 : 7 | | 245 32 -> 63 : 1 | | 246 64 -> 127 : 2 | | 247 128 -> 255 : 3 | | 248 256 -> 511 : 5 | | 249 512 -> 1023 : 5 | | 250 1024 -> 2047 : 0 | | 251 2048 -> 4095 : 1 | | 252^C 25320:10:20 254 msecs : count distribution 255 0 -> 1 : 249 |*************************************+| 256 2 -> 3 : 0 | | 257 4 -> 7 : 0 | | 258 8 -> 15 : 1 | | 259 16 -> 31 : 0 | | 260 32 -> 63 : 0 | | 261 64 -> 127 : 0 | | 262 128 -> 255 : 0 | | 263 256 -> 511 : 0 | | 264 512 -> 1023 : 1 | | 265Detaching... 266 267 268A single process can be traced, which filters in-kernel for efficiency. Here, 269the vfs_read() function is timed as milliseconds for PID 17064, which is a 270bash shell: 271 272# ./funclatency -mp 17064 vfs_read 273Tracing vfs_read... Hit Ctrl-C to end. 274^C 275 msecs : count distribution 276 0 -> 1 : 1 |** | 277 2 -> 3 : 0 | | 278 4 -> 7 : 0 | | 279 8 -> 15 : 1 |** | 280 16 -> 31 : 2 |***** | 281 32 -> 63 : 0 | | 282 64 -> 127 : 13 |**************************************| 283 128 -> 255 : 10 |***************************** | 284 256 -> 511 : 4 |*********** | 285Detaching... 286 287The distribution between 64 and 511 milliseconds shows keystroke latency. 288 289 290The -F option can be used to print a histogram per function. Eg: 291 292# ./funclatency -uF 'vfs_r*' 293Tracing 5 functions for "vfs_r*"... Hit Ctrl-C to end. 294^C 295 296Function = vfs_read 297 usecs : count distribution 298 0 -> 1 : 1044 |****************************************| 299 2 -> 3 : 383 |************** | 300 4 -> 7 : 76 |** | 301 8 -> 15 : 41 |* | 302 16 -> 31 : 26 | | 303 32 -> 63 : 0 | | 304 64 -> 127 : 1 | | 305 128 -> 255 : 0 | | 306 256 -> 511 : 0 | | 307 512 -> 1023 : 0 | | 308 1024 -> 2047 : 0 | | 309 2048 -> 4095 : 4 | | 310 4096 -> 8191 : 2 | | 311 8192 -> 16383 : 0 | | 312 16384 -> 32767 : 0 | | 313 32768 -> 65535 : 2 | | 314 65536 -> 131071 : 5 | | 315 131072 -> 262143 : 5 | | 316 262144 -> 524287 : 3 | | 317 524288 -> 1048575 : 7 | | 318 319Function = vfs_rename 320 usecs : count distribution 321 0 -> 1 : 2 |**** | 322 2 -> 3 : 2 |**** | 323 4 -> 7 : 2 |**** | 324 8 -> 15 : 0 | | 325 16 -> 31 : 6 |************* | 326 32 -> 63 : 18 |****************************************| 327Detaching... 328 329 330 331USAGE message: 332 333# ./funclatency -h 334usage: funclatency [-h] [-p PID] [-i INTERVAL] [-T] [-u] [-m] [-F] [-r] [-v] 335 pattern 336 337Time functions and print latency as a histogram 338 339positional arguments: 340 pattern search expression for functions 341 342optional arguments: 343 -h, --help show this help message and exit 344 -p PID, --pid PID trace this PID only 345 -i INTERVAL, --interval INTERVAL 346 summary interval, in seconds 347 -d DURATION, --duration DURATION 348 total duration of trace, in seconds 349 -T, --timestamp include timestamp on output 350 -u, --microseconds microsecond histogram 351 -m, --milliseconds millisecond histogram 352 -F, --function show a separate histogram per function 353 -r, --regexp use regular expressions. Default is "*" wildcards 354 only. 355 -v, --verbose print the BPF program (for debugging purposes) 356 357examples: 358 ./funclatency do_sys_open # time the do_sys_open() kernel function 359 ./funclatency c:read # time the read() C library function 360 ./funclatency -u vfs_read # time vfs_read(), in microseconds 361 ./funclatency -m do_nanosleep # time do_nanosleep(), in milliseconds 362 ./funclatency -i 2 -d 10 c:open # output every 2 seconds, for duration 10s 363 ./funclatency -mTi 5 vfs_read # output every 5 seconds, with timestamps 364 ./funclatency -p 181 vfs_read # time process 181 only 365 ./funclatency 'vfs_fstat*' # time both vfs_fstat() and vfs_fstatat() 366 ./funclatency 'c:*printf' # time the *printf family of functions 367 ./funclatency -F 'vfs_r*' # show one histogram per matched function 368