1\input texinfo @c -*-texinfo-*- 2@c %**start of header 3@setfilename netperf.info 4@settitle Care and Feeding of Netperf 2.6.X 5@c %**end of header 6 7@copying 8This is Rick Jones' feeble attempt at a Texinfo-based manual for the 9netperf benchmark. 10 11Copyright @copyright{} 2005-2012 Hewlett-Packard Company 12@quotation 13Permission is granted to copy, distribute and/or modify this document 14per the terms of the netperf source license, a copy of which can be 15found in the file @file{COPYING} of the basic netperf distribution. 16@end quotation 17@end copying 18 19@titlepage 20@title Care and Feeding of Netperf 21@subtitle Versions 2.6.0 and Later 22@author Rick Jones @email{rick.jones2@@hp.com} 23@c this is here to start the copyright page 24@page 25@vskip 0pt plus 1filll 26@insertcopying 27@end titlepage 28 29@c begin with a table of contents 30@contents 31 32@ifnottex 33@node Top, Introduction, (dir), (dir) 34@top Netperf Manual 35 36@insertcopying 37@end ifnottex 38 39@menu 40* Introduction:: An introduction to netperf - what it 41is and what it is not. 42* Installing Netperf:: How to go about installing netperf. 43* The Design of Netperf:: 44* Global Command-line Options:: 45* Using Netperf to Measure Bulk Data Transfer:: 46* Using Netperf to Measure Request/Response :: 47* Using Netperf to Measure Aggregate Performance:: 48* Using Netperf to Measure Bidirectional Transfer:: 49* The Omni Tests:: 50* Other Netperf Tests:: 51* Address Resolution:: 52* Enhancing Netperf:: 53* Netperf4:: 54* Concept Index:: 55* Option Index:: 56@end menu 57 58@node Introduction, Installing Netperf, Top, Top 59@chapter Introduction 60 61@cindex Introduction 62 63Netperf is a benchmark that can be use to measure various aspect of 64networking performance. The primary foci are bulk (aka 65unidirectional) data transfer and request/response performance using 66either TCP or UDP and the Berkeley Sockets interface. As of this 67writing, the tests available either unconditionally or conditionally 68include: 69 70@itemize @bullet 71@item 72TCP and UDP unidirectional transfer and request/response over IPv4 and 73IPv6 using the Sockets interface. 74@item 75TCP and UDP unidirectional transfer and request/response over IPv4 76using the XTI interface. 77@item 78Link-level unidirectional transfer and request/response using the DLPI 79interface. 80@item 81Unix domain sockets 82@item 83SCTP unidirectional transfer and request/response over IPv4 and IPv6 84using the sockets interface. 85@end itemize 86 87While not every revision of netperf will work on every platform 88listed, the intention is that at least some version of netperf will 89work on the following platforms: 90 91@itemize @bullet 92@item 93Unix - at least all the major variants. 94@item 95Linux 96@item 97Windows 98@item 99Others 100@end itemize 101 102Netperf is maintained and informally supported primarily by Rick 103Jones, who can perhaps be best described as Netperf Contributing 104Editor. Non-trivial and very appreciated assistance comes from others 105in the network performance community, who are too numerous to mention 106here. While it is often used by them, netperf is NOT supported via any 107of the formal Hewlett-Packard support channels. You should feel free 108to make enhancements and modifications to netperf to suit your 109nefarious porpoises, so long as you stay within the guidelines of the 110netperf copyright. If you feel so inclined, you can send your changes 111to 112@email{netperf-feedback@@netperf.org,netperf-feedback} for possible 113inclusion into subsequent versions of netperf. 114 115It is the Contributing Editor's belief that the netperf license walks 116like open source and talks like open source. However, the license was 117never submitted for ``certification'' as an open source license. If 118you would prefer to make contributions to a networking benchmark using 119a certified open source license, please consider netperf4, which is 120distributed under the terms of the GPLv2. 121 122The @email{netperf-talk@@netperf.org,netperf-talk} mailing list is 123available to discuss the care and feeding of netperf with others who 124share your interest in network performance benchmarking. The 125netperf-talk mailing list is a closed list (to deal with spam) and you 126must first subscribe by sending email to 127@email{netperf-talk-request@@netperf.org,netperf-talk-request}. 128 129 130@menu 131* Conventions:: 132@end menu 133 134@node Conventions, , Introduction, Introduction 135@section Conventions 136 137A @dfn{sizespec} is a one or two item, comma-separated list used as an 138argument to a command-line option that can set one or two, related 139netperf parameters. If you wish to set both parameters to separate 140values, items should be separated by a comma: 141 142@example 143parameter1,parameter2 144@end example 145 146If you wish to set the first parameter without altering the value of 147the second from its default, you should follow the first item with a 148comma: 149 150@example 151parameter1, 152@end example 153 154 155Likewise, precede the item with a comma if you wish to set only the 156second parameter: 157 158@example 159,parameter2 160@end example 161 162An item with no commas: 163 164@example 165parameter1and2 166@end example 167 168will set both parameters to the same value. This last mode is one of 169the most frequently used. 170 171There is another variant of the comma-separated, two-item list called 172a @dfn{optionspec} which is like a sizespec with the exception that a 173single item with no comma: 174 175@example 176parameter1 177@end example 178 179will only set the value of the first parameter and will leave the 180second parameter at its default value. 181 182Netperf has two types of command-line options. The first are global 183command line options. They are essentially any option not tied to a 184particular test or group of tests. An example of a global 185command-line option is the one which sets the test type - @option{-t}. 186 187The second type of options are test-specific options. These are 188options which are only applicable to a particular test or set of 189tests. An example of a test-specific option would be the send socket 190buffer size for a TCP_STREAM test. 191 192Global command-line options are specified first with test-specific 193options following after a @code{--} as in: 194 195@example 196netperf <global> -- <test-specific> 197@end example 198 199 200@node Installing Netperf, The Design of Netperf, Introduction, Top 201@chapter Installing Netperf 202 203@cindex Installation 204 205Netperf's primary form of distribution is source code. This allows 206installation on systems other than those to which the authors have 207ready access and thus the ability to create binaries. There are two 208styles of netperf installation. The first runs the netperf server 209program - netserver - as a child of inetd. This requires the 210installer to have sufficient privileges to edit the files 211@file{/etc/services} and @file{/etc/inetd.conf} or their 212platform-specific equivalents. 213 214The second style is to run netserver as a standalone daemon. This 215second method does not require edit privileges on @file{/etc/services} 216and @file{/etc/inetd.conf} but does mean you must remember to run the 217netserver program explicitly after every system reboot. 218 219This manual assumes that those wishing to measure networking 220performance already know how to use anonymous FTP and/or a web 221browser. It is also expected that you have at least a passing 222familiarity with the networking protocols and interfaces involved. In 223all honesty, if you do not have such familiarity, likely as not you 224have some experience to gain before attempting network performance 225measurements. The excellent texts by authors such as Stevens, Fenner 226and Rudoff and/or Stallings would be good starting points. There are 227likely other excellent sources out there as well. 228 229@menu 230* Getting Netperf Bits:: 231* Installing Netperf Bits:: 232* Verifying Installation:: 233@end menu 234 235@node Getting Netperf Bits, Installing Netperf Bits, Installing Netperf, Installing Netperf 236@section Getting Netperf Bits 237 238Gzipped tar files of netperf sources can be retrieved via 239@uref{ftp://ftp.netperf.org/netperf,anonymous FTP} 240for ``released'' versions of the bits. Pre-release versions of the 241bits can be retrieved via anonymous FTP from the 242@uref{ftp://ftp.netperf.org/netperf/experimental,experimental} subdirectory. 243 244For convenience and ease of remembering, a link to the download site 245is provided via the 246@uref{http://www.netperf.org/, NetperfPage} 247 248The bits corresponding to each discrete release of netperf are 249@uref{http://www.netperf.org/svn/netperf2/tags,tagged} for retrieval 250via subversion. For example, there is a tag for the first version 251corresponding to this version of the manual - 252@uref{http://www.netperf.org/svn/netperf2/tags/netperf-2.6.0,netperf 2532.6.0}. Those wishing to be on the bleeding edge of netperf 254development can use subversion to grab the 255@uref{http://www.netperf.org/svn/netperf2/trunk,top of trunk}. When 256fixing bugs or making enhancements, patches against the top-of-trunk 257are preferred. 258 259There are likely other places around the Internet from which one can 260download netperf bits. These may be simple mirrors of the main 261netperf site, or they may be local variants on netperf. As with 262anything one downloads from the Internet, take care to make sure it is 263what you really wanted and isn't some malicious Trojan or whatnot. 264Caveat downloader. 265 266As a general rule, binaries of netperf and netserver are not 267distributed from ftp.netperf.org. From time to time a kind soul or 268souls has packaged netperf as a Debian package available via the 269apt-get mechanism or as an RPM. I would be most interested in 270learning how to enhance the makefiles to make that easier for people. 271 272@node Installing Netperf Bits, Verifying Installation, Getting Netperf Bits, Installing Netperf 273@section Installing Netperf 274 275Once you have downloaded the tar file of netperf sources onto your 276system(s), it is necessary to unpack the tar file, cd to the netperf 277directory, run configure and then make. Most of the time it should be 278sufficient to just: 279 280@example 281gzcat netperf-<version>.tar.gz | tar xf - 282cd netperf-<version> 283./configure 284make 285make install 286@end example 287 288Most of the ``usual'' configure script options should be present 289dealing with where to install binaries and whatnot. 290@example 291./configure --help 292@end example 293should list all of those and more. You may find the @code{--prefix} 294option helpful in deciding where the binaries and such will be put 295during the @code{make install}. 296 297@vindex --enable-cpuutil, Configure 298If the netperf configure script does not know how to automagically 299detect which CPU utilization mechanism to use on your platform you may 300want to add a @code{--enable-cpuutil=mumble} option to the configure 301command. If you have knowledge and/or experience to contribute to 302that area, feel free to contact @email{netperf-feedback@@netperf.org}. 303 304@vindex --enable-xti, Configure 305@vindex --enable-unixdomain, Configure 306@vindex --enable-dlpi, Configure 307@vindex --enable-sctp, Configure 308Similarly, if you want tests using the XTI interface, Unix Domain 309Sockets, DLPI or SCTP it will be necessary to add one or more 310@code{--enable-[xti|unixdomain|dlpi|sctp]=yes} options to the configure 311command. As of this writing, the configure script will not include 312those tests automagically. 313 314@vindex --enable-omni, Configure 315Starting with version 2.5.0, netperf began migrating most of the 316``classic'' netperf tests found in @file{src/nettest_bsd.c} to the 317so-called ``omni'' tests (aka ``two routines to run them all'') found 318in @file{src/nettest_omni.c}. This migration enables a number of new 319features such as greater control over what output is included, and new 320things to output. The ``omni'' test is enabled by default in 2.5.0 321and a number of the classic tests are migrated - you can tell if a 322test has been migrated 323from the presence of @code{MIGRATED} in the test banner. If you 324encounter problems with either the omni or migrated tests, please 325first attempt to obtain resolution via 326@email{netperf-talk@@netperf.org} or 327@email{netperf-feedback@@netperf.org}. If that is unsuccessful, you 328can add a @code{--enable-omni=no} to the configure command and the 329omni tests will not be compiled-in and the classic tests will not be 330migrated. 331 332Starting with version 2.5.0, netperf includes the ``burst mode'' 333functionality in a default compilation of the bits. If you encounter 334problems with this, please first attempt to obtain help via 335@email{netperf-talk@@netperf.org} or 336@email{netperf-feedback@@netperf.org}. If that is unsuccessful, you 337can add a @code{--enable-burst=no} to the configure command and the 338burst mode functionality will not be compiled-in. 339 340On some platforms, it may be necessary to precede the configure 341command with a CFLAGS and/or LIBS variable as the netperf configure 342script is not yet smart enough to set them itself. Whenever possible, 343these requirements will be found in @file{README.@var{platform}} files. 344Expertise and assistance in making that more automagic in the 345configure script would be most welcome. 346 347@cindex Limiting Bandwidth 348@cindex Bandwidth Limitation 349@vindex --enable-intervals, Configure 350@vindex --enable-histogram, Configure 351Other optional configure-time settings include 352@code{--enable-intervals=yes} to give netperf the ability to ``pace'' 353its _STREAM tests and @code{--enable-histogram=yes} to have netperf 354keep a histogram of interesting times. Each of these will have some 355effect on the measured result. If your system supports 356@code{gethrtime()} the effect of the histogram measurement should be 357minimized but probably still measurable. For example, the histogram 358of a netperf TCP_RR test will be of the individual transaction times: 359@example 360netperf -t TCP_RR -H lag -v 2 361TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET : histogram 362Local /Remote 363Socket Size Request Resp. Elapsed Trans. 364Send Recv Size Size Time Rate 365bytes Bytes bytes bytes secs. per sec 366 36716384 87380 1 1 10.00 3538.82 36832768 32768 369Alignment Offset 370Local Remote Local Remote 371Send Recv Send Recv 372 8 0 0 0 373Histogram of request/response times 374UNIT_USEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 375TEN_USEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 376HUNDRED_USEC : 0: 34480: 111: 13: 12: 6: 9: 3: 4: 7 377UNIT_MSEC : 0: 60: 50: 51: 44: 44: 72: 119: 100: 101 378TEN_MSEC : 0: 105: 0: 0: 0: 0: 0: 0: 0: 0 379HUNDRED_MSEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 380UNIT_SEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 381TEN_SEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 382>100_SECS: 0 383HIST_TOTAL: 35391 384@end example 385 386The histogram you see above is basically a base-10 log histogram where 387we can see that most of the transaction times were on the order of one 388hundred to one-hundred, ninety-nine microseconds, but they were 389occasionally as long as ten to nineteen milliseconds 390 391The @option{--enable-demo=yes} configure option will cause code to be 392included to report interim results during a test run. The rate at 393which interim results are reported can then be controlled via the 394global @option{-D} option. Here is an example of @option{-D} output: 395 396@example 397$ src/netperf -D 1.35 -H tardy.hpl.hp.com -f M 398MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com (15.9.116.144) port 0 AF_INET : demo 399Interim result: 5.41 MBytes/s over 1.35 seconds ending at 1308789765.848 400Interim result: 11.07 MBytes/s over 1.36 seconds ending at 1308789767.206 401Interim result: 16.00 MBytes/s over 1.36 seconds ending at 1308789768.566 402Interim result: 20.66 MBytes/s over 1.36 seconds ending at 1308789769.922 403Interim result: 22.74 MBytes/s over 1.36 seconds ending at 1308789771.285 404Interim result: 23.07 MBytes/s over 1.36 seconds ending at 1308789772.647 405Interim result: 23.77 MBytes/s over 1.37 seconds ending at 1308789774.016 406Recv Send Send 407Socket Socket Message Elapsed 408Size Size Size Time Throughput 409bytes bytes bytes secs. MBytes/sec 410 411 87380 16384 16384 10.06 17.81 412@end example 413 414Notice how the units of the interim result track that requested by the 415@option{-f} option. Also notice that sometimes the interval will be 416longer than the value specified in the @option{-D} option. This is 417normal and stems from how demo mode is implemented not by relying on 418interval timers or frequent calls to get the current time, but by 419calculating how many units of work must be performed to take at least 420the desired interval. 421 422Those familiar with this option in earlier versions of netperf will 423note the addition of the ``ending at'' text. This is the time as 424reported by a @code{gettimeofday()} call (or its emulation) with a 425@code{NULL} timezone pointer. This addition is intended to make it 426easier to insert interim results into an 427@uref{http://oss.oetiker.ch/rrdtool/doc/rrdtool.en.html,rrdtool} 428Round-Robin Database (RRD). A likely bug-riddled example of doing so 429can be found in @file{doc/examples/netperf_interim_to_rrd.sh}. The 430time is reported out to milliseconds rather than microseconds because 431that is the most rrdtool understands as of the time of this writing. 432 433As of this writing, a @code{make install} will not actually update the 434files @file{/etc/services} and/or @file{/etc/inetd.conf} or their 435platform-specific equivalents. It remains necessary to perform that 436bit of installation magic by hand. Patches to the makefile sources to 437effect an automagic editing of the necessary files to have netperf 438installed as a child of inetd would be most welcome. 439 440Starting the netserver as a standalone daemon should be as easy as: 441@example 442$ netserver 443Starting netserver at port 12865 444Starting netserver at hostname 0.0.0.0 port 12865 and family 0 445@end example 446 447Over time the specifics of the messages netserver prints to the screen 448may change but the gist will remain the same. 449 450If the compilation of netperf or netserver happens to fail, feel free 451to contact @email{netperf-feedback@@netperf.org} or join and ask in 452@email{netperf-talk@@netperf.org}. However, it is quite important 453that you include the actual compilation errors and perhaps even the 454configure log in your email. Otherwise, it will be that much more 455difficult for someone to assist you. 456 457@node Verifying Installation, , Installing Netperf Bits, Installing Netperf 458@section Verifying Installation 459 460Basically, once netperf is installed and netserver is configured as a 461child of inetd, or launched as a standalone daemon, simply typing: 462@example 463netperf 464@end example 465should result in output similar to the following: 466@example 467$ netperf 468TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET 469Recv Send Send 470Socket Socket Message Elapsed 471Size Size Size Time Throughput 472bytes bytes bytes secs. 10^6bits/sec 473 474 87380 16384 16384 10.00 2997.84 475@end example 476 477 478@node The Design of Netperf, Global Command-line Options, Installing Netperf, Top 479@chapter The Design of Netperf 480 481@cindex Design of Netperf 482 483Netperf is designed around a basic client-server model. There are 484two executables - netperf and netserver. Generally you will only 485execute the netperf program, with the netserver program being invoked 486by the remote system's inetd or having been previously started as its 487own standalone daemon. 488 489When you execute netperf it will establish a ``control connection'' to 490the remote system. This connection will be used to pass test 491configuration information and results to and from the remote system. 492Regardless of the type of test to be run, the control connection will 493be a TCP connection using BSD sockets. The control connection can use 494either IPv4 or IPv6. 495 496Once the control connection is up and the configuration information 497has been passed, a separate ``data'' connection will be opened for the 498measurement itself using the API's and protocols appropriate for the 499specified test. When the test is completed, the data connection will 500be torn-down and results from the netserver will be passed-back via the 501control connection and combined with netperf's result for display to 502the user. 503 504Netperf places no traffic on the control connection while a test is in 505progress. Certain TCP options, such as SO_KEEPALIVE, if set as your 506systems' default, may put packets out on the control connection while 507a test is in progress. Generally speaking this will have no effect on 508the results. 509 510@menu 511* CPU Utilization:: 512@end menu 513 514@node CPU Utilization, , The Design of Netperf, The Design of Netperf 515@section CPU Utilization 516@cindex CPU Utilization 517 518CPU utilization is an important, and alas all-too infrequently 519reported component of networking performance. Unfortunately, it can 520be one of the most difficult metrics to measure accurately and 521portably. Netperf will do its level best to report accurate 522CPU utilization figures, but some combinations of processor, OS and 523configuration may make that difficult. 524 525CPU utilization in netperf is reported as a value between 0 and 100% 526regardless of the number of CPUs involved. In addition to CPU 527utilization, netperf will report a metric called a @dfn{service 528demand}. The service demand is the normalization of CPU utilization 529and work performed. For a _STREAM test it is the microseconds of CPU 530time consumed to transfer on KB (K == 1024) of data. For a _RR test 531it is the microseconds of CPU time consumed processing a single 532transaction. For both CPU utilization and service demand, lower is 533better. 534 535Service demand can be particularly useful when trying to gauge the 536effect of a performance change. It is essentially a measure of 537efficiency, with smaller values being more efficient and thus 538``better.'' 539 540Netperf is coded to be able to use one of several, generally 541platform-specific CPU utilization measurement mechanisms. Single 542letter codes will be included in the CPU portion of the test banner to 543indicate which mechanism was used on each of the local (netperf) and 544remote (netserver) system. 545 546As of this writing those codes are: 547 548@table @code 549@item U 550The CPU utilization measurement mechanism was unknown to netperf or 551netperf/netserver was not compiled to include CPU utilization 552measurements. The code for the null CPU utilization mechanism can be 553found in @file{src/netcpu_none.c}. 554@item I 555An HP-UX-specific CPU utilization mechanism whereby the kernel 556incremented a per-CPU counter by one for each trip through the idle 557loop. This mechanism was only available on specially-compiled HP-UX 558kernels prior to HP-UX 10 and is mentioned here only for the sake of 559historical completeness and perhaps as a suggestion to those who might 560be altering other operating systems. While rather simple, perhaps even 561simplistic, this mechanism was quite robust and was not affected by 562the concerns of statistical methods, or methods attempting to track 563time in each of user, kernel, interrupt and idle modes which require 564quite careful accounting. It can be thought-of as the in-kernel 565version of the looper @code{L} mechanism without the context switch 566overhead. This mechanism required calibration. 567@item P 568An HP-UX-specific CPU utilization mechanism whereby the kernel 569keeps-track of time (in the form of CPU cycles) spent in the kernel 570idle loop (HP-UX 10.0 to 11.31 inclusive), or where the kernel keeps 571track of time spent in idle, user, kernel and interrupt processing 572(HP-UX 11.23 and later). The former requires calibration, the latter 573does not. Values in either case are retrieved via one of the pstat(2) 574family of calls, hence the use of the letter @code{P}. The code for 575these mechanisms is found in @file{src/netcpu_pstat.c} and 576@file{src/netcpu_pstatnew.c} respectively. 577@item K 578A Solaris-specific CPU utilization mechanism whereby the kernel keeps 579track of ticks (eg HZ) spent in the idle loop. This method is 580statistical and is known to be inaccurate when the interrupt rate is 581above epsilon as time spent processing interrupts is not subtracted 582from idle. The value is retrieved via a kstat() call - hence the use 583of the letter @code{K}. Since this mechanism uses units of ticks (HZ) 584the calibration value should invariably match HZ. (Eg 100) The code 585for this mechanism is implemented in @file{src/netcpu_kstat.c}. 586@item M 587A Solaris-specific mechanism available on Solaris 10 and latter which 588uses the new microstate accounting mechanisms. There are two, alas, 589overlapping, mechanisms. The first tracks nanoseconds spent in user, 590kernel, and idle modes. The second mechanism tracks nanoseconds spent 591in interrupt. Since the mechanisms overlap, netperf goes through some 592hand-waving to try to ``fix'' the problem. Since the accuracy of the 593handwaving cannot be completely determined, one must presume that 594while better than the @code{K} mechanism, this mechanism too is not 595without issues. The values are retrieved via kstat() calls, but the 596letter code is set to @code{M} to distinguish this mechanism from the 597even less accurate @code{K} mechanism. The code for this mechanism is 598implemented in @file{src/netcpu_kstat10.c}. 599@item L 600A mechanism based on ``looper''or ``soaker'' processes which sit in 601tight loops counting as fast as they possibly can. This mechanism 602starts a looper process for each known CPU on the system. The effect 603of processor hyperthreading on the mechanism is not yet known. This 604mechanism definitely requires calibration. The code for the 605``looper''mechanism can be found in @file{src/netcpu_looper.c} 606@item N 607A Microsoft Windows-specific mechanism, the code for which can be 608found in @file{src/netcpu_ntperf.c}. This mechanism too is based on 609what appears to be a form of micro-state accounting and requires no 610calibration. On laptops, or other systems which may dynamically alter 611the CPU frequency to minimize power consumption, it has been suggested 612that this mechanism may become slightly confused, in which case using 613BIOS/uEFI settings to disable the power saving would be indicated. 614 615@item S 616This mechanism uses @file{/proc/stat} on Linux to retrieve time 617(ticks) spent in idle mode. It is thought but not known to be 618reasonably accurate. The code for this mechanism can be found in 619@file{src/netcpu_procstat.c}. 620@item C 621A mechanism somewhat similar to @code{S} but using the sysctl() call 622on BSD-like Operating systems (*BSD and MacOS X). The code for this 623mechanism can be found in @file{src/netcpu_sysctl.c}. 624@item Others 625Other mechanisms included in netperf in the past have included using 626the times() and getrusage() calls. These calls are actually rather 627poorly suited to the task of measuring CPU overhead for networking as 628they tend to be process-specific and much network-related processing 629can happen outside the context of a process, in places where it is not 630a given it will be charged to the correct, or even a process. They 631are mentioned here as a warning to anyone seeing those mechanisms used 632in other networking benchmarks. These mechanisms are not available in 633netperf 2.4.0 and later. 634@end table 635 636For many platforms, the configure script will chose the best available 637CPU utilization mechanism. However, some platforms have no 638particularly good mechanisms. On those platforms, it is probably best 639to use the ``LOOPER'' mechanism which is basically some number of 640processes (as many as there are processors) sitting in tight little 641loops counting as fast as they can. The rate at which the loopers 642count when the system is believed to be idle is compared with the rate 643when the system is running netperf and the ratio is used to compute 644CPU utilization. 645 646In the past, netperf included some mechanisms that only reported CPU 647time charged to the calling process. Those mechanisms have been 648removed from netperf versions 2.4.0 and later because they are 649hopelessly inaccurate. Networking can and often results in CPU time 650being spent in places - such as interrupt contexts - that do not get 651charged to a or the correct process. 652 653In fact, time spent in the processing of interrupts is a common issue 654for many CPU utilization mechanisms. In particular, the ``PSTAT'' 655mechanism was eventually known to have problems accounting for certain 656interrupt time prior to HP-UX 11.11 (11iv1). HP-UX 11iv2 and later 657are known/presumed to be good. The ``KSTAT'' mechanism is known to 658have problems on all versions of Solaris up to and including Solaris 65910. Even the microstate accounting available via kstat in Solaris 10 660has issues, though perhaps not as bad as those of prior versions. 661 662The /proc/stat mechanism under Linux is in what the author would 663consider an ``uncertain'' category as it appears to be statistical, 664which may also have issues with time spent processing interrupts. 665 666In summary, be sure to ``sanity-check'' the CPU utilization figures 667with other mechanisms. However, platform tools such as top, vmstat or 668mpstat are often based on the same mechanisms used by netperf. 669 670@menu 671* CPU Utilization in a Virtual Guest:: 672@end menu 673 674@node CPU Utilization in a Virtual Guest, , CPU Utilization, CPU Utilization 675@subsection CPU Utilization in a Virtual Guest 676 677The CPU utilization mechanisms used by netperf are ``inline'' in that 678they are run by the same netperf or netserver process as is running 679the test itself. This works just fine for ``bare iron'' tests but 680runs into a problem when using virtual machines. 681 682The relationship between virtual guest and hypervisor can be thought 683of as being similar to that between a process and kernel in a bare 684iron system. As such, (m)any CPU utilization mechanisms used in the 685virtual guest are similar to ``process-local'' mechanisms in a bare 686iron situation. However, just as with bare iron and process-local 687mechanisms, much networking processing happens outside the context of 688the virtual guest. It takes place in the hypervisor, and is not 689visible to mechanisms running in the guest(s). For this reason, one 690should not really trust CPU utilization figures reported by netperf or 691netserver when running in a virtual guest. 692 693If one is looking to measure the added overhead of a virtualization 694mechanism, rather than rely on CPU utilization, one can rely instead 695on netperf _RR tests - path-lengths and overheads can be a significant 696fraction of the latency, so increases in overhead should appear as 697decreases in transaction rate. Whatever you do, @b{DO NOT} rely on 698the throughput of a _STREAM test. Achieving link-rate can be done via 699a multitude of options that mask overhead rather than eliminate it. 700 701@node Global Command-line Options, Using Netperf to Measure Bulk Data Transfer, The Design of Netperf, Top 702@chapter Global Command-line Options 703 704This section describes each of the global command-line options 705available in the netperf and netserver binaries. Essentially, it is 706an expanded version of the usage information displayed by netperf or 707netserver when invoked with the @option{-h} global command-line 708option. 709 710@menu 711* Command-line Options Syntax:: 712* Global Options:: 713@end menu 714 715@node Command-line Options Syntax, Global Options, Global Command-line Options, Global Command-line Options 716@comment node-name, next, previous, up 717@section Command-line Options Syntax 718 719Revision 1.8 of netperf introduced enough new functionality to overrun 720the English alphabet for mnemonic command-line option names, and the 721author was not and is not quite ready to switch to the contemporary 722@option{--mumble} style of command-line options. (Call him a Luddite 723if you wish :). 724 725For this reason, the command-line options were split into two parts - 726the first are the global command-line options. They are options that 727affect nearly any and every test type of netperf. The second type are 728the test-specific command-line options. Both are entered on the same 729command line, but they must be separated from one another by a @code{--} 730for correct parsing. Global command-line options come first, followed 731by the @code{--} and then test-specific command-line options. If there 732are no test-specific options to be set, the @code{--} may be omitted. If 733there are no global command-line options to be set, test-specific 734options must still be preceded by a @code{--}. For example: 735@example 736netperf <global> -- <test-specific> 737@end example 738sets both global and test-specific options: 739@example 740netperf <global> 741@end example 742sets just global options and: 743@example 744netperf -- <test-specific> 745@end example 746sets just test-specific options. 747 748@node Global Options, , Command-line Options Syntax, Global Command-line Options 749@comment node-name, next, previous, up 750@section Global Options 751 752@table @code 753@vindex -a, Global 754@item -a <sizespec> 755This option allows you to alter the alignment of the buffers used in 756the sending and receiving calls on the local system.. Changing the 757alignment of the buffers can force the system to use different copy 758schemes, which can have a measurable effect on performance. If the 759page size for the system were 4096 bytes, and you want to pass 760page-aligned buffers beginning on page boundaries, you could use 761@samp{-a 4096}. By default the units are bytes, but suffix of ``G,'' 762``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20 (MB) or 7632^10 (KB) respectively. A suffix of ``g,'' ``m'' or ``k'' will specify 764units of 10^9, 10^6 or 10^3 bytes respectively. [Default: 8 bytes] 765 766@vindex -A, Global 767@item -A <sizespec> 768This option is identical to the @option{-a} option with the difference 769being it affects alignments for the remote system. 770 771@vindex -b, Global 772@item -b <size> 773This option is only present when netperf has been configure with 774--enable-intervals=yes prior to compilation. It sets the size of the 775burst of send calls in a _STREAM test. When used in conjunction with 776the @option{-w} option it can cause the rate at which data is sent to 777be ``paced.'' 778 779@vindex -B, Global 780@item -B <string> 781This option will cause @option{<string>} to be appended to the brief 782(see -P) output of netperf. 783 784@vindex -c, Global 785@item -c [rate] 786This option will ask that CPU utilization and service demand be 787calculated for the local system. For those CPU utilization mechanisms 788requiring calibration, the options rate parameter may be specified to 789preclude running another calibration step, saving 40 seconds of time. 790For those CPU utilization mechanisms requiring no calibration, the 791optional rate parameter will be utterly and completely ignored. 792[Default: no CPU measurements] 793 794@vindex -C, Global 795@item -C [rate] 796This option requests CPU utilization and service demand calculations 797for the remote system. It is otherwise identical to the @option{-c} 798option. 799 800@vindex -d, Global 801@item -d 802Each instance of this option will increase the quantity of debugging 803output displayed during a test. If the debugging output level is set 804high enough, it may have a measurable effect on performance. 805Debugging information for the local system is printed to stdout. 806Debugging information for the remote system is sent by default to the 807file @file{/tmp/netperf.debug}. [Default: no debugging output] 808 809@vindex -D, Global 810@item -D [interval,units] 811This option is only available when netperf is configured with 812--enable-demo=yes. When set, it will cause netperf to emit periodic 813reports of performance during the run. [@var{interval},@var{units}] 814follow the semantics of an optionspec. If specified, 815@var{interval} gives the minimum interval in real seconds, it does not 816have to be whole seconds. The @var{units} value can be used for the 817first guess as to how many units of work (bytes or transactions) must 818be done to take at least @var{interval} seconds. If omitted, 819@var{interval} defaults to one second and @var{units} to values 820specific to each test type. 821 822@vindex -f, Global 823@item -f G|M|K|g|m|k|x 824This option can be used to change the reporting units for _STREAM 825tests. Arguments of ``G,'' ``M,'' or ``K'' will set the units to 8262^30, 2^20 or 2^10 bytes/s respectively (EG power of two GB, MB or 827KB). Arguments of ``g,'' ``,m'' or ``k'' will set the units to 10^9, 82810^6 or 10^3 bits/s respectively. An argument of ``x'' requests the 829units be transactions per second and is only meaningful for a 830request-response test. [Default: ``m'' or 10^6 bits/s] 831 832@vindex -F, Global 833@item -F <fillfile> 834This option specified the file from which send which buffers will be 835pre-filled . While the buffers will contain data from the specified 836file, the file is not fully transferred to the remote system as the 837receiving end of the test will not write the contents of what it 838receives to a file. This can be used to pre-fill the send buffers 839with data having different compressibility and so is useful when 840measuring performance over mechanisms which perform compression. 841 842While previously required for a TCP_SENDFILE test, later versions of 843netperf removed that restriction, creating a temporary file as 844needed. While the author cannot recall exactly when that took place, 845it is known to be unnecessary in version 2.5.0 and later. 846 847@vindex -h, Global 848@item -h 849This option causes netperf to display its ``global'' usage string and 850exit to the exclusion of all else. 851 852@vindex -H, Global 853@item -H <optionspec> 854This option will set the name of the remote system and or the address 855family used for the control connection. For example: 856@example 857-H linger,4 858@end example 859will set the name of the remote system to ``linger'' and tells netperf to 860use IPv4 addressing only. 861@example 862-H ,6 863@end example 864will leave the name of the remote system at its default, and request 865that only IPv6 addresses be used for the control connection. 866@example 867-H lag 868@end example 869will set the name of the remote system to ``lag'' and leave the 870address family to AF_UNSPEC which means selection of IPv4 vs IPv6 is 871left to the system's address resolution. 872 873A value of ``inet'' can be used in place of ``4'' to request IPv4 only 874addressing. Similarly, a value of ``inet6'' can be used in place of 875``6'' to request IPv6 only addressing. A value of ``0'' can be used 876to request either IPv4 or IPv6 addressing as name resolution dictates. 877 878By default, the options set with the global @option{-H} option are 879inherited by the test for its data connection, unless a test-specific 880@option{-H} option is specified. 881 882If a @option{-H} option follows either the @option{-4} or @option{-6} 883options, the family setting specified with the -H option will override 884the @option{-4} or @option{-6} options for the remote address 885family. If no address family is specified, settings from a previous 886@option{-4} or @option{-6} option will remain. In a nutshell, the 887last explicit global command-line option wins. 888 889[Default: ``localhost'' for the remote name/IP address and ``0'' (eg 890AF_UNSPEC) for the remote address family.] 891 892@vindex -I, Global 893@item -I <optionspec> 894This option enables the calculation of confidence intervals and sets 895the confidence and width parameters with the first half of the 896optionspec being either 99 or 95 for 99% or 95% confidence 897respectively. The second value of the optionspec specifies the width 898of the desired confidence interval. For example 899@example 900-I 99,5 901@end example 902asks netperf to be 99% confident that the measured mean values for 903throughput and CPU utilization are within +/- 2.5% of the ``real'' 904mean values. If the @option{-i} option is specified and the 905@option{-I} option is omitted, the confidence defaults to 99% and the 906width to 5% (giving +/- 2.5%) 907 908If classic netperf test calculates that the desired confidence 909intervals have not been met, it emits a noticeable warning that cannot 910be suppressed with the @option{-P} or @option{-v} options: 911 912@example 913netperf -H tardy.cup -i 3 -I 99,5 914TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.cup.hp.com (15.244.44.58) port 0 AF_INET : +/-2.5% @ 99% conf. 915!!! WARNING 916!!! Desired confidence was not achieved within the specified iterations. 917!!! This implies that there was variability in the test environment that 918!!! must be investigated before going further. 919!!! Confidence intervals: Throughput : 6.8% 920!!! Local CPU util : 0.0% 921!!! Remote CPU util : 0.0% 922 923Recv Send Send 924Socket Socket Message Elapsed 925Size Size Size Time Throughput 926bytes bytes bytes secs. 10^6bits/sec 927 928 32768 16384 16384 10.01 40.23 929@end example 930 931In the example above we see that netperf did not meet the desired 932confidence intervals. Instead of being 99% confident it was within 933+/- 2.5% of the real mean value of throughput it is only confident it 934was within +/-3.4%. In this example, increasing the @option{-i} 935option (described below) and/or increasing the iteration length with 936the @option{-l} option might resolve the situation. 937 938In an explicit ``omni'' test, failure to meet the confidence intervals 939will not result in netperf emitting a warning. To verify the hitting, 940or not, of the confidence intervals one will need to include them as 941part of an @ref{Omni Output Selection,output selection} in the 942test-specific @option{-o}, @option{-O} or @option{k} output selection 943options. The warning about not hitting the confidence intervals will 944remain in a ``migrated'' classic netperf test. 945 946@vindex -i, Global 947@item -i <sizespec> 948This option enables the calculation of confidence intervals and sets 949the minimum and maximum number of iterations to run in attempting to 950achieve the desired confidence interval. The first value sets the 951maximum number of iterations to run, the second, the minimum. The 952maximum number of iterations is silently capped at 30 and the minimum 953is silently floored at 3. Netperf repeats the measurement the minimum 954number of iterations and continues until it reaches either the 955desired confidence interval, or the maximum number of iterations, 956whichever comes first. A classic or migrated netperf test will not 957display the actual number of iterations run. An @ref{The Omni 958Tests,omni test} will emit the number of iterations run if the 959@code{CONFIDENCE_ITERATION} output selector is included in the 960@ref{Omni Output Selection,output selection}. 961 962If the @option{-I} option is specified and the @option{-i} option 963omitted the maximum number of iterations is set to 10 and the minimum 964to three. 965 966Output of a warning upon not hitting the desired confidence intervals 967follows the description provided for the @option{-I} option. 968 969The total test time will be somewhere between the minimum and maximum 970number of iterations multiplied by the test length supplied by the 971@option{-l} option. 972 973@vindex -j, Global 974@item -j 975This option instructs netperf to keep additional timing statistics 976when explicitly running an @ref{The Omni Tests,omni test}. These can 977be output when the test-specific @option{-o}, @option{-O} or 978@option{-k} @ref{Omni Output Selectors,output selectors} include one 979or more of: 980 981@itemize 982@item MIN_LATENCY 983@item MAX_LATENCY 984@item P50_LATENCY 985@item P90_LATENCY 986@item P99_LATENCY 987@item MEAN_LATENCY 988@item STDDEV_LATENCY 989@end itemize 990 991These statistics will be based on an expanded (100 buckets per row 992rather than 10) histogram of times rather than a terribly long list of 993individual times. As such, there will be some slight error thanks to 994the bucketing. However, the reduction in storage and processing 995overheads is well worth it. When running a request/response test, one 996might get some idea of the error by comparing the @ref{Omni Output 997Selectors,@code{MEAN_LATENCY}} calculated from the histogram with the 998@code{RT_LATENCY} calculated from the number of request/response 999transactions and the test run time. 1000 1001In the case of a request/response test the latencies will be 1002transaction latencies. In the case of a receive-only test they will 1003be time spent in the receive call. In the case of a send-only test 1004they will be time spent in the send call. The units will be 1005microseconds. Added in netperf 2.5.0. 1006 1007@vindex -l, Global 1008@item -l testlen 1009This option controls the length of any @b{one} iteration of the requested 1010test. A positive value for @var{testlen} will run each iteration of 1011the test for at least @var{testlen} seconds. A negative value for 1012@var{testlen} will run each iteration for the absolute value of 1013@var{testlen} transactions for a _RR test or bytes for a _STREAM test. 1014Certain tests, notably those using UDP can only be timed, they cannot 1015be limited by transaction or byte count. This limitation may be 1016relaxed in an @ref{The Omni Tests,omni} test. 1017 1018In some situations, individual iterations of a test may run for longer 1019for the number of seconds specified by the @option{-l} option. In 1020particular, this may occur for those tests where the socket buffer 1021size(s) are significantly longer than the bandwidthXdelay product of 1022the link(s) over which the data connection passes, or those tests 1023where there may be non-trivial numbers of retransmissions. 1024 1025If confidence intervals are enabled via either @option{-I} or 1026@option{-i} the total length of the netperf test will be somewhere 1027between the minimum and maximum iteration count multiplied by 1028@var{testlen}. 1029 1030@vindex -L, Global 1031@item -L <optionspec> 1032This option is identical to the @option{-H} option with the difference 1033being it sets the _local_ hostname/IP and/or address family 1034information. This option is generally unnecessary, but can be useful 1035when you wish to make sure that the netperf control and data 1036connections go via different paths. It can also come-in handy if one 1037is trying to run netperf through those evil, end-to-end breaking 1038things known as firewalls. 1039 1040[Default: 0.0.0.0 (eg INADDR_ANY) for IPv4 and ::0 for IPv6 for the 1041local name. AF_UNSPEC for the local address family.] 1042 1043@vindex -n, Global 1044@item -n numcpus 1045This option tells netperf how many CPUs it should ass-u-me are active 1046on the system running netperf. In particular, this is used for the 1047@ref{CPU Utilization,CPU utilization} and service demand calculations. 1048On certain systems, netperf is able to determine the number of CPU's 1049automagically. This option will override any number netperf might be 1050able to determine on its own. 1051 1052Note that this option does _not_ set the number of CPUs on the system 1053running netserver. When netperf/netserver cannot automagically 1054determine the number of CPUs that can only be set for netserver via a 1055netserver @option{-n} command-line option. 1056 1057As it is almost universally possible for netperf/netserver to 1058determine the number of CPUs on the system automagically, 99 times out 1059of 10 this option should not be necessary and may be removed in a 1060future release of netperf. 1061 1062@vindex -N, Global 1063@item -N 1064This option tells netperf to forgo establishing a control 1065connection. This makes it is possible to run some limited netperf 1066tests without a corresponding netserver on the remote system. 1067 1068With this option set, the test to be run is to get all the addressing 1069information it needs to establish its data connection from the command 1070line or internal defaults. If not otherwise specified by 1071test-specific command line options, the data connection for a 1072``STREAM'' or ``SENDFILE'' test will be to the ``discard'' port, an 1073``RR'' test will be to the ``echo'' port, and a ``MEARTS'' test will 1074be to the chargen port. 1075 1076The response size of an ``RR'' test will be silently set to be the 1077same as the request size. Otherwise the test would hang if the 1078response size was larger than the request size, or would report an 1079incorrect, inflated transaction rate if the response size was less 1080than the request size. 1081 1082Since there is no control connection when this option is specified, it 1083is not possible to set ``remote'' properties such as socket buffer 1084size and the like via the netperf command line. Nor is it possible to 1085retrieve such interesting remote information as CPU utilization. 1086These items will be displayed as values which should make it 1087immediately obvious that was the case. 1088 1089The only way to change remote characteristics such as socket buffer 1090size or to obtain information such as CPU utilization is to employ 1091platform-specific methods on the remote system. Frankly, if one has 1092access to the remote system to employ those methods one aught to be 1093able to run a netserver there. However, that ability may not be 1094present in certain ``support'' situations, hence the addition of this 1095option. 1096 1097Added in netperf 2.4.3. 1098 1099@vindex -o, Global 1100@item -o <sizespec> 1101The value(s) passed-in with this option will be used as an offset 1102added to the alignment specified with the @option{-a} option. For 1103example: 1104@example 1105-o 3 -a 4096 1106@end example 1107will cause the buffers passed to the local (netperf) send and receive 1108calls to begin three bytes past an address aligned to 4096 1109bytes. [Default: 0 bytes] 1110 1111@vindex -O, Global 1112@item -O <sizespec> 1113This option behaves just as the @option{-o} option but on the remote 1114(netserver) system and in conjunction with the @option{-A} 1115option. [Default: 0 bytes] 1116 1117@vindex -p, Global 1118@item -p <optionspec> 1119The first value of the optionspec passed-in with this option tells 1120netperf the port number at which it should expect the remote netserver 1121to be listening for control connections. The second value of the 1122optionspec will request netperf to bind to that local port number 1123before establishing the control connection. For example 1124@example 1125-p 12345 1126@end example 1127tells netperf that the remote netserver is listening on port 12345 and 1128leaves selection of the local port number for the control connection 1129up to the local TCP/IP stack whereas 1130@example 1131-p ,32109 1132@end example 1133leaves the remote netserver port at the default value of 12865 and 1134causes netperf to bind to the local port number 32109 before 1135connecting to the remote netserver. 1136 1137In general, setting the local port number is only necessary when one 1138is looking to run netperf through those evil, end-to-end breaking 1139things known as firewalls. 1140 1141@vindex -P, Global 1142@item -P 0|1 1143A value of ``1'' for the @option{-P} option will enable display of 1144the test banner. A value of ``0'' will disable display of the test 1145banner. One might want to disable display of the test banner when 1146running the same basic test type (eg TCP_STREAM) multiple times in 1147succession where the test banners would then simply be redundant and 1148unnecessarily clutter the output. [Default: 1 - display test banners] 1149 1150@vindex -s, Global 1151@item -s <seconds> 1152This option will cause netperf to sleep @samp{<seconds>} before 1153actually transferring data over the data connection. This may be 1154useful in situations where one wishes to start a great many netperf 1155instances and do not want the earlier ones affecting the ability of 1156the later ones to get established. 1157 1158Added somewhere between versions 2.4.3 and 2.5.0. 1159 1160@vindex -S, Global 1161@item -S 1162This option will cause an attempt to be made to set SO_KEEPALIVE on 1163the data socket of a test using the BSD sockets interface. The 1164attempt will be made on the netperf side of all tests, and will be 1165made on the netserver side of an @ref{The Omni Tests,omni} or 1166@ref{Migrated Tests,migrated} test. No indication of failure is given 1167unless debug output is enabled with the global @option{-d} option. 1168 1169Added in version 2.5.0. 1170 1171@vindex -t, Global 1172@item -t testname 1173This option is used to tell netperf which test you wish to run. As of 1174this writing, valid values for @var{testname} include: 1175@itemize 1176@item 1177@ref{TCP_STREAM}, @ref{TCP_MAERTS}, @ref{TCP_SENDFILE}, @ref{TCP_RR}, @ref{TCP_CRR}, @ref{TCP_CC} 1178@item 1179@ref{UDP_STREAM}, @ref{UDP_RR} 1180@item 1181@ref{XTI_TCP_STREAM}, @ref{XTI_TCP_RR}, @ref{XTI_TCP_CRR}, @ref{XTI_TCP_CC} 1182@item 1183@ref{XTI_UDP_STREAM}, @ref{XTI_UDP_RR} 1184@item 1185@ref{SCTP_STREAM}, @ref{SCTP_RR} 1186@item 1187@ref{DLCO_STREAM}, @ref{DLCO_RR}, @ref{DLCL_STREAM}, @ref{DLCL_RR} 1188@item 1189@ref{Other Netperf Tests,LOC_CPU}, @ref{Other Netperf Tests,REM_CPU} 1190@item 1191@ref{The Omni Tests,OMNI} 1192@end itemize 1193Not all tests are always compiled into netperf. In particular, the 1194``XTI,'' ``SCTP,'' ``UNIXDOMAIN,'' and ``DL*'' tests are only included in 1195netperf when configured with 1196@option{--enable-[xti|sctp|unixdomain|dlpi]=yes}. 1197 1198Netperf only runs one type of test no matter how many @option{-t} 1199options may be present on the command-line. The last @option{-t} 1200global command-line option will determine the test to be 1201run. [Default: TCP_STREAM] 1202 1203@vindex -T, Global 1204@item -T <optionspec> 1205This option controls the CPU, and probably by extension memory, 1206affinity of netperf and/or netserver. 1207@example 1208netperf -T 1 1209@end example 1210will bind both netperf and netserver to ``CPU 1'' on their respective 1211systems. 1212@example 1213netperf -T 1, 1214@end example 1215will bind just netperf to ``CPU 1'' and will leave netserver unbound. 1216@example 1217netperf -T ,2 1218@end example 1219will leave netperf unbound and will bind netserver to ``CPU 2.'' 1220@example 1221netperf -T 1,2 1222@end example 1223will bind netperf to ``CPU 1'' and netserver to ``CPU 2.'' 1224 1225This can be particularly useful when investigating performance issues 1226involving where processes run relative to where NIC interrupts are 1227processed or where NICs allocate their DMA buffers. 1228 1229@vindex -v, Global 1230@item -v verbosity 1231This option controls how verbose netperf will be in its output, and is 1232often used in conjunction with the @option{-P} option. If the 1233verbosity is set to a value of ``0'' then only the test's SFM (Single 1234Figure of Merit) is displayed. If local @ref{CPU Utilization,CPU 1235utilization} is requested via the @option{-c} option then the SFM is 1236the local service demand. Othersise, if remote CPU utilization is 1237requested via the @option{-C} option then the SFM is the remote 1238service demand. If neither local nor remote CPU utilization are 1239requested the SFM will be the measured throughput or transaction rate 1240as implied by the test specified with the @option{-t} option. 1241 1242If the verbosity level is set to ``1'' then the ``normal'' netperf 1243result output for each test is displayed. 1244 1245If the verbosity level is set to ``2'' then ``extra'' information will 1246be displayed. This may include, but is not limited to the number of 1247send or recv calls made and the average number of bytes per send or 1248recv call, or a histogram of the time spent in each send() call or for 1249each transaction if netperf was configured with 1250@option{--enable-histogram=yes}. [Default: 1 - normal verbosity] 1251 1252In an @ref{The Omni Tests,omni} test the verbosity setting is largely 1253ignored, save for when asking for the time histogram to be displayed. 1254In version 2.5.0 and later there is no @ref{Omni Output Selectors,output 1255selector} for the histogram and so it remains displayed only when the 1256verbosity level is set to 2. 1257 1258@vindex -V, Global 1259@item -V 1260This option displays the netperf version and then exits. 1261 1262Added in netperf 2.4.4. 1263 1264@vindex -w, Global 1265@item -w time 1266If netperf was configured with @option{--enable-intervals=yes} then 1267this value will set the inter-burst time to time milliseconds, and the 1268@option{-b} option will set the number of sends per burst. The actual 1269inter-burst time may vary depending on the system's timer resolution. 1270 1271@vindex -W, Global 1272@item -W <sizespec> 1273This option controls the number of buffers in the send (first or only 1274value) and or receive (second or only value) buffer rings. Unlike 1275some benchmarks, netperf does not continuously send or receive from a 1276single buffer. Instead it rotates through a ring of 1277buffers. [Default: One more than the size of the send or receive 1278socket buffer sizes (@option{-s} and/or @option{-S} options) divided 1279by the send @option{-m} or receive @option{-M} buffer size 1280respectively] 1281 1282@vindex -4, Global 1283@item -4 1284Specifying this option will set both the local and remote address 1285families to AF_INET - that is use only IPv4 addresses on the control 1286connection. This can be overridden by a subsequent @option{-6}, 1287@option{-H} or @option{-L} option. Basically, the last option 1288explicitly specifying an address family wins. Unless overridden by a 1289test-specific option, this will be inherited for the data connection 1290as well. 1291 1292@vindex -6, Global 1293@item -6 1294Specifying this option will set both local and and remote address 1295families to AF_INET6 - that is use only IPv6 addresses on the control 1296connection. This can be overridden by a subsequent @option{-4}, 1297@option{-H} or @option{-L} option. Basically, the last address family 1298explicitly specified wins. Unless overridden by a test-specific 1299option, this will be inherited for the data connection as well. 1300 1301@end table 1302 1303 1304@node Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Request/Response , Global Command-line Options, Top 1305@chapter Using Netperf to Measure Bulk Data Transfer 1306 1307The most commonly measured aspect of networked system performance is 1308that of bulk or unidirectional transfer performance. Everyone wants 1309to know how many bits or bytes per second they can push across the 1310network. The classic netperf convention for a bulk data transfer test 1311name is to tack a ``_STREAM'' suffix to a test name. 1312 1313@menu 1314* Issues in Bulk Transfer:: 1315* Options common to TCP UDP and SCTP tests:: 1316@end menu 1317 1318@node Issues in Bulk Transfer, Options common to TCP UDP and SCTP tests, Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Bulk Data Transfer 1319@comment node-name, next, previous, up 1320@section Issues in Bulk Transfer 1321 1322There are any number of things which can affect the performance of a 1323bulk transfer test. 1324 1325Certainly, absent compression, bulk-transfer tests can be limited by 1326the speed of the slowest link in the path from the source to the 1327destination. If testing over a gigabit link, you will not see more 1328than a gigabit :) Such situations can be described as being 1329@dfn{network-limited} or @dfn{NIC-limited}. 1330 1331CPU utilization can also affect the results of a bulk-transfer test. 1332If the networking stack requires a certain number of instructions or 1333CPU cycles per KB of data transferred, and the CPU is limited in the 1334number of instructions or cycles it can provide, then the transfer can 1335be described as being @dfn{CPU-bound}. 1336 1337A bulk-transfer test can be CPU bound even when netperf reports less 1338than 100% CPU utilization. This can happen on an MP system where one 1339or more of the CPUs saturate at 100% but other CPU's remain idle. 1340Typically, a single flow of data, such as that from a single instance 1341of a netperf _STREAM test cannot make use of much more than the power 1342of one CPU. Exceptions to this generally occur when netperf and/or 1343netserver run on CPU(s) other than the CPU(s) taking interrupts from 1344the NIC(s). In that case, one might see as much as two CPUs' worth of 1345processing being used to service the flow of data. 1346 1347Distance and the speed-of-light can affect performance for a 1348bulk-transfer; often this can be mitigated by using larger windows. 1349One common limit to the performance of a transport using window-based 1350flow-control is: 1351@example 1352Throughput <= WindowSize/RoundTripTime 1353@end example 1354As the sender can only have a window's-worth of data outstanding on 1355the network at any one time, and the soonest the sender can receive a 1356window update from the receiver is one RoundTripTime (RTT). TCP and 1357SCTP are examples of such protocols. 1358 1359Packet losses and their effects can be particularly bad for 1360performance. This is especially true if the packet losses result in 1361retransmission timeouts for the protocol(s) involved. By the time a 1362retransmission timeout has happened, the flow or connection has sat 1363idle for a considerable length of time. 1364 1365On many platforms, some variant on the @command{netstat} command can 1366be used to retrieve statistics about packet loss and 1367retransmission. For example: 1368@example 1369netstat -p tcp 1370@end example 1371will retrieve TCP statistics on the HP-UX Operating System. On other 1372platforms, it may not be possible to retrieve statistics for a 1373specific protocol and something like: 1374@example 1375netstat -s 1376@end example 1377would be used instead. 1378 1379Many times, such network statistics are keep since the time the stack 1380started, and we are only really interested in statistics from when 1381netperf was running. In such situations something along the lines of: 1382@example 1383netstat -p tcp > before 1384netperf -t TCP_mumble... 1385netstat -p tcp > after 1386@end example 1387is indicated. The 1388@uref{ftp://ftp.cup.hp.com/dist/networking/tools/,beforeafter} utility 1389can be used to subtract the statistics in @file{before} from the 1390statistics in @file{after}: 1391@example 1392beforeafter before after > delta 1393@end example 1394and then one can look at the statistics in @file{delta}. Beforeafter 1395is distributed in source form so one can compile it on the platform(s) 1396of interest. 1397 1398If running a version 2.5.0 or later ``omni'' test under Linux one can 1399include either or both of: 1400@itemize 1401@item LOCAL_TRANSPORT_RETRANS 1402@item REMOTE_TRANSPORT_RETRANS 1403@end itemize 1404 1405in the values provided via a test-specific @option{-o}, @option{-O}, 1406or @option{-k} output selction option and netperf will report the 1407retransmissions experienced on the data connection, as reported via a 1408@code{getsockopt(TCP_INFO)} call. If confidence intervals have been 1409requested via the global @option{-I} or @option{-i} options, the 1410reported value(s) will be for the last iteration. If the test is over 1411a protocol other than TCP, or on a platform other than Linux, the 1412results are undefined. 1413 1414While it was written with HP-UX's netstat in mind, the 1415@uref{ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_netstat.txt,annotated 1416netstat} writeup may be helpful with other platforms as well. 1417 1418@node Options common to TCP UDP and SCTP tests, , Issues in Bulk Transfer, Using Netperf to Measure Bulk Data Transfer 1419@comment node-name, next, previous, up 1420@section Options common to TCP UDP and SCTP tests 1421 1422Many ``test-specific'' options are actually common across the 1423different tests. For those tests involving TCP, UDP and SCTP, whether 1424using the BSD Sockets or the XTI interface those common options 1425include: 1426 1427@table @code 1428@vindex -h, Test-specific 1429@item -h 1430Display the test-suite-specific usage string and exit. For a TCP_ or 1431UDP_ test this will be the usage string from the source file 1432nettest_bsd.c. For an XTI_ test, this will be the usage string from 1433the source file nettest_xti.c. For an SCTP test, this will be the 1434usage string from the source file nettest_sctp.c. 1435 1436@item -H <optionspec> 1437Normally, the remote hostname|IP and address family information is 1438inherited from the settings for the control connection (eg global 1439command-line @option{-H}, @option{-4} and/or @option{-6} options). 1440The test-specific @option{-H} will override those settings for the 1441data (aka test) connection only. Settings for the control connection 1442are left unchanged. 1443 1444@vindex -L, Test-specific 1445@item -L <optionspec> 1446The test-specific @option{-L} option is identical to the test-specific 1447@option{-H} option except it affects the local hostname|IP and address 1448family information. As with its global command-line counterpart, this 1449is generally only useful when measuring though those evil, end-to-end 1450breaking things called firewalls. 1451 1452@vindex -m, Test-specific 1453@item -m bytes 1454Set the size of the buffer passed-in to the ``send'' calls of a 1455_STREAM test. Note that this may have only an indirect effect on the 1456size of the packets sent over the network, and certain Layer 4 1457protocols do _not_ preserve or enforce message boundaries, so setting 1458@option{-m} for the send size does not necessarily mean the receiver 1459will receive that many bytes at any one time. By default the units are 1460bytes, but suffix of ``G,'' ``M,'' or ``K'' will specify the units to 1461be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,'' 1462``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes 1463respectively. For example: 1464@example 1465@code{-m 32K} 1466@end example 1467will set the size to 32KB or 32768 bytes. [Default: the local send 1468socket buffer size for the connection - either the system's default or 1469the value set via the @option{-s} option.] 1470 1471@vindex -M, Test-specific 1472@item -M bytes 1473Set the size of the buffer passed-in to the ``recv'' calls of a 1474_STREAM test. This will be an upper bound on the number of bytes 1475received per receive call. By default the units are bytes, but suffix 1476of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20 1477(MB) or 2^10 (KB) respectively. A suffix of ``g,'' ``m'' or ``k'' 1478will specify units of 10^9, 10^6 or 10^3 bytes respectively. For 1479example: 1480@example 1481@code{-M 32K} 1482@end example 1483will set the size to 32KB or 32768 bytes. [Default: the remote receive 1484socket buffer size for the data connection - either the system's 1485default or the value set via the @option{-S} option.] 1486 1487@vindex -P, Test-specific 1488@item -P <optionspec> 1489Set the local and/or remote port numbers for the data connection. 1490 1491@vindex -s, Test-specific 1492@item -s <sizespec> 1493This option sets the local (netperf) send and receive socket buffer 1494sizes for the data connection to the value(s) specified. Often, this 1495will affect the advertised and/or effective TCP or other window, but 1496on some platforms it may not. By default the units are bytes, but 1497suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 1498(GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,'' ``m'' 1499or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes 1500respectively. For example: 1501@example 1502@code{-s 128K} 1503@end example 1504Will request the local send and receive socket buffer sizes to be 1505128KB or 131072 bytes. 1506 1507While the historic expectation is that setting the socket buffer size 1508has a direct effect on say the TCP window, today that may not hold 1509true for all stacks. Further, while the historic expectation is that 1510the value specified in a @code{setsockopt()} call will be the value returned 1511via a @code{getsockopt()} call, at least one stack is known to deliberately 1512ignore history. When running under Windows a value of 0 may be used 1513which will be an indication to the stack the user wants to enable a 1514form of copy avoidance. [Default: -1 - use the system's default socket 1515buffer sizes] 1516 1517@vindex -S Test-specific 1518@item -S <sizespec> 1519This option sets the remote (netserver) send and/or receive socket 1520buffer sizes for the data connection to the value(s) specified. 1521Often, this will affect the advertised and/or effective TCP or other 1522window, but on some platforms it may not. By default the units are 1523bytes, but suffix of ``G,'' ``M,'' or ``K'' will specify the units to 1524be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,'' 1525``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes 1526respectively. For example: 1527@example 1528@code{-S 128K} 1529@end example 1530Will request the remote send and receive socket buffer sizes to be 1531128KB or 131072 bytes. 1532 1533While the historic expectation is that setting the socket buffer size 1534has a direct effect on say the TCP window, today that may not hold 1535true for all stacks. Further, while the historic expectation is that 1536the value specified in a @code{setsockopt()} call will be the value returned 1537via a @code{getsockopt()} call, at least one stack is known to deliberately 1538ignore history. When running under Windows a value of 0 may be used 1539which will be an indication to the stack the user wants to enable a 1540form of copy avoidance. [Default: -1 - use the system's default socket 1541buffer sizes] 1542 1543@vindex -4, Test-specific 1544@item -4 1545Set the local and remote address family for the data connection to 1546AF_INET - ie use IPv4 addressing only. Just as with their global 1547command-line counterparts the last of the @option{-4}, @option{-6}, 1548@option{-H} or @option{-L} option wins for their respective address 1549families. 1550 1551@vindex -6, Test-specific 1552@item -6 1553This option is identical to its @option{-4} cousin, but requests IPv6 1554addresses for the local and remote ends of the data connection. 1555 1556@end table 1557 1558 1559@menu 1560* TCP_STREAM:: 1561* TCP_MAERTS:: 1562* TCP_SENDFILE:: 1563* UDP_STREAM:: 1564* XTI_TCP_STREAM:: 1565* XTI_UDP_STREAM:: 1566* SCTP_STREAM:: 1567* DLCO_STREAM:: 1568* DLCL_STREAM:: 1569* STREAM_STREAM:: 1570* DG_STREAM:: 1571@end menu 1572 1573@node TCP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests, Options common to TCP UDP and SCTP tests 1574@subsection TCP_STREAM 1575 1576The TCP_STREAM test is the default test in netperf. It is quite 1577simple, transferring some quantity of data from the system running 1578netperf to the system running netserver. While time spent 1579establishing the connection is not included in the throughput 1580calculation, time spent flushing the last of the data to the remote at 1581the end of the test is. This is how netperf knows that all the data 1582it sent was received by the remote. In addition to the @ref{Options 1583common to TCP UDP and SCTP tests,options common to STREAM tests}, the 1584following test-specific options can be included to possibly alter the 1585behavior of the test: 1586 1587@table @code 1588@item -C 1589This option will set TCP_CORK mode on the data connection on those 1590systems where TCP_CORK is defined (typically Linux). A full 1591description of TCP_CORK is beyond the scope of this manual, but in a 1592nutshell it forces sub-MSS sends to be buffered so every segment sent 1593is Maximum Segment Size (MSS) unless the application performs an 1594explicit flush operation or the connection is closed. At present 1595netperf does not perform any explicit flush operations. Setting 1596TCP_CORK may improve the bitrate of tests where the ``send size'' 1597(@option{-m} option) is smaller than the MSS. It should also improve 1598(make smaller) the service demand. 1599 1600The Linux tcp(7) manpage states that TCP_CORK cannot be used in 1601conjunction with TCP_NODELAY (set via the @option{-d} option), however 1602netperf does not validate command-line options to enforce that. 1603 1604@item -D 1605This option will set TCP_NODELAY on the data connection on those 1606systems where TCP_NODELAY is defined. This disables something known 1607as the Nagle Algorithm, which is intended to make the segments TCP 1608sends as large as reasonably possible. Setting TCP_NODELAY for a 1609TCP_STREAM test should either have no effect when the send size 1610(@option{-m} option) is larger than the MSS or will decrease reported 1611bitrate and increase service demand when the send size is smaller than 1612the MSS. This stems from TCP_NODELAY causing each sub-MSS send to be 1613its own TCP segment rather than being aggregated with other small 1614sends. This means more trips up and down the protocol stack per KB of 1615data transferred, which means greater CPU utilization. 1616 1617If setting TCP_NODELAY with @option{-D} affects throughput and/or 1618service demand for tests where the send size (@option{-m}) is larger 1619than the MSS it suggests the TCP/IP stack's implementation of the 1620Nagle Algorithm _may_ be broken, perhaps interpreting the Nagle 1621Algorithm on a segment by segment basis rather than the proper user 1622send by user send basis. However, a better test of this can be 1623achieved with the @ref{TCP_RR} test. 1624 1625@end table 1626 1627Here is an example of a basic TCP_STREAM test, in this case from a 1628Debian Linux (2.6 kernel) system to an HP-UX 11iv2 (HP-UX 11.23) 1629system: 1630 1631@example 1632$ netperf -H lag 1633TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET 1634Recv Send Send 1635Socket Socket Message Elapsed 1636Size Size Size Time Throughput 1637bytes bytes bytes secs. 10^6bits/sec 1638 1639 32768 16384 16384 10.00 80.42 1640@end example 1641 1642We see that the default receive socket buffer size for the receiver 1643(lag - HP-UX 11.23) is 32768 bytes, and the default socket send buffer 1644size for the sender (Debian 2.6 kernel) is 16384 bytes, however Linux 1645does ``auto tuning'' of socket buffer and TCP window sizes, which 1646means the send socket buffer size may be different at the end of the 1647test than it was at the beginning. This is addressed in the @ref{The 1648Omni Tests,omni tests} added in version 2.5.0 and @ref{Omni Output 1649Selection,output selection}. Throughput is expressed as 10^6 (aka 1650Mega) bits per second, and the test ran for 10 seconds. IPv4 1651addresses (AF_INET) were used. 1652 1653@node TCP_MAERTS, TCP_SENDFILE, TCP_STREAM, Options common to TCP UDP and SCTP tests 1654@comment node-name, next, previous, up 1655@subsection TCP_MAERTS 1656 1657A TCP_MAERTS (MAERTS is STREAM backwards) test is ``just like'' a 1658@ref{TCP_STREAM} test except the data flows from the netserver to the 1659netperf. The global command-line @option{-F} option is ignored for 1660this test type. The test-specific command-line @option{-C} option is 1661ignored for this test type. 1662 1663Here is an example of a TCP_MAERTS test between the same two systems 1664as in the example for the @ref{TCP_STREAM} test. This time we request 1665larger socket buffers with @option{-s} and @option{-S} options: 1666 1667@example 1668$ netperf -H lag -t TCP_MAERTS -- -s 128K -S 128K 1669TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET 1670Recv Send Send 1671Socket Socket Message Elapsed 1672Size Size Size Time Throughput 1673bytes bytes bytes secs. 10^6bits/sec 1674 1675221184 131072 131072 10.03 81.14 1676@end example 1677 1678Where we see that Linux, unlike HP-UX, may not return the same value 1679in a @code{getsockopt()} as was requested in the prior @code{setsockopt()}. 1680 1681This test is included more for benchmarking convenience than anything 1682else. 1683 1684@node TCP_SENDFILE, UDP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests 1685@comment node-name, next, previous, up 1686@subsection TCP_SENDFILE 1687 1688The TCP_SENDFILE test is ``just like'' a @ref{TCP_STREAM} test except 1689netperf the platform's @code{sendfile()} call instead of calling 1690@code{send()}. Often this results in a @dfn{zero-copy} operation 1691where data is sent directly from the filesystem buffer cache. This 1692_should_ result in lower CPU utilization and possibly higher 1693throughput. If it does not, then you may want to contact your 1694vendor(s) because they have a problem on their hands. 1695 1696Zero-copy mechanisms may also alter the characteristics (size and 1697number of buffers per) of packets passed to the NIC. In many stacks, 1698when a copy is performed, the stack can ``reserve'' space at the 1699beginning of the destination buffer for things like TCP, IP and Link 1700headers. This then has the packet contained in a single buffer which 1701can be easier to DMA to the NIC. When no copy is performed, there is 1702no opportunity to reserve space for headers and so a packet will be 1703contained in two or more buffers. 1704 1705As of some time before version 2.5.0, the @ref{Global Options,global 1706@option{-F} option} is no longer required for this test. If it is not 1707specified, netperf will create a temporary file, which it will delete 1708at the end of the test. If the @option{-F} option is specified it 1709must reference a file of at least the size of the send ring 1710(@xref{Global Options,the global @option{-W} option}.) multiplied by 1711the send size (@xref{Options common to TCP UDP and SCTP tests,the 1712test-specific @option{-m} option}.). All other TCP-specific options 1713remain available and optional. 1714 1715In this first example: 1716@example 1717$ netperf -H lag -F ../src/netperf -t TCP_SENDFILE -- -s 128K -S 128K 1718TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET 1719alloc_sendfile_buf_ring: specified file too small. 1720file must be larger than send_width * send_size 1721@end example 1722 1723we see what happens when the file is too small. Here: 1724 1725@example 1726$ netperf -H lag -F /boot/vmlinuz-2.6.8-1-686 -t TCP_SENDFILE -- -s 128K -S 128K 1727TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET 1728Recv Send Send 1729Socket Socket Message Elapsed 1730Size Size Size Time Throughput 1731bytes bytes bytes secs. 10^6bits/sec 1732 1733131072 221184 221184 10.02 81.83 1734@end example 1735 1736we resolve that issue by selecting a larger file. 1737 1738 1739@node UDP_STREAM, XTI_TCP_STREAM, TCP_SENDFILE, Options common to TCP UDP and SCTP tests 1740@subsection UDP_STREAM 1741 1742A UDP_STREAM test is similar to a @ref{TCP_STREAM} test except UDP is 1743used as the transport rather than TCP. 1744 1745@cindex Limiting Bandwidth 1746A UDP_STREAM test has no end-to-end flow control - UDP provides none 1747and neither does netperf. However, if you wish, you can configure 1748netperf with @code{--enable-intervals=yes} to enable the global 1749command-line @option{-b} and @option{-w} options to pace bursts of 1750traffic onto the network. 1751 1752This has a number of implications. 1753 1754The biggest of these implications is the data which is sent might not 1755be received by the remote. For this reason, the output of a 1756UDP_STREAM test shows both the sending and receiving throughput. On 1757some platforms, it may be possible for the sending throughput to be 1758reported as a value greater than the maximum rate of the link. This 1759is common when the CPU(s) are faster than the network and there is no 1760@dfn{intra-stack} flow-control. 1761 1762Here is an example of a UDP_STREAM test between two systems connected 1763by a 10 Gigabit Ethernet link: 1764@example 1765$ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 32768 1766UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET 1767Socket Message Elapsed Messages 1768Size Size Time Okay Errors Throughput 1769bytes bytes secs # # 10^6bits/sec 1770 1771124928 32768 10.00 105672 0 2770.20 1772135168 10.00 104844 2748.50 1773 1774@end example 1775 1776The first line of numbers are statistics from the sending (netperf) 1777side. The second line of numbers are from the receiving (netserver) 1778side. In this case, 105672 - 104844 or 828 messages did not make it 1779all the way to the remote netserver process. 1780 1781If the value of the @option{-m} option is larger than the local send 1782socket buffer size (@option{-s} option) netperf will likely abort with 1783an error message about how the send call failed: 1784 1785@example 1786netperf -t UDP_STREAM -H 192.168.2.125 1787UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET 1788udp_send: data send error: Message too long 1789@end example 1790 1791If the value of the @option{-m} option is larger than the remote 1792socket receive buffer, the reported receive throughput will likely be 1793zero as the remote UDP will discard the messages as being too large to 1794fit into the socket buffer. 1795 1796@example 1797$ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 65000 -S 32768 1798UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET 1799Socket Message Elapsed Messages 1800Size Size Time Okay Errors Throughput 1801bytes bytes secs # # 10^6bits/sec 1802 1803124928 65000 10.00 53595 0 2786.99 1804 65536 10.00 0 0.00 1805@end example 1806 1807The example above was between a pair of systems running a ``Linux'' 1808kernel. Notice that the remote Linux system returned a value larger 1809than that passed-in to the @option{-S} option. In fact, this value 1810was larger than the message size set with the @option{-m} option. 1811That the remote socket buffer size is reported as 65536 bytes would 1812suggest to any sane person that a message of 65000 bytes would fit, 1813but the socket isn't _really_ 65536 bytes, even though Linux is 1814telling us so. Go figure. 1815 1816@node XTI_TCP_STREAM, XTI_UDP_STREAM, UDP_STREAM, Options common to TCP UDP and SCTP tests 1817@subsection XTI_TCP_STREAM 1818 1819An XTI_TCP_STREAM test is simply a @ref{TCP_STREAM} test using the XTI 1820rather than BSD Sockets interface. The test-specific @option{-X 1821<devspec>} option can be used to specify the name of the local and/or 1822remote XTI device files, which is required by the @code{t_open()} call 1823made by netperf XTI tests. 1824 1825The XTI_TCP_STREAM test is only present if netperf was configured with 1826@code{--enable-xti=yes}. The remote netserver must have also been 1827configured with @code{--enable-xti=yes}. 1828 1829@node XTI_UDP_STREAM, SCTP_STREAM, XTI_TCP_STREAM, Options common to TCP UDP and SCTP tests 1830@subsection XTI_UDP_STREAM 1831 1832An XTI_UDP_STREAM test is simply a @ref{UDP_STREAM} test using the XTI 1833rather than BSD Sockets Interface. The test-specific @option{-X 1834<devspec>} option can be used to specify the name of the local and/or 1835remote XTI device files, which is required by the @code{t_open()} call 1836made by netperf XTI tests. 1837 1838The XTI_UDP_STREAM test is only present if netperf was configured with 1839@code{--enable-xti=yes}. The remote netserver must have also been 1840configured with @code{--enable-xti=yes}. 1841 1842@node SCTP_STREAM, DLCO_STREAM, XTI_UDP_STREAM, Options common to TCP UDP and SCTP tests 1843@subsection SCTP_STREAM 1844 1845An SCTP_STREAM test is essentially a @ref{TCP_STREAM} test using the SCTP 1846rather than TCP. The @option{-D} option will set SCTP_NODELAY, which 1847is much like the TCP_NODELAY option for TCP. The @option{-C} option 1848is not applicable to an SCTP test as there is no corresponding 1849SCTP_CORK option. The author is still figuring-out what the 1850test-specific @option{-N} option does :) 1851 1852The SCTP_STREAM test is only present if netperf was configured with 1853@code{--enable-sctp=yes}. The remote netserver must have also been 1854configured with @code{--enable-sctp=yes}. 1855 1856@node DLCO_STREAM, DLCL_STREAM, SCTP_STREAM, Options common to TCP UDP and SCTP tests 1857@subsection DLCO_STREAM 1858 1859A DLPI Connection Oriented Stream (DLCO_STREAM) test is very similar 1860in concept to a @ref{TCP_STREAM} test. Both use reliable, 1861connection-oriented protocols. The DLPI test differs from the TCP 1862test in that its protocol operates only at the link-level and does not 1863include TCP-style segmentation and reassembly. This last difference 1864means that the value passed-in with the @option{-m} option must be 1865less than the interface MTU. Otherwise, the @option{-m} and 1866@option{-M} options are just like their TCP/UDP/SCTP counterparts. 1867 1868Other DLPI-specific options include: 1869 1870@table @code 1871@item -D <devspec> 1872This option is used to provide the fully-qualified names for the local 1873and/or remote DLPI device files. The syntax is otherwise identical to 1874that of a @dfn{sizespec}. 1875@item -p <ppaspec> 1876This option is used to specify the local and/or remote DLPI PPA(s). 1877The PPA is used to identify the interface over which traffic is to be 1878sent/received. The syntax of a @dfn{ppaspec} is otherwise the same as 1879a @dfn{sizespec}. 1880@item -s sap 1881This option specifies the 802.2 SAP for the test. A SAP is somewhat 1882like either the port field of a TCP or UDP header or the protocol 1883field of an IP header. The specified SAP should not conflict with any 1884other active SAPs on the specified PPA's (@option{-p} option). 1885@item -w <sizespec> 1886This option specifies the local send and receive window sizes in units 1887of frames on those platforms which support setting such things. 1888@item -W <sizespec> 1889This option specifies the remote send and receive window sizes in 1890units of frames on those platforms which support setting such things. 1891@end table 1892 1893The DLCO_STREAM test is only present if netperf was configured with 1894@code{--enable-dlpi=yes}. The remote netserver must have also been 1895configured with @code{--enable-dlpi=yes}. 1896 1897 1898@node DLCL_STREAM, STREAM_STREAM, DLCO_STREAM, Options common to TCP UDP and SCTP tests 1899@subsection DLCL_STREAM 1900 1901A DLPI ConnectionLess Stream (DLCL_STREAM) test is analogous to a 1902@ref{UDP_STREAM} test in that both make use of unreliable/best-effort, 1903connection-less transports. The DLCL_STREAM test differs from the 1904@ref{UDP_STREAM} test in that the message size (@option{-m} option) must 1905always be less than the link MTU as there is no IP-like fragmentation 1906and reassembly available and netperf does not presume to provide one. 1907 1908The test-specific command-line options for a DLCL_STREAM test are the 1909same as those for a @ref{DLCO_STREAM} test. 1910 1911The DLCL_STREAM test is only present if netperf was configured with 1912@code{--enable-dlpi=yes}. The remote netserver must have also been 1913configured with @code{--enable-dlpi=yes}. 1914 1915@node STREAM_STREAM, DG_STREAM, DLCL_STREAM, Options common to TCP UDP and SCTP tests 1916@comment node-name, next, previous, up 1917@subsection STREAM_STREAM 1918 1919A Unix Domain Stream Socket Stream test (STREAM_STREAM) is similar in 1920concept to a @ref{TCP_STREAM} test, but using Unix Domain sockets. It is, 1921naturally, limited to intra-machine traffic. A STREAM_STREAM test 1922shares the @option{-m}, @option{-M}, @option{-s} and @option{-S} 1923options of the other _STREAM tests. In a STREAM_STREAM test the 1924@option{-p} option sets the directory in which the pipes will be 1925created rather than setting a port number. The default is to create 1926the pipes in the system default for the @code{tempnam()} call. 1927 1928The STREAM_STREAM test is only present if netperf was configured with 1929@code{--enable-unixdomain=yes}. The remote netserver must have also been 1930configured with @code{--enable-unixdomain=yes}. 1931 1932@node DG_STREAM, , STREAM_STREAM, Options common to TCP UDP and SCTP tests 1933@comment node-name, next, previous, up 1934@subsection DG_STREAM 1935 1936A Unix Domain Datagram Socket Stream test (SG_STREAM) is very much 1937like a @ref{TCP_STREAM} test except that message boundaries are preserved. 1938In this way, it may also be considered similar to certain flavors of 1939SCTP test which can also preserve message boundaries. 1940 1941All the options of a @ref{STREAM_STREAM} test are applicable to a DG_STREAM 1942test. 1943 1944The DG_STREAM test is only present if netperf was configured with 1945@code{--enable-unixdomain=yes}. The remote netserver must have also been 1946configured with @code{--enable-unixdomain=yes}. 1947 1948 1949@node Using Netperf to Measure Request/Response , Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bulk Data Transfer, Top 1950@chapter Using Netperf to Measure Request/Response 1951 1952Request/response performance is often overlooked, yet it is just as 1953important as bulk-transfer performance. While things like larger 1954socket buffers and TCP windows, and stateless offloads like TSO and 1955LRO can cover a multitude of latency and even path-length sins, those 1956sins cannot easily hide from a request/response test. The convention 1957for a request/response test is to have a _RR suffix. There are 1958however a few ``request/response'' tests that have other suffixes. 1959 1960A request/response test, particularly synchronous, one transaction at 1961a time test such as those found by default in netperf, is particularly 1962sensitive to the path-length of the networking stack. An _RR test can 1963also uncover those platforms where the NICs are strapped by default 1964with overbearing interrupt avoidance settings in an attempt to 1965increase the bulk-transfer performance (or rather, decrease the CPU 1966utilization of a bulk-transfer test). This sensitivity is most acute 1967for small request and response sizes, such as the single-byte default 1968for a netperf _RR test. 1969 1970While a bulk-transfer test reports its results in units of bits or 1971bytes transferred per second, by default a mumble_RR test reports 1972transactions per second where a transaction is defined as the 1973completed exchange of a request and a response. One can invert the 1974transaction rate to arrive at the average round-trip latency. If one 1975is confident about the symmetry of the connection, the average one-way 1976latency can be taken as one-half the average round-trip latency. As of 1977version 2.5.0 (actually slightly before) netperf still does not do the 1978latter, but will do the former if one sets the verbosity to 2 for a 1979classic netperf test, or includes the appropriate @ref{Omni Output 1980Selectors,output selector} in an @ref{The Omni Tests,omni test}. It 1981will also allow the user to switch the throughput units from 1982transactions per second to bits or bytes per second with the global 1983@option{-f} option. 1984 1985@menu 1986* Issues in Request/Response:: 1987* Options Common to TCP UDP and SCTP _RR tests:: 1988@end menu 1989 1990@node Issues in Request/Response, Options Common to TCP UDP and SCTP _RR tests, Using Netperf to Measure Request/Response , Using Netperf to Measure Request/Response 1991@comment node-name, next, previous, up 1992@section Issues in Request/Response 1993 1994Most if not all the @ref{Issues in Bulk Transfer} apply to 1995request/response. The issue of round-trip latency is even more 1996important as netperf generally only has one transaction outstanding at 1997a time. 1998 1999A single instance of a one transaction outstanding _RR test should 2000_never_ completely saturate the CPU of a system. If testing between 2001otherwise evenly matched systems, the symmetric nature of a _RR test 2002with equal request and response sizes should result in equal CPU 2003loading on both systems. However, this may not hold true on MP 2004systems, particularly if one CPU binds the netperf and netserver 2005differently via the global @option{-T} option. 2006 2007For smaller request and response sizes packet loss is a bigger issue 2008as there is no opportunity for a @dfn{fast retransmit} or 2009retransmission prior to a retransmission timer expiring. 2010 2011Virtualization may considerably increase the effective path length of 2012a networking stack. While this may not preclude achieving link-rate 2013on a comparatively slow link (eg 1 Gigabit Ethernet) on a _STREAM 2014test, it can show-up as measurably fewer transactions per second on an 2015_RR test. However, this may still be masked by interrupt coalescing 2016in the NIC/driver. 2017 2018Certain NICs have ways to minimize the number of interrupts sent to 2019the host. If these are strapped badly they can significantly reduce 2020the performance of something like a single-byte request/response test. 2021Such setups are distinguished by seriously low reported CPU utilization 2022and what seems like a low (even if in the thousands) transaction per 2023second rate. Also, if you run such an OS/driver combination on faster 2024or slower hardware and do not see a corresponding change in the 2025transaction rate, chances are good that the driver is strapping the 2026NIC with aggressive interrupt avoidance settings. Good for bulk 2027throughput, but bad for latency. 2028 2029Some drivers may try to automagically adjust the interrupt avoidance 2030settings. If they are not terribly good at it, you will see 2031considerable run-to-run variation in reported transaction rates. 2032Particularly if you ``mix-up'' _STREAM and _RR tests. 2033 2034 2035@node Options Common to TCP UDP and SCTP _RR tests, , Issues in Request/Response, Using Netperf to Measure Request/Response 2036@comment node-name, next, previous, up 2037@section Options Common to TCP UDP and SCTP _RR tests 2038 2039Many ``test-specific'' options are actually common across the 2040different tests. For those tests involving TCP, UDP and SCTP, whether 2041using the BSD Sockets or the XTI interface those common options 2042include: 2043 2044@table @code 2045@vindex -h, Test-specific 2046@item -h 2047Display the test-suite-specific usage string and exit. For a TCP_ or 2048UDP_ test this will be the usage string from the source file 2049@file{nettest_bsd.c}. For an XTI_ test, this will be the usage string 2050from the source file @file{src/nettest_xti.c}. For an SCTP test, this 2051will be the usage string from the source file 2052@file{src/nettest_sctp.c}. 2053 2054@vindex -H, Test-specific 2055@item -H <optionspec> 2056Normally, the remote hostname|IP and address family information is 2057inherited from the settings for the control connection (eg global 2058command-line @option{-H}, @option{-4} and/or @option{-6} options. 2059The test-specific @option{-H} will override those settings for the 2060data (aka test) connection only. Settings for the control connection 2061are left unchanged. This might be used to cause the control and data 2062connections to take different paths through the network. 2063 2064@vindex -L, Test-specific 2065@item -L <optionspec> 2066The test-specific @option{-L} option is identical to the test-specific 2067@option{-H} option except it affects the local hostname|IP and address 2068family information. As with its global command-line counterpart, this 2069is generally only useful when measuring though those evil, end-to-end 2070breaking things called firewalls. 2071 2072@vindex -P, Test-specific 2073@item -P <optionspec> 2074Set the local and/or remote port numbers for the data connection. 2075 2076@vindex -r, Test-specific 2077@item -r <sizespec> 2078This option sets the request (first value) and/or response (second 2079value) sizes for an _RR test. By default the units are bytes, but a 2080suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 2081(GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,'' ``m'' 2082or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes 2083respectively. For example: 2084@example 2085@code{-r 128,16K} 2086@end example 2087Will set the request size to 128 bytes and the response size to 16 KB 2088or 16384 bytes. [Default: 1 - a single-byte request and response ] 2089 2090@vindex -s, Test-specific 2091@item -s <sizespec> 2092This option sets the local (netperf) send and receive socket buffer 2093sizes for the data connection to the value(s) specified. Often, this 2094will affect the advertised and/or effective TCP or other window, but 2095on some platforms it may not. By default the units are bytes, but a 2096suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 2097(GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,'' ``m'' 2098or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes 2099respectively. For example: 2100@example 2101@code{-s 128K} 2102@end example 2103Will request the local send (netperf) and receive socket buffer sizes 2104to be 128KB or 131072 bytes. 2105 2106While the historic expectation is that setting the socket buffer size 2107has a direct effect on say the TCP window, today that may not hold 2108true for all stacks. When running under Windows a value of 0 may be 2109used which will be an indication to the stack the user wants to enable 2110a form of copy avoidance. [Default: -1 - use the system's default 2111socket buffer sizes] 2112 2113@vindex -S, Test-specific 2114@item -S <sizespec> 2115This option sets the remote (netserver) send and/or receive socket 2116buffer sizes for the data connection to the value(s) specified. 2117Often, this will affect the advertised and/or effective TCP or other 2118window, but on some platforms it may not. By default the units are 2119bytes, but a suffix of ``G,'' ``M,'' or ``K'' will specify the units 2120to be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of 2121``g,'' ``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes 2122respectively. For example: 2123@example 2124@code{-S 128K} 2125@end example 2126Will request the remote (netserver) send and receive socket buffer 2127sizes to be 128KB or 131072 bytes. 2128 2129While the historic expectation is that setting the socket buffer size 2130has a direct effect on say the TCP window, today that may not hold 2131true for all stacks. When running under Windows a value of 0 may be 2132used which will be an indication to the stack the user wants to enable 2133a form of copy avoidance. [Default: -1 - use the system's default 2134socket buffer sizes] 2135 2136@vindex -4, Test-specific 2137@item -4 2138Set the local and remote address family for the data connection to 2139AF_INET - ie use IPv4 addressing only. Just as with their global 2140command-line counterparts the last of the @option{-4}, @option{-6}, 2141@option{-H} or @option{-L} option wins for their respective address 2142families. 2143 2144@vindex -6 Test-specific 2145@item -6 2146This option is identical to its @option{-4} cousin, but requests IPv6 2147addresses for the local and remote ends of the data connection. 2148 2149@end table 2150 2151@menu 2152* TCP_RR:: 2153* TCP_CC:: 2154* TCP_CRR:: 2155* UDP_RR:: 2156* XTI_TCP_RR:: 2157* XTI_TCP_CC:: 2158* XTI_TCP_CRR:: 2159* XTI_UDP_RR:: 2160* DLCL_RR:: 2161* DLCO_RR:: 2162* SCTP_RR:: 2163@end menu 2164 2165@node TCP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests, Options Common to TCP UDP and SCTP _RR tests 2166@subsection TCP_RR 2167@cindex Measuring Latency 2168@cindex Latency, Request-Response 2169 2170A TCP_RR (TCP Request/Response) test is requested by passing a value 2171of ``TCP_RR'' to the global @option{-t} command-line option. A TCP_RR 2172test can be thought-of as a user-space to user-space @code{ping} with 2173no think time - it is by default a synchronous, one transaction at a 2174time, request/response test. 2175 2176The transaction rate is the number of complete transactions exchanged 2177divided by the length of time it took to perform those transactions. 2178 2179If the two Systems Under Test are otherwise identical, a TCP_RR test 2180with the same request and response size should be symmetric - it 2181should not matter which way the test is run, and the CPU utilization 2182measured should be virtually the same on each system. If not, it 2183suggests that the CPU utilization mechanism being used may have some, 2184well, issues measuring CPU utilization completely and accurately. 2185 2186Time to establish the TCP connection is not counted in the result. If 2187you want connection setup overheads included, you should consider the 2188@ref{TCP_CC,TPC_CC} or @ref{TCP_CRR,TCP_CRR} tests. 2189 2190If specifying the @option{-D} option to set TCP_NODELAY and disable 2191the Nagle Algorithm increases the transaction rate reported by a 2192TCP_RR test, it implies the stack(s) over which the TCP_RR test is 2193running have a broken implementation of the Nagle Algorithm. Likely 2194as not they are interpreting Nagle on a segment by segment basis 2195rather than a user send by user send basis. You should contact your 2196stack vendor(s) to report the problem to them. 2197 2198Here is an example of two systems running a basic TCP_RR test over a 219910 Gigabit Ethernet link: 2200 2201@example 2202netperf -t TCP_RR -H 192.168.2.125 2203TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET 2204Local /Remote 2205Socket Size Request Resp. Elapsed Trans. 2206Send Recv Size Size Time Rate 2207bytes Bytes bytes bytes secs. per sec 2208 220916384 87380 1 1 10.00 29150.15 221016384 87380 2211@end example 2212 2213In this example the request and response sizes were one byte, the 2214socket buffers were left at their defaults, and the test ran for all 2215of 10 seconds. The transaction per second rate was rather good for 2216the time :) 2217 2218@node TCP_CC, TCP_CRR, TCP_RR, Options Common to TCP UDP and SCTP _RR tests 2219@subsection TCP_CC 2220@cindex Connection Latency 2221@cindex Latency, Connection Establishment 2222 2223A TCP_CC (TCP Connect/Close) test is requested by passing a value of 2224``TCP_CC'' to the global @option{-t} option. A TCP_CC test simply 2225measures how fast the pair of systems can open and close connections 2226between one another in a synchronous (one at a time) manner. While 2227this is considered an _RR test, no request or response is exchanged 2228over the connection. 2229 2230@cindex Port Reuse 2231@cindex TIME_WAIT 2232The issue of TIME_WAIT reuse is an important one for a TCP_CC test. 2233Basically, TIME_WAIT reuse is when a pair of systems churn through 2234connections fast enough that they wrap the 16-bit port number space in 2235less time than the length of the TIME_WAIT state. While it is indeed 2236theoretically possible to ``reuse'' a connection in TIME_WAIT, the 2237conditions under which such reuse is possible are rather rare. An 2238attempt to reuse a connection in TIME_WAIT can result in a non-trivial 2239delay in connection establishment. 2240 2241Basically, any time the connection churn rate approaches: 2242 2243Sizeof(clientportspace) / Lengthof(TIME_WAIT) 2244 2245there is the risk of TIME_WAIT reuse. To minimize the chances of this 2246happening, netperf will by default select its own client port numbers 2247from the range of 5000 to 65535. On systems with a 60 second 2248TIME_WAIT state, this should allow roughly 1000 transactions per 2249second. The size of the client port space used by netperf can be 2250controlled via the test-specific @option{-p} option, which takes a 2251@dfn{sizespec} as a value setting the minimum (first value) and 2252maximum (second value) port numbers used by netperf at the client end. 2253 2254Since no requests or responses are exchanged during a TCP_CC test, 2255only the @option{-H}, @option{-L}, @option{-4} and @option{-6} of the 2256``common'' test-specific options are likely to have an effect, if any, 2257on the results. The @option{-s} and @option{-S} options _may_ have 2258some effect if they alter the number and/or type of options carried in 2259the TCP SYNchronize segments, such as Window Scaling or Timestamps. 2260The @option{-P} and @option{-r} options are utterly ignored. 2261 2262Since connection establishment and tear-down for TCP is not symmetric, 2263a TCP_CC test is not symmetric in its loading of the two systems under 2264test. 2265 2266@node TCP_CRR, UDP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests 2267@subsection TCP_CRR 2268@cindex Latency, Connection Establishment 2269@cindex Latency, Request-Response 2270 2271The TCP Connect/Request/Response (TCP_CRR) test is requested by 2272passing a value of ``TCP_CRR'' to the global @option{-t} command-line 2273option. A TCP_CRR test is like a merger of a @ref{TCP_RR} and 2274@ref{TCP_CC} test which measures the performance of establishing a 2275connection, exchanging a single request/response transaction, and 2276tearing-down that connection. This is very much like what happens in 2277an HTTP 1.0 or HTTP 1.1 connection when HTTP Keepalives are not used. 2278In fact, the TCP_CRR test was added to netperf to simulate just that. 2279 2280Since a request and response are exchanged the @option{-r}, 2281@option{-s} and @option{-S} options can have an effect on the 2282performance. 2283 2284The issue of TIME_WAIT reuse exists for the TCP_CRR test just as it 2285does for the TCP_CC test. Similarly, since connection establishment 2286and tear-down is not symmetric, a TCP_CRR test is not symmetric even 2287when the request and response sizes are the same. 2288 2289@node UDP_RR, XTI_TCP_RR, TCP_CRR, Options Common to TCP UDP and SCTP _RR tests 2290@subsection UDP_RR 2291@cindex Latency, Request-Response 2292@cindex Packet Loss 2293 2294A UDP Request/Response (UDP_RR) test is requested by passing a value 2295of ``UDP_RR'' to a global @option{-t} option. It is very much the 2296same as a TCP_RR test except UDP is used rather than TCP. 2297 2298UDP does not provide for retransmission of lost UDP datagrams, and 2299netperf does not add anything for that either. This means that if 2300_any_ request or response is lost, the exchange of requests and 2301responses will stop from that point until the test timer expires. 2302Netperf will not really ``know'' this has happened - the only symptom 2303will be a low transaction per second rate. If @option{--enable-burst} 2304was included in the @code{configure} command and a test-specific 2305@option{-b} option used, the UDP_RR test will ``survive'' the loss of 2306requests and responses until the sum is one more than the value passed 2307via the @option{-b} option. It will though almost certainly run more 2308slowly. 2309 2310The netperf side of a UDP_RR test will call @code{connect()} on its 2311data socket and thenceforth use the @code{send()} and @code{recv()} 2312socket calls. The netserver side of a UDP_RR test will not call 2313@code{connect()} and will use @code{recvfrom()} and @code{sendto()} 2314calls. This means that even if the request and response sizes are the 2315same, a UDP_RR test is _not_ symmetric in its loading of the two 2316systems under test. 2317 2318Here is an example of a UDP_RR test between two otherwise 2319identical two-CPU systems joined via a 1 Gigabit Ethernet network: 2320 2321@example 2322$ netperf -T 1 -H 192.168.1.213 -t UDP_RR -c -C 2323UDP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.213 (192.168.1.213) port 0 AF_INET 2324Local /Remote 2325Socket Size Request Resp. Elapsed Trans. CPU CPU S.dem S.dem 2326Send Recv Size Size Time Rate local remote local remote 2327bytes bytes bytes bytes secs. per sec % I % I us/Tr us/Tr 2328 232965535 65535 1 1 10.01 15262.48 13.90 16.11 18.221 21.116 233065535 65535 2331@end example 2332 2333This example includes the @option{-c} and @option{-C} options to 2334enable CPU utilization reporting and shows the asymmetry in CPU 2335loading. The @option{-T} option was used to make sure netperf and 2336netserver ran on a given CPU and did not move around during the test. 2337 2338@node XTI_TCP_RR, XTI_TCP_CC, UDP_RR, Options Common to TCP UDP and SCTP _RR tests 2339@subsection XTI_TCP_RR 2340@cindex Latency, Request-Response 2341 2342An XTI_TCP_RR test is essentially the same as a @ref{TCP_RR} test only 2343using the XTI rather than BSD Sockets interface. It is requested by 2344passing a value of ``XTI_TCP_RR'' to the @option{-t} global 2345command-line option. 2346 2347The test-specific options for an XTI_TCP_RR test are the same as those 2348for a TCP_RR test with the addition of the @option{-X <devspec>} option to 2349specify the names of the local and/or remote XTI device file(s). 2350 2351@node XTI_TCP_CC, XTI_TCP_CRR, XTI_TCP_RR, Options Common to TCP UDP and SCTP _RR tests 2352@comment node-name, next, previous, up 2353@subsection XTI_TCP_CC 2354@cindex Latency, Connection Establishment 2355 2356An XTI_TCP_CC test is essentially the same as a @ref{TCP_CC,TCP_CC} 2357test, only using the XTI rather than BSD Sockets interface. 2358 2359The test-specific options for an XTI_TCP_CC test are the same as those 2360for a TCP_CC test with the addition of the @option{-X <devspec>} option to 2361specify the names of the local and/or remote XTI device file(s). 2362 2363@node XTI_TCP_CRR, XTI_UDP_RR, XTI_TCP_CC, Options Common to TCP UDP and SCTP _RR tests 2364@comment node-name, next, previous, up 2365@subsection XTI_TCP_CRR 2366@cindex Latency, Connection Establishment 2367@cindex Latency, Request-Response 2368 2369The XTI_TCP_CRR test is essentially the same as a 2370@ref{TCP_CRR,TCP_CRR} test, only using the XTI rather than BSD Sockets 2371interface. 2372 2373The test-specific options for an XTI_TCP_CRR test are the same as those 2374for a TCP_RR test with the addition of the @option{-X <devspec>} option to 2375specify the names of the local and/or remote XTI device file(s). 2376 2377@node XTI_UDP_RR, DLCL_RR, XTI_TCP_CRR, Options Common to TCP UDP and SCTP _RR tests 2378@subsection XTI_UDP_RR 2379@cindex Latency, Request-Response 2380 2381An XTI_UDP_RR test is essentially the same as a UDP_RR test only using 2382the XTI rather than BSD Sockets interface. It is requested by passing 2383a value of ``XTI_UDP_RR'' to the @option{-t} global command-line 2384option. 2385 2386The test-specific options for an XTI_UDP_RR test are the same as those 2387for a UDP_RR test with the addition of the @option{-X <devspec>} 2388option to specify the name of the local and/or remote XTI device 2389file(s). 2390 2391@node DLCL_RR, DLCO_RR, XTI_UDP_RR, Options Common to TCP UDP and SCTP _RR tests 2392@comment node-name, next, previous, up 2393@subsection DLCL_RR 2394@cindex Latency, Request-Response 2395 2396@node DLCO_RR, SCTP_RR, DLCL_RR, Options Common to TCP UDP and SCTP _RR tests 2397@comment node-name, next, previous, up 2398@subsection DLCO_RR 2399@cindex Latency, Request-Response 2400 2401@node SCTP_RR, , DLCO_RR, Options Common to TCP UDP and SCTP _RR tests 2402@comment node-name, next, previous, up 2403@subsection SCTP_RR 2404@cindex Latency, Request-Response 2405 2406@node Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Request/Response , Top 2407@comment node-name, next, previous, up 2408@chapter Using Netperf to Measure Aggregate Performance 2409@cindex Aggregate Performance 2410@vindex --enable-burst, Configure 2411 2412Ultimately, @ref{Netperf4,Netperf4} will be the preferred benchmark to 2413use when one wants to measure aggregate performance because netperf 2414has no support for explicit synchronization of concurrent tests. Until 2415netperf4 is ready for prime time, one can make use of the heuristics 2416and procedures mentioned here for the 85% solution. 2417 2418There are a few ways to measure aggregate performance with netperf. 2419The first is to run multiple, concurrent netperf tests and can be 2420applied to any of the netperf tests. The second is to configure 2421netperf with @code{--enable-burst} and is applicable to the TCP_RR 2422test. The third is a variation on the first. 2423 2424@menu 2425* Running Concurrent Netperf Tests:: 2426* Using --enable-burst:: 2427* Using --enable-demo:: 2428@end menu 2429 2430@node Running Concurrent Netperf Tests, Using --enable-burst, Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Aggregate Performance 2431@comment node-name, next, previous, up 2432@section Running Concurrent Netperf Tests 2433 2434@ref{Netperf4,Netperf4} is the preferred benchmark to use when one 2435wants to measure aggregate performance because netperf has no support 2436for explicit synchronization of concurrent tests. This leaves 2437netperf2 results vulnerable to @dfn{skew} errors. 2438 2439However, since there are times when netperf4 is unavailable it may be 2440necessary to run netperf. The skew error can be minimized by making 2441use of the confidence interval functionality. Then one simply 2442launches multiple tests from the shell using a @code{for} loop or the 2443like: 2444 2445@example 2446for i in 1 2 3 4 2447do 2448netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 & 2449done 2450@end example 2451 2452which will run four, concurrent @ref{TCP_STREAM,TCP_STREAM} tests from 2453the system on which it is executed to tardy.cup.hp.com. Each 2454concurrent netperf will iterate 10 times thanks to the @option{-i} 2455option and will omit the test banners (option @option{-P}) for 2456brevity. The output looks something like this: 2457 2458@example 2459 87380 16384 16384 10.03 235.15 2460 87380 16384 16384 10.03 235.09 2461 87380 16384 16384 10.03 235.38 2462 87380 16384 16384 10.03 233.96 2463@end example 2464 2465We can take the sum of the results and be reasonably confident that 2466the aggregate performance was 940 Mbits/s. This method does not need 2467to be limited to one system speaking to one other system. It can be 2468extended to one system talking to N other systems. It could be as simple as: 2469@example 2470for host in 'foo bar baz bing' 2471do 2472netperf -t TCP_STREAM -H $hosts -i 10 -P 0 & 2473done 2474@end example 2475A more complicated/sophisticated example can be found in 2476@file{doc/examples/runemomniagg2.sh} where. 2477 2478If you see warnings about netperf not achieving the confidence 2479intervals, the best thing to do is to increase the number of 2480iterations with @option{-i} and/or increase the run length of each 2481iteration with @option{-l}. 2482 2483You can also enable local (@option{-c}) and/or remote (@option{-C}) 2484CPU utilization: 2485 2486@example 2487for i in 1 2 3 4 2488do 2489netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 -c -C & 2490done 2491 249287380 16384 16384 10.03 235.47 3.67 5.09 10.226 14.180 249387380 16384 16384 10.03 234.73 3.67 5.09 10.260 14.225 249487380 16384 16384 10.03 234.64 3.67 5.10 10.263 14.231 249587380 16384 16384 10.03 234.87 3.67 5.09 10.253 14.215 2496@end example 2497 2498If the CPU utilizations reported for the same system are the same or 2499very very close you can be reasonably confident that skew error is 2500minimized. Presumably one could then omit @option{-i} but that is 2501not advised, particularly when/if the CPU utilization approaches 100 2502percent. In the example above we see that the CPU utilization on the 2503local system remains the same for all four tests, and is only off by 25040.01 out of 5.09 on the remote system. As the number of CPUs in the 2505system increases, and so too the odds of saturating a single CPU, the 2506accuracy of similar CPU utilization implying little skew error is 2507diminished. This is also the case for those increasingly rare single 2508CPU systems if the utilization is reported as 100% or very close to 2509it. 2510 2511@quotation 2512@b{NOTE: It is very important to remember that netperf is calculating 2513system-wide CPU utilization. When calculating the service demand 2514(those last two columns in the output above) each netperf assumes it 2515is the only thing running on the system. This means that for 2516concurrent tests the service demands reported by netperf will be 2517wrong. One has to compute service demands for concurrent tests by 2518hand.} 2519@end quotation 2520 2521If you wish you can add a unique, global @option{-B} option to each 2522command line to append the given string to the output: 2523 2524@example 2525for i in 1 2 3 4 2526do 2527netperf -t TCP_STREAM -H tardy.cup.hp.com -B "this is test $i" -i 10 -P 0 & 2528done 2529 253087380 16384 16384 10.03 234.90 this is test 4 253187380 16384 16384 10.03 234.41 this is test 2 253287380 16384 16384 10.03 235.26 this is test 1 253387380 16384 16384 10.03 235.09 this is test 3 2534@end example 2535 2536You will notice that the tests completed in an order other than they 2537were started from the shell. This underscores why there is a threat 2538of skew error and why netperf4 will eventually be the preferred tool 2539for aggregate tests. Even if you see the Netperf Contributing Editor 2540acting to the contrary!-) 2541 2542@menu 2543* Issues in Running Concurrent Tests:: 2544@end menu 2545 2546@node Issues in Running Concurrent Tests, , Running Concurrent Netperf Tests, Running Concurrent Netperf Tests 2547@subsection Issues in Running Concurrent Tests 2548 2549In addition to the aforementioned issue of skew error, there can be 2550other issues to consider when running concurrent netperf tests. 2551 2552For example, when running concurrent tests over multiple interfaces, 2553one is not always assured that the traffic one thinks went over a 2554given interface actually did so. In particular, the Linux networking 2555stack takes a particularly strong stance on its following the so 2556called @samp{weak end system model}. As such, it is willing to answer 2557ARP requests for any of its local IP addresses on any of its 2558interfaces. If multiple interfaces are connected to the same 2559broadcast domain, then even if they are configured into separate IP 2560subnets there is no a priori way of knowing which interface was 2561actually used for which connection(s). This can be addressed by 2562setting the @samp{arp_ignore} sysctl before configuring interfaces. 2563 2564As it is quite important, we will repeat that it is very important to 2565remember that each concurrent netperf instance is calculating 2566system-wide CPU utilization. When calculating the service demand each 2567netperf assumes it is the only thing running on the system. This 2568means that for concurrent tests the service demands reported by 2569netperf @b{will be wrong}. One has to compute service demands for 2570concurrent tests by hand 2571 2572Running concurrent tests can also become difficult when there is no 2573one ``central'' node. Running tests between pairs of systems may be 2574more difficult, calling for remote shell commands in the for loop 2575rather than netperf commands. This introduces more skew error, which 2576the confidence intervals may not be able to sufficiently mitigate. 2577One possibility is to actually run three consecutive netperf tests on 2578each node - the first being a warm-up, the last being a cool-down. 2579The idea then is to ensure that the time it takes to get all the 2580netperfs started is less than the length of the first netperf command 2581in the sequence of three. Similarly, it assumes that all ``middle'' 2582netperfs will complete before the first of the ``last'' netperfs 2583complete. 2584 2585@node Using --enable-burst, Using --enable-demo, Running Concurrent Netperf Tests, Using Netperf to Measure Aggregate Performance 2586@comment node-name, next, previous, up 2587@section Using - -enable-burst 2588 2589Starting in version 2.5.0 @code{--enable-burst=yes} is the default, 2590which means one no longer must: 2591 2592@example 2593configure --enable-burst 2594@end example 2595 2596To have burst-mode functionality present in netperf. This enables a 2597test-specific @option{-b num} option in @ref{TCP_RR,TCP_RR}, 2598@ref{UDP_RR,UDP_RR} and @ref{The Omni Tests,omni} tests. 2599 2600Normally, netperf will attempt to ramp-up the number of outstanding 2601requests to @option{num} plus one transactions in flight at one time. 2602The ramp-up is to avoid transactions being smashed together into a 2603smaller number of segments when the transport's congestion window (if 2604any) is smaller at the time than what netperf wants to have 2605outstanding at one time. If, however, the user specifies a negative 2606value for @option{num} this ramp-up is bypassed and the burst of sends 2607is made without consideration of transport congestion window. 2608 2609This burst-mode is used as an alternative to or even in conjunction 2610with multiple-concurrent _RR tests and as a way to implement a 2611single-connection, bidirectional bulk-transfer test. When run with 2612just a single instance of netperf, increasing the burst size can 2613determine the maximum number of transactions per second which can be 2614serviced by a single process: 2615 2616@example 2617for b in 0 1 2 4 8 16 32 2618do 2619 netperf -v 0 -t TCP_RR -B "-b $b" -H hpcpc108 -P 0 -- -b $b 2620done 2621 26229457.59 -b 0 26239975.37 -b 1 262410000.61 -b 2 262520084.47 -b 4 262629965.31 -b 8 262771929.27 -b 16 2628109718.17 -b 32 2629@end example 2630 2631The global @option{-v} and @option{-P} options were used to minimize 2632the output to the single figure of merit which in this case the 2633transaction rate. The global @code{-B} option was used to more 2634clearly label the output, and the test-specific @option{-b} option 2635enabled by @code{--enable-burst} increase the number of transactions 2636in flight at one time. 2637 2638Now, since the test-specific @option{-D} option was not specified to 2639set TCP_NODELAY, the stack was free to ``bundle'' requests and/or 2640responses into TCP segments as it saw fit, and since the default 2641request and response size is one byte, there could have been some 2642considerable bundling even in the absence of transport congestion 2643window issues. If one wants to try to achieve a closer to 2644one-to-one correspondence between a request and response and a TCP 2645segment, add the test-specific @option{-D} option: 2646 2647@example 2648for b in 0 1 2 4 8 16 32 2649do 2650 netperf -v 0 -t TCP_RR -B "-b $b -D" -H hpcpc108 -P 0 -- -b $b -D 2651done 2652 2653 8695.12 -b 0 -D 2654 19966.48 -b 1 -D 2655 20691.07 -b 2 -D 2656 49893.58 -b 4 -D 2657 62057.31 -b 8 -D 2658 108416.88 -b 16 -D 2659 114411.66 -b 32 -D 2660@end example 2661 2662You can see that this has a rather large effect on the reported 2663transaction rate. In this particular instance, the author believes it 2664relates to interactions between the test and interrupt coalescing 2665settings in the driver for the NICs used. 2666 2667@quotation 2668@b{NOTE: Even if you set the @option{-D} option that is still not a 2669guarantee that each transaction is in its own TCP segments. You 2670should get into the habit of verifying the relationship between the 2671transaction rate and the packet rate via other means.} 2672@end quotation 2673 2674You can also combine @code{--enable-burst} functionality with 2675concurrent netperf tests. This would then be an ``aggregate of 2676aggregates'' if you like: 2677 2678@example 2679 2680for i in 1 2 3 4 2681do 2682 netperf -H hpcpc108 -v 0 -P 0 -i 10 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & 2683done 2684 2685 46668.38 aggregate 4 -b 8 -D 2686 44890.64 aggregate 2 -b 8 -D 2687 45702.04 aggregate 1 -b 8 -D 2688 46352.48 aggregate 3 -b 8 -D 2689 2690@end example 2691 2692Since each netperf did hit the confidence intervals, we can be 2693reasonably certain that the aggregate transaction per second rate was 2694the sum of all four concurrent tests, or something just shy of 184,000 2695transactions per second. To get some idea if that was also the packet 2696per second rate, we could bracket that @code{for} loop with something 2697to gather statistics and run the results through 2698@uref{ftp://ftp.cup.hp.com/dist/networking/tools,beforeafter}: 2699 2700@example 2701/usr/sbin/ethtool -S eth2 > before 2702for i in 1 2 3 4 2703do 2704 netperf -H 192.168.2.108 -l 60 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & 2705done 2706wait 2707/usr/sbin/ethtool -S eth2 > after 2708 2709 52312.62 aggregate 2 -b 8 -D 2710 50105.65 aggregate 4 -b 8 -D 2711 50890.82 aggregate 1 -b 8 -D 2712 50869.20 aggregate 3 -b 8 -D 2713 2714beforeafter before after > delta 2715 2716grep packets delta 2717 rx_packets: 12251544 2718 tx_packets: 12251550 2719 2720@end example 2721 2722This example uses @code{ethtool} because the system being used is 2723running Linux. Other platforms have other tools - for example HP-UX 2724has lanadmin: 2725 2726@example 2727lanadmin -g mibstats <ppa> 2728@end example 2729 2730and of course one could instead use @code{netstat}. 2731 2732The @code{wait} is important because we are launching concurrent 2733netperfs in the background. Without it, the second ethtool command 2734would be run before the tests finished and perhaps even before the 2735last of them got started! 2736 2737The sum of the reported transaction rates is 204178 over 60 seconds, 2738which is a total of 12250680 transactions. Each transaction is the 2739exchange of a request and a response, so we multiply that by 2 to 2740arrive at 24501360. 2741 2742The sum of the ethtool stats is 24503094 packets which matches what 2743netperf was reporting very well. 2744 2745Had the request or response size differed, we would need to know how 2746it compared with the @dfn{MSS} for the connection. 2747 2748Just for grins, here is the exercise repeated, using @code{netstat} 2749instead of @code{ethtool} 2750 2751@example 2752netstat -s -t > before 2753for i in 1 2 3 4 2754do 2755 netperf -l 60 -H 192.168.2.108 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & done 2756wait 2757netstat -s -t > after 2758 2759 51305.88 aggregate 4 -b 8 -D 2760 51847.73 aggregate 2 -b 8 -D 2761 50648.19 aggregate 3 -b 8 -D 2762 53605.86 aggregate 1 -b 8 -D 2763 2764beforeafter before after > delta 2765 2766grep segments delta 2767 12445708 segments received 2768 12445730 segments send out 2769 1 segments retransmited 2770 0 bad segments received. 2771@end example 2772 2773The sums are left as an exercise to the reader :) 2774 2775Things become considerably more complicated if there are non-trvial 2776packet losses and/or retransmissions. 2777 2778Of course all this checking is unnecessary if the test is a UDP_RR 2779test because UDP ``never'' aggregates multiple sends into the same UDP 2780datagram, and there are no ACKnowledgements in UDP. The loss of a 2781single request or response will not bring a ``burst'' UDP_RR test to a 2782screeching halt, but it will reduce the number of transactions 2783outstanding at any one time. A ``burst'' UDP_RR test @b{will} come to a 2784halt if the sum of the lost requests and responses reaches the value 2785specified in the test-specific @option{-b} option. 2786 2787@node Using --enable-demo, , Using --enable-burst, Using Netperf to Measure Aggregate Performance 2788@section Using - -enable-demo 2789 2790One can 2791@example 2792configure --enable-demo 2793@end example 2794and compile netperf to enable netperf to emit ``interim results'' at 2795semi-regular intervals. This enables a global @code{-D} option which 2796takes a reporting interval as an argument. With that specified, the 2797output of netperf will then look something like 2798 2799@example 2800$ src/netperf -D 1.25 2801MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain () port 0 AF_INET : demo 2802Interim result: 25425.52 10^6bits/s over 1.25 seconds ending at 1327962078.405 2803Interim result: 25486.82 10^6bits/s over 1.25 seconds ending at 1327962079.655 2804Interim result: 25474.96 10^6bits/s over 1.25 seconds ending at 1327962080.905 2805Interim result: 25523.49 10^6bits/s over 1.25 seconds ending at 1327962082.155 2806Interim result: 25053.57 10^6bits/s over 1.27 seconds ending at 1327962083.429 2807Interim result: 25349.64 10^6bits/s over 1.25 seconds ending at 1327962084.679 2808Interim result: 25292.84 10^6bits/s over 1.25 seconds ending at 1327962085.932 2809Recv Send Send 2810Socket Socket Message Elapsed 2811Size Size Size Time Throughput 2812bytes bytes bytes secs. 10^6bits/sec 2813 2814 87380 16384 16384 10.00 25375.66 2815@end example 2816The units of the ``Interim result'' lines will follow the units 2817selected via the global @code{-f} option. If the test-specific 2818@code{-o} option is specified on the command line, the format will be 2819CSV: 2820@example 2821... 28222978.81,MBytes/s,1.25,1327962298.035 2823... 2824@end example 2825If the test-specific @code{-k} option is used the format will be 2826keyval with each keyval being given an index: 2827@example 2828... 2829NETPERF_INTERIM_RESULT[2]=25.00 2830NETPERF_UNITS[2]=10^9bits/s 2831NETPERF_INTERVAL[2]=1.25 2832NETPERF_ENDING[2]=1327962357.249 2833... 2834@end example 2835The expectation is it may be easier to utilize the keyvals if they 2836have indices. 2837 2838But how does this help with aggregate tests? Well, what one can do is 2839start the netperfs via a script, giving each a Very Long (tm) run 2840time. Direct the output to a file per instance. Then, once all the 2841netperfs have been started, take a timestamp and wait for some desired 2842test interval. Once that interval expires take another timestamp and 2843then start terminating the netperfs by sending them a SIGALRM signal 2844via the likes of the @code{kill} or @code{pkill} command. The 2845netperfs will terminate and emit the rest of the ``usual'' output, and 2846you can then bring the files to a central location for post 2847processing to find the aggregate performance over the ``test interval.'' 2848 2849This method has the advantage that it does not require advance 2850knowledge of how long it takes to get netperf tests started and/or 2851stopped. It does though require sufficiently synchronized clocks on 2852all the test systems. 2853 2854While calls to get the current time can be inexpensive, that neither 2855has been nor is universally true. For that reason netperf tries to 2856minimize the number of such ``timestamping'' calls (eg 2857@code{gettimeofday}) calls it makes when in demo mode. Rather than 2858take a timestamp after each @code{send} or @code{recv} call completes 2859netperf tries to guess how many units of work will be performed over 2860the desired interval. Only once that many units of work have been 2861completed will netperf check the time. If the reporting interval has 2862passed, netperf will emit an ``interim result.'' If the interval has 2863not passed, netperf will update its estimate for units and continue. 2864 2865After a bit of thought one can see that if things ``speed-up'' netperf 2866will still honor the interval. However, if things ``slow-down'' 2867netperf may be late with an ``interim result.'' Here is an example of 2868both of those happening during a test - with the interval being 2869honored while throughput increases, and then about half-way through 2870when another netperf (not shown) is started we see things slowing down 2871and netperf not hitting the interval as desired. 2872@example 2873$ src/netperf -D 2 -H tardy.hpl.hp.com -l 20 2874MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com () port 0 AF_INET : demo 2875Interim result: 36.46 10^6bits/s over 2.01 seconds ending at 1327963880.565 2876Interim result: 59.19 10^6bits/s over 2.00 seconds ending at 1327963882.569 2877Interim result: 73.39 10^6bits/s over 2.01 seconds ending at 1327963884.576 2878Interim result: 84.01 10^6bits/s over 2.03 seconds ending at 1327963886.603 2879Interim result: 75.63 10^6bits/s over 2.21 seconds ending at 1327963888.814 2880Interim result: 55.52 10^6bits/s over 2.72 seconds ending at 1327963891.538 2881Interim result: 70.94 10^6bits/s over 2.11 seconds ending at 1327963893.650 2882Interim result: 80.66 10^6bits/s over 2.13 seconds ending at 1327963895.777 2883Interim result: 86.42 10^6bits/s over 2.12 seconds ending at 1327963897.901 2884Recv Send Send 2885Socket Socket Message Elapsed 2886Size Size Size Time Throughput 2887bytes bytes bytes secs. 10^6bits/sec 2888 2889 87380 16384 16384 20.34 68.87 2890@end example 2891So long as your post-processing mechanism can account for that, there 2892should be no problem. As time passes there may be changes to try to 2893improve the netperf's honoring the interval but one should not 2894ass-u-me it will always do so. One should not assume the precision 2895will remain fixed - future versions may change it - perhaps going 2896beyond tenths of seconds in reporting the interval length etc. 2897 2898@node Using Netperf to Measure Bidirectional Transfer, The Omni Tests, Using Netperf to Measure Aggregate Performance, Top 2899@comment node-name, next, previous, up 2900@chapter Using Netperf to Measure Bidirectional Transfer 2901 2902There are two ways to use netperf to measure the performance of 2903bidirectional transfer. The first is to run concurrent netperf tests 2904from the command line. The second is to configure netperf with 2905@code{--enable-burst} and use a single instance of the 2906@ref{TCP_RR,TCP_RR} test. 2907 2908While neither method is more ``correct'' than the other, each is doing 2909so in different ways, and that has possible implications. For 2910instance, using the concurrent netperf test mechanism means that 2911multiple TCP connections and multiple processes are involved, whereas 2912using the single instance of TCP_RR there is only one TCP connection 2913and one process on each end. They may behave differently, especially 2914on an MP system. 2915 2916@menu 2917* Bidirectional Transfer with Concurrent Tests:: 2918* Bidirectional Transfer with TCP_RR:: 2919* Implications of Concurrent Tests vs Burst Request/Response:: 2920@end menu 2921 2922@node Bidirectional Transfer with Concurrent Tests, Bidirectional Transfer with TCP_RR, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Bidirectional Transfer 2923@comment node-name, next, previous, up 2924@section Bidirectional Transfer with Concurrent Tests 2925 2926If we had two hosts Fred and Ethel, we could simply run a netperf 2927@ref{TCP_STREAM,TCP_STREAM} test on Fred pointing at Ethel, and a 2928concurrent netperf TCP_STREAM test on Ethel pointing at Fred, but 2929since there are no mechanisms to synchronize netperf tests and we 2930would be starting tests from two different systems, there is a 2931considerable risk of skew error. 2932 2933Far better would be to run simultaneous TCP_STREAM and 2934@ref{TCP_MAERTS,TCP_MAERTS} tests from just @b{one} system, using the 2935concepts and procedures outlined in @ref{Running Concurrent Netperf 2936Tests,Running Concurrent Netperf Tests}. Here then is an example: 2937 2938@example 2939for i in 1 2940do 2941 netperf -H 192.168.2.108 -t TCP_STREAM -B "outbound" -i 10 -P 0 -v 0 \ 2942 -- -s 256K -S 256K & 2943 netperf -H 192.168.2.108 -t TCP_MAERTS -B "inbound" -i 10 -P 0 -v 0 \ 2944 -- -s 256K -S 256K & 2945done 2946 2947 892.66 outbound 2948 891.34 inbound 2949@end example 2950 2951We have used a @code{for} loop in the shell with just one iteration 2952because that will be @b{much} easier to get both tests started at more or 2953less the same time than doing it by hand. The global @option{-P} and 2954@option{-v} options are used because we aren't interested in anything 2955other than the throughput, and the global @option{-B} option is used 2956to tag each output so we know which was inbound and which outbound 2957relative to the system on which we were running netperf. Of course 2958that sense is switched on the system running netserver :) The use of 2959the global @option{-i} option is explained in @ref{Running Concurrent 2960Netperf Tests,Running Concurrent Netperf Tests}. 2961 2962Beginning with version 2.5.0 we can accomplish a similar result with 2963the @ref{The Omni Tests,the omni tests} and @ref{Omni Output 2964Selectors,output selectors}: 2965 2966@example 2967for i in 1 2968do 2969 netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \ 2970 -d stream -s 256K -S 256K -o throughput,direction & 2971 netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \ 2972 -d maerts -s 256K -S 256K -o throughput,direction & 2973done 2974 2975805.26,Receive 2976828.54,Send 2977@end example 2978 2979@node Bidirectional Transfer with TCP_RR, Implications of Concurrent Tests vs Burst Request/Response, Bidirectional Transfer with Concurrent Tests, Using Netperf to Measure Bidirectional Transfer 2980@comment node-name, next, previous, up 2981@section Bidirectional Transfer with TCP_RR 2982 2983Starting with version 2.5.0 the @code{--enable-burst} configure option 2984defaults to @code{yes}, and starting some time before version 2.5.0 2985but after 2.4.0 the global @option{-f} option would affect the 2986``throughput'' reported by request/response tests. If one uses the 2987test-specific @option{-b} option to have several ``transactions'' in 2988flight at one time and the test-specific @option{-r} option to 2989increase their size, the test looks more and more like a 2990single-connection bidirectional transfer than a simple 2991request/response test. 2992 2993So, putting it all together one can do something like: 2994 2995@example 2996netperf -f m -t TCP_RR -H 192.168.1.3 -v 2 -- -b 6 -r 32K -S 256K -S 256K 2997MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.3 (192.168.1.3) port 0 AF_INET : interval : first burst 6 2998Local /Remote 2999Socket Size Request Resp. Elapsed 3000Send Recv Size Size Time Throughput 3001bytes Bytes bytes bytes secs. 10^6bits/sec 3002 300316384 87380 32768 32768 10.00 1821.30 3004524288 524288 3005Alignment Offset RoundTrip Trans Throughput 3006Local Remote Local Remote Latency Rate 10^6bits/s 3007Send Recv Send Recv usec/Tran per sec Outbound Inbound 3008 8 0 0 0 2015.402 3473.252 910.492 910.492 3009@end example 3010 3011to get a bidirectional bulk-throughput result. As one can see, the -v 30122 output will include a number of interesting, related values. 3013 3014@quotation 3015@b{NOTE: The logic behind @code{--enable-burst} is very simple, and there 3016are no calls to @code{poll()} or @code{select()} which means we want 3017to make sure that the @code{send()} calls will never block, or we run 3018the risk of deadlock with each side stuck trying to call @code{send()} 3019and neither calling @code{recv()}.} 3020@end quotation 3021 3022Fortunately, this is easily accomplished by setting a ``large enough'' 3023socket buffer size with the test-specific @option{-s} and @option{-S} 3024options. Presently this must be performed by the user. Future 3025versions of netperf might attempt to do this automagically, but there 3026are some issues to be worked-out. 3027 3028@node Implications of Concurrent Tests vs Burst Request/Response, , Bidirectional Transfer with TCP_RR, Using Netperf to Measure Bidirectional Transfer 3029@section Implications of Concurrent Tests vs Burst Request/Response 3030 3031There are perhaps subtle but important differences between using 3032concurrent unidirectional tests vs a burst-mode request to measure 3033bidirectional performance. 3034 3035Broadly speaking, a single ``connection'' or ``flow'' of traffic 3036cannot make use of the services of more than one or two CPUs at either 3037end. Whether one or two CPUs will be used processing a flow will 3038depend on the specifics of the stack(s) involved and whether or not 3039the global @option{-T} option has been used to bind netperf/netserver 3040to specific CPUs. 3041 3042When using concurrent tests there will be two concurrent connections 3043or flows, which means that upwards of four CPUs will be employed 3044processing the packets (global @option{-T} used, no more than two if 3045not), however, with just a single, bidirectional request/response test 3046no more than two CPUs will be employed (only one if the global 3047@option{-T} is not used). 3048 3049If there is a CPU bottleneck on either system this may result in 3050rather different results between the two methods. 3051 3052Also, with a bidirectional request/response test there is something of 3053a natural balance or synchronization between inbound and outbound - a 3054response will not be sent until a request is received, and (once the 3055burst level is reached) a subsequent request will not be sent until a 3056response is received. This may mask favoritism in the NIC between 3057inbound and outbound processing. 3058 3059With two concurrent unidirectional tests there is no such 3060synchronization or balance and any favoritism in the NIC may be exposed. 3061 3062@node The Omni Tests, Other Netperf Tests, Using Netperf to Measure Bidirectional Transfer, Top 3063@chapter The Omni Tests 3064 3065Beginning with version 2.5.0, netperf begins a migration to the 3066@samp{omni} tests or ``Two routines to measure them all.'' The code for 3067the omni tests can be found in @file{src/nettest_omni.c} and the goal 3068is to make it easier for netperf to support multiple protocols and 3069report a great many additional things about the systems under test. 3070Additionally, a flexible output selection mechanism is present which 3071allows the user to chose specifically what values she wishes to have 3072reported and in what format. 3073 3074The omni tests are included by default in version 2.5.0. To disable 3075them, one must: 3076@example 3077./configure --enable-omni=no ... 3078@end example 3079 3080and remake netperf. Remaking netserver is optional because even in 30812.5.0 it has ``unmigrated'' netserver side routines for the classic 3082(eg @file{src/nettest_bsd.c}) tests. 3083 3084@menu 3085* Native Omni Tests:: 3086* Migrated Tests:: 3087* Omni Output Selection:: 3088@end menu 3089 3090@node Native Omni Tests, Migrated Tests, The Omni Tests, The Omni Tests 3091@section Native Omni Tests 3092 3093One access the omni tests ``natively'' by using a value of ``OMNI'' 3094with the global @option{-t} test-selection option. This will then 3095cause netperf to use the code in @file{src/nettest_omni.c} and in 3096particular the test-specific options parser for the omni tests. The 3097test-specific options for the omni tests are a superset of those for 3098``classic'' tests. The options added by the omni tests are: 3099 3100@table @code 3101@vindex -c, Test-specific 3102@item -c 3103This explicitly declares that the test is to include connection 3104establishment and tear-down as in either a TCP_CRR or TCP_CC test. 3105 3106@vindex -d, Test-specific 3107@item -d <direction> 3108This option sets the direction of the test relative to the netperf 3109process. As of version 2.5.0 one can use the following in a 3110case-insensitive manner: 3111 3112@table @code 3113@item send, stream, transmit, xmit or 2 3114Any of which will cause netperf to send to the netserver. 3115@item recv, receive, maerts or 4 3116Any of which will cause netserver to send to netperf. 3117@item rr or 6 3118Either of which will cause a request/response test. 3119@end table 3120 3121Additionally, one can specify two directions separated by a '|' 3122character and they will be OR'ed together. In this way one can use 3123the ''Send|Recv'' that will be emitted by the @ref{Omni Output 3124Selectors,DIRECTION} @ref{Omni Output Selection,output selector} when 3125used with a request/response test. 3126 3127@vindex -k, Test-specific 3128@item -k [@ref{Omni Output Selection,output selector}] 3129This option sets the style of output to ``keyval'' where each line of 3130output has the form: 3131@example 3132key=value 3133@end example 3134For example: 3135@example 3136$ netperf -t omni -- -d rr -k "THROUGHPUT,THROUGHPUT_UNITS" 3137OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 3138THROUGHPUT=59092.65 3139THROUGHPUT_UNITS=Trans/s 3140@end example 3141 3142Using the @option{-k} option will override any previous, test-specific 3143@option{-o} or @option{-O} option. 3144 3145@vindex -o, Test-specific 3146@item -o [@ref{Omni Output Selection,output selector}] 3147This option sets the style of output to ``CSV'' where there will be 3148one line of comma-separated values, preceded by one line of column 3149names unless the global @option{-P} option is used with a value of 0: 3150@example 3151$ netperf -t omni -- -d rr -o "THROUGHPUT,THROUGHPUT_UNITS" 3152OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 3153Throughput,Throughput Units 315460999.07,Trans/s 3155@end example 3156 3157Using the @option{-o} option will override any previous, test-specific 3158@option{-k} or @option{-O} option. 3159 3160@vindex -O, Test-specific 3161@item -O [@ref{Omni Output Selection,output selector}] 3162This option sets the style of output to ``human readable'' which will 3163look quite similar to classic netperf output: 3164@example 3165$ netperf -t omni -- -d rr -O "THROUGHPUT,THROUGHPUT_UNITS" 3166OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 3167Throughput Throughput 3168 Units 3169 3170 317160492.57 Trans/s 3172@end example 3173 3174Using the @option{-O} option will override any previous, test-specific 3175@option{-k} or @option{-o} option. 3176 3177@vindex -t, Test-specific 3178@item -t 3179This option explicitly sets the socket type for the test's data 3180connection. As of version 2.5.0 the known socket types include 3181``stream'' and ``dgram'' for SOCK_STREAM and SOCK_DGRAM respectively. 3182 3183@vindex -T, Test-specific 3184@item -T <protocol> 3185This option is used to explicitly set the protocol used for the 3186test. It is case-insensitive. As of version 2.5.0 the protocols known 3187to netperf include: 3188@table @code 3189@item TCP 3190Select the Transmission Control Protocol 3191@item UDP 3192Select the User Datagram Protocol 3193@item SDP 3194Select the Sockets Direct Protocol 3195@item DCCP 3196Select the Datagram Congestion Control Protocol 3197@item SCTP 3198Select the Stream Control Transport Protocol 3199@item udplite 3200Select UDP Lite 3201@end table 3202 3203The default is implicit based on other settings. 3204@end table 3205 3206The omni tests also extend the interpretation of some of the classic, 3207test-specific options for the BSD Sockets tests: 3208 3209@table @code 3210@item -m <optionspec> 3211This can set the send size for either or both of the netperf and 3212netserver sides of the test: 3213@example 3214-m 32K 3215@end example 3216sets only the netperf-side send size to 32768 bytes, and or's-in 3217transmit for the direction. This is effectively the same behaviour as 3218for the classic tests. 3219@example 3220-m ,32K 3221@end example 3222sets only the netserver side send size to 32768 bytes and or's-in 3223receive for the direction. 3224@example 3225-m 16K,32K 3226sets the netperf side send size to 16284 bytes, the netserver side 3227send size to 32768 bytes and the direction will be "Send|Recv." 3228@end example 3229@item -M <optionspec> 3230This can set the receive size for either or both of the netperf and 3231netserver sides of the test: 3232@example 3233-M 32K 3234@end example 3235sets only the netserver side receive size to 32768 bytes and or's-in 3236send for the test direction. 3237@example 3238-M ,32K 3239@end example 3240sets only the netperf side receive size to 32768 bytes and or's-in 3241receive for the test direction. 3242@example 3243-M 16K,32K 3244@end example 3245sets the netserver side receive size to 16384 bytes and the netperf 3246side receive size to 32768 bytes and the direction will be "Send|Recv." 3247@end table 3248 3249@node Migrated Tests, Omni Output Selection, Native Omni Tests, The Omni Tests 3250@section Migrated Tests 3251 3252As of version 2.5.0 several tests have been migrated to use the omni 3253code in @file{src/nettest_omni.c} for the core of their testing. A 3254migrated test retains all its previous output code and so should still 3255``look and feel'' just like a pre-2.5.0 test with one exception - the 3256first line of the test banners will include the word ``MIGRATED'' at 3257the beginning as in: 3258 3259@example 3260$ netperf 3261MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 3262Recv Send Send 3263Socket Socket Message Elapsed 3264Size Size Size Time Throughput 3265bytes bytes bytes secs. 10^6bits/sec 3266 3267 87380 16384 16384 10.00 27175.27 3268@end example 3269 3270The tests migrated in version 2.5.0 are: 3271@itemize 3272@item TCP_STREAM 3273@item TCP_MAERTS 3274@item TCP_RR 3275@item TCP_CRR 3276@item UDP_STREAM 3277@item UDP_RR 3278@end itemize 3279 3280It is expected that future releases will have additional tests 3281migrated to use the ``omni'' functionality. 3282 3283If one uses ``omni-specific'' test-specific options in conjunction 3284with a migrated test, instead of using the classic output code, the 3285new omni output code will be used. For example if one uses the 3286@option{-k} test-specific option with a value of 3287``MIN_LATENCY,MAX_LATENCY'' with a migrated TCP_RR test one will see: 3288 3289@example 3290$ netperf -t tcp_rr -- -k THROUGHPUT,THROUGHPUT_UNITS 3291MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 3292THROUGHPUT=60074.74 3293THROUGHPUT_UNITS=Trans/s 3294@end example 3295rather than: 3296@example 3297$ netperf -t tcp_rr 3298MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo 3299Local /Remote 3300Socket Size Request Resp. Elapsed Trans. 3301Send Recv Size Size Time Rate 3302bytes Bytes bytes bytes secs. per sec 3303 330416384 87380 1 1 10.00 59421.52 330516384 87380 3306@end example 3307 3308@node Omni Output Selection, , Migrated Tests, The Omni Tests 3309@section Omni Output Selection 3310 3311The omni test-specific @option{-k}, @option{-o} and @option{-O} 3312options take an optional @code{output selector} by which the user can 3313configure what values are reported. The output selector can take 3314several forms: 3315 3316@table @code 3317@item @file{filename} 3318The output selections will be read from the named file. Within the 3319file there can be up to four lines of comma-separated output 3320selectors. This controls how many multi-line blocks of output are emitted 3321when the @option{-O} option is used. This output, while not identical to 3322``classic'' netperf output, is inspired by it. Multiple lines have no 3323effect for @option{-k} and @option{-o} options. Putting output 3324selections in a file can be useful when the list of selections is long. 3325@item comma and/or semi-colon-separated list 3326The output selections will be parsed from a comma and/or 3327semi-colon-separated list of output selectors. When the list is given 3328to a @option{-O} option a semi-colon specifies a new output block 3329should be started. Semi-colons have the same meaning as commas when 3330used with the @option{-k} or @option{-o} options. Depending on the 3331command interpreter being used, the semi-colon may have to be escaped 3332somehow to keep it from being interpreted by the command interpreter. 3333This can often be done by enclosing the entire list in quotes. 3334@item all 3335If the keyword @b{all} is specified it means that all known output 3336values should be displayed at the end of the test. This can be a 3337great deal of output. As of version 2.5.0 there are 157 different 3338output selectors. 3339@item ? 3340If a ``?'' is given as the output selection, the list of all known 3341output selectors will be displayed and no test actually run. When 3342passed to the @option{-O} option they will be listed one per 3343line. Otherwise they will be listed as a comma-separated list. It may 3344be necessary to protect the ``?'' from the command interpreter by 3345escaping it or enclosing it in quotes. 3346@item no selector 3347If nothing is given to the @option{-k}, @option{-o} or @option{-O} 3348option then the code selects a default set of output selectors 3349inspired by classic netperf output. The format will be the @samp{human 3350readable} format emitted by the test-specific @option{-O} option. 3351@end table 3352 3353The order of evaluation will first check for an output selection. If 3354none is specified with the @option{-k}, @option{-o} or @option{-O} 3355option netperf will select a default based on the characteristics of the 3356test. If there is an output selection, the code will first check for 3357@samp{?}, then check to see if it is the magic @samp{all} keyword. 3358After that it will check for either @samp{,} or @samp{;} in the 3359selection and take that to mean it is a comma and/or 3360semi-colon-separated list. If none of those checks match, netperf will then 3361assume the output specification is a filename and attempt to open and 3362parse the file. 3363 3364@menu 3365* Omni Output Selectors:: 3366@end menu 3367 3368@node Omni Output Selectors, , Omni Output Selection, Omni Output Selection 3369@subsection Omni Output Selectors 3370 3371As of version 2.5.0 the output selectors are: 3372 3373@table @code 3374@item OUTPUT_NONE 3375This is essentially a null output. For @option{-k} output it will 3376simply add a line that reads ``OUTPUT_NONE='' to the output. For 3377@option{-o} it will cause an empty ``column'' to be included. For 3378@option{-O} output it will cause extra spaces to separate ``real'' output. 3379@item SOCKET_TYPE 3380This will cause the socket type (eg SOCK_STREAM, SOCK_DGRAM) for the 3381data connection to be output. 3382@item PROTOCOL 3383This will cause the protocol used for the data connection to be displayed. 3384@item DIRECTION 3385This will display the data flow direction relative to the netperf 3386process. Units: Send or Recv for a unidirectional bulk-transfer test, 3387or Send|Recv for a request/response test. 3388@item ELAPSED_TIME 3389This will display the elapsed time in seconds for the test. 3390@item THROUGHPUT 3391This will display the throughput for the test. Units: As requested via 3392the global @option{-f} option and displayed by the THROUGHPUT_UNITS 3393output selector. 3394@item THROUGHPUT_UNITS 3395This will display the units for what is displayed by the 3396@code{THROUGHPUT} output selector. 3397@item LSS_SIZE_REQ 3398This will display the local (netperf) send socket buffer size (aka 3399SO_SNDBUF) requested via the command line. Units: Bytes. 3400@item LSS_SIZE 3401This will display the local (netperf) send socket buffer size 3402(SO_SNDBUF) immediately after the data connection socket was created. 3403Peculiarities of different networking stacks may lead to this 3404differing from the size requested via the command line. Units: Bytes. 3405@item LSS_SIZE_END 3406This will display the local (netperf) send socket buffer size 3407(SO_SNDBUF) immediately before the data connection socket is closed. 3408Peculiarities of different networking stacks may lead this to differ 3409from the size requested via the command line and/or the size 3410immediately after the data connection socket was created. Units: Bytes. 3411@item LSR_SIZE_REQ 3412This will display the local (netperf) receive socket buffer size (aka 3413SO_RCVBUF) requested via the command line. Units: Bytes. 3414@item LSR_SIZE 3415This will display the local (netperf) receive socket buffer size 3416(SO_RCVBUF) immediately after the data connection socket was created. 3417Peculiarities of different networking stacks may lead to this 3418differing from the size requested via the command line. Units: Bytes. 3419@item LSR_SIZE_END 3420This will display the local (netperf) receive socket buffer size 3421(SO_RCVBUF) immediately before the data connection socket is closed. 3422Peculiarities of different networking stacks may lead this to differ 3423from the size requested via the command line and/or the size 3424immediately after the data connection socket was created. Units: Bytes. 3425@item RSS_SIZE_REQ 3426This will display the remote (netserver) send socket buffer size (aka 3427SO_SNDBUF) requested via the command line. Units: Bytes. 3428@item RSS_SIZE 3429This will display the remote (netserver) send socket buffer size 3430(SO_SNDBUF) immediately after the data connection socket was created. 3431Peculiarities of different networking stacks may lead to this 3432differing from the size requested via the command line. Units: Bytes. 3433@item RSS_SIZE_END 3434This will display the remote (netserver) send socket buffer size 3435(SO_SNDBUF) immediately before the data connection socket is closed. 3436Peculiarities of different networking stacks may lead this to differ 3437from the size requested via the command line and/or the size 3438immediately after the data connection socket was created. Units: Bytes. 3439@item RSR_SIZE_REQ 3440This will display the remote (netserver) receive socket buffer size (aka 3441SO_RCVBUF) requested via the command line. Units: Bytes. 3442@item RSR_SIZE 3443This will display the remote (netserver) receive socket buffer size 3444(SO_RCVBUF) immediately after the data connection socket was created. 3445Peculiarities of different networking stacks may lead to this 3446differing from the size requested via the command line. Units: Bytes. 3447@item RSR_SIZE_END 3448This will display the remote (netserver) receive socket buffer size 3449(SO_RCVBUF) immediately before the data connection socket is closed. 3450Peculiarities of different networking stacks may lead this to differ 3451from the size requested via the command line and/or the size 3452immediately after the data connection socket was created. Units: Bytes. 3453@item LOCAL_SEND_SIZE 3454This will display the size of the buffers netperf passed in any 3455``send'' calls it made on the data connection for a 3456non-request/response test. Units: Bytes. 3457@item LOCAL_RECV_SIZE 3458This will display the size of the buffers netperf passed in any 3459``receive'' calls it made on the data connection for a 3460non-request/response test. Units: Bytes. 3461@item REMOTE_SEND_SIZE 3462This will display the size of the buffers netserver passed in any 3463``send'' calls it made on the data connection for a 3464non-request/response test. Units: Bytes. 3465@item REMOTE_RECV_SIZE 3466This will display the size of the buffers netserver passed in any 3467``receive'' calls it made on the data connection for a 3468non-request/response test. Units: Bytes. 3469@item REQUEST_SIZE 3470This will display the size of the requests netperf sent in a 3471request-response test. Units: Bytes. 3472@item RESPONSE_SIZE 3473This will display the size of the responses netserver sent in a 3474request-response test. Units: Bytes. 3475@item LOCAL_CPU_UTIL 3476This will display the overall CPU utilization during the test as 3477measured by netperf. Units: 0 to 100 percent. 3478@item LOCAL_CPU_PERCENT_USER 3479This will display the CPU fraction spent in user mode during the test 3480as measured by netperf. Only supported by netcpu_procstat. Units: 0 to 3481100 percent. 3482@item LOCAL_CPU_PERCENT_SYSTEM 3483This will display the CPU fraction spent in system mode during the test 3484as measured by netperf. Only supported by netcpu_procstat. Units: 0 to 3485100 percent. 3486@item LOCAL_CPU_PERCENT_IOWAIT 3487This will display the fraction of time waiting for I/O to complete 3488during the test as measured by netperf. Only supported by 3489netcpu_procstat. Units: 0 to 100 percent. 3490@item LOCAL_CPU_PERCENT_IRQ 3491This will display the fraction of time servicing interrupts during the 3492test as measured by netperf. Only supported by netcpu_procstat. Units: 34930 to 100 percent. 3494@item LOCAL_CPU_PERCENT_SWINTR 3495This will display the fraction of time servicing softirqs during the 3496test as measured by netperf. Only supported by netcpu_procstat. Units: 34970 to 100 percent. 3498@item LOCAL_CPU_METHOD 3499This will display the method used by netperf to measure CPU 3500utilization. Units: single character denoting method. 3501@item LOCAL_SD 3502This will display the service demand, or units of CPU consumed per 3503unit of work, as measured by netperf. Units: microseconds of CPU 3504consumed per either KB (K==1024) of data transferred or request/response 3505transaction. 3506@item REMOTE_CPU_UTIL 3507This will display the overall CPU utilization during the test as 3508measured by netserver. Units 0 to 100 percent. 3509@item REMOTE_CPU_PERCENT_USER 3510This will display the CPU fraction spent in user mode during the test 3511as measured by netserver. Only supported by netcpu_procstat. Units: 0 to 3512100 percent. 3513@item REMOTE_CPU_PERCENT_SYSTEM 3514This will display the CPU fraction spent in system mode during the test 3515as measured by netserver. Only supported by netcpu_procstat. Units: 0 to 3516100 percent. 3517@item REMOTE_CPU_PERCENT_IOWAIT 3518This will display the fraction of time waiting for I/O to complete 3519during the test as measured by netserver. Only supported by 3520netcpu_procstat. Units: 0 to 100 percent. 3521@item REMOTE_CPU_PERCENT_IRQ 3522This will display the fraction of time servicing interrupts during the 3523test as measured by netserver. Only supported by netcpu_procstat. Units: 35240 to 100 percent. 3525@item REMOTE_CPU_PERCENT_SWINTR 3526This will display the fraction of time servicing softirqs during the 3527test as measured by netserver. Only supported by netcpu_procstat. Units: 35280 to 100 percent. 3529@item REMOTE_CPU_METHOD 3530This will display the method used by netserver to measure CPU 3531utilization. Units: single character denoting method. 3532@item REMOTE_SD 3533This will display the service demand, or units of CPU consumed per 3534unit of work, as measured by netserver. Units: microseconds of CPU 3535consumed per either KB (K==1024) of data transferred or 3536request/response transaction. 3537@item SD_UNITS 3538This will display the units for LOCAL_SD and REMOTE_SD 3539@item CONFIDENCE_LEVEL 3540This will display the confidence level requested by the user either 3541explicitly via the global @option{-I} option, or implicitly via the 3542global @option{-i} option. The value will be either 95 or 99 if 3543confidence intervals have been requested or 0 if they were not. Units: 3544Percent 3545@item CONFIDENCE_INTERVAL 3546This will display the width of the confidence interval requested 3547either explicitly via the global @option{-I} option or implicitly via 3548the global @option{-i} option. Units: Width in percent of mean value 3549computed. A value of -1.0 means that confidence intervals were not requested. 3550@item CONFIDENCE_ITERATION 3551This will display the number of test iterations netperf undertook, 3552perhaps while attempting to achieve the requested confidence interval 3553and level. If confidence intervals were requested via the command line 3554then the value will be between 3 and 30. If confidence intervals were 3555not requested the value will be 1. Units: Iterations 3556@item THROUGHPUT_CONFID 3557This will display the width of the confidence interval actually 3558achieved for @code{THROUGHPUT} during the test. Units: Width of 3559interval as percentage of reported throughput value. 3560@item LOCAL_CPU_CONFID 3561This will display the width of the confidence interval actually 3562achieved for overall CPU utilization on the system running netperf 3563(@code{LOCAL_CPU_UTIL}) during the test, if CPU utilization measurement 3564was enabled. Units: Width of interval as percentage of reported CPU 3565utilization. 3566@item REMOTE_CPU_CONFID 3567This will display the width of the confidence interval actually 3568achieved for overall CPU utilization on the system running netserver 3569(@code{REMOTE_CPU_UTIL}) during the test, if CPU utilization 3570measurement was enabled. Units: Width of interval as percentage of 3571reported CPU utilization. 3572@item TRANSACTION_RATE 3573This will display the transaction rate in transactions per second for 3574a request/response test even if the user has requested a throughput in 3575units of bits or bytes per second via the global @option{-f} 3576option. It is undefined for a non-request/response test. Units: 3577Transactions per second. 3578@item RT_LATENCY 3579This will display the average round-trip latency for a 3580request/response test, accounting for number of transactions in flight 3581at one time. It is undefined for a non-request/response test. Units: 3582Microseconds per transaction 3583@item BURST_SIZE 3584This will display the ``burst size'' or added transactions in flight 3585in a request/response test as requested via a test-specific 3586@option{-b} option. The number of transactions in flight at one time 3587will be one greater than this value. It is undefined for a 3588non-request/response test. Units: added Transactions in flight. 3589@item LOCAL_TRANSPORT_RETRANS 3590This will display the number of retransmissions experienced on the 3591data connection during the test as determined by netperf. A value of 3592-1 means the attempt to determine the number of retransmissions failed 3593or the concept was not valid for the given protocol or the mechanism 3594is not known for the platform. A value of -2 means it was not 3595attempted. As of version 2.5.0 the meaning of values are in flux and 3596subject to change. Units: number of retransmissions. 3597@item REMOTE_TRANSPORT_RETRANS 3598This will display the number of retransmissions experienced on the 3599data connection during the test as determined by netserver. A value 3600of -1 means the attempt to determine the number of retransmissions 3601failed or the concept was not valid for the given protocol or the 3602mechanism is not known for the platform. A value of -2 means it was 3603not attempted. As of version 2.5.0 the meaning of values are in flux 3604and subject to change. Units: number of retransmissions. 3605@item TRANSPORT_MSS 3606This will display the Maximum Segment Size (aka MSS) or its equivalent 3607for the protocol being used during the test. A value of -1 means 3608either the concept of an MSS did not apply to the protocol being used, 3609or there was an error in retrieving it. Units: Bytes. 3610@item LOCAL_SEND_THROUGHPUT 3611The throughput as measured by netperf for the successful ``send'' 3612calls it made on the data connection. Units: as requested via the 3613global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS} 3614output selector. 3615@item LOCAL_RECV_THROUGHPUT 3616The throughput as measured by netperf for the successful ``receive'' 3617calls it made on the data connection. Units: as requested via the 3618global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS} 3619output selector. 3620@item REMOTE_SEND_THROUGHPUT 3621The throughput as measured by netserver for the successful ``send'' 3622calls it made on the data connection. Units: as requested via the 3623global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS} 3624output selector. 3625@item REMOTE_RECV_THROUGHPUT 3626The throughput as measured by netserver for the successful ``receive'' 3627calls it made on the data connection. Units: as requested via the 3628global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS} 3629output selector. 3630@item LOCAL_CPU_BIND 3631The CPU to which netperf was bound, if at all, during the test. A 3632value of -1 means that netperf was not explicitly bound to a CPU 3633during the test. Units: CPU ID 3634@item LOCAL_CPU_COUNT 3635The number of CPUs (cores, threads) detected by netperf. Units: CPU count. 3636@item LOCAL_CPU_PEAK_UTIL 3637The utilization of the CPU most heavily utilized during the test, as 3638measured by netperf. This can be used to see if any one CPU of a 3639multi-CPU system was saturated even though the overall CPU utilization 3640as reported by @code{LOCAL_CPU_UTIL} was low. Units: 0 to 100% 3641@item LOCAL_CPU_PEAK_ID 3642The id of the CPU most heavily utilized during the test as determined 3643by netperf. Units: CPU ID. 3644@item LOCAL_CPU_MODEL 3645Model information for the processor(s) present on the system running 3646netperf. Assumes all processors in the system (as perceived by 3647netperf) on which netperf is running are the same model. Units: Text 3648@item LOCAL_CPU_FREQUENCY 3649The frequency of the processor(s) on the system running netperf, at 3650the time netperf made the call. Assumes that all processors present 3651in the system running netperf are running at the same 3652frequency. Units: MHz 3653@item REMOTE_CPU_BIND 3654The CPU to which netserver was bound, if at all, during the test. A 3655value of -1 means that netperf was not explicitly bound to a CPU 3656during the test. Units: CPU ID 3657@item REMOTE_CPU_COUNT 3658The number of CPUs (cores, threads) detected by netserver. Units: CPU 3659count. 3660@item REMOTE_CPU_PEAK_UTIL 3661The utilization of the CPU most heavily utilized during the test, as 3662measured by netserver. This can be used to see if any one CPU of a 3663multi-CPU system was saturated even though the overall CPU utilization 3664as reported by @code{REMOTE_CPU_UTIL} was low. Units: 0 to 100% 3665@item REMOTE_CPU_PEAK_ID 3666The id of the CPU most heavily utilized during the test as determined 3667by netserver. Units: CPU ID. 3668@item REMOTE_CPU_MODEL 3669Model information for the processor(s) present on the system running 3670netserver. Assumes all processors in the system (as perceived by 3671netserver) on which netserver is running are the same model. Units: 3672Text 3673@item REMOTE_CPU_FREQUENCY 3674The frequency of the processor(s) on the system running netserver, at 3675the time netserver made the call. Assumes that all processors present 3676in the system running netserver are running at the same 3677frequency. Units: MHz 3678@item SOURCE_PORT 3679The port ID/service name to which the data socket created by netperf 3680was bound. A value of 0 means the data socket was not explicitly 3681bound to a port number. Units: ASCII text. 3682@item SOURCE_ADDR 3683The name/address to which the data socket created by netperf was 3684bound. A value of 0.0.0.0 means the data socket was not explicitly 3685bound to an address. Units: ASCII text. 3686@item SOURCE_FAMILY 3687The address family to which the data socket created by netperf was 3688bound. A value of 0 means the data socket was not explicitly bound to 3689a given address family. Units: ASCII text. 3690@item DEST_PORT 3691The port ID to which the data socket created by netserver was bound. A 3692value of 0 means the data socket was not explicitly bound to a port 3693number. Units: ASCII text. 3694@item DEST_ADDR 3695The name/address of the data socket created by netserver. Units: 3696ASCII text. 3697@item DEST_FAMILY 3698The address family to which the data socket created by netserver was 3699bound. A value of 0 means the data socket was not explicitly bound to 3700a given address family. Units: ASCII text. 3701@item LOCAL_SEND_CALLS 3702The number of successful ``send'' calls made by netperf against its 3703data socket. Units: Calls. 3704@item LOCAL_RECV_CALLS 3705The number of successful ``receive'' calls made by netperf against its 3706data socket. Units: Calls. 3707@item LOCAL_BYTES_PER_RECV 3708The average number of bytes per ``receive'' call made by netperf 3709against its data socket. Units: Bytes. 3710@item LOCAL_BYTES_PER_SEND 3711The average number of bytes per ``send'' call made by netperf against 3712its data socket. Units: Bytes. 3713@item LOCAL_BYTES_SENT 3714The number of bytes successfully sent by netperf through its data 3715socket. Units: Bytes. 3716@item LOCAL_BYTES_RECVD 3717The number of bytes successfully received by netperf through its data 3718socket. Units: Bytes. 3719@item LOCAL_BYTES_XFERD 3720The sum of bytes sent and received by netperf through its data 3721socket. Units: Bytes. 3722@item LOCAL_SEND_OFFSET 3723The offset from the alignment of the buffers passed by netperf in its 3724``send'' calls. Specified via the global @option{-o} option and 3725defaults to 0. Units: Bytes. 3726@item LOCAL_RECV_OFFSET 3727The offset from the alignment of the buffers passed by netperf in its 3728``receive'' calls. Specified via the global @option{-o} option and 3729defaults to 0. Units: Bytes. 3730@item LOCAL_SEND_ALIGN 3731The alignment of the buffers passed by netperf in its ``send'' calls 3732as specified via the global @option{-a} option. Defaults to 8. Units: 3733Bytes. 3734@item LOCAL_RECV_ALIGN 3735The alignment of the buffers passed by netperf in its ``receive'' 3736calls as specified via the global @option{-a} option. Defaults to 37378. Units: Bytes. 3738@item LOCAL_SEND_WIDTH 3739The ``width'' of the ring of buffers through which netperf cycles as 3740it makes its ``send'' calls. Defaults to one more than the local send 3741socket buffer size divided by the send size as determined at the time 3742the data socket is created. Can be used to make netperf more processor 3743data cache unfriendly. Units: number of buffers. 3744@item LOCAL_RECV_WIDTH 3745The ``width'' of the ring of buffers through which netperf cycles as 3746it makes its ``receive'' calls. Defaults to one more than the local 3747receive socket buffer size divided by the receive size as determined 3748at the time the data socket is created. Can be used to make netperf 3749more processor data cache unfriendly. Units: number of buffers. 3750@item LOCAL_SEND_DIRTY_COUNT 3751The number of bytes to ``dirty'' (write to) before netperf makes a 3752``send'' call. Specified via the global @option{-k} option, which 3753requires that --enable-dirty=yes was specified with the configure 3754command prior to building netperf. Units: Bytes. 3755@item LOCAL_RECV_DIRTY_COUNT 3756The number of bytes to ``dirty'' (write to) before netperf makes a 3757``recv'' call. Specified via the global @option{-k} option which 3758requires that --enable-dirty was specified with the configure command 3759prior to building netperf. Units: Bytes. 3760@item LOCAL_RECV_CLEAN_COUNT 3761The number of bytes netperf should read ``cleanly'' before making a 3762``receive'' call. Specified via the global @option{-k} option which 3763requires that --enable-dirty was specified with configure command 3764prior to building netperf. Clean reads start were dirty writes ended. 3765Units: Bytes. 3766@item LOCAL_NODELAY 3767Indicates whether or not setting the test protocol-specific ``no 3768delay'' (eg TCP_NODELAY) option on the data socket used by netperf was 3769requested by the test-specific @option{-D} option and 3770successful. Units: 0 means no, 1 means yes. 3771@item LOCAL_CORK 3772Indicates whether or not TCP_CORK was set on the data socket used by 3773netperf as requested via the test-specific @option{-C} option. 1 means 3774yes, 0 means no/not applicable. 3775@item REMOTE_SEND_CALLS 3776@item REMOTE_RECV_CALLS 3777@item REMOTE_BYTES_PER_RECV 3778@item REMOTE_BYTES_PER_SEND 3779@item REMOTE_BYTES_SENT 3780@item REMOTE_BYTES_RECVD 3781@item REMOTE_BYTES_XFERD 3782@item REMOTE_SEND_OFFSET 3783@item REMOTE_RECV_OFFSET 3784@item REMOTE_SEND_ALIGN 3785@item REMOTE_RECV_ALIGN 3786@item REMOTE_SEND_WIDTH 3787@item REMOTE_RECV_WIDTH 3788@item REMOTE_SEND_DIRTY_COUNT 3789@item REMOTE_RECV_DIRTY_COUNT 3790@item REMOTE_RECV_CLEAN_COUNT 3791@item REMOTE_NODELAY 3792@item REMOTE_CORK 3793These are all like their ``LOCAL_'' counterparts only for the 3794netserver rather than netperf. 3795@item LOCAL_SYSNAME 3796The name of the OS (eg ``Linux'') running on the system on which 3797netperf was running. Units: ASCII Text 3798@item LOCAL_SYSTEM_MODEL 3799The model name of the system on which netperf was running. Units: 3800ASCII Text. 3801@item LOCAL_RELEASE 3802The release name/number of the OS running on the system on which 3803netperf was running. Units: ASCII Text 3804@item LOCAL_VERSION 3805The version number of the OS running on the system on which netperf 3806was running. Units: ASCII Text 3807@item LOCAL_MACHINE 3808The machine architecture of the machine on which netperf was 3809running. Units: ASCII Text. 3810@item REMOTE_SYSNAME 3811@item REMOTE_SYSTEM_MODEL 3812@item REMOTE_RELEASE 3813@item REMOTE_VERSION 3814@item REMOTE_MACHINE 3815These are all like their ``LOCAL_'' counterparts only for the 3816netserver rather than netperf. 3817@item LOCAL_INTERFACE_NAME 3818The name of the probable egress interface through which the data 3819connection went on the system running netperf. Example: eth0. Units: 3820ASCII Text. 3821@item LOCAL_INTERFACE_VENDOR 3822The vendor ID of the probable egress interface through which traffic 3823on the data connection went on the system running netperf. Units: 3824Hexadecimal IDs as might be found in a @file{pci.ids} file or at 3825@uref{http://pciids.sourceforge.net/,the PCI ID Repository}. 3826@item LOCAL_INTERFACE_DEVICE 3827The device ID of the probable egress interface through which traffic 3828on the data connection went on the system running netperf. Units: 3829Hexadecimal IDs as might be found in a @file{pci.ids} file or at 3830@uref{http://pciids.sourceforge.net/,the PCI ID Repository}. 3831@item LOCAL_INTERFACE_SUBVENDOR 3832The sub-vendor ID of the probable egress interface through which 3833traffic on the data connection went on the system running 3834netperf. Units: Hexadecimal IDs as might be found in a @file{pci.ids} 3835file or at @uref{http://pciids.sourceforge.net/,the PCI ID 3836Repository}. 3837@item LOCAL_INTERFACE_SUBDEVICE 3838The sub-device ID of the probable egress interface through which 3839traffic on the data connection went on the system running 3840netperf. Units: Hexadecimal IDs as might be found in a @file{pci.ids} 3841file or at @uref{http://pciids.sourceforge.net/,the PCI ID 3842Repository}. 3843@item LOCAL_DRIVER_NAME 3844The name of the driver used for the probable egress interface through 3845which traffic on the data connection went on the system running 3846netperf. Units: ASCII Text. 3847@item LOCAL_DRIVER_VERSION 3848The version string for the driver used for the probable egress 3849interface through which traffic on the data connection went on the 3850system running netperf. Units: ASCII Text. 3851@item LOCAL_DRIVER_FIRMWARE 3852The firmware version for the driver used for the probable egress 3853interface through which traffic on the data connection went on the 3854system running netperf. Units: ASCII Text. 3855@item LOCAL_DRIVER_BUS 3856The bus address of the probable egress interface through which traffic 3857on the data connection went on the system running netperf. Units: 3858ASCII Text. 3859@item LOCAL_INTERFACE_SLOT 3860The slot ID of the probable egress interface through which traffic 3861on the data connection went on the system running netperf. Units: 3862ASCII Text. 3863@item REMOTE_INTERFACE_NAME 3864@item REMOTE_INTERFACE_VENDOR 3865@item REMOTE_INTERFACE_DEVICE 3866@item REMOTE_INTERFACE_SUBVENDOR 3867@item REMOTE_INTERFACE_SUBDEVICE 3868@item REMOTE_DRIVER_NAME 3869@item REMOTE_DRIVER_VERSION 3870@item REMOTE_DRIVER_FIRMWARE 3871@item REMOTE_DRIVER_BUS 3872@item REMOTE_INTERFACE_SLOT 3873These are all like their ``LOCAL_'' counterparts only for the 3874netserver rather than netperf. 3875@item LOCAL_INTERVAL_USECS 3876The interval at which bursts of operations (sends, receives, 3877transactions) were attempted by netperf. Specified by the 3878global @option{-w} option which requires --enable-intervals to have 3879been specified with the configure command prior to building 3880netperf. Units: Microseconds (though specified by default in 3881milliseconds on the command line) 3882@item LOCAL_INTERVAL_BURST 3883The number of operations (sends, receives, transactions depending on 3884the test) which were attempted by netperf each LOCAL_INTERVAL_USECS 3885units of time. Specified by the global @option{-b} option which 3886requires --enable-intervals to have been specified with the configure 3887command prior to building netperf. Units: number of operations per burst. 3888@item REMOTE_INTERVAL_USECS 3889The interval at which bursts of operations (sends, receives, 3890transactions) were attempted by netserver. Specified by the 3891global @option{-w} option which requires --enable-intervals to have 3892been specified with the configure command prior to building 3893netperf. Units: Microseconds (though specified by default in 3894milliseconds on the command line) 3895@item REMOTE_INTERVAL_BURST 3896The number of operations (sends, receives, transactions depending on 3897the test) which were attempted by netperf each LOCAL_INTERVAL_USECS 3898units of time. Specified by the global @option{-b} option which 3899requires --enable-intervals to have been specified with the configure 3900command prior to building netperf. Units: number of operations per burst. 3901@item LOCAL_SECURITY_TYPE_ID 3902@item LOCAL_SECURITY_TYPE 3903@item LOCAL_SECURITY_ENABLED_NUM 3904@item LOCAL_SECURITY_ENABLED 3905@item LOCAL_SECURITY_SPECIFIC 3906@item REMOTE_SECURITY_TYPE_ID 3907@item REMOTE_SECURITY_TYPE 3908@item REMOTE_SECURITY_ENABLED_NUM 3909@item REMOTE_SECURITY_ENABLED 3910@item REMOTE_SECURITY_SPECIFIC 3911A bunch of stuff related to what sort of security mechanisms (eg 3912SELINUX) were enabled on the systems during the test. 3913@item RESULT_BRAND 3914The string specified by the user with the global @option{-B} 3915option. Units: ASCII Text. 3916@item UUID 3917The universally unique identifier associated with this test, either 3918generated automagically by netperf, or passed to netperf via an omni 3919test-specific @option{-u} option. Note: Future versions may make this 3920a global command-line option. Units: ASCII Text. 3921@item MIN_LATENCY 3922The minimum ``latency'' or operation time (send, receive or 3923request/response exchange depending on the test) as measured on the 3924netperf side when the global @option{-j} option was specified. Units: 3925Microseconds. 3926@item MAX_LATENCY 3927The maximum ``latency'' or operation time (send, receive or 3928request/response exchange depending on the test) as measured on the 3929netperf side when the global @option{-j} option was specified. Units: 3930Microseconds. 3931@item P50_LATENCY 3932The 50th percentile value of ``latency'' or operation time (send, receive or 3933request/response exchange depending on the test) as measured on the 3934netperf side when the global @option{-j} option was specified. Units: 3935Microseconds. 3936@item P90_LATENCY 3937The 90th percentile value of ``latency'' or operation time (send, receive or 3938request/response exchange depending on the test) as measured on the 3939netperf side when the global @option{-j} option was specified. Units: 3940Microseconds. 3941@item P99_LATENCY 3942The 99th percentile value of ``latency'' or operation time (send, receive or 3943request/response exchange depending on the test) as measured on the 3944netperf side when the global @option{-j} option was specified. Units: 3945Microseconds. 3946@item MEAN_LATENCY 3947The average ``latency'' or operation time (send, receive or 3948request/response exchange depending on the test) as measured on the 3949netperf side when the global @option{-j} option was specified. Units: 3950Microseconds. 3951@item STDDEV_LATENCY 3952The standard deviation of ``latency'' or operation time (send, receive or 3953request/response exchange depending on the test) as measured on the 3954netperf side when the global @option{-j} option was specified. Units: 3955Microseconds. 3956@item COMMAND_LINE 3957The full command line used when invoking netperf. Units: ASCII Text. 3958@item OUTPUT_END 3959While emitted with the list of output selectors, it is ignored when 3960specified as an output selector. 3961@end table 3962 3963@node Other Netperf Tests, Address Resolution, The Omni Tests, Top 3964@chapter Other Netperf Tests 3965 3966Apart from the typical performance tests, netperf contains some tests 3967which can be used to streamline measurements and reporting. These 3968include CPU rate calibration (present) and host identification (future 3969enhancement). 3970 3971@menu 3972* CPU rate calibration:: 3973* UUID Generation:: 3974@end menu 3975 3976@node CPU rate calibration, UUID Generation, Other Netperf Tests, Other Netperf Tests 3977@section CPU rate calibration 3978 3979Some of the CPU utilization measurement mechanisms of netperf work by 3980comparing the rate at which some counter increments when the system is 3981idle with the rate at which that same counter increments when the 3982system is running a netperf test. The ratio of those rates is used to 3983arrive at a CPU utilization percentage. 3984 3985This means that netperf must know the rate at which the counter 3986increments when the system is presumed to be ``idle.'' If it does not 3987know the rate, netperf will measure it before starting a data transfer 3988test. This calibration step takes 40 seconds for each of the local or 3989remote systems, and if repeated for each netperf test would make taking 3990repeated measurements rather slow. 3991 3992Thus, the netperf CPU utilization options @option{-c} and and 3993@option{-C} can take an optional calibration value. This value is 3994used as the ``idle rate'' and the calibration step is not 3995performed. To determine the idle rate, netperf can be used to run 3996special tests which only report the value of the calibration - they 3997are the LOC_CPU and REM_CPU tests. These return the calibration value 3998for the local and remote system respectively. A common way to use 3999these tests is to store their results into an environment variable and 4000use that in subsequent netperf commands: 4001 4002@example 4003LOC_RATE=`netperf -t LOC_CPU` 4004REM_RATE=`netperf -H <remote> -t REM_CPU` 4005netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ... 4006... 4007netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ... 4008@end example 4009 4010If you are going to use netperf to measure aggregate results, it is 4011important to use the LOC_CPU and REM_CPU tests to get the calibration 4012values first to avoid issues with some of the aggregate netperf tests 4013transferring data while others are ``idle'' and getting bogus 4014calibration values. When running aggregate tests, it is very 4015important to remember that any one instance of netperf does not know 4016about the other instances of netperf. It will report global CPU 4017utilization and will calculate service demand believing it was the 4018only thing causing that CPU utilization. So, you can use the CPU 4019utilization reported by netperf in an aggregate test, but you have to 4020calculate service demands by hand. 4021 4022@node UUID Generation, , CPU rate calibration, Other Netperf Tests 4023@section UUID Generation 4024 4025Beginning with version 2.5.0 netperf can generate Universally Unique 4026IDentifiers (UUIDs). This can be done explicitly via the ``UUID'' 4027test: 4028@example 4029$ netperf -t UUID 40302c8561ae-9ebd-11e0-a297-0f5bfa0349d0 4031@end example 4032 4033In and of itself, this is not terribly useful, but used in conjunction 4034with the test-specific @option{-u} option of an ``omni'' test to set 4035the UUID emitted by the @ref{Omni Output Selectors,UUID} output 4036selector, it can be used to tie-together the separate instances of an 4037aggregate netperf test. Say, for instance if they were inserted into 4038a database of some sort. 4039 4040@node Address Resolution, Enhancing Netperf, Other Netperf Tests, Top 4041@comment node-name, next, previous, up 4042@chapter Address Resolution 4043 4044Netperf versions 2.4.0 and later have merged IPv4 and IPv6 tests so 4045the functionality of the tests in @file{src/nettest_ipv6.c} has been 4046subsumed into the tests in @file{src/nettest_bsd.c} This has been 4047accomplished in part by switching from @code{gethostbyname()}to 4048@code{getaddrinfo()} exclusively. While it was theoretically possible 4049to get multiple results for a hostname from @code{gethostbyname()} it 4050was generally unlikely and netperf's ignoring of the second and later 4051results was not much of an issue. 4052 4053Now with @code{getaddrinfo} and particularly with AF_UNSPEC it is 4054increasingly likely that a given hostname will have multiple 4055associated addresses. The @code{establish_control()} routine of 4056@file{src/netlib.c} will indeed attempt to chose from among all the 4057matching IP addresses when establishing the control connection. 4058Netperf does not _really_ care if the control connection is IPv4 or 4059IPv6 or even mixed on either end. 4060 4061However, the individual tests still ass-u-me that the first result in 4062the address list is the one to be used. Whether or not this will 4063turn-out to be an issue has yet to be determined. 4064 4065If you do run into problems with this, the easiest workaround is to 4066specify IP addresses for the data connection explicitly in the 4067test-specific @option{-H} and @option{-L} options. At some point, the 4068netperf tests _may_ try to be more sophisticated in their parsing of 4069returns from @code{getaddrinfo()} - straw-man patches to 4070@email{netperf-feedback@@netperf.org} would of course be most welcome 4071:) 4072 4073Netperf has leveraged code from other open-source projects with 4074amenable licensing to provide a replacement @code{getaddrinfo()} call 4075on those platforms where the @command{configure} script believes there 4076is no native getaddrinfo call. As of this writing, the replacement 4077@code{getaddrinfo()} as been tested on HP-UX 11.0 and then presumed to 4078run elsewhere. 4079 4080@node Enhancing Netperf, Netperf4, Address Resolution, Top 4081@comment node-name, next, previous, up 4082@chapter Enhancing Netperf 4083 4084Netperf is constantly evolving. If you find you want to make 4085enhancements to netperf, by all means do so. If you wish to add a new 4086``suite'' of tests to netperf the general idea is to: 4087 4088@enumerate 4089@item 4090Add files @file{src/nettest_mumble.c} and @file{src/nettest_mumble.h} 4091where mumble is replaced with something meaningful for the test-suite. 4092@item 4093Add support for an appropriate @option{--enable-mumble} option in 4094@file{configure.ac}. 4095@item 4096Edit @file{src/netperf.c}, @file{netsh.c}, and @file{netserver.c} as 4097required, using #ifdef WANT_MUMBLE. 4098@item 4099Compile and test 4100@end enumerate 4101 4102However, with the addition of the ``omni'' tests in version 2.5.0 it 4103is preferred that one attempt to make the necessary changes to 4104@file{src/nettest_omni.c} rather than adding new source files, unless 4105this would make the omni tests entirely too complicated. 4106 4107If you wish to submit your changes for possible inclusion into the 4108mainline sources, please try to base your changes on the latest 4109available sources. (@xref{Getting Netperf Bits}.) and then send email 4110describing the changes at a high level to 4111@email{netperf-feedback@@netperf.org} or perhaps 4112@email{netperf-talk@@netperf.org}. If the consensus is positive, then 4113sending context @command{diff} results to 4114@email{netperf-feedback@@netperf.org} is the next step. From that 4115point, it is a matter of pestering the Netperf Contributing Editor 4116until he gets the changes incorporated :) 4117 4118@node Netperf4, Concept Index, Enhancing Netperf, Top 4119@comment node-name, next, previous, up 4120@chapter Netperf4 4121 4122Netperf4 is the shorthand name given to version 4.X.X of netperf. 4123This is really a separate benchmark more than a newer version of 4124netperf, but it is a descendant of netperf so the netperf name is 4125kept. The facetious way to describe netperf4 is to say it is the 4126egg-laying-woolly-milk-pig version of netperf :) The more respectful 4127way to describe it is to say it is the version of netperf with support 4128for synchronized, multiple-thread, multiple-test, multiple-system, 4129network-oriented benchmarking. 4130 4131Netperf4 is still undergoing evolution. Those wishing to work with or 4132on netperf4 are encouraged to join the 4133@uref{http://www.netperf.org/cgi-bin/mailman/listinfo/netperf-dev,netperf-dev} 4134mailing list and/or peruse the 4135@uref{http://www.netperf.org/svn/netperf4/trunk,current sources}. 4136 4137@node Concept Index, Option Index, Netperf4, Top 4138@unnumbered Concept Index 4139 4140@printindex cp 4141 4142@node Option Index, , Concept Index, Top 4143@comment node-name, next, previous, up 4144@unnumbered Option Index 4145 4146@printindex vr 4147@bye 4148 4149@c LocalWords: texinfo setfilename settitle titlepage vskip pt filll ifnottex 4150@c LocalWords: insertcopying cindex dfn uref printindex cp 4151