• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1This is netperf.info, produced by makeinfo version 4.13 from
2netperf.texi.
3
4This is Rick Jones' feeble attempt at a Texinfo-based manual for the
5netperf benchmark.
6
7   Copyright (C) 2005-2012 Hewlett-Packard Company
8
9     Permission is granted to copy, distribute and/or modify this
10     document per the terms of the netperf source license, a copy of
11     which can be found in the file `COPYING' of the basic netperf
12     distribution.
13
14
15File: netperf.info,  Node: Top,  Next: Introduction,  Prev: (dir),  Up: (dir)
16
17Netperf Manual
18**************
19
20This is Rick Jones' feeble attempt at a Texinfo-based manual for the
21netperf benchmark.
22
23   Copyright (C) 2005-2012 Hewlett-Packard Company
24
25     Permission is granted to copy, distribute and/or modify this
26     document per the terms of the netperf source license, a copy of
27     which can be found in the file `COPYING' of the basic netperf
28     distribution.
29
30* Menu:
31
32* Introduction::                An introduction to netperf - what it
33is and what it is not.
34* Installing Netperf::          How to go about installing netperf.
35* The Design of Netperf::
36* Global Command-line Options::
37* Using Netperf to Measure Bulk Data Transfer::
38* Using Netperf to Measure Request/Response ::
39* Using Netperf to Measure Aggregate Performance::
40* Using Netperf to Measure Bidirectional Transfer::
41* The Omni Tests::
42* Other Netperf Tests::
43* Address Resolution::
44* Enhancing Netperf::
45* Netperf4::
46* Concept Index::
47* Option Index::
48
49
50File: netperf.info,  Node: Introduction,  Next: Installing Netperf,  Prev: Top,  Up: Top
51
521 Introduction
53**************
54
55Netperf is a benchmark that can be use to measure various aspect of
56networking performance.  The primary foci are bulk (aka unidirectional)
57data transfer and request/response performance using either TCP or UDP
58and the Berkeley Sockets interface.  As of this writing, the tests
59available either unconditionally or conditionally include:
60
61   * TCP and UDP unidirectional transfer and request/response over IPv4
62     and IPv6 using the Sockets interface.
63
64   * TCP and UDP unidirectional transfer and request/response over IPv4
65     using the XTI interface.
66
67   * Link-level unidirectional transfer and request/response using the
68     DLPI interface.
69
70   * Unix domain sockets
71
72   * SCTP unidirectional transfer and request/response over IPv4 and
73     IPv6 using the sockets interface.
74
75   While not every revision of netperf will work on every platform
76listed, the intention is that at least some version of netperf will
77work on the following platforms:
78
79   * Unix - at least all the major variants.
80
81   * Linux
82
83   * Windows
84
85   * Others
86
87   Netperf is maintained and informally supported primarily by Rick
88Jones, who can perhaps be best described as Netperf Contributing
89Editor.  Non-trivial and very appreciated assistance comes from others
90in the network performance community, who are too numerous to mention
91here. While it is often used by them, netperf is NOT supported via any
92of the formal Hewlett-Packard support channels.  You should feel free
93to make enhancements and modifications to netperf to suit your
94nefarious porpoises, so long as you stay within the guidelines of the
95netperf copyright.  If you feel so inclined, you can send your changes
96to netperf-feedback <netperf-feedback@netperf.org> for possible
97inclusion into subsequent versions of netperf.
98
99   It is the Contributing Editor's belief that the netperf license walks
100like open source and talks like open source. However, the license was
101never submitted for "certification" as an open source license.  If you
102would prefer to make contributions to a networking benchmark using a
103certified open source license, please consider netperf4, which is
104distributed under the terms of the GPLv2.
105
106   The netperf-talk <netperf-talk@netperf.org> mailing list is
107available to discuss the care and feeding of netperf with others who
108share your interest in network performance benchmarking. The
109netperf-talk mailing list is a closed list (to deal with spam) and you
110must first subscribe by sending email to netperf-talk-request
111<netperf-talk-request@netperf.org>.
112
113* Menu:
114
115* Conventions::
116
117
118File: netperf.info,  Node: Conventions,  Prev: Introduction,  Up: Introduction
119
1201.1 Conventions
121===============
122
123A "sizespec" is a one or two item, comma-separated list used as an
124argument to a command-line option that can set one or two, related
125netperf parameters.  If you wish to set both parameters to separate
126values, items should be separated by a comma:
127
128     parameter1,parameter2
129
130   If you wish to set the first parameter without altering the value of
131the second from its default, you should follow the first item with a
132comma:
133
134     parameter1,
135
136   Likewise, precede the item with a comma if you wish to set only the
137second parameter:
138
139     ,parameter2
140
141   An item with no commas:
142
143     parameter1and2
144
145   will set both parameters to the same value.  This last mode is one of
146the most frequently used.
147
148   There is another variant of the comma-separated, two-item list called
149a "optionspec" which is like a sizespec with the exception that a
150single item with no comma:
151
152     parameter1
153
154   will only set the value of the first parameter and will leave the
155second parameter at its default value.
156
157   Netperf has two types of command-line options.  The first are global
158command line options.  They are essentially any option not tied to a
159particular test or group of tests.  An example of a global command-line
160option is the one which sets the test type - `-t'.
161
162   The second type of options are test-specific options.  These are
163options which are only applicable to a particular test or set of tests.
164An example of a test-specific option would be the send socket buffer
165size for a TCP_STREAM test.
166
167   Global command-line options are specified first with test-specific
168options following after a `--' as in:
169
170     netperf <global> -- <test-specific>
171
172
173File: netperf.info,  Node: Installing Netperf,  Next: The Design of Netperf,  Prev: Introduction,  Up: Top
174
1752 Installing Netperf
176********************
177
178Netperf's primary form of distribution is source code.  This allows
179installation on systems other than those to which the authors have
180ready access and thus the ability to create binaries.  There are two
181styles of netperf installation.  The first runs the netperf server
182program - netserver - as a child of inetd.  This requires the installer
183to have sufficient privileges to edit the files `/etc/services' and
184`/etc/inetd.conf' or their platform-specific equivalents.
185
186   The second style is to run netserver as a standalone daemon.  This
187second method does not require edit privileges on `/etc/services' and
188`/etc/inetd.conf' but does mean you must remember to run the netserver
189program explicitly after every system reboot.
190
191   This manual assumes that those wishing to measure networking
192performance already know how to use anonymous FTP and/or a web browser.
193It is also expected that you have at least a passing familiarity with
194the networking protocols and interfaces involved. In all honesty, if
195you do not have such familiarity, likely as not you have some
196experience to gain before attempting network performance measurements.
197The excellent texts by authors such as Stevens, Fenner and Rudoff
198and/or Stallings would be good starting points. There are likely other
199excellent sources out there as well.
200
201* Menu:
202
203* Getting Netperf Bits::
204* Installing Netperf Bits::
205* Verifying Installation::
206
207
208File: netperf.info,  Node: Getting Netperf Bits,  Next: Installing Netperf Bits,  Prev: Installing Netperf,  Up: Installing Netperf
209
2102.1 Getting Netperf Bits
211========================
212
213Gzipped tar files of netperf sources can be retrieved via anonymous FTP
214(ftp://ftp.netperf.org/netperf) for "released" versions of the bits.
215Pre-release versions of the bits can be retrieved via anonymous FTP
216from the experimental (ftp://ftp.netperf.org/netperf/experimental)
217subdirectory.
218
219   For convenience and ease of remembering, a link to the download site
220is provided via the NetperfPage (http://www.netperf.org/)
221
222   The bits corresponding to each discrete release of netperf are
223tagged (http://www.netperf.org/svn/netperf2/tags) for retrieval via
224subversion.  For example, there is a tag for the first version
225corresponding to this version of the manual - netperf 2.6.0
226(http://www.netperf.org/svn/netperf2/tags/netperf-2.6.0).  Those
227wishing to be on the bleeding edge of netperf development can use
228subversion to grab the top of trunk
229(http://www.netperf.org/svn/netperf2/trunk).  When fixing bugs or
230making enhancements, patches against the top-of-trunk are preferred.
231
232   There are likely other places around the Internet from which one can
233download netperf bits.  These may be simple mirrors of the main netperf
234site, or they may be local variants on netperf.  As with anything one
235downloads from the Internet, take care to make sure it is what you
236really wanted and isn't some malicious Trojan or whatnot.  Caveat
237downloader.
238
239   As a general rule, binaries of netperf and netserver are not
240distributed from ftp.netperf.org.  From time to time a kind soul or
241souls has packaged netperf as a Debian package available via the
242apt-get mechanism or as an RPM.  I would be most interested in learning
243how to enhance the makefiles to make that easier for people.
244
245
246File: netperf.info,  Node: Installing Netperf Bits,  Next: Verifying Installation,  Prev: Getting Netperf Bits,  Up: Installing Netperf
247
2482.2 Installing Netperf
249======================
250
251Once you have downloaded the tar file of netperf sources onto your
252system(s), it is necessary to unpack the tar file, cd to the netperf
253directory, run configure and then make.  Most of the time it should be
254sufficient to just:
255
256     gzcat netperf-<version>.tar.gz | tar xf -
257     cd netperf-<version>
258     ./configure
259     make
260     make install
261
262   Most of the "usual" configure script options should be present
263dealing with where to install binaries and whatnot.
264     ./configure --help
265   should list all of those and more.  You may find the `--prefix'
266option helpful in deciding where the binaries and such will be put
267during the `make install'.
268
269   If the netperf configure script does not know how to automagically
270detect which CPU utilization mechanism to use on your platform you may
271want to add a `--enable-cpuutil=mumble' option to the configure
272command.   If you have knowledge and/or experience to contribute to
273that area, feel free to contact <netperf-feedback@netperf.org>.
274
275   Similarly, if you want tests using the XTI interface, Unix Domain
276Sockets, DLPI or SCTP it will be necessary to add one or more
277`--enable-[xti|unixdomain|dlpi|sctp]=yes' options to the configure
278command.  As of this writing, the configure script will not include
279those tests automagically.
280
281   Starting with version 2.5.0, netperf began migrating most of the
282"classic" netperf tests found in `src/nettest_bsd.c' to the so-called
283"omni" tests (aka "two routines to run them all") found in
284`src/nettest_omni.c'.  This migration enables a number of new features
285such as greater control over what output is included, and new things to
286output.  The "omni" test is enabled by default in 2.5.0 and a number of
287the classic tests are migrated - you can tell if a test has been
288migrated from the presence of `MIGRATED' in the test banner.  If you
289encounter problems with either the omni or migrated tests, please first
290attempt to obtain resolution via <netperf-talk@netperf.org> or
291<netperf-feedback@netperf.org>.  If that is unsuccessful, you can add a
292`--enable-omni=no' to the configure command and the omni tests will not
293be compiled-in and the classic tests will not be migrated.
294
295   Starting with version 2.5.0, netperf includes the "burst mode"
296functionality in a default compilation of the bits.  If you encounter
297problems with this, please first attempt to obtain help via
298<netperf-talk@netperf.org> or <netperf-feedback@netperf.org>.  If that
299is unsuccessful, you can add a `--enable-burst=no' to the configure
300command and the burst mode functionality will not be compiled-in.
301
302   On some platforms, it may be necessary to precede the configure
303command with a CFLAGS and/or LIBS variable as the netperf configure
304script is not yet smart enough to set them itself.  Whenever possible,
305these requirements will be found in `README.PLATFORM' files.  Expertise
306and assistance in making that more automagic in the configure script
307would be most welcome.
308
309   Other optional configure-time settings include
310`--enable-intervals=yes' to give netperf the ability to "pace" its
311_STREAM tests and `--enable-histogram=yes' to have netperf keep a
312histogram of interesting times.  Each of these will have some effect on
313the measured result.  If your system supports `gethrtime()' the effect
314of the histogram measurement should be minimized but probably still
315measurable.  For example, the histogram of a netperf TCP_RR test will
316be of the individual transaction times:
317     netperf -t TCP_RR -H lag -v 2
318     TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET : histogram
319     Local /Remote
320     Socket Size   Request  Resp.   Elapsed  Trans.
321     Send   Recv   Size     Size    Time     Rate
322     bytes  Bytes  bytes    bytes   secs.    per sec
323
324     16384  87380  1        1       10.00    3538.82
325     32768  32768
326     Alignment      Offset
327     Local  Remote  Local  Remote
328     Send   Recv    Send   Recv
329         8      0       0      0
330     Histogram of request/response times
331     UNIT_USEC     :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
332     TEN_USEC      :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
333     HUNDRED_USEC  :    0: 34480:  111:   13:   12:    6:    9:    3:    4:    7
334     UNIT_MSEC     :    0:   60:   50:   51:   44:   44:   72:  119:  100:  101
335     TEN_MSEC      :    0:  105:    0:    0:    0:    0:    0:    0:    0:    0
336     HUNDRED_MSEC  :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
337     UNIT_SEC      :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
338     TEN_SEC       :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
339     >100_SECS: 0
340     HIST_TOTAL:      35391
341
342   The histogram you see above is basically a base-10 log histogram
343where we can see that most of the transaction times were on the order
344of one hundred to one-hundred, ninety-nine microseconds, but they were
345occasionally as long as ten to nineteen milliseconds
346
347   The `--enable-demo=yes' configure option will cause code to be
348included to report interim results during a test run.  The rate at
349which interim results are reported can then be controlled via the
350global `-D' option.  Here is an example of `-D' output:
351
352     $ src/netperf -D 1.35 -H tardy.hpl.hp.com -f M
353     MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com (15.9.116.144) port 0 AF_INET : demo
354     Interim result:    5.41 MBytes/s over 1.35 seconds ending at 1308789765.848
355     Interim result:   11.07 MBytes/s over 1.36 seconds ending at 1308789767.206
356     Interim result:   16.00 MBytes/s over 1.36 seconds ending at 1308789768.566
357     Interim result:   20.66 MBytes/s over 1.36 seconds ending at 1308789769.922
358     Interim result:   22.74 MBytes/s over 1.36 seconds ending at 1308789771.285
359     Interim result:   23.07 MBytes/s over 1.36 seconds ending at 1308789772.647
360     Interim result:   23.77 MBytes/s over 1.37 seconds ending at 1308789774.016
361     Recv   Send    Send
362     Socket Socket  Message  Elapsed
363     Size   Size    Size     Time     Throughput
364     bytes  bytes   bytes    secs.    MBytes/sec
365
366      87380  16384  16384    10.06      17.81
367
368   Notice how the units of the interim result track that requested by
369the `-f' option.  Also notice that sometimes the interval will be
370longer than the value specified in the `-D' option.  This is normal and
371stems from how demo mode is implemented not by relying on interval
372timers or frequent calls to get the current time, but by calculating
373how many units of work must be performed to take at least the desired
374interval.
375
376   Those familiar with this option in earlier versions of netperf will
377note the addition of the "ending at" text.  This is the time as
378reported by a `gettimeofday()' call (or its emulation) with a `NULL'
379timezone pointer.  This addition is intended to make it easier to
380insert interim results into an rrdtool
381(http://oss.oetiker.ch/rrdtool/doc/rrdtool.en.html) Round-Robin
382Database (RRD).  A likely bug-riddled example of doing so can be found
383in `doc/examples/netperf_interim_to_rrd.sh'.  The time is reported out
384to milliseconds rather than microseconds because that is the most
385rrdtool understands as of the time of this writing.
386
387   As of this writing, a `make install' will not actually update the
388files `/etc/services' and/or `/etc/inetd.conf' or their
389platform-specific equivalents.  It remains necessary to perform that
390bit of installation magic by hand.  Patches to the makefile sources to
391effect an automagic editing of the necessary files to have netperf
392installed as a child of inetd would be most welcome.
393
394   Starting the netserver as a standalone daemon should be as easy as:
395     $ netserver
396     Starting netserver at port 12865
397     Starting netserver at hostname 0.0.0.0 port 12865 and family 0
398
399   Over time the specifics of the messages netserver prints to the
400screen may change but the gist will remain the same.
401
402   If the compilation of netperf or netserver happens to fail, feel free
403to contact <netperf-feedback@netperf.org> or join and ask in
404<netperf-talk@netperf.org>.  However, it is quite important that you
405include the actual compilation errors and perhaps even the configure
406log in your email.  Otherwise, it will be that much more difficult for
407someone to assist you.
408
409
410File: netperf.info,  Node: Verifying Installation,  Prev: Installing Netperf Bits,  Up: Installing Netperf
411
4122.3 Verifying Installation
413==========================
414
415Basically, once netperf is installed and netserver is configured as a
416child of inetd, or launched as a standalone daemon, simply typing:
417     netperf
418   should result in output similar to the following:
419     $ netperf
420     TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET
421     Recv   Send    Send
422     Socket Socket  Message  Elapsed
423     Size   Size    Size     Time     Throughput
424     bytes  bytes   bytes    secs.    10^6bits/sec
425
426      87380  16384  16384    10.00    2997.84
427
428
429File: netperf.info,  Node: The Design of Netperf,  Next: Global Command-line Options,  Prev: Installing Netperf,  Up: Top
430
4313 The Design of Netperf
432***********************
433
434Netperf is designed around a basic client-server model.  There are two
435executables - netperf and netserver.  Generally you will only execute
436the netperf program, with the netserver program being invoked by the
437remote system's inetd or having been previously started as its own
438standalone daemon.
439
440   When you execute netperf it will establish a "control connection" to
441the remote system.  This connection will be used to pass test
442configuration information and results to and from the remote system.
443Regardless of the type of test to be run, the control connection will
444be a TCP connection using BSD sockets.  The control connection can use
445either IPv4 or IPv6.
446
447   Once the control connection is up and the configuration information
448has been passed, a separate "data" connection will be opened for the
449measurement itself using the API's and protocols appropriate for the
450specified test.  When the test is completed, the data connection will
451be torn-down and results from the netserver will be passed-back via the
452control connection and combined with netperf's result for display to
453the user.
454
455   Netperf places no traffic on the control connection while a test is
456in progress.  Certain TCP options, such as SO_KEEPALIVE, if set as your
457systems' default, may put packets out on the control connection while a
458test is in progress.  Generally speaking this will have no effect on
459the results.
460
461* Menu:
462
463* CPU Utilization::
464
465
466File: netperf.info,  Node: CPU Utilization,  Prev: The Design of Netperf,  Up: The Design of Netperf
467
4683.1 CPU Utilization
469===================
470
471CPU utilization is an important, and alas all-too infrequently reported
472component of networking performance.  Unfortunately, it can be one of
473the most difficult metrics to measure accurately and portably.  Netperf
474will do its level best to report accurate CPU utilization figures, but
475some combinations of processor, OS and configuration may make that
476difficult.
477
478   CPU utilization in netperf is reported as a value between 0 and 100%
479regardless of the number of CPUs involved.  In addition to CPU
480utilization, netperf will report a metric called a "service demand".
481The service demand is the normalization of CPU utilization and work
482performed.  For a _STREAM test it is the microseconds of CPU time
483consumed to transfer on KB (K == 1024) of data.  For a _RR test it is
484the microseconds of CPU time consumed processing a single transaction.
485For both CPU utilization and service demand, lower is better.
486
487   Service demand can be particularly useful when trying to gauge the
488effect of a performance change.  It is essentially a measure of
489efficiency, with smaller values being more efficient and thus "better."
490
491   Netperf is coded to be able to use one of several, generally
492platform-specific CPU utilization measurement mechanisms.  Single
493letter codes will be included in the CPU portion of the test banner to
494indicate which mechanism was used on each of the local (netperf) and
495remote (netserver) system.
496
497   As of this writing those codes are:
498
499`U'
500     The CPU utilization measurement mechanism was unknown to netperf or
501     netperf/netserver was not compiled to include CPU utilization
502     measurements. The code for the null CPU utilization mechanism can
503     be found in `src/netcpu_none.c'.
504
505`I'
506     An HP-UX-specific CPU utilization mechanism whereby the kernel
507     incremented a per-CPU counter by one for each trip through the idle
508     loop. This mechanism was only available on specially-compiled HP-UX
509     kernels prior to HP-UX 10 and is mentioned here only for the sake
510     of historical completeness and perhaps as a suggestion to those
511     who might be altering other operating systems. While rather
512     simple, perhaps even simplistic, this mechanism was quite robust
513     and was not affected by the concerns of statistical methods, or
514     methods attempting to track time in each of user, kernel,
515     interrupt and idle modes which require quite careful accounting.
516     It can be thought-of as the in-kernel version of the looper `L'
517     mechanism without the context switch overhead. This mechanism
518     required calibration.
519
520`P'
521     An HP-UX-specific CPU utilization mechanism whereby the kernel
522     keeps-track of time (in the form of CPU cycles) spent in the kernel
523     idle loop (HP-UX 10.0 to 11.31 inclusive), or where the kernel
524     keeps track of time spent in idle, user, kernel and interrupt
525     processing (HP-UX 11.23 and later).  The former requires
526     calibration, the latter does not.  Values in either case are
527     retrieved via one of the pstat(2) family of calls, hence the use
528     of the letter `P'.  The code for these mechanisms is found in
529     `src/netcpu_pstat.c' and `src/netcpu_pstatnew.c' respectively.
530
531`K'
532     A Solaris-specific CPU utilization mechanism whereby the kernel
533     keeps track of ticks (eg HZ) spent in the idle loop.  This method
534     is statistical and is known to be inaccurate when the interrupt
535     rate is above epsilon as time spent processing interrupts is not
536     subtracted from idle.  The value is retrieved via a kstat() call -
537     hence the use of the letter `K'.  Since this mechanism uses units
538     of ticks (HZ) the calibration value should invariably match HZ.
539     (Eg 100) The code for this mechanism is implemented in
540     `src/netcpu_kstat.c'.
541
542`M'
543     A Solaris-specific mechanism available on Solaris 10 and latter
544     which uses the new microstate accounting mechanisms.  There are
545     two, alas, overlapping, mechanisms.  The first tracks nanoseconds
546     spent in user, kernel, and idle modes. The second mechanism tracks
547     nanoseconds spent in interrupt.  Since the mechanisms overlap,
548     netperf goes through some hand-waving to try to "fix" the problem.
549     Since the accuracy of the handwaving cannot be completely
550     determined, one must presume that while better than the `K'
551     mechanism, this mechanism too is not without issues.  The values
552     are retrieved via kstat() calls, but the letter code is set to `M'
553     to distinguish this mechanism from the even less accurate `K'
554     mechanism.  The code for this mechanism is implemented in
555     `src/netcpu_kstat10.c'.
556
557`L'
558     A mechanism based on "looper"or "soaker" processes which sit in
559     tight loops counting as fast as they possibly can. This mechanism
560     starts a looper process for each known CPU on the system.  The
561     effect of processor hyperthreading on the mechanism is not yet
562     known.  This mechanism definitely requires calibration.  The code
563     for the "looper"mechanism can be found in `src/netcpu_looper.c'
564
565`N'
566     A Microsoft Windows-specific mechanism, the code for which can be
567     found in `src/netcpu_ntperf.c'.  This mechanism too is based on
568     what appears to be a form of micro-state accounting and requires no
569     calibration.  On laptops, or other systems which may dynamically
570     alter the CPU frequency to minimize power consumption, it has been
571     suggested that this mechanism may become slightly confused, in
572     which case using BIOS/uEFI settings to disable the power saving
573     would be indicated.
574
575`S'
576     This mechanism uses `/proc/stat' on Linux to retrieve time (ticks)
577     spent in idle mode.  It is thought but not known to be reasonably
578     accurate.  The code for this mechanism can be found in
579     `src/netcpu_procstat.c'.
580
581`C'
582     A mechanism somewhat similar to `S' but using the sysctl() call on
583     BSD-like Operating systems (*BSD and MacOS X).  The code for this
584     mechanism can be found in `src/netcpu_sysctl.c'.
585
586`Others'
587     Other mechanisms included in netperf in the past have included
588     using the times() and getrusage() calls.  These calls are actually
589     rather poorly suited to the task of measuring CPU overhead for
590     networking as they tend to be process-specific and much
591     network-related processing can happen outside the context of a
592     process, in places where it is not a given it will be charged to
593     the correct, or even a process.  They are mentioned here as a
594     warning to anyone seeing those mechanisms used in other networking
595     benchmarks.  These mechanisms are not available in netperf 2.4.0
596     and later.
597
598   For many platforms, the configure script will chose the best
599available CPU utilization mechanism.  However, some platforms have no
600particularly good mechanisms.  On those platforms, it is probably best
601to use the "LOOPER" mechanism which is basically some number of
602processes (as many as there are processors) sitting in tight little
603loops counting as fast as they can.  The rate at which the loopers
604count when the system is believed to be idle is compared with the rate
605when the system is running netperf and the ratio is used to compute CPU
606utilization.
607
608   In the past, netperf included some mechanisms that only reported CPU
609time charged to the calling process.  Those mechanisms have been
610removed from netperf versions 2.4.0 and later because they are
611hopelessly inaccurate.  Networking can and often results in CPU time
612being spent in places - such as interrupt contexts - that do not get
613charged to a or the correct process.
614
615   In fact, time spent in the processing of interrupts is a common issue
616for many CPU utilization mechanisms.  In particular, the "PSTAT"
617mechanism was eventually known to have problems accounting for certain
618interrupt time prior to HP-UX 11.11 (11iv1).  HP-UX 11iv2 and later are
619known/presumed to be good. The "KSTAT" mechanism is known to have
620problems on all versions of Solaris up to and including Solaris 10.
621Even the microstate accounting available via kstat in Solaris 10 has
622issues, though perhaps not as bad as those of prior versions.
623
624   The /proc/stat mechanism under Linux is in what the author would
625consider an "uncertain" category as it appears to be statistical, which
626may also have issues with time spent processing interrupts.
627
628   In summary, be sure to "sanity-check" the CPU utilization figures
629with other mechanisms.  However, platform tools such as top, vmstat or
630mpstat are often based on the same mechanisms used by netperf.
631
632* Menu:
633
634* CPU Utilization in a Virtual Guest::
635
636
637File: netperf.info,  Node: CPU Utilization in a Virtual Guest,  Prev: CPU Utilization,  Up: CPU Utilization
638
6393.1.1 CPU Utilization in a Virtual Guest
640----------------------------------------
641
642The CPU utilization mechanisms used by netperf are "inline" in that
643they are run by the same netperf or netserver process as is running the
644test itself.  This works just fine for "bare iron" tests but runs into
645a problem when using virtual machines.
646
647   The relationship between virtual guest and hypervisor can be thought
648of as being similar to that between a process and kernel in a bare iron
649system.  As such, (m)any CPU utilization mechanisms used in the virtual
650guest are similar to "process-local" mechanisms in a bare iron
651situation.  However, just as with bare iron and process-local
652mechanisms, much networking processing happens outside the context of
653the virtual guest.  It takes place in the hypervisor, and is not
654visible to mechanisms running in the guest(s).  For this reason, one
655should not really trust CPU utilization figures reported by netperf or
656netserver when running in a virtual guest.
657
658   If one is looking to measure the added overhead of a virtualization
659mechanism, rather than rely on CPU utilization, one can rely instead on
660netperf _RR tests - path-lengths and overheads can be a significant
661fraction of the latency, so increases in overhead should appear as
662decreases in transaction rate.  Whatever you do, DO NOT rely on the
663throughput of a _STREAM test.  Achieving link-rate can be done via a
664multitude of options that mask overhead rather than eliminate it.
665
666
667File: netperf.info,  Node: Global Command-line Options,  Next: Using Netperf to Measure Bulk Data Transfer,  Prev: The Design of Netperf,  Up: Top
668
6694 Global Command-line Options
670*****************************
671
672This section describes each of the global command-line options
673available in the netperf and netserver binaries.  Essentially, it is an
674expanded version of the usage information displayed by netperf or
675netserver when invoked with the `-h' global command-line option.
676
677* Menu:
678
679* Command-line Options Syntax::
680* Global Options::
681
682
683File: netperf.info,  Node: Command-line Options Syntax,  Next: Global Options,  Prev: Global Command-line Options,  Up: Global Command-line Options
684
6854.1 Command-line Options Syntax
686===============================
687
688Revision 1.8 of netperf introduced enough new functionality to overrun
689the English alphabet for mnemonic command-line option names, and the
690author was not and is not quite ready to switch to the contemporary
691`--mumble' style of command-line options. (Call him a Luddite if you
692wish :).
693
694   For this reason, the command-line options were split into two parts -
695the first are the global command-line options.  They are options that
696affect nearly any and every test type of netperf.  The second type are
697the test-specific command-line options.  Both are entered on the same
698command line, but they must be separated from one another by a `--' for
699correct parsing.  Global command-line options come first, followed by
700the `--' and then test-specific command-line options.  If there are no
701test-specific options to be set, the `--' may be omitted.  If there are
702no global command-line options to be set, test-specific options must
703still be preceded by a `--'.  For example:
704     netperf <global> -- <test-specific>
705   sets both global and test-specific options:
706     netperf <global>
707   sets just global options and:
708     netperf -- <test-specific>
709   sets just test-specific options.
710
711
712File: netperf.info,  Node: Global Options,  Prev: Command-line Options Syntax,  Up: Global Command-line Options
713
7144.2 Global Options
715==================
716
717`-a <sizespec>'
718     This option allows you to alter the alignment of the buffers used
719     in the sending and receiving calls on the local system.. Changing
720     the alignment of the buffers can force the system to use different
721     copy schemes, which can have a measurable effect on performance.
722     If the page size for the system were 4096 bytes, and you want to
723     pass page-aligned buffers beginning on page boundaries, you could
724     use `-a 4096'.  By default the units are bytes, but suffix of "G,"
725     "M," or "K" will specify the units to be 2^30 (GB), 2^20 (MB) or
726     2^10 (KB) respectively. A suffix of "g," "m" or "k" will specify
727     units of 10^9, 10^6 or 10^3 bytes respectively. [Default: 8 bytes]
728
729`-A <sizespec>'
730     This option is identical to the `-a' option with the difference
731     being it affects alignments for the remote system.
732
733`-b <size>'
734     This option is only present when netperf has been configure with
735     -enable-intervals=yes prior to compilation.  It sets the size of
736     the burst of send calls in a _STREAM test.  When used in
737     conjunction with the `-w' option it can cause the rate at which
738     data is sent to be "paced."
739
740`-B <string>'
741     This option will cause `<string>' to be appended to the brief (see
742     -P) output of netperf.
743
744`-c [rate]'
745     This option will ask that CPU utilization and service demand be
746     calculated for the local system.  For those CPU utilization
747     mechanisms requiring calibration, the options rate parameter may
748     be specified to preclude running another calibration step, saving
749     40 seconds of time.  For those CPU utilization mechanisms
750     requiring no calibration, the optional rate parameter will be
751     utterly and completely ignored.  [Default: no CPU measurements]
752
753`-C [rate]'
754     This option requests CPU utilization and service demand
755     calculations for the remote system.  It is otherwise identical to
756     the `-c' option.
757
758`-d'
759     Each instance of this option will increase the quantity of
760     debugging output displayed during a test.  If the debugging output
761     level is set high enough, it may have a measurable effect on
762     performance.  Debugging information for the local system is
763     printed to stdout.  Debugging information for the remote system is
764     sent by default to the file `/tmp/netperf.debug'. [Default: no
765     debugging output]
766
767`-D [interval,units]'
768     This option is only available when netperf is configured with
769     -enable-demo=yes.  When set, it will cause netperf to emit periodic
770     reports of performance during the run.  [INTERVAL,UNITS] follow
771     the semantics of an optionspec. If specified, INTERVAL gives the
772     minimum interval in real seconds, it does not have to be whole
773     seconds.  The UNITS value can be used for the first guess as to
774     how many units of work (bytes or transactions) must be done to
775     take at least INTERVAL seconds. If omitted, INTERVAL defaults to
776     one second and UNITS to values specific to each test type.
777
778`-f G|M|K|g|m|k|x'
779     This option can be used to change the reporting units for _STREAM
780     tests.  Arguments of "G," "M," or "K" will set the units to 2^30,
781     2^20 or 2^10 bytes/s respectively (EG power of two GB, MB or KB).
782     Arguments of "g," ",m" or "k" will set the units to 10^9, 10^6 or
783     10^3 bits/s respectively.  An argument of "x" requests the units
784     be transactions per second and is only meaningful for a
785     request-response test. [Default: "m" or 10^6 bits/s]
786
787`-F <fillfile>'
788     This option specified the file from which send which buffers will
789     be pre-filled .  While the buffers will contain data from the
790     specified file, the file is not fully transferred to the remote
791     system as the receiving end of the test will not write the
792     contents of what it receives to a file.  This can be used to
793     pre-fill the send buffers with data having different
794     compressibility and so is useful when measuring performance over
795     mechanisms which perform compression.
796
797     While previously required for a TCP_SENDFILE test, later versions
798     of netperf removed that restriction, creating a temporary file as
799     needed.  While the author cannot recall exactly when that took
800     place, it is known to be unnecessary in version 2.5.0 and later.
801
802`-h'
803     This option causes netperf to display its "global" usage string and
804     exit to the exclusion of all else.
805
806`-H <optionspec>'
807     This option will set the name of the remote system and or the
808     address family used for the control connection.  For example:
809          -H linger,4
810     will set the name of the remote system to "linger" and tells
811     netperf to use IPv4 addressing only.
812          -H ,6
813     will leave the name of the remote system at its default, and
814     request that only IPv6 addresses be used for the control
815     connection.
816          -H lag
817     will set the name of the remote system to "lag" and leave the
818     address family to AF_UNSPEC which means selection of IPv4 vs IPv6
819     is left to the system's address resolution.
820
821     A value of "inet" can be used in place of "4" to request IPv4 only
822     addressing.  Similarly, a value of "inet6" can be used in place of
823     "6" to request IPv6 only addressing.  A value of "0" can be used
824     to request either IPv4 or IPv6 addressing as name resolution
825     dictates.
826
827     By default, the options set with the global `-H' option are
828     inherited by the test for its data connection, unless a
829     test-specific `-H' option is specified.
830
831     If a `-H' option follows either the `-4' or `-6' options, the
832     family setting specified with the -H option will override the `-4'
833     or `-6' options for the remote address family. If no address
834     family is specified, settings from a previous `-4' or `-6' option
835     will remain.  In a nutshell, the last explicit global command-line
836     option wins.
837
838     [Default:  "localhost" for the remote name/IP address and "0" (eg
839     AF_UNSPEC) for the remote address family.]
840
841`-I <optionspec>'
842     This option enables the calculation of confidence intervals and
843     sets the confidence and width parameters with the first half of the
844     optionspec being either 99 or 95 for 99% or 95% confidence
845     respectively.  The second value of the optionspec specifies the
846     width of the desired confidence interval.  For example
847          -I 99,5
848     asks netperf to be 99% confident that the measured mean values for
849     throughput and CPU utilization are within +/- 2.5% of the "real"
850     mean values.  If the `-i' option is specified and the `-I' option
851     is omitted, the confidence defaults to 99% and the width to 5%
852     (giving +/- 2.5%)
853
854     If classic netperf test calculates that the desired confidence
855     intervals have not been met, it emits a noticeable warning that
856     cannot be suppressed with the `-P' or `-v' options:
857
858          netperf -H tardy.cup -i 3 -I 99,5
859          TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.cup.hp.com (15.244.44.58) port 0 AF_INET : +/-2.5%  99% conf.
860          !!! WARNING
861          !!! Desired confidence was not achieved within the specified iterations.
862          !!! This implies that there was variability in the test environment that
863          !!! must be investigated before going further.
864          !!! Confidence intervals: Throughput      :  6.8%
865          !!!                       Local CPU util  :  0.0%
866          !!!                       Remote CPU util :  0.0%
867
868          Recv   Send    Send
869          Socket Socket  Message  Elapsed
870          Size   Size    Size     Time     Throughput
871          bytes  bytes   bytes    secs.    10^6bits/sec
872
873           32768  16384  16384    10.01      40.23
874
875     In the example above we see that netperf did not meet the desired
876     confidence intervals.  Instead of being 99% confident it was within
877     +/- 2.5% of the real mean value of throughput it is only confident
878     it was within +/-3.4%.  In this example, increasing the `-i'
879     option (described below) and/or increasing the iteration length
880     with the `-l' option might resolve the situation.
881
882     In an explicit "omni" test, failure to meet the confidence
883     intervals will not result in netperf emitting a warning.  To
884     verify the hitting, or not, of the confidence intervals one will
885     need to include them as part of an *note output selection: Omni
886     Output Selection. in the test-specific `-o', `-O' or `k' output
887     selection options.  The warning about not hitting the confidence
888     intervals will remain in a "migrated" classic netperf test.
889
890`-i <sizespec>'
891     This option enables the calculation of confidence intervals and
892     sets the minimum and maximum number of iterations to run in
893     attempting to achieve the desired confidence interval.  The first
894     value sets the maximum number of iterations to run, the second,
895     the minimum.  The maximum number of iterations is silently capped
896     at 30 and the minimum is silently floored at 3.  Netperf repeats
897     the measurement the minimum number of iterations and continues
898     until it reaches either the desired confidence interval, or the
899     maximum number of iterations, whichever comes first.  A classic or
900     migrated netperf test will not display the actual number of
901     iterations run. An *note omni test: The Omni Tests. will emit the
902     number of iterations run if the `CONFIDENCE_ITERATION' output
903     selector is included in the *note output selection: Omni Output
904     Selection.
905
906     If the `-I' option is specified and the `-i' option omitted the
907     maximum number of iterations is set to 10 and the minimum to three.
908
909     Output of a warning upon not hitting the desired confidence
910     intervals follows the description provided for the `-I' option.
911
912     The total test time will be somewhere between the minimum and
913     maximum number of iterations multiplied by the test length
914     supplied by the `-l' option.
915
916`-j'
917     This option instructs netperf to keep additional timing statistics
918     when explicitly running an *note omni test: The Omni Tests.  These
919     can be output when the test-specific `-o', `-O' or `-k' *note
920     output selectors: Omni Output Selectors. include one or more of:
921
922        * MIN_LATENCY
923
924        * MAX_LATENCY
925
926        * P50_LATENCY
927
928        * P90_LATENCY
929
930        * P99_LATENCY
931
932        * MEAN_LATENCY
933
934        * STDDEV_LATENCY
935
936     These statistics will be based on an expanded (100 buckets per row
937     rather than 10) histogram of times rather than a terribly long
938     list of individual times.  As such, there will be some slight
939     error thanks to the bucketing. However, the reduction in storage
940     and processing overheads is well worth it.  When running a
941     request/response test, one might get some idea of the error by
942     comparing the *note `MEAN_LATENCY': Omni Output Selectors.
943     calculated from the histogram with the `RT_LATENCY' calculated
944     from the number of request/response transactions and the test run
945     time.
946
947     In the case of a request/response test the latencies will be
948     transaction latencies.  In the case of a receive-only test they
949     will be time spent in the receive call.  In the case of a
950     send-only test they will be time spent in the send call. The units
951     will be microseconds. Added in netperf 2.5.0.
952
953`-l testlen'
954     This option controls the length of any one iteration of the
955     requested test.  A positive value for TESTLEN will run each
956     iteration of the test for at least TESTLEN seconds.  A negative
957     value for TESTLEN will run each iteration for the absolute value of
958     TESTLEN transactions for a _RR test or bytes for a _STREAM test.
959     Certain tests, notably those using UDP can only be timed, they
960     cannot be limited by transaction or byte count.  This limitation
961     may be relaxed in an *note omni: The Omni Tests. test.
962
963     In some situations, individual iterations of a test may run for
964     longer for the number of seconds specified by the `-l' option.  In
965     particular, this may occur for those tests where the socket buffer
966     size(s) are significantly longer than the bandwidthXdelay product
967     of the link(s) over which the data connection passes, or those
968     tests where there may be non-trivial numbers of retransmissions.
969
970     If confidence intervals are enabled via either `-I' or `-i' the
971     total length of the netperf test will be somewhere between the
972     minimum and maximum iteration count multiplied by TESTLEN.
973
974`-L <optionspec>'
975     This option is identical to the `-H' option with the difference
976     being it sets the _local_ hostname/IP and/or address family
977     information.  This option is generally unnecessary, but can be
978     useful when you wish to make sure that the netperf control and data
979     connections go via different paths.  It can also come-in handy if
980     one is trying to run netperf through those evil, end-to-end
981     breaking things known as firewalls.
982
983     [Default: 0.0.0.0 (eg INADDR_ANY) for IPv4 and ::0 for IPv6 for the
984     local name.  AF_UNSPEC for the local address family.]
985
986`-n numcpus'
987     This option tells netperf how many CPUs it should ass-u-me are
988     active on the system running netperf.  In particular, this is used
989     for the *note CPU utilization: CPU Utilization. and service demand
990     calculations.  On certain systems, netperf is able to determine
991     the number of CPU's automagically. This option will override any
992     number netperf might be able to determine on its own.
993
994     Note that this option does _not_ set the number of CPUs on the
995     system running netserver.  When netperf/netserver cannot
996     automagically determine the number of CPUs that can only be set
997     for netserver via a netserver `-n' command-line option.
998
999     As it is almost universally possible for netperf/netserver to
1000     determine the number of CPUs on the system automagically, 99 times
1001     out of 10 this option should not be necessary and may be removed
1002     in a future release of netperf.
1003
1004`-N'
1005     This option tells netperf to forgo establishing a control
1006     connection. This makes it is possible to run some limited netperf
1007     tests without a corresponding netserver on the remote system.
1008
1009     With this option set, the test to be run is to get all the
1010     addressing information it needs to establish its data connection
1011     from the command line or internal defaults.  If not otherwise
1012     specified by test-specific command line options, the data
1013     connection for a "STREAM" or "SENDFILE" test will be to the
1014     "discard" port, an "RR" test will be to the "echo" port, and a
1015     "MEARTS" test will be to the chargen port.
1016
1017     The response size of an "RR" test will be silently set to be the
1018     same as the request size.  Otherwise the test would hang if the
1019     response size was larger than the request size, or would report an
1020     incorrect, inflated transaction rate if the response size was less
1021     than the request size.
1022
1023     Since there is no control connection when this option is
1024     specified, it is not possible to set "remote" properties such as
1025     socket buffer size and the like via the netperf command line. Nor
1026     is it possible to retrieve such interesting remote information as
1027     CPU utilization.  These items will be displayed as values which
1028     should make it immediately obvious that was the case.
1029
1030     The only way to change remote characteristics such as socket buffer
1031     size or to obtain information such as CPU utilization is to employ
1032     platform-specific methods on the remote system.  Frankly, if one
1033     has access to the remote system to employ those methods one aught
1034     to be able to run a netserver there.  However, that ability may
1035     not be present in certain "support" situations, hence the addition
1036     of this option.
1037
1038     Added in netperf 2.4.3.
1039
1040`-o <sizespec>'
1041     The value(s) passed-in with this option will be used as an offset
1042     added to the alignment specified with the `-a' option.  For
1043     example:
1044          -o 3 -a 4096
1045     will cause the buffers passed to the local (netperf) send and
1046     receive calls to begin three bytes past an address aligned to 4096
1047     bytes. [Default: 0 bytes]
1048
1049`-O <sizespec>'
1050     This option behaves just as the `-o' option but on the remote
1051     (netserver) system and in conjunction with the `-A' option.
1052     [Default: 0 bytes]
1053
1054`-p <optionspec>'
1055     The first value of the optionspec passed-in with this option tells
1056     netperf the port number at which it should expect the remote
1057     netserver to be listening for control connections.  The second
1058     value of the optionspec will request netperf to bind to that local
1059     port number before establishing the control connection.  For
1060     example
1061          -p 12345
1062     tells netperf that the remote netserver is listening on port 12345
1063     and leaves selection of the local port number for the control
1064     connection up to the local TCP/IP stack whereas
1065          -p ,32109
1066     leaves the remote netserver port at the default value of 12865 and
1067     causes netperf to bind to the local port number 32109 before
1068     connecting to the remote netserver.
1069
1070     In general, setting the local port number is only necessary when
1071     one is looking to run netperf through those evil, end-to-end
1072     breaking things known as firewalls.
1073
1074`-P 0|1'
1075     A value of "1" for the `-P' option will enable display of the test
1076     banner.  A value of "0" will disable display of the test banner.
1077     One might want to disable display of the test banner when running
1078     the same basic test type (eg TCP_STREAM) multiple times in
1079     succession where the test banners would then simply be redundant
1080     and unnecessarily clutter the output. [Default: 1 - display test
1081     banners]
1082
1083`-s <seconds>'
1084     This option will cause netperf to sleep `<seconds>' before
1085     actually transferring data over the data connection.  This may be
1086     useful in situations where one wishes to start a great many netperf
1087     instances and do not want the earlier ones affecting the ability of
1088     the later ones to get established.
1089
1090     Added somewhere between versions 2.4.3 and 2.5.0.
1091
1092`-S'
1093     This option will cause an attempt to be made to set SO_KEEPALIVE on
1094     the data socket of a test using the BSD sockets interface.  The
1095     attempt will be made on the netperf side of all tests, and will be
1096     made on the netserver side of an *note omni: The Omni Tests. or
1097     *note migrated: Migrated Tests. test.  No indication of failure is
1098     given unless debug output is enabled with the global `-d' option.
1099
1100     Added in version 2.5.0.
1101
1102`-t testname'
1103     This option is used to tell netperf which test you wish to run.
1104     As of this writing, valid values for TESTNAME include:
1105        * *note TCP_STREAM::, *note TCP_MAERTS::, *note TCP_SENDFILE::,
1106          *note TCP_RR::, *note TCP_CRR::, *note TCP_CC::
1107
1108        * *note UDP_STREAM::, *note UDP_RR::
1109
1110        * *note XTI_TCP_STREAM::,  *note XTI_TCP_RR::, *note
1111          XTI_TCP_CRR::, *note XTI_TCP_CC::
1112
1113        * *note XTI_UDP_STREAM::, *note XTI_UDP_RR::
1114
1115        * *note SCTP_STREAM::, *note SCTP_RR::
1116
1117        * *note DLCO_STREAM::, *note DLCO_RR::,  *note DLCL_STREAM::,
1118          *note DLCL_RR::
1119
1120        * *note LOC_CPU: Other Netperf Tests, *note REM_CPU: Other
1121          Netperf Tests.
1122
1123        * *note OMNI: The Omni Tests.
1124     Not all tests are always compiled into netperf.  In particular, the
1125     "XTI," "SCTP," "UNIXDOMAIN," and "DL*" tests are only included in
1126     netperf when configured with
1127     `--enable-[xti|sctp|unixdomain|dlpi]=yes'.
1128
1129     Netperf only runs one type of test no matter how many `-t' options
1130     may be present on the command-line.  The last `-t' global
1131     command-line option will determine the test to be run. [Default:
1132     TCP_STREAM]
1133
1134`-T <optionspec>'
1135     This option controls the CPU, and probably by extension memory,
1136     affinity of netperf and/or netserver.
1137          netperf -T 1
1138     will bind both netperf and netserver to "CPU 1" on their respective
1139     systems.
1140          netperf -T 1,
1141     will bind just netperf to "CPU 1" and will leave netserver unbound.
1142          netperf -T ,2
1143     will leave netperf unbound and will bind netserver to "CPU 2."
1144          netperf -T 1,2
1145     will bind netperf to "CPU 1" and netserver to "CPU 2."
1146
1147     This can be particularly useful when investigating performance
1148     issues involving where processes run relative to where NIC
1149     interrupts are processed or where NICs allocate their DMA buffers.
1150
1151`-v verbosity'
1152     This option controls how verbose netperf will be in its output,
1153     and is often used in conjunction with the `-P' option. If the
1154     verbosity is set to a value of "0" then only the test's SFM (Single
1155     Figure of Merit) is displayed.  If local *note CPU utilization:
1156     CPU Utilization. is requested via the `-c' option then the SFM is
1157     the local service demand.  Othersise, if remote CPU utilization is
1158     requested via the `-C' option then the SFM is the remote service
1159     demand.  If neither local nor remote CPU utilization are requested
1160     the SFM will be the measured throughput or transaction rate as
1161     implied by the test specified with the `-t' option.
1162
1163     If the verbosity level is set to "1" then the "normal" netperf
1164     result output for each test is displayed.
1165
1166     If the verbosity level is set to "2" then "extra" information will
1167     be displayed.  This may include, but is not limited to the number
1168     of send or recv calls made and the average number of bytes per
1169     send or recv call, or a histogram of the time spent in each send()
1170     call or for each transaction if netperf was configured with
1171     `--enable-histogram=yes'. [Default: 1 - normal verbosity]
1172
1173     In an *note omni: The Omni Tests. test the verbosity setting is
1174     largely ignored, save for when asking for the time histogram to be
1175     displayed.  In version 2.5.0 and later there is no *note output
1176     selector: Omni Output Selectors. for the histogram and so it
1177     remains displayed only when the verbosity level is set to 2.
1178
1179`-V'
1180     This option displays the netperf version and then exits.
1181
1182     Added in netperf 2.4.4.
1183
1184`-w time'
1185     If netperf was configured with `--enable-intervals=yes' then this
1186     value will set the inter-burst time to time milliseconds, and the
1187     `-b' option will set the number of sends per burst.  The actual
1188     inter-burst time may vary depending on the system's timer
1189     resolution.
1190
1191`-W <sizespec>'
1192     This option controls the number of buffers in the send (first or
1193     only value) and or receive (second or only value) buffer rings.
1194     Unlike some benchmarks, netperf does not continuously send or
1195     receive from a single buffer.  Instead it rotates through a ring of
1196     buffers. [Default: One more than the size of the send or receive
1197     socket buffer sizes (`-s' and/or `-S' options) divided by the send
1198     `-m' or receive `-M' buffer size respectively]
1199
1200`-4'
1201     Specifying this option will set both the local and remote address
1202     families to AF_INET - that is use only IPv4 addresses on the
1203     control connection.  This can be overridden by a subsequent `-6',
1204     `-H' or `-L' option.  Basically, the last option explicitly
1205     specifying an address family wins.  Unless overridden by a
1206     test-specific option, this will be inherited for the data
1207     connection as well.
1208
1209`-6'
1210     Specifying this option will set both local and and remote address
1211     families to AF_INET6 - that is use only IPv6 addresses on the
1212     control connection.  This can be overridden by a subsequent `-4',
1213     `-H' or `-L' option.  Basically, the last address family
1214     explicitly specified wins.  Unless overridden by a test-specific
1215     option, this will be inherited for the data connection as well.
1216
1217
1218
1219File: netperf.info,  Node: Using Netperf to Measure Bulk Data Transfer,  Next: Using Netperf to Measure Request/Response,  Prev: Global Command-line Options,  Up: Top
1220
12215 Using Netperf to Measure Bulk Data Transfer
1222*********************************************
1223
1224The most commonly measured aspect of networked system performance is
1225that of bulk or unidirectional transfer performance.  Everyone wants to
1226know how many bits or bytes per second they can push across the
1227network. The classic netperf convention for a bulk data transfer test
1228name is to tack a "_STREAM" suffix to a test name.
1229
1230* Menu:
1231
1232* Issues in Bulk Transfer::
1233* Options common to TCP UDP and SCTP tests::
1234
1235
1236File: netperf.info,  Node: Issues in Bulk Transfer,  Next: Options common to TCP UDP and SCTP tests,  Prev: Using Netperf to Measure Bulk Data Transfer,  Up: Using Netperf to Measure Bulk Data Transfer
1237
12385.1 Issues in Bulk Transfer
1239===========================
1240
1241There are any number of things which can affect the performance of a
1242bulk transfer test.
1243
1244   Certainly, absent compression, bulk-transfer tests can be limited by
1245the speed of the slowest link in the path from the source to the
1246destination.  If testing over a gigabit link, you will not see more
1247than a gigabit :) Such situations can be described as being
1248"network-limited" or "NIC-limited".
1249
1250   CPU utilization can also affect the results of a bulk-transfer test.
1251If the networking stack requires a certain number of instructions or
1252CPU cycles per KB of data transferred, and the CPU is limited in the
1253number of instructions or cycles it can provide, then the transfer can
1254be described as being "CPU-bound".
1255
1256   A bulk-transfer test can be CPU bound even when netperf reports less
1257than 100% CPU utilization.  This can happen on an MP system where one
1258or more of the CPUs saturate at 100% but other CPU's remain idle.
1259Typically, a single flow of data, such as that from a single instance
1260of a netperf _STREAM test cannot make use of much more than the power
1261of one CPU. Exceptions to this generally occur when netperf and/or
1262netserver run on CPU(s) other than the CPU(s) taking interrupts from
1263the NIC(s). In that case, one might see as much as two CPUs' worth of
1264processing being used to service the flow of data.
1265
1266   Distance and the speed-of-light can affect performance for a
1267bulk-transfer; often this can be mitigated by using larger windows.
1268One common limit to the performance of a transport using window-based
1269flow-control is:
1270     Throughput <= WindowSize/RoundTripTime
1271   As the sender can only have a window's-worth of data outstanding on
1272the network at any one time, and the soonest the sender can receive a
1273window update from the receiver is one RoundTripTime (RTT).  TCP and
1274SCTP are examples of such protocols.
1275
1276   Packet losses and their effects can be particularly bad for
1277performance.  This is especially true if the packet losses result in
1278retransmission timeouts for the protocol(s) involved.  By the time a
1279retransmission timeout has happened, the flow or connection has sat
1280idle for a considerable length of time.
1281
1282   On many platforms, some variant on the `netstat' command can be used
1283to retrieve statistics about packet loss and retransmission. For
1284example:
1285     netstat -p tcp
1286   will retrieve TCP statistics on the HP-UX Operating System.  On other
1287platforms, it may not be possible to retrieve statistics for a specific
1288protocol and something like:
1289     netstat -s
1290   would be used instead.
1291
1292   Many times, such network statistics are keep since the time the stack
1293started, and we are only really interested in statistics from when
1294netperf was running.  In such situations something along the lines of:
1295     netstat -p tcp > before
1296     netperf -t TCP_mumble...
1297     netstat -p tcp > after
1298   is indicated.  The beforeafter
1299(ftp://ftp.cup.hp.com/dist/networking/tools/) utility can be used to
1300subtract the statistics in `before' from the statistics in `after':
1301     beforeafter before after > delta
1302   and then one can look at the statistics in `delta'.  Beforeafter is
1303distributed in source form so one can compile it on the platform(s) of
1304interest.
1305
1306   If running a version 2.5.0 or later "omni" test under Linux one can
1307include either or both of:
1308   * LOCAL_TRANSPORT_RETRANS
1309
1310   * REMOTE_TRANSPORT_RETRANS
1311
1312   in the values provided via a test-specific `-o', `-O', or `-k'
1313output selction option and netperf will report the retransmissions
1314experienced on the data connection, as reported via a
1315`getsockopt(TCP_INFO)' call.  If confidence intervals have been
1316requested via the global `-I' or `-i' options, the reported value(s)
1317will be for the last iteration.  If the test is over a protocol other
1318than TCP, or on a platform other than Linux, the results are undefined.
1319
1320   While it was written with HP-UX's netstat in mind, the annotated
1321netstat
1322(ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_netstat.txt)
1323writeup may be helpful with other platforms as well.
1324
1325
1326File: netperf.info,  Node: Options common to TCP UDP and SCTP tests,  Prev: Issues in Bulk Transfer,  Up: Using Netperf to Measure Bulk Data Transfer
1327
13285.2 Options common to TCP UDP and SCTP tests
1329============================================
1330
1331Many "test-specific" options are actually common across the different
1332tests.  For those tests involving TCP, UDP and SCTP, whether using the
1333BSD Sockets or the XTI interface those common options include:
1334
1335`-h'
1336     Display the test-suite-specific usage string and exit.  For a TCP_
1337     or UDP_ test this will be the usage string from the source file
1338     nettest_bsd.c.  For an XTI_ test, this will be the usage string
1339     from the source file nettest_xti.c.  For an SCTP test, this will
1340     be the usage string from the source file nettest_sctp.c.
1341
1342`-H <optionspec>'
1343     Normally, the remote hostname|IP and address family information is
1344     inherited from the settings for the control connection (eg global
1345     command-line `-H', `-4' and/or `-6' options).  The test-specific
1346     `-H' will override those settings for the data (aka test)
1347     connection only.  Settings for the control connection are left
1348     unchanged.
1349
1350`-L <optionspec>'
1351     The test-specific `-L' option is identical to the test-specific
1352     `-H' option except it affects the local hostname|IP and address
1353     family information.  As with its global command-line counterpart,
1354     this is generally only useful when measuring though those evil,
1355     end-to-end breaking things called firewalls.
1356
1357`-m bytes'
1358     Set the size of the buffer passed-in to the "send" calls of a
1359     _STREAM test.  Note that this may have only an indirect effect on
1360     the size of the packets sent over the network, and certain Layer 4
1361     protocols do _not_ preserve or enforce message boundaries, so
1362     setting `-m' for the send size does not necessarily mean the
1363     receiver will receive that many bytes at any one time. By default
1364     the units are bytes, but suffix of "G," "M," or "K" will specify
1365     the units to be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A
1366     suffix of "g," "m" or "k" will specify units of 10^9, 10^6 or 10^3
1367     bytes respectively. For example:
1368          `-m 32K'
1369     will set the size to 32KB or 32768 bytes. [Default: the local send
1370     socket buffer size for the connection - either the system's
1371     default or the value set via the `-s' option.]
1372
1373`-M bytes'
1374     Set the size of the buffer passed-in to the "recv" calls of a
1375     _STREAM test.  This will be an upper bound on the number of bytes
1376     received per receive call. By default the units are bytes, but
1377     suffix of "G," "M," or "K" will specify the units to be 2^30 (GB),
1378     2^20 (MB) or 2^10 (KB) respectively.  A suffix of "g," "m" or "k"
1379     will specify units of 10^9, 10^6 or 10^3 bytes respectively. For
1380     example:
1381          `-M 32K'
1382     will set the size to 32KB or 32768 bytes. [Default: the remote
1383     receive socket buffer size for the data connection - either the
1384     system's default or the value set via the `-S' option.]
1385
1386`-P <optionspec>'
1387     Set the local and/or remote port numbers for the data connection.
1388
1389`-s <sizespec>'
1390     This option sets the local (netperf) send and receive socket buffer
1391     sizes for the data connection to the value(s) specified.  Often,
1392     this will affect the advertised and/or effective TCP or other
1393     window, but on some platforms it may not. By default the units are
1394     bytes, but suffix of "G," "M," or "K" will specify the units to be
1395     2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of "g,"
1396     "m" or "k" will specify units of 10^9, 10^6 or 10^3 bytes
1397     respectively. For example:
1398          `-s 128K'
1399     Will request the local send and receive socket buffer sizes to be
1400     128KB or 131072 bytes.
1401
1402     While the historic expectation is that setting the socket buffer
1403     size has a direct effect on say the TCP window, today that may not
1404     hold true for all stacks. Further, while the historic expectation
1405     is that the value specified in a `setsockopt()' call will be the
1406     value returned via a `getsockopt()' call, at least one stack is
1407     known to deliberately ignore history.  When running under Windows
1408     a value of 0 may be used which will be an indication to the stack
1409     the user wants to enable a form of copy avoidance. [Default: -1 -
1410     use the system's default socket buffer sizes]
1411
1412`-S <sizespec>'
1413     This option sets the remote (netserver) send and/or receive socket
1414     buffer sizes for the data connection to the value(s) specified.
1415     Often, this will affect the advertised and/or effective TCP or
1416     other window, but on some platforms it may not. By default the
1417     units are bytes, but suffix of "G," "M," or "K" will specify the
1418     units to be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A
1419     suffix of "g," "m" or "k" will specify units of 10^9, 10^6 or 10^3
1420     bytes respectively.  For example:
1421          `-S 128K'
1422     Will request the remote send and receive socket buffer sizes to be
1423     128KB or 131072 bytes.
1424
1425     While the historic expectation is that setting the socket buffer
1426     size has a direct effect on say the TCP window, today that may not
1427     hold true for all stacks.  Further, while the historic expectation
1428     is that the value specified in a `setsockopt()' call will be the
1429     value returned via a `getsockopt()' call, at least one stack is
1430     known to deliberately ignore history.  When running under Windows
1431     a value of 0 may be used which will be an indication to the stack
1432     the user wants to enable a form of copy avoidance. [Default: -1 -
1433     use the system's default socket buffer sizes]
1434
1435`-4'
1436     Set the local and remote address family for the data connection to
1437     AF_INET - ie use IPv4 addressing only.  Just as with their global
1438     command-line counterparts the last of the `-4', `-6', `-H' or `-L'
1439     option wins for their respective address families.
1440
1441`-6'
1442     This option is identical to its `-4' cousin, but requests IPv6
1443     addresses for the local and remote ends of the data connection.
1444
1445
1446* Menu:
1447
1448* TCP_STREAM::
1449* TCP_MAERTS::
1450* TCP_SENDFILE::
1451* UDP_STREAM::
1452* XTI_TCP_STREAM::
1453* XTI_UDP_STREAM::
1454* SCTP_STREAM::
1455* DLCO_STREAM::
1456* DLCL_STREAM::
1457* STREAM_STREAM::
1458* DG_STREAM::
1459
1460
1461File: netperf.info,  Node: TCP_STREAM,  Next: TCP_MAERTS,  Prev: Options common to TCP UDP and SCTP tests,  Up: Options common to TCP UDP and SCTP tests
1462
14635.2.1 TCP_STREAM
1464----------------
1465
1466The TCP_STREAM test is the default test in netperf.  It is quite
1467simple, transferring some quantity of data from the system running
1468netperf to the system running netserver.  While time spent establishing
1469the connection is not included in the throughput calculation, time
1470spent flushing the last of the data to the remote at the end of the
1471test is.  This is how netperf knows that all the data it sent was
1472received by the remote.  In addition to the *note options common to
1473STREAM tests: Options common to TCP UDP and SCTP tests, the following
1474test-specific options can be included to possibly alter the behavior of
1475the test:
1476
1477`-C'
1478     This option will set TCP_CORK mode on the data connection on those
1479     systems where TCP_CORK is defined (typically Linux).  A full
1480     description of TCP_CORK is beyond the scope of this manual, but in
1481     a nutshell it forces sub-MSS sends to be buffered so every segment
1482     sent is Maximum Segment Size (MSS) unless the application performs
1483     an explicit flush operation or the connection is closed.  At
1484     present netperf does not perform any explicit flush operations.
1485     Setting TCP_CORK may improve the bitrate of tests where the "send
1486     size" (`-m' option) is smaller than the MSS.  It should also
1487     improve (make smaller) the service demand.
1488
1489     The Linux tcp(7) manpage states that TCP_CORK cannot be used in
1490     conjunction with TCP_NODELAY (set via the `-d' option), however
1491     netperf does not validate command-line options to enforce that.
1492
1493`-D'
1494     This option will set TCP_NODELAY on the data connection on those
1495     systems where TCP_NODELAY is defined.  This disables something
1496     known as the Nagle Algorithm, which is intended to make the
1497     segments TCP sends as large as reasonably possible.  Setting
1498     TCP_NODELAY for a TCP_STREAM test should either have no effect
1499     when the send size (`-m' option) is larger than the MSS or will
1500     decrease reported bitrate and increase service demand when the
1501     send size is smaller than the MSS.  This stems from TCP_NODELAY
1502     causing each sub-MSS send to be its own TCP segment rather than
1503     being aggregated with other small sends.  This means more trips up
1504     and down the protocol stack per KB of data transferred, which
1505     means greater CPU utilization.
1506
1507     If setting TCP_NODELAY with `-D' affects throughput and/or service
1508     demand for tests where the send size (`-m') is larger than the MSS
1509     it suggests the TCP/IP stack's implementation of the Nagle
1510     Algorithm _may_ be broken, perhaps interpreting the Nagle
1511     Algorithm on a segment by segment basis rather than the proper user
1512     send by user send basis.  However, a better test of this can be
1513     achieved with the *note TCP_RR:: test.
1514
1515
1516   Here is an example of a basic TCP_STREAM test, in this case from a
1517Debian Linux (2.6 kernel) system to an HP-UX 11iv2 (HP-UX 11.23) system:
1518
1519     $ netperf -H lag
1520     TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
1521     Recv   Send    Send
1522     Socket Socket  Message  Elapsed
1523     Size   Size    Size     Time     Throughput
1524     bytes  bytes   bytes    secs.    10^6bits/sec
1525
1526      32768  16384  16384    10.00      80.42
1527
1528   We see that the default receive socket buffer size for the receiver
1529(lag - HP-UX 11.23) is 32768 bytes, and the default socket send buffer
1530size for the sender (Debian 2.6 kernel) is 16384 bytes, however Linux
1531does "auto tuning" of socket buffer and TCP window sizes, which means
1532the send socket buffer size may be different at the end of the test
1533than it was at the beginning.  This is addressed in the *note omni
1534tests: The Omni Tests. added in version 2.5.0 and *note output
1535selection: Omni Output Selection.  Throughput is expressed as 10^6 (aka
1536Mega) bits per second, and the test ran for 10 seconds.  IPv4 addresses
1537(AF_INET) were used.
1538
1539
1540File: netperf.info,  Node: TCP_MAERTS,  Next: TCP_SENDFILE,  Prev: TCP_STREAM,  Up: Options common to TCP UDP and SCTP tests
1541
15425.2.2 TCP_MAERTS
1543----------------
1544
1545A TCP_MAERTS (MAERTS is STREAM backwards) test is "just like" a *note
1546TCP_STREAM:: test except the data flows from the netserver to the
1547netperf. The global command-line `-F' option is ignored for this test
1548type.  The test-specific command-line `-C' option is ignored for this
1549test type.
1550
1551   Here is an example of a TCP_MAERTS test between the same two systems
1552as in the example for the *note TCP_STREAM:: test.  This time we request
1553larger socket buffers with `-s' and `-S' options:
1554
1555     $ netperf -H lag -t TCP_MAERTS -- -s 128K -S 128K
1556     TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
1557     Recv   Send    Send
1558     Socket Socket  Message  Elapsed
1559     Size   Size    Size     Time     Throughput
1560     bytes  bytes   bytes    secs.    10^6bits/sec
1561
1562     221184 131072 131072    10.03      81.14
1563
1564   Where we see that Linux, unlike HP-UX, may not return the same value
1565in a `getsockopt()' as was requested in the prior `setsockopt()'.
1566
1567   This test is included more for benchmarking convenience than anything
1568else.
1569
1570
1571File: netperf.info,  Node: TCP_SENDFILE,  Next: UDP_STREAM,  Prev: TCP_MAERTS,  Up: Options common to TCP UDP and SCTP tests
1572
15735.2.3 TCP_SENDFILE
1574------------------
1575
1576The TCP_SENDFILE test is "just like" a *note TCP_STREAM:: test except
1577netperf the platform's `sendfile()' call instead of calling `send()'.
1578Often this results in a "zero-copy" operation where data is sent
1579directly from the filesystem buffer cache.  This _should_ result in
1580lower CPU utilization and possibly higher throughput.  If it does not,
1581then you may want to contact your vendor(s) because they have a problem
1582on their hands.
1583
1584   Zero-copy mechanisms may also alter the characteristics (size and
1585number of buffers per) of packets passed to the NIC.  In many stacks,
1586when a copy is performed, the stack can "reserve" space at the
1587beginning of the destination buffer for things like TCP, IP and Link
1588headers.  This then has the packet contained in a single buffer which
1589can be easier to DMA to the NIC.  When no copy is performed, there is
1590no opportunity to reserve space for headers and so a packet will be
1591contained in two or more buffers.
1592
1593   As of some time before version 2.5.0, the *note global `-F' option:
1594Global Options. is no longer required for this test.  If it is not
1595specified, netperf will create a temporary file, which it will delete
1596at the end of the test.  If the `-F' option is specified it must
1597reference a file of at least the size of the send ring (*Note the
1598global `-W' option: Global Options.) multiplied by the send size (*Note
1599the test-specific `-m' option: Options common to TCP UDP and SCTP
1600tests.).  All other TCP-specific options remain available and optional.
1601
1602   In this first example:
1603     $ netperf -H lag -F ../src/netperf -t TCP_SENDFILE -- -s 128K -S 128K
1604     TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
1605     alloc_sendfile_buf_ring: specified file too small.
1606     file must be larger than send_width * send_size
1607
1608   we see what happens when the file is too small.  Here:
1609
1610     $ netperf -H lag -F /boot/vmlinuz-2.6.8-1-686 -t TCP_SENDFILE -- -s 128K -S 128K
1611     TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
1612     Recv   Send    Send
1613     Socket Socket  Message  Elapsed
1614     Size   Size    Size     Time     Throughput
1615     bytes  bytes   bytes    secs.    10^6bits/sec
1616
1617     131072 221184 221184    10.02      81.83
1618
1619   we resolve that issue by selecting a larger file.
1620
1621
1622File: netperf.info,  Node: UDP_STREAM,  Next: XTI_TCP_STREAM,  Prev: TCP_SENDFILE,  Up: Options common to TCP UDP and SCTP tests
1623
16245.2.4 UDP_STREAM
1625----------------
1626
1627A UDP_STREAM test is similar to a *note TCP_STREAM:: test except UDP is
1628used as the transport rather than TCP.
1629
1630   A UDP_STREAM test has no end-to-end flow control - UDP provides none
1631and neither does netperf.  However, if you wish, you can configure
1632netperf with `--enable-intervals=yes' to enable the global command-line
1633`-b' and `-w' options to pace bursts of traffic onto the network.
1634
1635   This has a number of implications.
1636
1637   The biggest of these implications is the data which is sent might not
1638be received by the remote.  For this reason, the output of a UDP_STREAM
1639test shows both the sending and receiving throughput.  On some
1640platforms, it may be possible for the sending throughput to be reported
1641as a value greater than the maximum rate of the link.  This is common
1642when the CPU(s) are faster than the network and there is no
1643"intra-stack" flow-control.
1644
1645   Here is an example of a UDP_STREAM test between two systems connected
1646by a 10 Gigabit Ethernet link:
1647     $ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 32768
1648     UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
1649     Socket  Message  Elapsed      Messages
1650     Size    Size     Time         Okay Errors   Throughput
1651     bytes   bytes    secs            #      #   10^6bits/sec
1652
1653     124928   32768   10.00      105672      0    2770.20
1654     135168           10.00      104844           2748.50
1655
1656   The first line of numbers are statistics from the sending (netperf)
1657side. The second line of numbers are from the receiving (netserver)
1658side.  In this case, 105672 - 104844 or 828 messages did not make it
1659all the way to the remote netserver process.
1660
1661   If the value of the `-m' option is larger than the local send socket
1662buffer size (`-s' option) netperf will likely abort with an error
1663message about how the send call failed:
1664
1665     netperf -t UDP_STREAM -H 192.168.2.125
1666     UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
1667     udp_send: data send error: Message too long
1668
1669   If the value of the `-m' option is larger than the remote socket
1670receive buffer, the reported receive throughput will likely be zero as
1671the remote UDP will discard the messages as being too large to fit into
1672the socket buffer.
1673
1674     $ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 65000 -S 32768
1675     UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
1676     Socket  Message  Elapsed      Messages
1677     Size    Size     Time         Okay Errors   Throughput
1678     bytes   bytes    secs            #      #   10^6bits/sec
1679
1680     124928   65000   10.00       53595      0    2786.99
1681      65536           10.00           0              0.00
1682
1683   The example above was between a pair of systems running a "Linux"
1684kernel. Notice that the remote Linux system returned a value larger
1685than that passed-in to the `-S' option.  In fact, this value was larger
1686than the message size set with the `-m' option.  That the remote socket
1687buffer size is reported as 65536 bytes would suggest to any sane person
1688that a message of 65000 bytes would fit, but the socket isn't _really_
168965536 bytes, even though Linux is telling us so.  Go figure.
1690
1691
1692File: netperf.info,  Node: XTI_TCP_STREAM,  Next: XTI_UDP_STREAM,  Prev: UDP_STREAM,  Up: Options common to TCP UDP and SCTP tests
1693
16945.2.5 XTI_TCP_STREAM
1695--------------------
1696
1697An XTI_TCP_STREAM test is simply a *note TCP_STREAM:: test using the XTI
1698rather than BSD Sockets interface.  The test-specific `-X <devspec>'
1699option can be used to specify the name of the local and/or remote XTI
1700device files, which is required by the `t_open()' call made by netperf
1701XTI tests.
1702
1703   The XTI_TCP_STREAM test is only present if netperf was configured
1704with `--enable-xti=yes'.  The remote netserver must have also been
1705configured with `--enable-xti=yes'.
1706
1707
1708File: netperf.info,  Node: XTI_UDP_STREAM,  Next: SCTP_STREAM,  Prev: XTI_TCP_STREAM,  Up: Options common to TCP UDP and SCTP tests
1709
17105.2.6 XTI_UDP_STREAM
1711--------------------
1712
1713An XTI_UDP_STREAM test is simply a *note UDP_STREAM:: test using the XTI
1714rather than BSD Sockets Interface.  The test-specific `-X <devspec>'
1715option can be used to specify the name of the local and/or remote XTI
1716device files, which is required by the `t_open()' call made by netperf
1717XTI tests.
1718
1719   The XTI_UDP_STREAM test is only present if netperf was configured
1720with `--enable-xti=yes'. The remote netserver must have also been
1721configured with `--enable-xti=yes'.
1722
1723
1724File: netperf.info,  Node: SCTP_STREAM,  Next: DLCO_STREAM,  Prev: XTI_UDP_STREAM,  Up: Options common to TCP UDP and SCTP tests
1725
17265.2.7 SCTP_STREAM
1727-----------------
1728
1729An SCTP_STREAM test is essentially a *note TCP_STREAM:: test using the
1730SCTP rather than TCP.  The `-D' option will set SCTP_NODELAY, which is
1731much like the TCP_NODELAY option for TCP.  The `-C' option is not
1732applicable to an SCTP test as there is no corresponding SCTP_CORK
1733option.  The author is still figuring-out what the test-specific `-N'
1734option does :)
1735
1736   The SCTP_STREAM test is only present if netperf was configured with
1737`--enable-sctp=yes'. The remote netserver must have also been
1738configured with `--enable-sctp=yes'.
1739
1740
1741File: netperf.info,  Node: DLCO_STREAM,  Next: DLCL_STREAM,  Prev: SCTP_STREAM,  Up: Options common to TCP UDP and SCTP tests
1742
17435.2.8 DLCO_STREAM
1744-----------------
1745
1746A DLPI Connection Oriented Stream (DLCO_STREAM) test is very similar in
1747concept to a *note TCP_STREAM:: test.  Both use reliable,
1748connection-oriented protocols.  The DLPI test differs from the TCP test
1749in that its protocol operates only at the link-level and does not
1750include TCP-style segmentation and reassembly.  This last difference
1751means that the value  passed-in  with the `-m' option must be less than
1752the interface MTU.  Otherwise, the `-m' and `-M' options are just like
1753their TCP/UDP/SCTP counterparts.
1754
1755   Other DLPI-specific options include:
1756
1757`-D <devspec>'
1758     This option is used to provide the fully-qualified names for the
1759     local and/or remote DLPI device files.  The syntax is otherwise
1760     identical to that of a "sizespec".
1761
1762`-p <ppaspec>'
1763     This option is used to specify the local and/or remote DLPI PPA(s).
1764     The PPA is used to identify the interface over which traffic is to
1765     be sent/received. The syntax of a "ppaspec" is otherwise the same
1766     as a "sizespec".
1767
1768`-s sap'
1769     This option specifies the 802.2 SAP for the test.  A SAP is
1770     somewhat like either the port field of a TCP or UDP header or the
1771     protocol field of an IP header.  The specified SAP should not
1772     conflict with any other active SAPs on the specified PPA's (`-p'
1773     option).
1774
1775`-w <sizespec>'
1776     This option specifies the local send and receive window sizes in
1777     units of frames on those platforms which support setting such
1778     things.
1779
1780`-W <sizespec>'
1781     This option specifies the remote send and receive window sizes in
1782     units of frames on those platforms which support setting such
1783     things.
1784
1785   The DLCO_STREAM test is only present if netperf was configured with
1786`--enable-dlpi=yes'. The remote netserver must have also been
1787configured with `--enable-dlpi=yes'.
1788
1789
1790File: netperf.info,  Node: DLCL_STREAM,  Next: STREAM_STREAM,  Prev: DLCO_STREAM,  Up: Options common to TCP UDP and SCTP tests
1791
17925.2.9 DLCL_STREAM
1793-----------------
1794
1795A DLPI ConnectionLess Stream (DLCL_STREAM) test is analogous to a *note
1796UDP_STREAM:: test in that both make use of unreliable/best-effort,
1797connection-less transports.  The DLCL_STREAM test differs from the
1798*note UDP_STREAM:: test in that the message size (`-m' option) must
1799always be less than the link MTU as there is no IP-like fragmentation
1800and reassembly available and netperf does not presume to provide one.
1801
1802   The test-specific command-line options for a DLCL_STREAM test are the
1803same as those for a *note DLCO_STREAM:: test.
1804
1805   The DLCL_STREAM test is only present if netperf was configured with
1806`--enable-dlpi=yes'. The remote netserver must have also been
1807configured with `--enable-dlpi=yes'.
1808
1809
1810File: netperf.info,  Node: STREAM_STREAM,  Next: DG_STREAM,  Prev: DLCL_STREAM,  Up: Options common to TCP UDP and SCTP tests
1811
18125.2.10 STREAM_STREAM
1813--------------------
1814
1815A Unix Domain Stream Socket Stream test (STREAM_STREAM) is similar in
1816concept to a *note TCP_STREAM:: test, but using Unix Domain sockets.
1817It is, naturally, limited to intra-machine traffic.  A STREAM_STREAM
1818test shares the `-m', `-M', `-s' and `-S' options of the other _STREAM
1819tests.  In a STREAM_STREAM test the `-p' option sets the directory in
1820which the pipes will be created rather than setting a port number.  The
1821default is to create the pipes in the system default for the
1822`tempnam()' call.
1823
1824   The STREAM_STREAM test is only present if netperf was configured with
1825`--enable-unixdomain=yes'. The remote netserver must have also been
1826configured with `--enable-unixdomain=yes'.
1827
1828
1829File: netperf.info,  Node: DG_STREAM,  Prev: STREAM_STREAM,  Up: Options common to TCP UDP and SCTP tests
1830
18315.2.11 DG_STREAM
1832----------------
1833
1834A Unix Domain Datagram Socket Stream test (SG_STREAM) is very much like
1835a *note TCP_STREAM:: test except that message boundaries are preserved.
1836In this way, it may also be considered similar to certain flavors of
1837SCTP test which can also preserve message boundaries.
1838
1839   All the options of a *note STREAM_STREAM:: test are applicable to a
1840DG_STREAM test.
1841
1842   The DG_STREAM test is only present if netperf was configured with
1843`--enable-unixdomain=yes'. The remote netserver must have also been
1844configured with `--enable-unixdomain=yes'.
1845
1846
1847File: netperf.info,  Node: Using Netperf to Measure Request/Response,  Next: Using Netperf to Measure Aggregate Performance,  Prev: Using Netperf to Measure Bulk Data Transfer,  Up: Top
1848
18496 Using Netperf to Measure Request/Response
1850*******************************************
1851
1852Request/response performance is often overlooked, yet it is just as
1853important as bulk-transfer performance.  While things like larger
1854socket buffers and TCP windows, and stateless offloads like TSO and LRO
1855can cover a multitude of latency and even path-length sins, those sins
1856cannot easily hide from a request/response test.  The convention for a
1857request/response test is to have a _RR suffix.  There are however a few
1858"request/response" tests that have other suffixes.
1859
1860   A request/response test, particularly synchronous, one transaction at
1861a time test such as those found by default in netperf, is particularly
1862sensitive to the path-length of the networking stack.  An _RR test can
1863also uncover those platforms where the NICs are strapped by default
1864with overbearing interrupt avoidance settings in an attempt to increase
1865the bulk-transfer performance (or rather, decrease the CPU utilization
1866of a bulk-transfer test).  This sensitivity is most acute for small
1867request and response sizes, such as the single-byte default for a
1868netperf _RR test.
1869
1870   While a bulk-transfer test reports its results in units of bits or
1871bytes transferred per second, by default a mumble_RR test reports
1872transactions per second where a transaction is defined as the completed
1873exchange of a request and a response.  One can invert the transaction
1874rate to arrive at the average round-trip latency.  If one is confident
1875about the symmetry of the connection, the average one-way latency can
1876be taken as one-half the average round-trip latency. As of version
18772.5.0 (actually slightly before) netperf still does not do the latter,
1878but will do the former if one sets the verbosity to 2 for a classic
1879netperf test, or includes the appropriate *note output selector: Omni
1880Output Selectors. in an *note omni test: The Omni Tests.  It will also
1881allow the user to switch the throughput units from transactions per
1882second to bits or bytes per second with the global `-f' option.
1883
1884* Menu:
1885
1886* Issues in Request/Response::
1887* Options Common to TCP UDP and SCTP _RR tests::
1888
1889
1890File: netperf.info,  Node: Issues in Request/Response,  Next: Options Common to TCP UDP and SCTP _RR tests,  Prev: Using Netperf to Measure Request/Response,  Up: Using Netperf to Measure Request/Response
1891
18926.1 Issues in Request/Response
1893==============================
1894
1895Most if not all the *note Issues in Bulk Transfer:: apply to
1896request/response.  The issue of round-trip latency is even more
1897important as netperf generally only has one transaction outstanding at
1898a time.
1899
1900   A single instance of a one transaction outstanding _RR test should
1901_never_ completely saturate the CPU of a system.  If testing between
1902otherwise evenly matched systems, the symmetric nature of a _RR test
1903with equal request and response sizes should result in equal CPU
1904loading on both systems. However, this may not hold true on MP systems,
1905particularly if one CPU binds the netperf and netserver differently via
1906the global `-T' option.
1907
1908   For smaller request and response sizes packet loss is a bigger issue
1909as there is no opportunity for a "fast retransmit" or retransmission
1910prior to a retransmission timer expiring.
1911
1912   Virtualization may considerably increase the effective path length of
1913a networking stack.  While this may not preclude achieving link-rate on
1914a comparatively slow link (eg 1 Gigabit Ethernet) on a _STREAM test, it
1915can show-up as measurably fewer transactions per second on an _RR test.
1916However, this may still be masked by interrupt coalescing in the
1917NIC/driver.
1918
1919   Certain NICs have ways to minimize the number of interrupts sent to
1920the host.  If these are strapped badly they can significantly reduce
1921the performance of something like a single-byte request/response test.
1922Such setups are distinguished by seriously low reported CPU utilization
1923and what seems like a low (even if in the thousands) transaction per
1924second rate.  Also, if you run such an OS/driver combination on faster
1925or slower hardware and do not see a corresponding change in the
1926transaction rate, chances are good that the driver is strapping the NIC
1927with aggressive interrupt avoidance settings.  Good for bulk
1928throughput, but bad for latency.
1929
1930   Some drivers may try to automagically adjust the interrupt avoidance
1931settings.  If they are not terribly good at it, you will see
1932considerable run-to-run variation in reported transaction rates.
1933Particularly if you "mix-up" _STREAM and _RR tests.
1934
1935
1936File: netperf.info,  Node: Options Common to TCP UDP and SCTP _RR tests,  Prev: Issues in Request/Response,  Up: Using Netperf to Measure Request/Response
1937
19386.2 Options Common to TCP UDP and SCTP _RR tests
1939================================================
1940
1941Many "test-specific" options are actually common across the different
1942tests.  For those tests involving TCP, UDP and SCTP, whether using the
1943BSD Sockets or the XTI interface those common options include:
1944
1945`-h'
1946     Display the test-suite-specific usage string and exit.  For a TCP_
1947     or UDP_ test this will be the usage string from the source file
1948     `nettest_bsd.c'.  For an XTI_ test, this will be the usage string
1949     from the source file `src/nettest_xti.c'.  For an SCTP test, this
1950     will be the usage string from the source file `src/nettest_sctp.c'.
1951
1952`-H <optionspec>'
1953     Normally, the remote hostname|IP and address family information is
1954     inherited from the settings for the control connection (eg global
1955     command-line `-H', `-4' and/or `-6' options.  The test-specific
1956     `-H' will override those settings for the data (aka test)
1957     connection only.  Settings for the control connection are left
1958     unchanged.  This might be used to cause the control and data
1959     connections to take different paths through the network.
1960
1961`-L <optionspec>'
1962     The test-specific `-L' option is identical to the test-specific
1963     `-H' option except it affects the local hostname|IP and address
1964     family information.  As with its global command-line counterpart,
1965     this is generally only useful when measuring though those evil,
1966     end-to-end breaking things called firewalls.
1967
1968`-P <optionspec>'
1969     Set the local and/or remote port numbers for the data connection.
1970
1971`-r <sizespec>'
1972     This option sets the request (first value) and/or response (second
1973     value) sizes for an _RR test. By default the units are bytes, but a
1974     suffix of "G," "M," or "K" will specify the units to be 2^30 (GB),
1975     2^20 (MB) or 2^10 (KB) respectively.  A suffix of "g," "m" or "k"
1976     will specify units of 10^9, 10^6 or 10^3 bytes respectively. For
1977     example:
1978          `-r 128,16K'
1979     Will set the request size to 128 bytes and the response size to 16
1980     KB or 16384 bytes. [Default: 1 - a single-byte request and
1981     response ]
1982
1983`-s <sizespec>'
1984     This option sets the local (netperf) send and receive socket buffer
1985     sizes for the data connection to the value(s) specified.  Often,
1986     this will affect the advertised and/or effective TCP or other
1987     window, but on some platforms it may not. By default the units are
1988     bytes, but a suffix of "G," "M," or "K" will specify the units to
1989     be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of
1990     "g," "m" or "k" will specify units of 10^9, 10^6 or 10^3 bytes
1991     respectively. For example:
1992          `-s 128K'
1993     Will request the local send (netperf) and receive socket buffer
1994     sizes to be 128KB or 131072 bytes.
1995
1996     While the historic expectation is that setting the socket buffer
1997     size has a direct effect on say the TCP window, today that may not
1998     hold true for all stacks.  When running under Windows a value of 0
1999     may be used which will be an indication to the stack the user
2000     wants to enable a form of copy avoidance. [Default: -1 - use the
2001     system's default socket buffer sizes]
2002
2003`-S <sizespec>'
2004     This option sets the remote (netserver) send and/or receive socket
2005     buffer sizes for the data connection to the value(s) specified.
2006     Often, this will affect the advertised and/or effective TCP or
2007     other window, but on some platforms it may not. By default the
2008     units are bytes, but a suffix of "G," "M," or "K" will specify the
2009     units to be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A
2010     suffix of "g," "m" or "k" will specify units of 10^9, 10^6 or 10^3
2011     bytes respectively.  For example:
2012          `-S 128K'
2013     Will request the remote (netserver) send and receive socket buffer
2014     sizes to be 128KB or 131072 bytes.
2015
2016     While the historic expectation is that setting the socket buffer
2017     size has a direct effect on say the TCP window, today that may not
2018     hold true for all stacks.  When running under Windows a value of 0
2019     may be used which will be an indication to the stack the user
2020     wants to enable a form of copy avoidance.  [Default: -1 - use the
2021     system's default socket buffer sizes]
2022
2023`-4'
2024     Set the local and remote address family for the data connection to
2025     AF_INET - ie use IPv4 addressing only.  Just as with their global
2026     command-line counterparts the last of the `-4', `-6', `-H' or `-L'
2027     option wins for their respective address families.
2028
2029`-6'
2030     This option is identical to its `-4' cousin, but requests IPv6
2031     addresses for the local and remote ends of the data connection.
2032
2033
2034* Menu:
2035
2036* TCP_RR::
2037* TCP_CC::
2038* TCP_CRR::
2039* UDP_RR::
2040* XTI_TCP_RR::
2041* XTI_TCP_CC::
2042* XTI_TCP_CRR::
2043* XTI_UDP_RR::
2044* DLCL_RR::
2045* DLCO_RR::
2046* SCTP_RR::
2047
2048
2049File: netperf.info,  Node: TCP_RR,  Next: TCP_CC,  Prev: Options Common to TCP UDP and SCTP _RR tests,  Up: Options Common to TCP UDP and SCTP _RR tests
2050
20516.2.1 TCP_RR
2052------------
2053
2054A TCP_RR (TCP Request/Response) test is requested by passing a value of
2055"TCP_RR" to the global `-t' command-line option.  A TCP_RR test can be
2056thought-of as a user-space to user-space `ping' with no think time - it
2057is by default a synchronous, one transaction at a time,
2058request/response test.
2059
2060   The transaction rate is the number of complete transactions exchanged
2061divided by the length of time it took to perform those transactions.
2062
2063   If the two Systems Under Test are otherwise identical, a TCP_RR test
2064with the same request and response size should be symmetric - it should
2065not matter which way the test is run, and the CPU utilization measured
2066should be virtually the same on each system.  If not, it suggests that
2067the CPU utilization mechanism being used may have some, well, issues
2068measuring CPU utilization completely and accurately.
2069
2070   Time to establish the TCP connection is not counted in the result.
2071If you want connection setup overheads included, you should consider the
2072*note TPC_CC: TCP_CC. or *note TCP_CRR: TCP_CRR. tests.
2073
2074   If specifying the `-D' option to set TCP_NODELAY and disable the
2075Nagle Algorithm increases the transaction rate reported by a TCP_RR
2076test, it implies the stack(s) over which the TCP_RR test is running
2077have a broken implementation of the Nagle Algorithm.  Likely as not
2078they are interpreting Nagle on a segment by segment basis rather than a
2079user send by user send basis.  You should contact your stack vendor(s)
2080to report the problem to them.
2081
2082   Here is an example of two systems running a basic TCP_RR test over a
208310 Gigabit Ethernet link:
2084
2085     netperf -t TCP_RR -H 192.168.2.125
2086     TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
2087     Local /Remote
2088     Socket Size   Request  Resp.   Elapsed  Trans.
2089     Send   Recv   Size     Size    Time     Rate
2090     bytes  Bytes  bytes    bytes   secs.    per sec
2091
2092     16384  87380  1        1       10.00    29150.15
2093     16384  87380
2094
2095   In this example the request and response sizes were one byte, the
2096socket buffers were left at their defaults, and the test ran for all of
209710 seconds.  The transaction per second rate was rather good for the
2098time :)
2099
2100
2101File: netperf.info,  Node: TCP_CC,  Next: TCP_CRR,  Prev: TCP_RR,  Up: Options Common to TCP UDP and SCTP _RR tests
2102
21036.2.2 TCP_CC
2104------------
2105
2106A TCP_CC (TCP Connect/Close) test is requested by passing a value of
2107"TCP_CC" to the global `-t' option.  A TCP_CC test simply measures how
2108fast the pair of systems can open and close connections between one
2109another in a synchronous (one at a time) manner.  While this is
2110considered an _RR test, no request or response is exchanged over the
2111connection.
2112
2113   The issue of TIME_WAIT reuse is an important one for a TCP_CC test.
2114Basically, TIME_WAIT reuse is when a pair of systems churn through
2115connections fast enough that they wrap the 16-bit port number space in
2116less time than the length of the TIME_WAIT state.  While it is indeed
2117theoretically possible to "reuse" a connection in TIME_WAIT, the
2118conditions under which such reuse is possible are rather rare.  An
2119attempt to reuse a connection in TIME_WAIT can result in a non-trivial
2120delay in connection establishment.
2121
2122   Basically, any time the connection churn rate approaches:
2123
2124   Sizeof(clientportspace) / Lengthof(TIME_WAIT)
2125
2126   there is the risk of TIME_WAIT reuse.  To minimize the chances of
2127this happening, netperf will by default select its own client port
2128numbers from the range of 5000 to 65535.  On systems with a 60 second
2129TIME_WAIT state, this should allow roughly 1000 transactions per
2130second.  The size of the client port space used by netperf can be
2131controlled via the test-specific `-p' option, which takes a "sizespec"
2132as a value setting the minimum (first value) and maximum (second value)
2133port numbers used by netperf at the client end.
2134
2135   Since no requests or responses are exchanged during a TCP_CC test,
2136only the `-H', `-L', `-4' and `-6' of the "common" test-specific
2137options are likely to have an effect, if any, on the results.  The `-s'
2138and `-S' options _may_ have some effect if they alter the number and/or
2139type of options carried in the TCP SYNchronize segments, such as Window
2140Scaling or Timestamps.  The `-P' and `-r' options are utterly ignored.
2141
2142   Since connection establishment and tear-down for TCP is not
2143symmetric, a TCP_CC test is not symmetric in its loading of the two
2144systems under test.
2145
2146
2147File: netperf.info,  Node: TCP_CRR,  Next: UDP_RR,  Prev: TCP_CC,  Up: Options Common to TCP UDP and SCTP _RR tests
2148
21496.2.3 TCP_CRR
2150-------------
2151
2152The TCP Connect/Request/Response (TCP_CRR) test is requested by passing
2153a value of "TCP_CRR" to the global `-t' command-line option.  A TCP_CRR
2154test is like a merger of a *note TCP_RR:: and *note TCP_CC:: test which
2155measures the performance of establishing a connection, exchanging a
2156single request/response transaction, and tearing-down that connection.
2157This is very much like what happens in an HTTP 1.0 or HTTP 1.1
2158connection when HTTP Keepalives are not used.  In fact, the TCP_CRR
2159test was added to netperf to simulate just that.
2160
2161   Since a request and response are exchanged the `-r', `-s' and `-S'
2162options can have an effect on the performance.
2163
2164   The issue of TIME_WAIT reuse exists for the TCP_CRR test just as it
2165does for the TCP_CC test.  Similarly, since connection establishment
2166and tear-down is not symmetric, a TCP_CRR test is not symmetric even
2167when the request and response sizes are the same.
2168
2169
2170File: netperf.info,  Node: UDP_RR,  Next: XTI_TCP_RR,  Prev: TCP_CRR,  Up: Options Common to TCP UDP and SCTP _RR tests
2171
21726.2.4 UDP_RR
2173------------
2174
2175A UDP Request/Response (UDP_RR) test is requested by passing a value of
2176"UDP_RR" to a global `-t' option.  It is very much the same as a TCP_RR
2177test except UDP is used rather than TCP.
2178
2179   UDP does not provide for retransmission of lost UDP datagrams, and
2180netperf does not add anything for that either.  This means that if
2181_any_ request or response is lost, the exchange of requests and
2182responses will stop from that point until the test timer expires.
2183Netperf will not really "know" this has happened - the only symptom
2184will be a low transaction per second rate.  If `--enable-burst' was
2185included in the `configure' command and a test-specific `-b' option
2186used, the UDP_RR test will "survive" the loss of requests and responses
2187until the sum is one more than the value passed via the `-b' option. It
2188will though almost certainly run more slowly.
2189
2190   The netperf side of a UDP_RR test will call `connect()' on its data
2191socket and thenceforth use the `send()' and `recv()' socket calls.  The
2192netserver side of a UDP_RR test will not call `connect()' and will use
2193`recvfrom()' and `sendto()' calls.  This means that even if the request
2194and response sizes are the same, a UDP_RR test is _not_ symmetric in
2195its loading of the two systems under test.
2196
2197   Here is an example of a UDP_RR test between two otherwise identical
2198two-CPU systems joined via a 1 Gigabit Ethernet network:
2199
2200     $ netperf -T 1 -H 192.168.1.213 -t UDP_RR -c -C
2201     UDP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.213 (192.168.1.213) port 0 AF_INET
2202     Local /Remote
2203     Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
2204     Send   Recv   Size    Size   Time    Rate     local  remote local   remote
2205     bytes  bytes  bytes   bytes  secs.   per sec  % I    % I    us/Tr   us/Tr
2206
2207     65535  65535  1       1      10.01   15262.48   13.90  16.11  18.221  21.116
2208     65535  65535
2209
2210   This example includes the `-c' and `-C' options to enable CPU
2211utilization reporting and shows the asymmetry in CPU loading.  The `-T'
2212option was used to make sure netperf and netserver ran on a given CPU
2213and did not move around during the test.
2214
2215
2216File: netperf.info,  Node: XTI_TCP_RR,  Next: XTI_TCP_CC,  Prev: UDP_RR,  Up: Options Common to TCP UDP and SCTP _RR tests
2217
22186.2.5 XTI_TCP_RR
2219----------------
2220
2221An XTI_TCP_RR test is essentially the same as a *note TCP_RR:: test only
2222using the XTI rather than BSD Sockets interface. It is requested by
2223passing a value of "XTI_TCP_RR" to the `-t' global command-line option.
2224
2225   The test-specific options for an XTI_TCP_RR test are the same as
2226those for a TCP_RR test with the addition of the `-X <devspec>' option
2227to specify the names of the local and/or remote XTI device file(s).
2228
2229
2230File: netperf.info,  Node: XTI_TCP_CC,  Next: XTI_TCP_CRR,  Prev: XTI_TCP_RR,  Up: Options Common to TCP UDP and SCTP _RR tests
2231
22326.2.6 XTI_TCP_CC
2233----------------
2234
2235An XTI_TCP_CC test is essentially the same as a *note TCP_CC: TCP_CC.
2236test, only using the XTI rather than BSD Sockets interface.
2237
2238   The test-specific options for an XTI_TCP_CC test are the same as
2239those for a TCP_CC test with the addition of the `-X <devspec>' option
2240to specify the names of the local and/or remote XTI device file(s).
2241
2242
2243File: netperf.info,  Node: XTI_TCP_CRR,  Next: XTI_UDP_RR,  Prev: XTI_TCP_CC,  Up: Options Common to TCP UDP and SCTP _RR tests
2244
22456.2.7 XTI_TCP_CRR
2246-----------------
2247
2248The XTI_TCP_CRR test is essentially the same as a *note TCP_CRR:
2249TCP_CRR. test, only using the XTI rather than BSD Sockets interface.
2250
2251   The test-specific options for an XTI_TCP_CRR test are the same as
2252those for a TCP_RR test with the addition of the `-X <devspec>' option
2253to specify the names of the local and/or remote XTI device file(s).
2254
2255
2256File: netperf.info,  Node: XTI_UDP_RR,  Next: DLCL_RR,  Prev: XTI_TCP_CRR,  Up: Options Common to TCP UDP and SCTP _RR tests
2257
22586.2.8 XTI_UDP_RR
2259----------------
2260
2261An XTI_UDP_RR test is essentially the same as a UDP_RR test only using
2262the XTI rather than BSD Sockets interface.  It is requested by passing
2263a value of "XTI_UDP_RR" to the `-t' global command-line option.
2264
2265   The test-specific options for an XTI_UDP_RR test are the same as
2266those for a UDP_RR test with the addition of the `-X <devspec>' option
2267to specify the name of the local and/or remote XTI device file(s).
2268
2269
2270File: netperf.info,  Node: DLCL_RR,  Next: DLCO_RR,  Prev: XTI_UDP_RR,  Up: Options Common to TCP UDP and SCTP _RR tests
2271
22726.2.9 DLCL_RR
2273-------------
2274
2275
2276File: netperf.info,  Node: DLCO_RR,  Next: SCTP_RR,  Prev: DLCL_RR,  Up: Options Common to TCP UDP and SCTP _RR tests
2277
22786.2.10 DLCO_RR
2279--------------
2280
2281
2282File: netperf.info,  Node: SCTP_RR,  Prev: DLCO_RR,  Up: Options Common to TCP UDP and SCTP _RR tests
2283
22846.2.11 SCTP_RR
2285--------------
2286
2287
2288File: netperf.info,  Node: Using Netperf to Measure Aggregate Performance,  Next: Using Netperf to Measure Bidirectional Transfer,  Prev: Using Netperf to Measure Request/Response,  Up: Top
2289
22907 Using Netperf to Measure Aggregate Performance
2291************************************************
2292
2293Ultimately, *note Netperf4: Netperf4. will be the preferred benchmark to
2294use when one wants to measure aggregate performance because netperf has
2295no support for explicit synchronization of concurrent tests. Until
2296netperf4 is ready for prime time, one can make use of the heuristics
2297and procedures mentioned here for the 85% solution.
2298
2299   There are a few ways to measure aggregate performance with netperf.
2300The first is to run multiple, concurrent netperf tests and can be
2301applied to any of the netperf tests.  The second is to configure
2302netperf with `--enable-burst' and is applicable to the TCP_RR test. The
2303third is a variation on the first.
2304
2305* Menu:
2306
2307* Running Concurrent Netperf Tests::
2308* Using --enable-burst::
2309* Using --enable-demo::
2310
2311
2312File: netperf.info,  Node: Running Concurrent Netperf Tests,  Next: Using --enable-burst,  Prev: Using Netperf to Measure Aggregate Performance,  Up: Using Netperf to Measure Aggregate Performance
2313
23147.1 Running Concurrent Netperf Tests
2315====================================
2316
2317*note Netperf4: Netperf4. is the preferred benchmark to use when one
2318wants to measure aggregate performance because netperf has no support
2319for explicit synchronization of concurrent tests.  This leaves netperf2
2320results vulnerable to "skew" errors.
2321
2322   However, since there are times when netperf4 is unavailable it may be
2323necessary to run netperf. The skew error can be minimized by making use
2324of the confidence interval functionality.  Then one simply launches
2325multiple tests from the shell using a `for' loop or the like:
2326
2327     for i in 1 2 3 4
2328     do
2329     netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 &
2330     done
2331
2332   which will run four, concurrent *note TCP_STREAM: TCP_STREAM. tests
2333from the system on which it is executed to tardy.cup.hp.com.  Each
2334concurrent netperf will iterate 10 times thanks to the `-i' option and
2335will omit the test banners (option `-P') for brevity.  The output looks
2336something like this:
2337
2338      87380  16384  16384    10.03     235.15
2339      87380  16384  16384    10.03     235.09
2340      87380  16384  16384    10.03     235.38
2341      87380  16384  16384    10.03     233.96
2342
2343   We can take the sum of the results and be reasonably confident that
2344the aggregate performance was 940 Mbits/s.  This method does not need
2345to be limited to one system speaking to one other system.  It can be
2346extended to one system talking to N other systems.  It could be as
2347simple as:
2348     for host in 'foo bar baz bing'
2349     do
2350     netperf -t TCP_STREAM -H $hosts -i 10 -P 0 &
2351     done
2352   A more complicated/sophisticated example can be found in
2353`doc/examples/runemomniagg2.sh' where.
2354
2355   If you see warnings about netperf not achieving the confidence
2356intervals, the best thing to do is to increase the number of iterations
2357with `-i' and/or increase the run length of each iteration with `-l'.
2358
2359   You can also enable local (`-c') and/or remote (`-C') CPU
2360utilization:
2361
2362     for i in 1 2 3 4
2363     do
2364     netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 -c -C &
2365     done
2366
2367     87380  16384  16384    10.03       235.47   3.67     5.09     10.226  14.180
2368     87380  16384  16384    10.03       234.73   3.67     5.09     10.260  14.225
2369     87380  16384  16384    10.03       234.64   3.67     5.10     10.263  14.231
2370     87380  16384  16384    10.03       234.87   3.67     5.09     10.253  14.215
2371
2372   If the CPU utilizations reported for the same system are the same or
2373very very close you can be reasonably confident that skew error is
2374minimized.  Presumably one could then omit `-i' but that is not
2375advised, particularly when/if the CPU utilization approaches 100
2376percent.  In the example above we see that the CPU utilization on the
2377local system remains the same for all four tests, and is only off by
23780.01 out of 5.09 on the remote system.  As the number of CPUs in the
2379system increases, and so too the odds of saturating a single CPU, the
2380accuracy of similar CPU utilization implying little skew error is
2381diminished.  This is also the case for those increasingly rare single
2382CPU systems if the utilization is reported as 100% or very close to it.
2383
2384     NOTE: It is very important to remember that netperf is calculating
2385     system-wide CPU utilization.  When calculating the service demand
2386     (those last two columns in the output above) each netperf assumes
2387     it is the only thing running on the system.  This means that for
2388     concurrent tests the service demands reported by netperf will be
2389     wrong.  One has to compute service demands for concurrent tests by
2390     hand.
2391
2392   If you wish you can add a unique, global `-B' option to each command
2393line to append the given string to the output:
2394
2395     for i in 1 2 3 4
2396     do
2397     netperf -t TCP_STREAM -H tardy.cup.hp.com -B "this is test $i" -i 10 -P 0 &
2398     done
2399
2400     87380  16384  16384    10.03     234.90   this is test 4
2401     87380  16384  16384    10.03     234.41   this is test 2
2402     87380  16384  16384    10.03     235.26   this is test 1
2403     87380  16384  16384    10.03     235.09   this is test 3
2404
2405   You will notice that the tests completed in an order other than they
2406were started from the shell.  This underscores why there is a threat of
2407skew error and why netperf4 will eventually be the preferred tool for
2408aggregate tests.  Even if you see the Netperf Contributing Editor
2409acting to the contrary!-)
2410
2411* Menu:
2412
2413* Issues in Running Concurrent Tests::
2414
2415
2416File: netperf.info,  Node: Issues in Running Concurrent Tests,  Prev: Running Concurrent Netperf Tests,  Up: Running Concurrent Netperf Tests
2417
24187.1.1 Issues in Running Concurrent Tests
2419----------------------------------------
2420
2421In addition to the aforementioned issue of skew error, there can be
2422other issues to consider when running concurrent netperf tests.
2423
2424   For example, when running concurrent tests over multiple interfaces,
2425one is not always assured that the traffic one thinks went over a given
2426interface actually did so.  In particular, the Linux networking stack
2427takes a particularly strong stance on its following the so called `weak
2428end system model'.  As such, it is willing to answer ARP requests for
2429any of its local IP addresses on any of its interfaces.  If multiple
2430interfaces are connected to the same broadcast domain, then even if
2431they are configured into separate IP subnets there is no a priori way
2432of knowing which interface was actually used for which connection(s).
2433This can be addressed by setting the `arp_ignore' sysctl before
2434configuring interfaces.
2435
2436   As it is quite important, we will repeat that it is very important to
2437remember that each concurrent netperf instance is calculating
2438system-wide CPU utilization.  When calculating the service demand each
2439netperf assumes it is the only thing running on the system.  This means
2440that for concurrent tests the service demands reported by netperf will
2441be wrong.  One has to compute service demands for concurrent tests by
2442hand
2443
2444   Running concurrent tests can also become difficult when there is no
2445one "central" node.  Running tests between pairs of systems may be more
2446difficult, calling for remote shell commands in the for loop rather
2447than netperf commands.  This introduces more skew error, which the
2448confidence intervals may not be able to sufficiently mitigate.  One
2449possibility is to actually run three consecutive netperf tests on each
2450node - the first being a warm-up, the last being a cool-down.  The idea
2451then is to ensure that the time it takes to get all the netperfs
2452started is less than the length of the first netperf command in the
2453sequence of three.  Similarly, it assumes that all "middle" netperfs
2454will complete before the first of the "last" netperfs complete.
2455
2456
2457File: netperf.info,  Node: Using --enable-burst,  Next: Using --enable-demo,  Prev: Running Concurrent Netperf Tests,  Up: Using Netperf to Measure Aggregate Performance
2458
24597.2 Using - -enable-burst
2460=========================
2461
2462Starting in version 2.5.0 `--enable-burst=yes' is the default, which
2463means one no longer must:
2464
2465     configure --enable-burst
2466
2467   To have burst-mode functionality present in netperf.  This enables a
2468test-specific `-b num' option in *note TCP_RR: TCP_RR, *note UDP_RR:
2469UDP_RR. and *note omni: The Omni Tests. tests.
2470
2471   Normally, netperf will attempt to ramp-up the number of outstanding
2472requests to `num' plus one transactions in flight at one time.  The
2473ramp-up is to avoid transactions being smashed together into a smaller
2474number of segments when the transport's congestion window (if any) is
2475smaller at the time than what netperf wants to have outstanding at one
2476time. If, however, the user specifies a negative value for `num' this
2477ramp-up is bypassed and the burst of sends is made without
2478consideration of transport congestion window.
2479
2480   This burst-mode is used as an alternative to or even in conjunction
2481with multiple-concurrent _RR tests and as a way to implement a
2482single-connection, bidirectional bulk-transfer test.  When run with
2483just a single instance of netperf, increasing the burst size can
2484determine the maximum number of transactions per second which can be
2485serviced by a single process:
2486
2487     for b in 0 1 2 4 8 16 32
2488     do
2489      netperf -v 0 -t TCP_RR -B "-b $b" -H hpcpc108 -P 0 -- -b $b
2490     done
2491
2492     9457.59 -b 0
2493     9975.37 -b 1
2494     10000.61 -b 2
2495     20084.47 -b 4
2496     29965.31 -b 8
2497     71929.27 -b 16
2498     109718.17 -b 32
2499
2500   The global `-v' and `-P' options were used to minimize the output to
2501the single figure of merit which in this case the transaction rate.
2502The global `-B' option was used to more clearly label the output, and
2503the test-specific `-b' option enabled by `--enable-burst' increase the
2504number of transactions in flight at one time.
2505
2506   Now, since the test-specific `-D' option was not specified to set
2507TCP_NODELAY, the stack was free to "bundle" requests and/or responses
2508into TCP segments as it saw fit, and since the default request and
2509response size is one byte, there could have been some considerable
2510bundling even in the absence of transport congestion window issues.  If
2511one wants to try to achieve a closer to one-to-one correspondence
2512between a request and response and a TCP segment, add the test-specific
2513`-D' option:
2514
2515     for b in 0 1 2 4 8 16 32
2516     do
2517      netperf -v 0 -t TCP_RR -B "-b $b -D" -H hpcpc108 -P 0 -- -b $b -D
2518     done
2519
2520      8695.12 -b 0 -D
2521      19966.48 -b 1 -D
2522      20691.07 -b 2 -D
2523      49893.58 -b 4 -D
2524      62057.31 -b 8 -D
2525      108416.88 -b 16 -D
2526      114411.66 -b 32 -D
2527
2528   You can see that this has a rather large effect on the reported
2529transaction rate.  In this particular instance, the author believes it
2530relates to interactions between the test and interrupt coalescing
2531settings in the driver for the NICs used.
2532
2533     NOTE: Even if you set the `-D' option that is still not a
2534     guarantee that each transaction is in its own TCP segments.  You
2535     should get into the habit of verifying the relationship between the
2536     transaction rate and the packet rate via other means.
2537
2538   You can also combine `--enable-burst' functionality with concurrent
2539netperf tests.  This would then be an "aggregate of aggregates" if you
2540like:
2541
2542
2543     for i in 1 2 3 4
2544     do
2545      netperf -H hpcpc108 -v 0 -P 0 -i 10 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D &
2546     done
2547
2548      46668.38 aggregate 4 -b 8 -D
2549      44890.64 aggregate 2 -b 8 -D
2550      45702.04 aggregate 1 -b 8 -D
2551      46352.48 aggregate 3 -b 8 -D
2552
2553   Since each netperf did hit the confidence intervals, we can be
2554reasonably certain that the aggregate transaction per second rate was
2555the sum of all four concurrent tests, or something just shy of 184,000
2556transactions per second.  To get some idea if that was also the packet
2557per second rate, we could bracket that `for' loop with something to
2558gather statistics and run the results through beforeafter
2559(ftp://ftp.cup.hp.com/dist/networking/tools):
2560
2561     /usr/sbin/ethtool -S eth2 > before
2562     for i in 1 2 3 4
2563     do
2564      netperf -H 192.168.2.108 -l 60 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D &
2565     done
2566     wait
2567     /usr/sbin/ethtool -S eth2 > after
2568
2569      52312.62 aggregate 2 -b 8 -D
2570      50105.65 aggregate 4 -b 8 -D
2571      50890.82 aggregate 1 -b 8 -D
2572      50869.20 aggregate 3 -b 8 -D
2573
2574     beforeafter before after > delta
2575
2576     grep packets delta
2577          rx_packets: 12251544
2578          tx_packets: 12251550
2579
2580   This example uses `ethtool' because the system being used is running
2581Linux.  Other platforms have other tools - for example HP-UX has
2582lanadmin:
2583
2584     lanadmin -g mibstats <ppa>
2585
2586   and of course one could instead use `netstat'.
2587
2588   The `wait' is important because we are launching concurrent netperfs
2589in the background.  Without it, the second ethtool command would be run
2590before the tests finished and perhaps even before the last of them got
2591started!
2592
2593   The sum of the reported transaction rates is 204178 over 60 seconds,
2594which is a total of 12250680 transactions.  Each transaction is the
2595exchange of a request and a response, so we multiply that by 2 to
2596arrive at 24501360.
2597
2598   The sum of the ethtool stats is 24503094 packets which matches what
2599netperf was reporting very well.
2600
2601   Had the request or response size differed, we would need to know how
2602it compared with the "MSS" for the connection.
2603
2604   Just for grins, here is the exercise repeated, using `netstat'
2605instead of `ethtool'
2606
2607     netstat -s -t > before
2608     for i in 1 2 3 4
2609     do
2610      netperf -l 60 -H 192.168.2.108 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & done
2611     wait
2612     netstat -s -t > after
2613
2614      51305.88 aggregate 4 -b 8 -D
2615      51847.73 aggregate 2 -b 8 -D
2616      50648.19 aggregate 3 -b 8 -D
2617      53605.86 aggregate 1 -b 8 -D
2618
2619     beforeafter before after > delta
2620
2621     grep segments delta
2622         12445708 segments received
2623         12445730 segments send out
2624         1 segments retransmited
2625         0 bad segments received.
2626
2627   The sums are left as an exercise to the reader :)
2628
2629   Things become considerably more complicated if there are non-trvial
2630packet losses and/or retransmissions.
2631
2632   Of course all this checking is unnecessary if the test is a UDP_RR
2633test because UDP "never" aggregates multiple sends into the same UDP
2634datagram, and there are no ACKnowledgements in UDP.  The loss of a
2635single request or response will not bring a "burst" UDP_RR test to a
2636screeching halt, but it will reduce the number of transactions
2637outstanding at any one time.  A "burst" UDP_RR test will come to a halt
2638if the sum of the lost requests and responses reaches the value
2639specified in the test-specific `-b' option.
2640
2641
2642File: netperf.info,  Node: Using --enable-demo,  Prev: Using --enable-burst,  Up: Using Netperf to Measure Aggregate Performance
2643
26447.3 Using - -enable-demo
2645========================
2646
2647One can
2648     configure --enable-demo
2649   and compile netperf to enable netperf to emit "interim results" at
2650semi-regular intervals.  This enables a global `-D' option which takes
2651a reporting interval as an argument.  With that specified, the output
2652of netperf will then look something like
2653
2654     $ src/netperf -D 1.25
2655     MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain () port 0 AF_INET : demo
2656     Interim result: 25425.52 10^6bits/s over 1.25 seconds ending at 1327962078.405
2657     Interim result: 25486.82 10^6bits/s over 1.25 seconds ending at 1327962079.655
2658     Interim result: 25474.96 10^6bits/s over 1.25 seconds ending at 1327962080.905
2659     Interim result: 25523.49 10^6bits/s over 1.25 seconds ending at 1327962082.155
2660     Interim result: 25053.57 10^6bits/s over 1.27 seconds ending at 1327962083.429
2661     Interim result: 25349.64 10^6bits/s over 1.25 seconds ending at 1327962084.679
2662     Interim result: 25292.84 10^6bits/s over 1.25 seconds ending at 1327962085.932
2663     Recv   Send    Send
2664     Socket Socket  Message  Elapsed
2665     Size   Size    Size     Time     Throughput
2666     bytes  bytes   bytes    secs.    10^6bits/sec
2667
2668      87380  16384  16384    10.00    25375.66
2669   The units of the "Interim result" lines will follow the units
2670selected via the global `-f' option.  If the test-specific `-o' option
2671is specified on the command line, the format will be CSV:
2672     ...
2673     2978.81,MBytes/s,1.25,1327962298.035
2674     ...
2675   If the test-specific `-k' option is used the format will be keyval
2676with each keyval being given an index:
2677     ...
2678     NETPERF_INTERIM_RESULT[2]=25.00
2679     NETPERF_UNITS[2]=10^9bits/s
2680     NETPERF_INTERVAL[2]=1.25
2681     NETPERF_ENDING[2]=1327962357.249
2682     ...
2683   The expectation is it may be easier to utilize the keyvals if they
2684have indices.
2685
2686   But how does this help with aggregate tests?  Well, what one can do
2687is start the netperfs via a script, giving each a Very Long (tm) run
2688time.  Direct the output to a file per instance.  Then, once all the
2689netperfs have been started, take a timestamp and wait for some desired
2690test interval.  Once that interval expires take another timestamp and
2691then start terminating the netperfs by sending them a SIGALRM signal
2692via the likes of the `kill' or `pkill' command.  The netperfs will
2693terminate and emit the rest of the "usual" output, and you can then
2694bring the files to a central location for post processing to find the
2695aggregate performance over the "test interval."
2696
2697   This method has the advantage that it does not require advance
2698knowledge of how long it takes to get netperf tests started and/or
2699stopped.  It does though require sufficiently synchronized clocks on
2700all the test systems.
2701
2702   While calls to get the current time can be inexpensive, that neither
2703has been nor is universally true.  For that reason netperf tries to
2704minimize the number of such "timestamping" calls (eg `gettimeofday')
2705calls it makes when in demo mode.  Rather than take a timestamp after
2706each `send' or `recv' call completes netperf tries to guess how many
2707units of work will be performed over the desired interval.  Only once
2708that many units of work have been completed will netperf check the
2709time.  If the reporting interval has passed, netperf will emit an
2710"interim result."  If the interval has not passed, netperf will update
2711its estimate for units and continue.
2712
2713   After a bit of thought one can see that if things "speed-up" netperf
2714will still honor the interval.  However, if things "slow-down" netperf
2715may be late with an "interim result."  Here is an example of both of
2716those happening during a test - with the interval being honored while
2717throughput increases, and then about half-way through when another
2718netperf (not shown) is started we see things slowing down and netperf
2719not hitting the interval as desired.
2720     $ src/netperf -D 2 -H tardy.hpl.hp.com -l 20
2721     MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com () port 0 AF_INET : demo
2722     Interim result:   36.46 10^6bits/s over 2.01 seconds ending at 1327963880.565
2723     Interim result:   59.19 10^6bits/s over 2.00 seconds ending at 1327963882.569
2724     Interim result:   73.39 10^6bits/s over 2.01 seconds ending at 1327963884.576
2725     Interim result:   84.01 10^6bits/s over 2.03 seconds ending at 1327963886.603
2726     Interim result:   75.63 10^6bits/s over 2.21 seconds ending at 1327963888.814
2727     Interim result:   55.52 10^6bits/s over 2.72 seconds ending at 1327963891.538
2728     Interim result:   70.94 10^6bits/s over 2.11 seconds ending at 1327963893.650
2729     Interim result:   80.66 10^6bits/s over 2.13 seconds ending at 1327963895.777
2730     Interim result:   86.42 10^6bits/s over 2.12 seconds ending at 1327963897.901
2731     Recv   Send    Send
2732     Socket Socket  Message  Elapsed
2733     Size   Size    Size     Time     Throughput
2734     bytes  bytes   bytes    secs.    10^6bits/sec
2735
2736      87380  16384  16384    20.34      68.87
2737   So long as your post-processing mechanism can account for that, there
2738should be no problem.  As time passes there may be changes to try to
2739improve the netperf's honoring the interval but one should not ass-u-me
2740it will always do so.  One should not assume the precision will remain
2741fixed - future versions may change it - perhaps going beyond tenths of
2742seconds in reporting the interval length etc.
2743
2744
2745File: netperf.info,  Node: Using Netperf to Measure Bidirectional Transfer,  Next: The Omni Tests,  Prev: Using Netperf to Measure Aggregate Performance,  Up: Top
2746
27478 Using Netperf to Measure Bidirectional Transfer
2748*************************************************
2749
2750There are two ways to use netperf to measure the performance of
2751bidirectional transfer.  The first is to run concurrent netperf tests
2752from the command line.  The second is to configure netperf with
2753`--enable-burst' and use a single instance of the *note TCP_RR: TCP_RR.
2754test.
2755
2756   While neither method is more "correct" than the other, each is doing
2757so in different ways, and that has possible implications.  For
2758instance, using the concurrent netperf test mechanism means that
2759multiple TCP connections and multiple processes are involved, whereas
2760using the single instance of TCP_RR there is only one TCP connection
2761and one process on each end.  They may behave differently, especially
2762on an MP system.
2763
2764* Menu:
2765
2766* Bidirectional Transfer with Concurrent Tests::
2767* Bidirectional Transfer with TCP_RR::
2768* Implications of Concurrent Tests vs Burst Request/Response::
2769
2770
2771File: netperf.info,  Node: Bidirectional Transfer with Concurrent Tests,  Next: Bidirectional Transfer with TCP_RR,  Prev: Using Netperf to Measure Bidirectional Transfer,  Up: Using Netperf to Measure Bidirectional Transfer
2772
27738.1 Bidirectional Transfer with Concurrent Tests
2774================================================
2775
2776If we had two hosts Fred and Ethel, we could simply run a netperf *note
2777TCP_STREAM: TCP_STREAM. test on Fred pointing at Ethel, and a
2778concurrent netperf TCP_STREAM test on Ethel pointing at Fred, but since
2779there are no mechanisms to synchronize netperf tests and we would be
2780starting tests from two different systems, there is a considerable risk
2781of skew error.
2782
2783   Far better would be to run simultaneous TCP_STREAM and *note
2784TCP_MAERTS: TCP_MAERTS. tests from just one system, using the concepts
2785and procedures outlined in *note Running Concurrent Netperf Tests:
2786Running Concurrent Netperf Tests. Here then is an example:
2787
2788     for i in 1
2789     do
2790      netperf -H 192.168.2.108 -t TCP_STREAM -B "outbound" -i 10 -P 0 -v 0 \
2791        -- -s 256K -S 256K &
2792      netperf -H 192.168.2.108 -t TCP_MAERTS -B "inbound"  -i 10 -P 0 -v 0 \
2793        -- -s 256K -S 256K &
2794     done
2795
2796      892.66 outbound
2797      891.34 inbound
2798
2799   We have used a `for' loop in the shell with just one iteration
2800because that will be much easier to get both tests started at more or
2801less the same time than doing it by hand.  The global `-P' and `-v'
2802options are used because we aren't interested in anything other than
2803the throughput, and the global `-B' option is used to tag each output
2804so we know which was inbound and which outbound relative to the system
2805on which we were running netperf.  Of course that sense is switched on
2806the system running netserver :)  The use of the global `-i' option is
2807explained in *note Running Concurrent Netperf Tests: Running Concurrent
2808Netperf Tests.
2809
2810   Beginning with version 2.5.0 we can accomplish a similar result with
2811the *note the omni tests: The Omni Tests. and *note output selectors:
2812Omni Output Selectors.:
2813
2814     for i in 1
2815     do
2816       netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \
2817         -d stream -s 256K -S 256K -o throughput,direction &
2818       netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \
2819         -d maerts -s 256K -S 256K -o throughput,direction &
2820     done
2821
2822     805.26,Receive
2823     828.54,Send
2824
2825
2826File: netperf.info,  Node: Bidirectional Transfer with TCP_RR,  Next: Implications of Concurrent Tests vs Burst Request/Response,  Prev: Bidirectional Transfer with Concurrent Tests,  Up: Using Netperf to Measure Bidirectional Transfer
2827
28288.2 Bidirectional Transfer with TCP_RR
2829======================================
2830
2831Starting with version 2.5.0 the `--enable-burst' configure option
2832defaults to `yes', and starting some time before version 2.5.0 but
2833after 2.4.0 the global `-f' option would affect the "throughput"
2834reported by request/response tests.  If one uses the test-specific `-b'
2835option to have several "transactions" in flight at one time and the
2836test-specific `-r' option to increase their size, the test looks more
2837and more like a single-connection bidirectional transfer than a simple
2838request/response test.
2839
2840   So, putting it all together one can do something like:
2841
2842     netperf -f m -t TCP_RR -H 192.168.1.3 -v 2 -- -b 6 -r 32K -S 256K -S 256K
2843     MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.3 (192.168.1.3) port 0 AF_INET : interval : first burst 6
2844     Local /Remote
2845     Socket Size   Request  Resp.   Elapsed
2846     Send   Recv   Size     Size    Time     Throughput
2847     bytes  Bytes  bytes    bytes   secs.    10^6bits/sec
2848
2849     16384  87380  32768    32768   10.00    1821.30
2850     524288 524288
2851     Alignment      Offset         RoundTrip  Trans    Throughput
2852     Local  Remote  Local  Remote  Latency    Rate     10^6bits/s
2853     Send   Recv    Send   Recv    usec/Tran  per sec  Outbound   Inbound
2854         8      0       0      0   2015.402   3473.252 910.492    910.492
2855
2856   to get a bidirectional bulk-throughput result. As one can see, the -v
28572 output will include a number of interesting, related values.
2858
2859     NOTE: The logic behind `--enable-burst' is very simple, and there
2860     are no calls to `poll()' or `select()' which means we want to make
2861     sure that the `send()' calls will never block, or we run the risk
2862     of deadlock with each side stuck trying to call `send()' and
2863     neither calling `recv()'.
2864
2865   Fortunately, this is easily accomplished by setting a "large enough"
2866socket buffer size with the test-specific `-s' and `-S' options.
2867Presently this must be performed by the user.  Future versions of
2868netperf might attempt to do this automagically, but there are some
2869issues to be worked-out.
2870
2871
2872File: netperf.info,  Node: Implications of Concurrent Tests vs Burst Request/Response,  Prev: Bidirectional Transfer with TCP_RR,  Up: Using Netperf to Measure Bidirectional Transfer
2873
28748.3 Implications of Concurrent Tests vs Burst Request/Response
2875==============================================================
2876
2877There are perhaps subtle but important differences between using
2878concurrent unidirectional tests vs a burst-mode request to measure
2879bidirectional performance.
2880
2881   Broadly speaking, a single "connection" or "flow" of traffic cannot
2882make use of the services of more than one or two CPUs at either end.
2883Whether one or two CPUs will be used processing a flow will depend on
2884the specifics of the stack(s) involved and whether or not the global
2885`-T' option has been used to bind netperf/netserver to specific CPUs.
2886
2887   When using concurrent tests there will be two concurrent connections
2888or flows, which means that upwards of four CPUs will be employed
2889processing the packets (global `-T' used, no more than two if not),
2890however, with just a single, bidirectional request/response test no
2891more than two CPUs will be employed (only one if the global `-T' is not
2892used).
2893
2894   If there is a CPU bottleneck on either system this may result in
2895rather different results between the two methods.
2896
2897   Also, with a bidirectional request/response test there is something
2898of a natural balance or synchronization between inbound and outbound - a
2899response will not be sent until a request is received, and (once the
2900burst level is reached) a subsequent request will not be sent until a
2901response is received.  This may mask favoritism in the NIC between
2902inbound and outbound processing.
2903
2904   With two concurrent unidirectional tests there is no such
2905synchronization or balance and any favoritism in the NIC may be exposed.
2906
2907
2908File: netperf.info,  Node: The Omni Tests,  Next: Other Netperf Tests,  Prev: Using Netperf to Measure Bidirectional Transfer,  Up: Top
2909
29109 The Omni Tests
2911****************
2912
2913Beginning with version 2.5.0, netperf begins a migration to the `omni'
2914tests or "Two routines to measure them all."  The code for the omni
2915tests can be found in `src/nettest_omni.c' and the goal is to make it
2916easier for netperf to support multiple protocols and report a great
2917many additional things about the systems under test.  Additionally, a
2918flexible output selection mechanism is present which allows the user to
2919chose specifically what values she wishes to have reported and in what
2920format.
2921
2922   The omni tests are included by default in version 2.5.0.  To disable
2923them, one must:
2924     ./configure --enable-omni=no ...
2925
2926   and remake netperf.  Remaking netserver is optional because even in
29272.5.0 it has "unmigrated" netserver side routines for the classic (eg
2928`src/nettest_bsd.c') tests.
2929
2930* Menu:
2931
2932* Native Omni Tests::
2933* Migrated Tests::
2934* Omni Output Selection::
2935
2936
2937File: netperf.info,  Node: Native Omni Tests,  Next: Migrated Tests,  Prev: The Omni Tests,  Up: The Omni Tests
2938
29399.1 Native Omni Tests
2940=====================
2941
2942One access the omni tests "natively" by using a value of "OMNI" with
2943the global `-t' test-selection option.  This will then cause netperf to
2944use the code in `src/nettest_omni.c' and in particular the
2945test-specific options parser for the omni tests.  The test-specific
2946options for the omni tests are a superset of those for "classic" tests.
2947The options added by the omni tests are:
2948
2949`-c'
2950     This explicitly declares that the test is to include connection
2951     establishment and tear-down as in either a TCP_CRR or TCP_CC test.
2952
2953`-d <direction>'
2954     This option sets the direction of the test relative to the netperf
2955     process.  As of version 2.5.0 one can use the following in a
2956     case-insensitive manner:
2957
2958    `send, stream, transmit, xmit or 2'
2959          Any of which will cause netperf to send to the netserver.
2960
2961    `recv, receive, maerts or 4'
2962          Any of which will cause netserver to send to netperf.
2963
2964    `rr or 6'
2965          Either of which will cause a request/response test.
2966
2967     Additionally, one can specify two directions separated by a '|'
2968     character and they will be OR'ed together.  In this way one can use
2969     the "Send|Recv" that will be emitted by the *note DIRECTION: Omni
2970     Output Selectors. *note output selector: Omni Output Selection.
2971     when used with a request/response test.
2972
2973`-k [*note output selector: Omni Output Selection.]'
2974     This option sets the style of output to "keyval" where each line of
2975     output has the form:
2976          key=value
2977     For example:
2978          $ netperf -t omni -- -d rr -k "THROUGHPUT,THROUGHPUT_UNITS"
2979          OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
2980          THROUGHPUT=59092.65
2981          THROUGHPUT_UNITS=Trans/s
2982
2983     Using the `-k' option will override any previous, test-specific
2984     `-o' or `-O' option.
2985
2986`-o [*note output selector: Omni Output Selection.]'
2987     This option sets the style of output to "CSV" where there will be
2988     one line of comma-separated values, preceded by one line of column
2989     names unless the global `-P' option is used with a value of 0:
2990          $ netperf -t omni -- -d rr -o "THROUGHPUT,THROUGHPUT_UNITS"
2991          OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
2992          Throughput,Throughput Units
2993          60999.07,Trans/s
2994
2995     Using the `-o' option will override any previous, test-specific
2996     `-k' or `-O' option.
2997
2998`-O [*note output selector: Omni Output Selection.]'
2999     This option sets the style of output to "human readable" which will
3000     look quite similar to classic netperf output:
3001          $ netperf -t omni -- -d rr -O "THROUGHPUT,THROUGHPUT_UNITS"
3002          OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
3003          Throughput Throughput
3004                     Units
3005
3006
3007          60492.57   Trans/s
3008
3009     Using the `-O' option will override any previous, test-specific
3010     `-k' or `-o' option.
3011
3012`-t'
3013     This option explicitly sets the socket type for the test's data
3014     connection. As of version 2.5.0 the known socket types include
3015     "stream" and "dgram" for SOCK_STREAM and SOCK_DGRAM respectively.
3016
3017`-T <protocol>'
3018     This option is used to explicitly set the protocol used for the
3019     test. It is case-insensitive. As of version 2.5.0 the protocols
3020     known to netperf include:
3021    `TCP'
3022          Select the Transmission Control Protocol
3023
3024    `UDP'
3025          Select the User Datagram Protocol
3026
3027    `SDP'
3028          Select the Sockets Direct Protocol
3029
3030    `DCCP'
3031          Select the Datagram Congestion Control Protocol
3032
3033    `SCTP'
3034          Select the Stream Control Transport Protocol
3035
3036    `udplite'
3037          Select UDP Lite
3038
3039     The default is implicit based on other settings.
3040
3041   The omni tests also extend the interpretation of some of the classic,
3042test-specific options for the BSD Sockets tests:
3043
3044`-m <optionspec>'
3045     This can set the send size for either or both of the netperf and
3046     netserver sides of the test:
3047          -m 32K
3048     sets only the netperf-side send size to 32768 bytes, and or's-in
3049     transmit for the direction. This is effectively the same behaviour
3050     as for the classic tests.
3051          -m ,32K
3052     sets only the netserver side send size to 32768 bytes and or's-in
3053     receive for the direction.
3054          -m 16K,32K
3055          sets the netperf side send size to 16284 bytes, the netserver side
3056          send size to 32768 bytes and the direction will be "Send|Recv."
3057
3058`-M <optionspec>'
3059     This can set the receive size for either or both of the netperf and
3060     netserver sides of the test:
3061          -M 32K
3062     sets only the netserver side receive size to 32768 bytes and
3063     or's-in send for the test direction.
3064          -M ,32K
3065     sets only the netperf side receive size to 32768 bytes and or's-in
3066     receive for the test direction.
3067          -M 16K,32K
3068     sets the netserver side receive size to 16384 bytes and the netperf
3069     side receive size to 32768 bytes and the direction will be
3070     "Send|Recv."
3071
3072
3073File: netperf.info,  Node: Migrated Tests,  Next: Omni Output Selection,  Prev: Native Omni Tests,  Up: The Omni Tests
3074
30759.2 Migrated Tests
3076==================
3077
3078As of version 2.5.0 several tests have been migrated to use the omni
3079code in `src/nettest_omni.c' for the core of their testing.  A migrated
3080test retains all its previous output code and so should still "look and
3081feel" just like a pre-2.5.0 test with one exception - the first line of
3082the test banners will include the word "MIGRATED" at the beginning as
3083in:
3084
3085     $ netperf
3086     MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
3087     Recv   Send    Send
3088     Socket Socket  Message  Elapsed
3089     Size   Size    Size     Time     Throughput
3090     bytes  bytes   bytes    secs.    10^6bits/sec
3091
3092      87380  16384  16384    10.00    27175.27
3093
3094   The tests migrated in version 2.5.0 are:
3095   * TCP_STREAM
3096
3097   * TCP_MAERTS
3098
3099   * TCP_RR
3100
3101   * TCP_CRR
3102
3103   * UDP_STREAM
3104
3105   * UDP_RR
3106
3107   It is expected that future releases will have additional tests
3108migrated to use the "omni" functionality.
3109
3110   If one uses "omni-specific" test-specific options in conjunction
3111with a migrated test, instead of using the classic output code, the new
3112omni output code will be used. For example if one uses the `-k'
3113test-specific option with a value of "MIN_LATENCY,MAX_LATENCY" with a
3114migrated TCP_RR test one will see:
3115
3116     $ netperf -t tcp_rr -- -k THROUGHPUT,THROUGHPUT_UNITS
3117     MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
3118     THROUGHPUT=60074.74
3119     THROUGHPUT_UNITS=Trans/s
3120   rather than:
3121     $ netperf -t tcp_rr
3122     MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
3123     Local /Remote
3124     Socket Size   Request  Resp.   Elapsed  Trans.
3125     Send   Recv   Size     Size    Time     Rate
3126     bytes  Bytes  bytes    bytes   secs.    per sec
3127
3128     16384  87380  1        1       10.00    59421.52
3129     16384  87380
3130
3131
3132File: netperf.info,  Node: Omni Output Selection,  Prev: Migrated Tests,  Up: The Omni Tests
3133
31349.3 Omni Output Selection
3135=========================
3136
3137The omni test-specific `-k', `-o' and `-O' options take an optional
3138`output selector' by which the user can configure what values are
3139reported.  The output selector can take several forms:
3140
3141``filename''
3142     The output selections will be read from the named file. Within the
3143     file there can be up to four lines of comma-separated output
3144     selectors. This controls how many multi-line blocks of output are
3145     emitted when the `-O' option is used.  This output, while not
3146     identical to "classic" netperf output, is inspired by it.
3147     Multiple lines have no effect for `-k' and `-o' options.  Putting
3148     output selections in a file can be useful when the list of
3149     selections is long.
3150
3151`comma and/or semi-colon-separated list'
3152     The output selections will be parsed from a comma and/or
3153     semi-colon-separated list of output selectors. When the list is
3154     given to a `-O' option a semi-colon specifies a new output block
3155     should be started.  Semi-colons have the same meaning as commas
3156     when used with the `-k' or `-o' options.  Depending on the command
3157     interpreter being used, the semi-colon may have to be escaped
3158     somehow to keep it from being interpreted by the command
3159     interpreter.  This can often be done by enclosing the entire list
3160     in quotes.
3161
3162`all'
3163     If the keyword all is specified it means that all known output
3164     values should be displayed at the end of the test.  This can be a
3165     great deal of output.  As of version 2.5.0 there are 157 different
3166     output selectors.
3167
3168`?'
3169     If a "?" is given as the output selection, the list of all known
3170     output selectors will be displayed and no test actually run.  When
3171     passed to the `-O' option they will be listed one per line.
3172     Otherwise they will be listed as a comma-separated list.  It may
3173     be necessary to protect the "?" from the command interpreter by
3174     escaping it or enclosing it in quotes.
3175
3176`no selector'
3177     If nothing is given to the `-k', `-o' or `-O' option then the code
3178     selects a default set of output selectors inspired by classic
3179     netperf output. The format will be the `human readable' format
3180     emitted by the test-specific `-O' option.
3181
3182   The order of evaluation will first check for an output selection.  If
3183none is specified with the `-k', `-o' or `-O' option netperf will
3184select a default based on the characteristics of the test.  If there is
3185an output selection, the code will first check for `?', then check to
3186see if it is the magic `all' keyword.  After that it will check for
3187either `,' or `;' in the selection and take that to mean it is a comma
3188and/or semi-colon-separated list. If none of those checks match,
3189netperf will then assume the output specification is a filename and
3190attempt to open and parse the file.
3191
3192* Menu:
3193
3194* Omni Output Selectors::
3195
3196
3197File: netperf.info,  Node: Omni Output Selectors,  Prev: Omni Output Selection,  Up: Omni Output Selection
3198
31999.3.1 Omni Output Selectors
3200---------------------------
3201
3202As of version 2.5.0 the output selectors are:
3203
3204`OUTPUT_NONE'
3205     This is essentially a null output.  For `-k' output it will simply
3206     add a line that reads "OUTPUT_NONE=" to the output. For `-o' it
3207     will cause an empty "column" to be included. For `-O' output it
3208     will cause extra spaces to separate "real" output.
3209
3210`SOCKET_TYPE'
3211     This will cause the socket type (eg SOCK_STREAM, SOCK_DGRAM) for
3212     the data connection to be output.
3213
3214`PROTOCOL'
3215     This will cause the protocol used for the data connection to be
3216     displayed.
3217
3218`DIRECTION'
3219     This will display the data flow direction relative to the netperf
3220     process. Units: Send or Recv for a unidirectional bulk-transfer
3221     test, or Send|Recv for a request/response test.
3222
3223`ELAPSED_TIME'
3224     This will display the elapsed time in seconds for the test.
3225
3226`THROUGHPUT'
3227     This will display the throughput for the test. Units: As requested
3228     via the global `-f' option and displayed by the THROUGHPUT_UNITS
3229     output selector.
3230
3231`THROUGHPUT_UNITS'
3232     This will display the units for what is displayed by the
3233     `THROUGHPUT' output selector.
3234
3235`LSS_SIZE_REQ'
3236     This will display the local (netperf) send socket buffer size (aka
3237     SO_SNDBUF) requested via the command line. Units: Bytes.
3238
3239`LSS_SIZE'
3240     This will display the local (netperf) send socket buffer size
3241     (SO_SNDBUF) immediately after the data connection socket was
3242     created.  Peculiarities of different networking stacks may lead to
3243     this differing from the size requested via the command line.
3244     Units: Bytes.
3245
3246`LSS_SIZE_END'
3247     This will display the local (netperf) send socket buffer size
3248     (SO_SNDBUF) immediately before the data connection socket is
3249     closed.  Peculiarities of different networking stacks may lead
3250     this to differ from the size requested via the command line and/or
3251     the size immediately after the data connection socket was created.
3252     Units: Bytes.
3253
3254`LSR_SIZE_REQ'
3255     This will display the local (netperf) receive socket buffer size
3256     (aka SO_RCVBUF) requested via the command line. Units: Bytes.
3257
3258`LSR_SIZE'
3259     This will display the local (netperf) receive socket buffer size
3260     (SO_RCVBUF) immediately after the data connection socket was
3261     created.  Peculiarities of different networking stacks may lead to
3262     this differing from the size requested via the command line.
3263     Units: Bytes.
3264
3265`LSR_SIZE_END'
3266     This will display the local (netperf) receive socket buffer size
3267     (SO_RCVBUF) immediately before the data connection socket is
3268     closed.  Peculiarities of different networking stacks may lead
3269     this to differ from the size requested via the command line and/or
3270     the size immediately after the data connection socket was created.
3271     Units: Bytes.
3272
3273`RSS_SIZE_REQ'
3274     This will display the remote (netserver) send socket buffer size
3275     (aka SO_SNDBUF) requested via the command line. Units: Bytes.
3276
3277`RSS_SIZE'
3278     This will display the remote (netserver) send socket buffer size
3279     (SO_SNDBUF) immediately after the data connection socket was
3280     created.  Peculiarities of different networking stacks may lead to
3281     this differing from the size requested via the command line.
3282     Units: Bytes.
3283
3284`RSS_SIZE_END'
3285     This will display the remote (netserver) send socket buffer size
3286     (SO_SNDBUF) immediately before the data connection socket is
3287     closed.  Peculiarities of different networking stacks may lead
3288     this to differ from the size requested via the command line and/or
3289     the size immediately after the data connection socket was created.
3290     Units: Bytes.
3291
3292`RSR_SIZE_REQ'
3293     This will display the remote (netserver) receive socket buffer
3294     size (aka SO_RCVBUF) requested via the command line. Units: Bytes.
3295
3296`RSR_SIZE'
3297     This will display the remote (netserver) receive socket buffer size
3298     (SO_RCVBUF) immediately after the data connection socket was
3299     created.  Peculiarities of different networking stacks may lead to
3300     this differing from the size requested via the command line.
3301     Units: Bytes.
3302
3303`RSR_SIZE_END'
3304     This will display the remote (netserver) receive socket buffer size
3305     (SO_RCVBUF) immediately before the data connection socket is
3306     closed.  Peculiarities of different networking stacks may lead
3307     this to differ from the size requested via the command line and/or
3308     the size immediately after the data connection socket was created.
3309     Units: Bytes.
3310
3311`LOCAL_SEND_SIZE'
3312     This will display the size of the buffers netperf passed in any
3313     "send" calls it made on the data connection for a
3314     non-request/response test. Units: Bytes.
3315
3316`LOCAL_RECV_SIZE'
3317     This will display the size of the buffers netperf passed in any
3318     "receive" calls it made on the data connection for a
3319     non-request/response test. Units: Bytes.
3320
3321`REMOTE_SEND_SIZE'
3322     This will display the size of the buffers netserver passed in any
3323     "send" calls it made on the data connection for a
3324     non-request/response test. Units: Bytes.
3325
3326`REMOTE_RECV_SIZE'
3327     This will display the size of the buffers netserver passed in any
3328     "receive" calls it made on the data connection for a
3329     non-request/response test. Units: Bytes.
3330
3331`REQUEST_SIZE'
3332     This will display the size of the requests netperf sent in a
3333     request-response test. Units: Bytes.
3334
3335`RESPONSE_SIZE'
3336     This will display the size of the responses netserver sent in a
3337     request-response test. Units: Bytes.
3338
3339`LOCAL_CPU_UTIL'
3340     This will display the overall CPU utilization during the test as
3341     measured by netperf. Units: 0 to 100 percent.
3342
3343`LOCAL_CPU_PERCENT_USER'
3344     This will display the CPU fraction spent in user mode during the
3345     test as measured by netperf. Only supported by netcpu_procstat.
3346     Units: 0 to 100 percent.
3347
3348`LOCAL_CPU_PERCENT_SYSTEM'
3349     This will display the CPU fraction spent in system mode during the
3350     test as measured by netperf. Only supported by netcpu_procstat.
3351     Units: 0 to 100 percent.
3352
3353`LOCAL_CPU_PERCENT_IOWAIT'
3354     This will display the fraction of time waiting for I/O to complete
3355     during the test as measured by netperf. Only supported by
3356     netcpu_procstat. Units: 0 to 100 percent.
3357
3358`LOCAL_CPU_PERCENT_IRQ'
3359     This will display the fraction of time servicing interrupts during
3360     the test as measured by netperf. Only supported by
3361     netcpu_procstat. Units: 0 to 100 percent.
3362
3363`LOCAL_CPU_PERCENT_SWINTR'
3364     This will display the fraction of time servicing softirqs during
3365     the test as measured by netperf. Only supported by
3366     netcpu_procstat. Units: 0 to 100 percent.
3367
3368`LOCAL_CPU_METHOD'
3369     This will display the method used by netperf to measure CPU
3370     utilization. Units: single character denoting method.
3371
3372`LOCAL_SD'
3373     This will display the service demand, or units of CPU consumed per
3374     unit of work, as measured by netperf. Units: microseconds of CPU
3375     consumed per either KB (K==1024) of data transferred or
3376     request/response transaction.
3377
3378`REMOTE_CPU_UTIL'
3379     This will display the overall CPU utilization during the test as
3380     measured by netserver. Units 0 to 100 percent.
3381
3382`REMOTE_CPU_PERCENT_USER'
3383     This will display the CPU fraction spent in user mode during the
3384     test as measured by netserver. Only supported by netcpu_procstat.
3385     Units: 0 to 100 percent.
3386
3387`REMOTE_CPU_PERCENT_SYSTEM'
3388     This will display the CPU fraction spent in system mode during the
3389     test as measured by netserver. Only supported by netcpu_procstat.
3390     Units: 0 to 100 percent.
3391
3392`REMOTE_CPU_PERCENT_IOWAIT'
3393     This will display the fraction of time waiting for I/O to complete
3394     during the test as measured by netserver. Only supported by
3395     netcpu_procstat. Units: 0 to 100 percent.
3396
3397`REMOTE_CPU_PERCENT_IRQ'
3398     This will display the fraction of time servicing interrupts during
3399     the test as measured by netserver. Only supported by
3400     netcpu_procstat. Units: 0 to 100 percent.
3401
3402`REMOTE_CPU_PERCENT_SWINTR'
3403     This will display the fraction of time servicing softirqs during
3404     the test as measured by netserver. Only supported by
3405     netcpu_procstat. Units: 0 to 100 percent.
3406
3407`REMOTE_CPU_METHOD'
3408     This will display the method used by netserver to measure CPU
3409     utilization. Units: single character denoting method.
3410
3411`REMOTE_SD'
3412     This will display the service demand, or units of CPU consumed per
3413     unit of work, as measured by netserver. Units: microseconds of CPU
3414     consumed per either KB (K==1024) of data transferred or
3415     request/response transaction.
3416
3417`SD_UNITS'
3418     This will display the units for LOCAL_SD and REMOTE_SD
3419
3420`CONFIDENCE_LEVEL'
3421     This will display the confidence level requested by the user either
3422     explicitly via the global `-I' option, or implicitly via the
3423     global `-i' option.  The value will be either 95 or 99 if
3424     confidence intervals have been requested or 0 if they were not.
3425     Units: Percent
3426
3427`CONFIDENCE_INTERVAL'
3428     This will display the width of the confidence interval requested
3429     either explicitly via the global `-I' option or implicitly via the
3430     global `-i' option.  Units: Width in percent of mean value
3431     computed. A value of -1.0 means that confidence intervals were not
3432     requested.
3433
3434`CONFIDENCE_ITERATION'
3435     This will display the number of test iterations netperf undertook,
3436     perhaps while attempting to achieve the requested confidence
3437     interval and level. If confidence intervals were requested via the
3438     command line then the value will be between 3 and 30.  If
3439     confidence intervals were not requested the value will be 1.
3440     Units: Iterations
3441
3442`THROUGHPUT_CONFID'
3443     This will display the width of the confidence interval actually
3444     achieved for `THROUGHPUT' during the test.  Units: Width of
3445     interval as percentage of reported throughput value.
3446
3447`LOCAL_CPU_CONFID'
3448     This will display the width of the confidence interval actually
3449     achieved for overall CPU utilization on the system running netperf
3450     (`LOCAL_CPU_UTIL') during the test, if CPU utilization measurement
3451     was enabled.  Units: Width of interval as percentage of reported
3452     CPU utilization.
3453
3454`REMOTE_CPU_CONFID'
3455     This will display the width of the confidence interval actually
3456     achieved for overall CPU utilization on the system running
3457     netserver (`REMOTE_CPU_UTIL') during the test, if CPU utilization
3458     measurement was enabled. Units: Width of interval as percentage of
3459     reported CPU utilization.
3460
3461`TRANSACTION_RATE'
3462     This will display the transaction rate in transactions per second
3463     for a request/response test even if the user has requested a
3464     throughput in units of bits or bytes per second via the global `-f'
3465     option. It is undefined for a non-request/response test. Units:
3466     Transactions per second.
3467
3468`RT_LATENCY'
3469     This will display the average round-trip latency for a
3470     request/response test, accounting for number of transactions in
3471     flight at one time. It is undefined for a non-request/response
3472     test. Units: Microseconds per transaction
3473
3474`BURST_SIZE'
3475     This will display the "burst size" or added transactions in flight
3476     in a request/response test as requested via a test-specific `-b'
3477     option.  The number of transactions in flight at one time will be
3478     one greater than this value.  It is undefined for a
3479     non-request/response test. Units: added Transactions in flight.
3480
3481`LOCAL_TRANSPORT_RETRANS'
3482     This will display the number of retransmissions experienced on the
3483     data connection during the test as determined by netperf.  A value
3484     of -1 means the attempt to determine the number of retransmissions
3485     failed or the concept was not valid for the given protocol or the
3486     mechanism is not known for the platform. A value of -2 means it
3487     was not attempted. As of version 2.5.0 the meaning of values are
3488     in flux and subject to change.  Units: number of retransmissions.
3489
3490`REMOTE_TRANSPORT_RETRANS'
3491     This will display the number of retransmissions experienced on the
3492     data connection during the test as determined by netserver.  A
3493     value of -1 means the attempt to determine the number of
3494     retransmissions failed or the concept was not valid for the given
3495     protocol or the mechanism is not known for the platform. A value
3496     of -2 means it was not attempted. As of version 2.5.0 the meaning
3497     of values are in flux and subject to change.  Units: number of
3498     retransmissions.
3499
3500`TRANSPORT_MSS'
3501     This will display the Maximum Segment Size (aka MSS) or its
3502     equivalent for the protocol being used during the test.  A value
3503     of -1 means either the concept of an MSS did not apply to the
3504     protocol being used, or there was an error in retrieving it.
3505     Units: Bytes.
3506
3507`LOCAL_SEND_THROUGHPUT'
3508     The throughput as measured by netperf for the successful "send"
3509     calls it made on the data connection. Units: as requested via the
3510     global `-f' option and displayed via the `THROUGHPUT_UNITS' output
3511     selector.
3512
3513`LOCAL_RECV_THROUGHPUT'
3514     The throughput as measured by netperf for the successful "receive"
3515     calls it made on the data connection. Units: as requested via the
3516     global `-f' option and displayed via the `THROUGHPUT_UNITS' output
3517     selector.
3518
3519`REMOTE_SEND_THROUGHPUT'
3520     The throughput as measured by netserver for the successful "send"
3521     calls it made on the data connection. Units: as requested via the
3522     global `-f' option and displayed via the `THROUGHPUT_UNITS' output
3523     selector.
3524
3525`REMOTE_RECV_THROUGHPUT'
3526     The throughput as measured by netserver for the successful
3527     "receive" calls it made on the data connection. Units: as
3528     requested via the global `-f' option and displayed via the
3529     `THROUGHPUT_UNITS' output selector.
3530
3531`LOCAL_CPU_BIND'
3532     The CPU to which netperf was bound, if at all, during the test. A
3533     value of -1 means that netperf was not explicitly bound to a CPU
3534     during the test. Units: CPU ID
3535
3536`LOCAL_CPU_COUNT'
3537     The number of CPUs (cores, threads) detected by netperf. Units:
3538     CPU count.
3539
3540`LOCAL_CPU_PEAK_UTIL'
3541     The utilization of the CPU most heavily utilized during the test,
3542     as measured by netperf. This can be used to see if any one CPU of a
3543     multi-CPU system was saturated even though the overall CPU
3544     utilization as reported by `LOCAL_CPU_UTIL' was low. Units: 0 to
3545     100%
3546
3547`LOCAL_CPU_PEAK_ID'
3548     The id of the CPU most heavily utilized during the test as
3549     determined by netperf. Units: CPU ID.
3550
3551`LOCAL_CPU_MODEL'
3552     Model information for the processor(s) present on the system
3553     running netperf. Assumes all processors in the system (as
3554     perceived by netperf) on which netperf is running are the same
3555     model. Units: Text
3556
3557`LOCAL_CPU_FREQUENCY'
3558     The frequency of the processor(s) on the system running netperf, at
3559     the time netperf made the call.  Assumes that all processors
3560     present in the system running netperf are running at the same
3561     frequency. Units: MHz
3562
3563`REMOTE_CPU_BIND'
3564     The CPU to which netserver was bound, if at all, during the test. A
3565     value of -1 means that netperf was not explicitly bound to a CPU
3566     during the test. Units: CPU ID
3567
3568`REMOTE_CPU_COUNT'
3569     The number of CPUs (cores, threads) detected by netserver. Units:
3570     CPU count.
3571
3572`REMOTE_CPU_PEAK_UTIL'
3573     The utilization of the CPU most heavily utilized during the test,
3574     as measured by netserver. This can be used to see if any one CPU
3575     of a multi-CPU system was saturated even though the overall CPU
3576     utilization as reported by `REMOTE_CPU_UTIL' was low. Units: 0 to
3577     100%
3578
3579`REMOTE_CPU_PEAK_ID'
3580     The id of the CPU most heavily utilized during the test as
3581     determined by netserver. Units: CPU ID.
3582
3583`REMOTE_CPU_MODEL'
3584     Model information for the processor(s) present on the system
3585     running netserver. Assumes all processors in the system (as
3586     perceived by netserver) on which netserver is running are the same
3587     model. Units: Text
3588
3589`REMOTE_CPU_FREQUENCY'
3590     The frequency of the processor(s) on the system running netserver,
3591     at the time netserver made the call.  Assumes that all processors
3592     present in the system running netserver are running at the same
3593     frequency. Units: MHz
3594
3595`SOURCE_PORT'
3596     The port ID/service name to which the data socket created by
3597     netperf was bound.  A value of 0 means the data socket was not
3598     explicitly bound to a port number. Units: ASCII text.
3599
3600`SOURCE_ADDR'
3601     The name/address to which the data socket created by netperf was
3602     bound. A value of 0.0.0.0 means the data socket was not explicitly
3603     bound to an address. Units: ASCII text.
3604
3605`SOURCE_FAMILY'
3606     The address family to which the data socket created by netperf was
3607     bound.  A value of 0 means the data socket was not explicitly
3608     bound to a given address family. Units: ASCII text.
3609
3610`DEST_PORT'
3611     The port ID to which the data socket created by netserver was
3612     bound. A value of 0 means the data socket was not explicitly bound
3613     to a port number.  Units: ASCII text.
3614
3615`DEST_ADDR'
3616     The name/address of the data socket created by netserver.  Units:
3617     ASCII text.
3618
3619`DEST_FAMILY'
3620     The address family to which the data socket created by netserver
3621     was bound. A value of 0 means the data socket was not explicitly
3622     bound to a given address family. Units: ASCII text.
3623
3624`LOCAL_SEND_CALLS'
3625     The number of successful "send" calls made by netperf against its
3626     data socket. Units: Calls.
3627
3628`LOCAL_RECV_CALLS'
3629     The number of successful "receive" calls made by netperf against
3630     its data socket. Units: Calls.
3631
3632`LOCAL_BYTES_PER_RECV'
3633     The average number of bytes per "receive" call made by netperf
3634     against its data socket. Units: Bytes.
3635
3636`LOCAL_BYTES_PER_SEND'
3637     The average number of bytes per "send" call made by netperf against
3638     its data socket. Units: Bytes.
3639
3640`LOCAL_BYTES_SENT'
3641     The number of bytes successfully sent by netperf through its data
3642     socket. Units: Bytes.
3643
3644`LOCAL_BYTES_RECVD'
3645     The number of bytes successfully received by netperf through its
3646     data socket. Units: Bytes.
3647
3648`LOCAL_BYTES_XFERD'
3649     The sum of bytes sent and received by netperf through its data
3650     socket. Units: Bytes.
3651
3652`LOCAL_SEND_OFFSET'
3653     The offset from the alignment of the buffers passed by netperf in
3654     its "send" calls. Specified via the global `-o' option and
3655     defaults to 0. Units: Bytes.
3656
3657`LOCAL_RECV_OFFSET'
3658     The offset from the alignment of the buffers passed by netperf in
3659     its "receive" calls. Specified via the global `-o' option and
3660     defaults to 0. Units: Bytes.
3661
3662`LOCAL_SEND_ALIGN'
3663     The alignment of the buffers passed by netperf in its "send" calls
3664     as specified via the global `-a' option. Defaults to 8. Units:
3665     Bytes.
3666
3667`LOCAL_RECV_ALIGN'
3668     The alignment of the buffers passed by netperf in its "receive"
3669     calls as specified via the global `-a' option. Defaults to 8.
3670     Units: Bytes.
3671
3672`LOCAL_SEND_WIDTH'
3673     The "width" of the ring of buffers through which netperf cycles as
3674     it makes its "send" calls.  Defaults to one more than the local
3675     send socket buffer size divided by the send size as determined at
3676     the time the data socket is created. Can be used to make netperf
3677     more processor data cache unfriendly. Units: number of buffers.
3678
3679`LOCAL_RECV_WIDTH'
3680     The "width" of the ring of buffers through which netperf cycles as
3681     it makes its "receive" calls.  Defaults to one more than the local
3682     receive socket buffer size divided by the receive size as
3683     determined at the time the data socket is created. Can be used to
3684     make netperf more processor data cache unfriendly. Units: number
3685     of buffers.
3686
3687`LOCAL_SEND_DIRTY_COUNT'
3688     The number of bytes to "dirty" (write to) before netperf makes a
3689     "send" call. Specified via the global `-k' option, which requires
3690     that -enable-dirty=yes was specified with the configure command
3691     prior to building netperf. Units: Bytes.
3692
3693`LOCAL_RECV_DIRTY_COUNT'
3694     The number of bytes to "dirty" (write to) before netperf makes a
3695     "recv" call. Specified via the global `-k' option which requires
3696     that -enable-dirty was specified with the configure command prior
3697     to building netperf. Units: Bytes.
3698
3699`LOCAL_RECV_CLEAN_COUNT'
3700     The number of bytes netperf should read "cleanly" before making a
3701     "receive" call. Specified via the global `-k' option which
3702     requires that -enable-dirty was specified with configure command
3703     prior to building netperf.  Clean reads start were dirty writes
3704     ended.  Units: Bytes.
3705
3706`LOCAL_NODELAY'
3707     Indicates whether or not setting the test protocol-specific "no
3708     delay" (eg TCP_NODELAY) option on the data socket used by netperf
3709     was requested by the test-specific `-D' option and successful.
3710     Units: 0 means no, 1 means yes.
3711
3712`LOCAL_CORK'
3713     Indicates whether or not TCP_CORK was set on the data socket used
3714     by netperf as requested via the test-specific `-C' option. 1 means
3715     yes, 0 means no/not applicable.
3716
3717`REMOTE_SEND_CALLS'
3718
3719`REMOTE_RECV_CALLS'
3720
3721`REMOTE_BYTES_PER_RECV'
3722
3723`REMOTE_BYTES_PER_SEND'
3724
3725`REMOTE_BYTES_SENT'
3726
3727`REMOTE_BYTES_RECVD'
3728
3729`REMOTE_BYTES_XFERD'
3730
3731`REMOTE_SEND_OFFSET'
3732
3733`REMOTE_RECV_OFFSET'
3734
3735`REMOTE_SEND_ALIGN'
3736
3737`REMOTE_RECV_ALIGN'
3738
3739`REMOTE_SEND_WIDTH'
3740
3741`REMOTE_RECV_WIDTH'
3742
3743`REMOTE_SEND_DIRTY_COUNT'
3744
3745`REMOTE_RECV_DIRTY_COUNT'
3746
3747`REMOTE_RECV_CLEAN_COUNT'
3748
3749`REMOTE_NODELAY'
3750
3751`REMOTE_CORK'
3752     These are all like their "LOCAL_" counterparts only for the
3753     netserver rather than netperf.
3754
3755`LOCAL_SYSNAME'
3756     The name of the OS (eg "Linux") running on the system on which
3757     netperf was running. Units: ASCII Text
3758
3759`LOCAL_SYSTEM_MODEL'
3760     The model name of the system on which netperf was running. Units:
3761     ASCII Text.
3762
3763`LOCAL_RELEASE'
3764     The release name/number of the OS running on the system on which
3765     netperf  was running. Units: ASCII Text
3766
3767`LOCAL_VERSION'
3768     The version number of the OS running on the system on which netperf
3769     was running. Units: ASCII Text
3770
3771`LOCAL_MACHINE'
3772     The machine architecture of the machine on which netperf was
3773     running. Units: ASCII Text.
3774
3775`REMOTE_SYSNAME'
3776
3777`REMOTE_SYSTEM_MODEL'
3778
3779`REMOTE_RELEASE'
3780
3781`REMOTE_VERSION'
3782
3783`REMOTE_MACHINE'
3784     These are all like their "LOCAL_" counterparts only for the
3785     netserver rather than netperf.
3786
3787`LOCAL_INTERFACE_NAME'
3788     The name of the probable egress interface through which the data
3789     connection went on the system running netperf. Example: eth0.
3790     Units: ASCII Text.
3791
3792`LOCAL_INTERFACE_VENDOR'
3793     The vendor ID of the probable egress interface through which
3794     traffic on the data connection went on the system running netperf.
3795     Units: Hexadecimal IDs as might be found in a `pci.ids' file or at
3796     the PCI ID Repository (http://pciids.sourceforge.net/).
3797
3798`LOCAL_INTERFACE_DEVICE'
3799     The device ID of the probable egress interface through which
3800     traffic on the data connection went on the system running netperf.
3801     Units: Hexadecimal IDs as might be found in a `pci.ids' file or at
3802     the PCI ID Repository (http://pciids.sourceforge.net/).
3803
3804`LOCAL_INTERFACE_SUBVENDOR'
3805     The sub-vendor ID of the probable egress interface through which
3806     traffic on the data connection went on the system running netperf.
3807     Units: Hexadecimal IDs as might be found in a `pci.ids' file or at
3808     the PCI ID Repository (http://pciids.sourceforge.net/).
3809
3810`LOCAL_INTERFACE_SUBDEVICE'
3811     The sub-device ID of the probable egress interface through which
3812     traffic on the data connection went on the system running netperf.
3813     Units: Hexadecimal IDs as might be found in a `pci.ids' file or at
3814     the PCI ID Repository (http://pciids.sourceforge.net/).
3815
3816`LOCAL_DRIVER_NAME'
3817     The name of the driver used for the probable egress interface
3818     through which traffic on the data connection went on the system
3819     running netperf. Units: ASCII Text.
3820
3821`LOCAL_DRIVER_VERSION'
3822     The version string for the driver used for the probable egress
3823     interface through which traffic on the data connection went on the
3824     system running netperf. Units: ASCII Text.
3825
3826`LOCAL_DRIVER_FIRMWARE'
3827     The firmware version for the driver used for the probable egress
3828     interface through which traffic on the data connection went on the
3829     system running netperf. Units: ASCII Text.
3830
3831`LOCAL_DRIVER_BUS'
3832     The bus address of the probable egress interface through which
3833     traffic on the data connection went on the system running netperf.
3834     Units: ASCII Text.
3835
3836`LOCAL_INTERFACE_SLOT'
3837     The slot ID of the probable egress interface through which traffic
3838     on the data connection went on the system running netperf. Units:
3839     ASCII Text.
3840
3841`REMOTE_INTERFACE_NAME'
3842
3843`REMOTE_INTERFACE_VENDOR'
3844
3845`REMOTE_INTERFACE_DEVICE'
3846
3847`REMOTE_INTERFACE_SUBVENDOR'
3848
3849`REMOTE_INTERFACE_SUBDEVICE'
3850
3851`REMOTE_DRIVER_NAME'
3852
3853`REMOTE_DRIVER_VERSION'
3854
3855`REMOTE_DRIVER_FIRMWARE'
3856
3857`REMOTE_DRIVER_BUS'
3858
3859`REMOTE_INTERFACE_SLOT'
3860     These are all like their "LOCAL_" counterparts only for the
3861     netserver rather than netperf.
3862
3863`LOCAL_INTERVAL_USECS'
3864     The interval at which bursts of operations (sends, receives,
3865     transactions) were attempted by netperf.  Specified by the global
3866     `-w' option which requires -enable-intervals to have been
3867     specified with the configure command prior to building netperf.
3868     Units: Microseconds (though specified by default in milliseconds
3869     on the command line)
3870
3871`LOCAL_INTERVAL_BURST'
3872     The number of operations (sends, receives, transactions depending
3873     on the test) which were attempted by netperf each
3874     LOCAL_INTERVAL_USECS units of time. Specified by the global `-b'
3875     option which requires -enable-intervals to have been specified
3876     with the configure command prior to building netperf.  Units:
3877     number of operations per burst.
3878
3879`REMOTE_INTERVAL_USECS'
3880     The interval at which bursts of operations (sends, receives,
3881     transactions) were attempted by netserver.  Specified by the
3882     global `-w' option which requires -enable-intervals to have been
3883     specified with the configure command prior to building netperf.
3884     Units: Microseconds (though specified by default in milliseconds
3885     on the command line)
3886
3887`REMOTE_INTERVAL_BURST'
3888     The number of operations (sends, receives, transactions depending
3889     on the test) which were attempted by netperf each
3890     LOCAL_INTERVAL_USECS units of time. Specified by the global `-b'
3891     option which requires -enable-intervals to have been specified
3892     with the configure command prior to building netperf.  Units:
3893     number of operations per burst.
3894
3895`LOCAL_SECURITY_TYPE_ID'
3896
3897`LOCAL_SECURITY_TYPE'
3898
3899`LOCAL_SECURITY_ENABLED_NUM'
3900
3901`LOCAL_SECURITY_ENABLED'
3902
3903`LOCAL_SECURITY_SPECIFIC'
3904
3905`REMOTE_SECURITY_TYPE_ID'
3906
3907`REMOTE_SECURITY_TYPE'
3908
3909`REMOTE_SECURITY_ENABLED_NUM'
3910
3911`REMOTE_SECURITY_ENABLED'
3912
3913`REMOTE_SECURITY_SPECIFIC'
3914     A bunch of stuff related to what sort of security mechanisms (eg
3915     SELINUX) were enabled on the systems during the test.
3916
3917`RESULT_BRAND'
3918     The string specified by the user with the global `-B' option.
3919     Units: ASCII Text.
3920
3921`UUID'
3922     The universally unique identifier associated with this test, either
3923     generated automagically by netperf, or passed to netperf via an
3924     omni test-specific `-u' option. Note: Future versions may make this
3925     a global command-line option. Units: ASCII Text.
3926
3927`MIN_LATENCY'
3928     The minimum "latency" or operation time (send, receive or
3929     request/response exchange depending on the test) as measured on the
3930     netperf side when the global `-j' option was specified. Units:
3931     Microseconds.
3932
3933`MAX_LATENCY'
3934     The maximum "latency" or operation time (send, receive or
3935     request/response exchange depending on the test) as measured on the
3936     netperf side when the global `-j' option was specified. Units:
3937     Microseconds.
3938
3939`P50_LATENCY'
3940     The 50th percentile value of "latency" or operation time (send,
3941     receive or request/response exchange depending on the test) as
3942     measured on the netperf side when the global `-j' option was
3943     specified. Units: Microseconds.
3944
3945`P90_LATENCY'
3946     The 90th percentile value of "latency" or operation time (send,
3947     receive or request/response exchange depending on the test) as
3948     measured on the netperf side when the global `-j' option was
3949     specified. Units: Microseconds.
3950
3951`P99_LATENCY'
3952     The 99th percentile value of "latency" or operation time (send,
3953     receive or request/response exchange depending on the test) as
3954     measured on the netperf side when the global `-j' option was
3955     specified. Units: Microseconds.
3956
3957`MEAN_LATENCY'
3958     The average "latency" or operation time (send, receive or
3959     request/response exchange depending on the test) as measured on the
3960     netperf side when the global `-j' option was specified. Units:
3961     Microseconds.
3962
3963`STDDEV_LATENCY'
3964     The standard deviation of "latency" or operation time (send,
3965     receive or request/response exchange depending on the test) as
3966     measured on the netperf side when the global `-j' option was
3967     specified. Units: Microseconds.
3968
3969`COMMAND_LINE'
3970     The full command line used when invoking netperf. Units: ASCII
3971     Text.
3972
3973`OUTPUT_END'
3974     While emitted with the list of output selectors, it is ignored when
3975     specified as an output selector.
3976
3977
3978File: netperf.info,  Node: Other Netperf Tests,  Next: Address Resolution,  Prev: The Omni Tests,  Up: Top
3979
398010 Other Netperf Tests
3981**********************
3982
3983Apart from the typical performance tests, netperf contains some tests
3984which can be used to streamline measurements and reporting.  These
3985include CPU rate calibration (present) and host identification (future
3986enhancement).
3987
3988* Menu:
3989
3990* CPU rate calibration::
3991* UUID Generation::
3992
3993
3994File: netperf.info,  Node: CPU rate calibration,  Next: UUID Generation,  Prev: Other Netperf Tests,  Up: Other Netperf Tests
3995
399610.1 CPU rate calibration
3997=========================
3998
3999Some of the CPU utilization measurement mechanisms of netperf work by
4000comparing the rate at which some counter increments when the system is
4001idle with the rate at which that same counter increments when the
4002system is running a netperf test.  The ratio of those rates is used to
4003arrive at a CPU utilization percentage.
4004
4005   This means that netperf must know the rate at which the counter
4006increments when the system is presumed to be "idle."  If it does not
4007know the rate, netperf will measure it before starting a data transfer
4008test.  This calibration step takes 40 seconds for each of the local or
4009remote systems, and if repeated for each netperf test would make taking
4010repeated measurements rather slow.
4011
4012   Thus, the netperf CPU utilization options `-c' and and `-C' can take
4013an optional calibration value.  This value is used as the "idle rate"
4014and the calibration step is not performed. To determine the idle rate,
4015netperf can be used to run special tests which only report the value of
4016the calibration - they are the LOC_CPU and REM_CPU tests.  These return
4017the calibration value for the local and remote system respectively.  A
4018common way to use these tests is to store their results into an
4019environment variable and use that in subsequent netperf commands:
4020
4021     LOC_RATE=`netperf -t LOC_CPU`
4022     REM_RATE=`netperf -H <remote> -t REM_CPU`
4023     netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ...
4024     ...
4025     netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ...
4026
4027   If you are going to use netperf to measure aggregate results, it is
4028important to use the LOC_CPU and REM_CPU tests to get the calibration
4029values first to avoid issues with some of the aggregate netperf tests
4030transferring data while others are "idle" and getting bogus calibration
4031values.  When running aggregate tests, it is very important to remember
4032that any one instance of netperf does not know about the other
4033instances of netperf.  It will report global CPU utilization and will
4034calculate service demand believing it was the only thing causing that
4035CPU utilization.  So, you can use the CPU utilization reported by
4036netperf in an aggregate test, but you have to calculate service demands
4037by hand.
4038
4039
4040File: netperf.info,  Node: UUID Generation,  Prev: CPU rate calibration,  Up: Other Netperf Tests
4041
404210.2 UUID Generation
4043====================
4044
4045Beginning with version 2.5.0 netperf can generate Universally Unique
4046IDentifiers (UUIDs).  This can be done explicitly via the "UUID" test:
4047     $ netperf -t UUID
4048     2c8561ae-9ebd-11e0-a297-0f5bfa0349d0
4049
4050   In and of itself, this is not terribly useful, but used in
4051conjunction with the test-specific `-u' option of an "omni" test to set
4052the UUID emitted by the *note UUID: Omni Output Selectors. output
4053selector, it can be used to tie-together the separate instances of an
4054aggregate netperf test.  Say, for instance if they were inserted into a
4055database of some sort.
4056
4057
4058File: netperf.info,  Node: Address Resolution,  Next: Enhancing Netperf,  Prev: Other Netperf Tests,  Up: Top
4059
406011 Address Resolution
4061*********************
4062
4063Netperf versions 2.4.0 and later have merged IPv4 and IPv6 tests so the
4064functionality of the tests in `src/nettest_ipv6.c' has been subsumed
4065into the tests in `src/nettest_bsd.c'  This has been accomplished in
4066part by switching from `gethostbyname()'to `getaddrinfo()' exclusively.
4067While it was theoretically possible to get multiple results for a
4068hostname from `gethostbyname()' it was generally unlikely and netperf's
4069ignoring of the second and later results was not much of an issue.
4070
4071   Now with `getaddrinfo' and particularly with AF_UNSPEC it is
4072increasingly likely that a given hostname will have multiple associated
4073addresses.  The `establish_control()' routine of `src/netlib.c' will
4074indeed attempt to chose from among all the matching IP addresses when
4075establishing the control connection.  Netperf does not _really_ care if
4076the control connection is IPv4 or IPv6 or even mixed on either end.
4077
4078   However, the individual tests still ass-u-me that the first result in
4079the address list is the one to be used.  Whether or not this will
4080turn-out to be an issue has yet to be determined.
4081
4082   If you do run into problems with this, the easiest workaround is to
4083specify IP addresses for the data connection explicitly in the
4084test-specific `-H' and `-L' options.  At some point, the netperf tests
4085_may_ try to be more sophisticated in their parsing of returns from
4086`getaddrinfo()' - straw-man patches to <netperf-feedback@netperf.org>
4087would of course be most welcome :)
4088
4089   Netperf has leveraged code from other open-source projects with
4090amenable licensing to provide a replacement `getaddrinfo()' call on
4091those platforms where the `configure' script believes there is no
4092native getaddrinfo call.  As of this writing, the replacement
4093`getaddrinfo()' as been tested on HP-UX 11.0 and then presumed to run
4094elsewhere.
4095
4096
4097File: netperf.info,  Node: Enhancing Netperf,  Next: Netperf4,  Prev: Address Resolution,  Up: Top
4098
409912 Enhancing Netperf
4100********************
4101
4102Netperf is constantly evolving.  If you find you want to make
4103enhancements to netperf, by all means do so.  If you wish to add a new
4104"suite" of tests to netperf the general idea is to:
4105
4106  1. Add files `src/nettest_mumble.c' and `src/nettest_mumble.h' where
4107     mumble is replaced with something meaningful for the test-suite.
4108
4109  2. Add support for an appropriate `--enable-mumble' option in
4110     `configure.ac'.
4111
4112  3. Edit `src/netperf.c', `netsh.c', and `netserver.c' as required,
4113     using #ifdef WANT_MUMBLE.
4114
4115  4. Compile and test
4116
4117   However, with the addition of the "omni" tests in version 2.5.0 it
4118is preferred that one attempt to make the necessary changes to
4119`src/nettest_omni.c' rather than adding new source files, unless this
4120would make the omni tests entirely too complicated.
4121
4122   If you wish to submit your changes for possible inclusion into the
4123mainline sources, please try to base your changes on the latest
4124available sources. (*Note Getting Netperf Bits::.) and then send email
4125describing the changes at a high level to
4126<netperf-feedback@netperf.org> or perhaps <netperf-talk@netperf.org>.
4127If the consensus is positive, then sending context `diff' results to
4128<netperf-feedback@netperf.org> is the next step.  From that point, it
4129is a matter of pestering the Netperf Contributing Editor until he gets
4130the changes incorporated :)
4131
4132
4133File: netperf.info,  Node: Netperf4,  Next: Concept Index,  Prev: Enhancing Netperf,  Up: Top
4134
413513 Netperf4
4136***********
4137
4138Netperf4 is the shorthand name given to version 4.X.X of netperf.  This
4139is really a separate benchmark more than a newer version of netperf,
4140but it is a descendant of netperf so the netperf name is kept.  The
4141facetious way to describe netperf4 is to say it is the
4142egg-laying-woolly-milk-pig version of netperf :)  The more respectful
4143way to describe it is to say it is the version of netperf with support
4144for synchronized, multiple-thread, multiple-test, multiple-system,
4145network-oriented benchmarking.
4146
4147   Netperf4 is still undergoing evolution. Those wishing to work with or
4148on netperf4 are encouraged to join the netperf-dev
4149(http://www.netperf.org/cgi-bin/mailman/listinfo/netperf-dev) mailing
4150list and/or peruse the current sources
4151(http://www.netperf.org/svn/netperf4/trunk).
4152
4153
4154File: netperf.info,  Node: Concept Index,  Next: Option Index,  Prev: Netperf4,  Up: Top
4155
4156Concept Index
4157*************
4158
4159�[index�]
4160* Menu:
4161
4162* Aggregate Performance:                 Using Netperf to Measure Aggregate Performance.
4163                                                               (line  6)
4164* Bandwidth Limitation:                  Installing Netperf Bits.
4165                                                               (line 64)
4166* Connection Latency:                    TCP_CC.               (line  6)
4167* CPU Utilization:                       CPU Utilization.      (line  6)
4168* Design of Netperf:                     The Design of Netperf.
4169                                                               (line  6)
4170* Installation:                          Installing Netperf.   (line  6)
4171* Introduction:                          Introduction.         (line  6)
4172* Latency, Connection Establishment <1>: XTI_TCP_CRR.          (line  6)
4173* Latency, Connection Establishment <2>: XTI_TCP_CC.           (line  6)
4174* Latency, Connection Establishment <3>: TCP_CRR.              (line  6)
4175* Latency, Connection Establishment:     TCP_CC.               (line  6)
4176* Latency, Request-Response <1>:         SCTP_RR.              (line  6)
4177* Latency, Request-Response <2>:         DLCO_RR.              (line  6)
4178* Latency, Request-Response <3>:         DLCL_RR.              (line  6)
4179* Latency, Request-Response <4>:         XTI_UDP_RR.           (line  6)
4180* Latency, Request-Response <5>:         XTI_TCP_CRR.          (line  6)
4181* Latency, Request-Response <6>:         XTI_TCP_RR.           (line  6)
4182* Latency, Request-Response <7>:         UDP_RR.               (line  6)
4183* Latency, Request-Response <8>:         TCP_CRR.              (line  6)
4184* Latency, Request-Response:             TCP_RR.               (line  6)
4185* Limiting Bandwidth <1>:                UDP_STREAM.           (line  9)
4186* Limiting Bandwidth:                    Installing Netperf Bits.
4187                                                               (line 64)
4188* Measuring Latency:                     TCP_RR.               (line  6)
4189* Packet Loss:                           UDP_RR.               (line  6)
4190* Port Reuse:                            TCP_CC.               (line 13)
4191* TIME_WAIT:                             TCP_CC.               (line 13)
4192
4193
4194File: netperf.info,  Node: Option Index,  Prev: Concept Index,  Up: Top
4195
4196Option Index
4197************
4198
4199�[index�]
4200* Menu:
4201
4202* --enable-burst, Configure:             Using Netperf to Measure Aggregate Performance.
4203                                                              (line   6)
4204* --enable-cpuutil, Configure:           Installing Netperf Bits.
4205                                                              (line  24)
4206* --enable-dlpi, Configure:              Installing Netperf Bits.
4207                                                              (line  30)
4208* --enable-histogram, Configure:         Installing Netperf Bits.
4209                                                              (line  64)
4210* --enable-intervals, Configure:         Installing Netperf Bits.
4211                                                              (line  64)
4212* --enable-omni, Configure:              Installing Netperf Bits.
4213                                                              (line  36)
4214* --enable-sctp, Configure:              Installing Netperf Bits.
4215                                                              (line  30)
4216* --enable-unixdomain, Configure:        Installing Netperf Bits.
4217                                                              (line  30)
4218* --enable-xti, Configure:               Installing Netperf Bits.
4219                                                              (line  30)
4220* -4, Global:                            Global Options.      (line 489)
4221* -4, Test-specific <1>:                 Options Common to TCP UDP and SCTP _RR tests.
4222                                                              (line  88)
4223* -4, Test-specific:                     Options common to TCP UDP and SCTP tests.
4224                                                              (line 110)
4225* -6 Test-specific:                      Options Common to TCP UDP and SCTP _RR tests.
4226                                                              (line  94)
4227* -6, Global:                            Global Options.      (line 498)
4228* -6, Test-specific:                     Options common to TCP UDP and SCTP tests.
4229                                                              (line 116)
4230* -A, Global:                            Global Options.      (line  18)
4231* -a, Global:                            Global Options.      (line   6)
4232* -B, Global:                            Global Options.      (line  29)
4233* -b, Global:                            Global Options.      (line  22)
4234* -C, Global:                            Global Options.      (line  42)
4235* -c, Global:                            Global Options.      (line  33)
4236* -c, Test-specific:                     Native Omni Tests.   (line  13)
4237* -D, Global:                            Global Options.      (line  56)
4238* -d, Global:                            Global Options.      (line  47)
4239* -d, Test-specific:                     Native Omni Tests.   (line  17)
4240* -F, Global:                            Global Options.      (line  76)
4241* -f, Global:                            Global Options.      (line  67)
4242* -H, Global:                            Global Options.      (line  95)
4243* -h, Global:                            Global Options.      (line  91)
4244* -H, Test-specific:                     Options Common to TCP UDP and SCTP _RR tests.
4245                                                              (line  17)
4246* -h, Test-specific <1>:                 Options Common to TCP UDP and SCTP _RR tests.
4247                                                              (line  10)
4248* -h, Test-specific:                     Options common to TCP UDP and SCTP tests.
4249                                                              (line  10)
4250* -i, Global:                            Global Options.      (line 179)
4251* -I, Global:                            Global Options.      (line 130)
4252* -j, Global:                            Global Options.      (line 205)
4253* -k, Test-specific:                     Native Omni Tests.   (line  37)
4254* -L, Global:                            Global Options.      (line 263)
4255* -l, Global:                            Global Options.      (line 242)
4256* -L, Test-specific <1>:                 Options Common to TCP UDP and SCTP _RR tests.
4257                                                              (line  26)
4258* -L, Test-specific:                     Options common to TCP UDP and SCTP tests.
4259                                                              (line  25)
4260* -M, Test-specific:                     Options common to TCP UDP and SCTP tests.
4261                                                              (line  48)
4262* -m, Test-specific:                     Options common to TCP UDP and SCTP tests.
4263                                                              (line  32)
4264* -N, Global:                            Global Options.      (line 293)
4265* -n, Global:                            Global Options.      (line 275)
4266* -O, Global:                            Global Options.      (line 338)
4267* -o, Global:                            Global Options.      (line 329)
4268* -O, Test-specific:                     Native Omni Tests.   (line  62)
4269* -o, Test-specific:                     Native Omni Tests.   (line  50)
4270* -P, Global:                            Global Options.      (line 363)
4271* -p, Global:                            Global Options.      (line 343)
4272* -P, Test-specific <1>:                 Options Common to TCP UDP and SCTP _RR tests.
4273                                                              (line  33)
4274* -P, Test-specific:                     Options common to TCP UDP and SCTP tests.
4275                                                              (line  61)
4276* -r, Test-specific:                     Options Common to TCP UDP and SCTP _RR tests.
4277                                                              (line  36)
4278* -S Test-specific:                      Options common to TCP UDP and SCTP tests.
4279                                                              (line  87)
4280* -S, Global:                            Global Options.      (line 381)
4281* -s, Global:                            Global Options.      (line 372)
4282* -S, Test-specific:                     Options Common to TCP UDP and SCTP _RR tests.
4283                                                              (line  68)
4284* -s, Test-specific <1>:                 Options Common to TCP UDP and SCTP _RR tests.
4285                                                              (line  48)
4286* -s, Test-specific:                     Options common to TCP UDP and SCTP tests.
4287                                                              (line  64)
4288* -T, Global:                            Global Options.      (line 423)
4289* -t, Global:                            Global Options.      (line 391)
4290* -T, Test-specific:                     Native Omni Tests.   (line  81)
4291* -t, Test-specific:                     Native Omni Tests.   (line  76)
4292* -V, Global:                            Global Options.      (line 468)
4293* -v, Global:                            Global Options.      (line 440)
4294* -W, Global:                            Global Options.      (line 480)
4295* -w, Global:                            Global Options.      (line 473)
4296
4297
4298
4299Tag Table:
4300Node: Top439
4301Node: Introduction1476
4302Node: Conventions4150
4303Node: Installing Netperf5913
4304Node: Getting Netperf Bits7467
4305Node: Installing Netperf Bits9326
4306Node: Verifying Installation17820
4307Node: The Design of Netperf18524
4308Node: CPU Utilization20120
4309Node: CPU Utilization in a Virtual Guest28844
4310Node: Global Command-line Options30431
4311Node: Command-line Options Syntax30970
4312Node: Global Options32366
4313Node: Using Netperf to Measure Bulk Data Transfer56529
4314Node: Issues in Bulk Transfer57202
4315Node: Options common to TCP UDP and SCTP tests61463
4316Node: TCP_STREAM67788
4317Node: TCP_MAERTS71873
4318Node: TCP_SENDFILE73110
4319Node: UDP_STREAM75610
4320Node: XTI_TCP_STREAM79046
4321Node: XTI_UDP_STREAM79691
4322Node: SCTP_STREAM80336
4323Node: DLCO_STREAM81036
4324Node: DLCL_STREAM83009
4325Node: STREAM_STREAM83883
4326Node: DG_STREAM84741
4327Node: Using Netperf to Measure Request/Response85422
4328Node: Issues in Request/Response87740
4329Node: Options Common to TCP UDP and SCTP _RR tests90114
4330Node: TCP_RR95138
4331Node: TCP_CC97538
4332Node: TCP_CRR99772
4333Node: UDP_RR100834
4334Node: XTI_TCP_RR103138
4335Node: XTI_TCP_CC103721
4336Node: XTI_TCP_CRR104226
4337Node: XTI_UDP_RR104738
4338Node: DLCL_RR105315
4339Node: DLCO_RR105468
4340Node: SCTP_RR105620
4341Node: Using Netperf to Measure Aggregate Performance105756
4342Node: Running Concurrent Netperf Tests106788
4343Node: Issues in Running Concurrent Tests111429
4344Node: Using --enable-burst113693
4345Node: Using --enable-demo120592
4346Node: Using Netperf to Measure Bidirectional Transfer126148
4347Node: Bidirectional Transfer with Concurrent Tests127280
4348Node: Bidirectional Transfer with TCP_RR129636
4349Node: Implications of Concurrent Tests vs Burst Request/Response132020
4350Node: The Omni Tests133834
4351Node: Native Omni Tests134881
4352Node: Migrated Tests140159
4353Node: Omni Output Selection142264
4354Node: Omni Output Selectors145247
4355Node: Other Netperf Tests174980
4356Node: CPU rate calibration175415
4357Node: UUID Generation177783
4358Node: Address Resolution178499
4359Node: Enhancing Netperf180475
4360Node: Netperf4181970
4361Node: Concept Index182875
4362Node: Option Index185201
4363
4364End Tag Table
4365