Blame SOURCES/ib_atomic_bw.1

0ff89e
.\" Copyright (c) 2014, Jan Chaloupka <jchaloup@redhat.com>
0ff89e
.\"
0ff89e
.\" %%%LICENSE_START(GPLv2+_DOC_FULL)
0ff89e
.\" This is free documentation; you can redistribute it and/or
0ff89e
.\" modify it under the terms of the GNU General Public License as
0ff89e
.\" published by the Free Software Foundation; either version 2 of
0ff89e
.\" the License, or (at your option) any later version.
0ff89e
.\"
0ff89e
.\" The GNU General Public License's references to "object code"
0ff89e
.\" and "executables" are to be interpreted as the output of any
0ff89e
.\" document formatting or typesetting system, including
0ff89e
.\" intermediate and printed output.
0ff89e
.\"
0ff89e
.\" This manual is distributed in the hope that it will be useful,
0ff89e
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
0ff89e
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0ff89e
.\" GNU General Public License for more details.
0ff89e
.\"
0ff89e
.\" You should have received a copy of the GNU General Public
0ff89e
.\" License along with this manual; if not, see
0ff89e
.\" <http://www.gnu.org/licenses/>.
0ff89e
.\" %%%LICENSE_END
0ff89e
.TH "IB_ATOMIC_BW" 1 2014 "Open Fabrics Enterprise Distribution"
0ff89e
.\" IB_ATOMIC_BW
0ff89e
.SH NAME
0ff89e
ib_atomic_bw, ib_atomic_lat, ib_read_bw, ib_read_lat, ib_send_bw,
0ff89e
ib_send_lat, ib_write_bw, ib_write_lat
0ff89e
\- Collection of tests written over uverbs intended for use as a
0ff89e
performance micro-benchmark
0ff89e
.SH SYNOPSIS
0ff89e
.sp
0ff89e
.B ib_atomic_bw [<host>] [options]
0ff89e
.sp
0ff89e
.B ib_atomic_lat [<host>] [options]
0ff89e
.sp
0ff89e
.B ib_read_bw [<host>] [options] 
0ff89e
.sp
0ff89e
.B ib_read_lat [<host>] [options]
0ff89e
.sp
0ff89e
.B ib_write_bw [<host>] [options]
0ff89e
.sp
0ff89e
.B ib_write_lat [<host>] [options]
0ff89e
.SH DESCRIPTION
0ff89e
This is a collection of tests written over uverbs intended for use as a
0ff89e
performance micro-benchmark. As an example, the tests can be used for
0ff89e
HW or SW tuning and/or functional testing.
0ff89e
0ff89e
The collection conatains a set of BW and latency benchmark such as :
0ff89e
.sp
0ff89e
* Read   - ib_read_bw and ib_read_lat.
0ff89e
.sp
0ff89e
* Write  - ib_write_bw and ib_wriet_lat.
0ff89e
.sp
0ff89e
* Send   - ib_send_bw and ib_send_lat.
0ff89e
.sp
0ff89e
* Atomic - ib_atomic_bw and ib_atomic_lat
0ff89e
.sp
0ff89e
* Raw Etherent (when working with MOFED2) - raw_ethernet_bw, raw_ethernet_lat
0ff89e
0ff89e
The benchmark used the CPU cycle counter to get time stamps without context
0ff89e
switch.  Some CPU architectures (e.g., Intel's 80486 or older PPC) do NOT
0ff89e
have such capability.
0ff89e
0ff89e
The latency benchmarks measures round-trip time but reports half of that as
0ff89e
one-way latency.
0ff89e
This means that it may not be sufficiently accurate for asymmetrical
0ff89e
configurations.
0ff89e
0ff89e
On Bw benchmarks, we calculate the BW on send side only, as he calculates
0ff89e
the Bw after collecting completion from the receive side.
0ff89e
In case we use the bidirectional flag , BW is calculated on both sides.
0ff89e
in ib_send_bw, server side also calculate the received throughput.
0ff89e
0ff89e
Min/Median/Max result is reported in latency tests.
0ff89e
The median (vs average) is less sensitive to extreme scores.
0ff89e
Typically, the "Max" value is the first value measured.
0ff89e
0ff89e
Larger samples help marginally only. The default (1000) is pretty good.
0ff89e
Note that an array of cycles_t (typically unsigned long) is allocated
0ff89e
once to collect samples and again to store the difference between them.
0ff89e
Really big sample sizes (e.g., 1 million) might expose other problems
0ff89e
with the program. In this case you can use -N flag (No Peak) to instruct
0ff89e
the test sample only 2 times (begining and end).
0ff89e
0ff89e
All throughput tests now have duration feature as well (-D <seconds to run>)
0ff89e
to instruct the test to run for <seconds to run>.
0ff89e
Another feature added is --run_infinitely, which instruct the test to run
0ff89e
all te time and print throughput every 5 seconds.
0ff89e
0ff89e
The "-H" option (latency) will dump the histogram for additional statistical
0ff89e
analysis.
0ff89e
See xgraph, ygraph, r-base (http://www.r-project.org/), pspp, or other
0ff89e
statistical math programs.
0ff89e
0ff89e
0ff89e
Architectures tested: i686, x86_64, ia64
0ff89e
.SH OPTIONS
0ff89e
The SAME OPTIONS must be passed to both server and client.
0ff89e
0ff89e
If
0ff89e
.I <host>
0ff89e
is not presented, command starts a server and waits for connection.
0ff89e
If it is, command connects to server at
0ff89e
.I <host>.
0ff89e
.sp
0ff89e
.B Common Options:
0ff89e
.RS 4
0ff89e
.TP
0ff89e
\fB\-h\fR, \fB\-\-help\fR
0ff89e
Display this help message screen.
0ff89e
.TP
0ff89e
\fB\-p\fR, \fB\-\-port\fR=\fI<port>\fR
0ff89e
Listen on/connect to port <port> (default: 18515) when exchaning data.
0ff89e
.TP
0ff89e
\fB\-R\fR, \fB\-\-rdma_cm\fR
0ff89e
Connect QPs with rdma_cm and run test on those QPs.
0ff89e
.TP
0ff89e
\fB\-z\fR, \fB\-\-com_rdma_cm\fR
0ff89e
Communicate with rdma_cm module to exchange data \- use regular QPs.
0ff89e
.TP
0ff89e
\fB\-m\fR, \fB\-\-mtu\fR=\fI<mtu>\fR
0ff89e
 QP Mtu size (default: active_mtu from ibv_devinfo).
0ff89e
.TP
0ff89e
\fB\-c\fR, \fB\-\-connection\fR=\fI<RC/UC/UD>\fR
0ff89e
Connection type RC/UC/UD (default RC)
0ff89e
.TP
0ff89e
\fB\-d\fR, \fB\-\-ib\-dev\fR=\fI<dev>\fR
0ff89e
Use IB device <dev> (default: first device found).
0ff89e
.TP
0ff89e
\fB\-i\fR, \fB\-\-ib\-port\fR=\fI<port>\fR
0ff89e
Use port <port> of IB device (default: 1).
0ff89e
.TP
0ff89e
\fB\-s\fR, \fB\-\-size\fR=\fI<size>\fR
0ff89e
Size of message to exchange (default: 1).
0ff89e
.TP
0ff89e
\fB\-a\fR, \fB\-\-all\fR
0ff89e
Run sizes from 2 till 2^23.
0ff89e
.TP
0ff89e
\fB\-n\fR, \fB\-\-iters\fR=\fI<iters>\fR
0ff89e
Number of exchanges (at least 100, default: 1000).
0ff89e
.TP
0ff89e
\fB\-x\fR, \fB\-\-gid\-index\fR=\fI<index>\fR
0ff89e
Test uses GID with GID index taken from command
0ff89e
.TP
0ff89e
\fB\-V\fR, \fB\-\-version\fR
0ff89e
Display version number.
0ff89e
.TP
0ff89e
\fB\-e\fR, \fB\-\-events\fR
0ff89e
Sleep on CQ events (default poll).
0ff89e
.TP
0ff89e
\fB\-F\fR, \fB\-\-CPU\-freq\fR
0ff89e
Do not fail even if cpufreq_ondemand module.
0ff89e
.TP
0ff89e
\fB\-I\fR, \fB\-\-inline_size\fR=\fI<size>\fR
0ff89e
Max size of message to be sent in inline mode.
0ff89e
.TP
0ff89e
\fB\-u\fR, \fB\-\-qp\-timeout\fR=\fI<timeout>\fR
0ff89e
QP timeout, timeout value is 4 usec*2 ^timeout (default: 14).
0ff89e
.TP
0ff89e
\fB\-S\fR, \fB\-\-sl\fR=\fI<sl>\fR
0ff89e
SL \- Service Level (default 0)
0ff89e
.TP
0ff89e
\fB\-r\fR, \fB\-\-rx\-depth\fR=\fI<dep>\fR
0ff89e
Make rx queue bigger than tx (default 600).
0ff89e
.RE
0ff89e
.sp
0ff89e
.B Latenct tests options:
0ff89e
.RS 4
0ff89e
.TP
0ff89e
\fB\-C\fR, \fB\-\-report\-cycles\fR
0ff89e
Report times in cpu cycle units.
0ff89e
.TP
0ff89e
\fB\-H\fR, \fB\-\-report\-histogram\fR
0ff89e
Print out all results (Default: summary only).
0ff89e
.TP
0ff89e
\fB\-U\fR, \fB\-\-report\-unsorted\fR
0ff89e
Print out unsorted results (default sorted).
0ff89e
.RE
0ff89e
.sp
0ff89e
.B BW tests options:
0ff89e
.RS 4
0ff89e
.TP
0ff89e
\fB\-b\fR, \fB\-\-bidirectional\fR
0ff89e
Measure bidirectional bandwidth (default uni).
0ff89e
.TP
0ff89e
\fB\-N\fR, \fB\-\-no\fR
0ff89e
peak\-bw              Cancel peak\-bw calculation (default with peak\-bw)
0ff89e
.TP
0ff89e
\fB\-Q\fR, \fB\-\-cq\-mod\fR
0ff89e
Generate Cqe only after <cq\-mod> completion
0ff89e
.TP
0ff89e
\fB\-t\fR, \fB\-\-tx\-depth=<dep>\fR
0ff89e
Size of tx queue (default: 128).
0ff89e
.TP
0ff89e
\fB\-O\fR, \fB\-\-dualport\fR
0ff89e
Run test in dual\-port mode (2 QPs). both ports must be active (default OFF).
0ff89e
.TP
0ff89e
\fB\-D\fR, \fB\-\-duration=<sec>\fR
0ff89e
Run test for <sec> period of seconds.
0ff89e
.TP
0ff89e
\fB\-f\fR, \fB\-\-margin=<sec>\fR
0ff89e
When in Duration, measure results within margins (default: 2)
0ff89e
.TP
0ff89e
\fB\-l\fR, \fB\-\-post_list=<list_size>\fR
0ff89e
Post list of WQEs of <list size> size (instead of single post).
0ff89e
.TP
0ff89e
\fB\-q\fR, \fB\-\-qp=<num_of_qps>\fR
0ff89e
Num of QPs running in the process (default: 1).
0ff89e
.TP
0ff89e
\fB\-\-run_infinitely \fR
0ff89e
Run test forever\fR, \fBprint results every 5 seconds.
0ff89e
.RE
0ff89e
.sp
0ff89e
.B SEND tests options:
0ff89e
.RS 4
0ff89e
.TP
0ff89e
\fB\-r\fR, \fB\-\-rx\-depth=<dep>\fR
0ff89e
Size of RX queue (default: 512 in BW test).
0ff89e
.TP
0ff89e
\fB\-g\fR, \fB\-\-mcg=<num_of_qps>\fR
0ff89e
Send messages to multicast group with <num_of_qps> qps attached to it.
0ff89e
.TP
0ff89e
\fB\-M\fR, \fB\-\-MGID=<multicast_gid>\fR
0ff89e
In multicast, uses <multicast_gid> as the group MGID.
0ff89e
.RE
0ff89e
.sp
0ff89e
.B Raw Ethernet BW test options:
0ff89e
.RS 4
0ff89e
.TP
0ff89e
\fB\-A\fR, \fB\-\-atomic_type=<type>\fR
0ff89e
type of atomic operation from {CMP_AND_SWAP,FETCH_AND_ADD}.
0ff89e
.TP
0ff89e
\fB\-o\fR, \fB\-\-outs=<num>\fR
0ff89e
Number of outstanding read/atomic requests \- also on READ tests.
0ff89e
.TP
0ff89e
\fB\-B\fR, \fB\-\-source_mac\fR
0ff89e
source MAC address by this format XX:XX:XX:XX:XX:XX (default take the MAC address form GID).
0ff89e
.TP
0ff89e
\fB\-E\fR, \fB\-\-dest_mac\fR
0ff89e
destination MAC address by this format XX:XX:XX:XX:XX:XX **MUST** be entered.
0ff89e
.TP
0ff89e
\fB\-J\fR, \fB\-\-server_ip\fR
0ff89e
server ip address by this format X.X.X.X (using to send packets with IP header).
0ff89e
.TP
0ff89e
\fB\-j\fR, \fB\-\-client_ip\fR
0ff89e
client ip address by this format X.X.X.X (using to send packets with IP header).
0ff89e
.TP
0ff89e
\fB\-K\fR, \fB\-\-server_port\fR
0ff89e
server udp port number (using to send packets with UPD header).
0ff89e
.TP
0ff89e
\fB\-k\fR, \fB\-\-client_port\fR
0ff89e
client udp port number (using to send packets with UDP header).
0ff89e
.TP
0ff89e
\fB\-Z\fR, \fB\-\-server\fR
0ff89e
choose server side for the current machine (\-\-server/\-\-client must be selected ).
0ff89e
.TP
0ff89e
\fB\-P\fR, \fB\-\-client\fR
0ff89e
choose client side for the current machine (\-\-server/\-\-client must be selected).
0ff89e
.RE
0ff89e
.SH ENVIRONMENT
0ff89e
.B Prerequisites:
0ff89e
.RS
0ff89e
kernel 2.6
0ff89e
.RE
0ff89e
.RS
0ff89e
(kernel module) matches libibverbs
0ff89e
.RE
0ff89e
.RS
0ff89e
(kernel module) matches librdmacm
0ff89e
.RE
0ff89e
.RS
0ff89e
(kernel module) matches libibumad
0ff89e
.RE
0ff89e
.RS
0ff89e
(kernel module) matches libmath (lm).
0ff89e
.RE
0ff89e
.SH NOTES
0ff89e
You need to be running a Subnet Manager on the switch or on one of the nodes in your fabric, in case you are in IB fabric.
0ff89e
.SH BUGS
0ff89e
1. Multicast feauture in ib_send_lat and in ib_send_bw still have many problems!
0ff89e
Will increase the support and bug fixes in this Q, but now the tests may stuck
0ff89e
and could produce undefine behaviours.
0ff89e
.sp
0ff89e
2. Bidirectional feature in ib_send_bw test, when running in UD or UC mode.
0ff89e
The algorithm we use for the bidirectional measurement is designed for RC connection type.
0ff89e
When running in UC or UD connection types, there is a small probablity the test will be stuck.
0ff89e
.sp
0ff89e
3. RDMA_CM feature in read tests still doesn't work.
0ff89e
.sp
0ff89e
4. Dual-port support currently works only with ib_write_bw.
0ff89e
.sp
0ff89e
5. Compabilty issues may occur between different versions of perftest.
0ff89e
Please make sure you work with the same version on both sides to ensure
0ff89e
consistency of the test.
0ff89e
.SH AUTHORS
0ff89e
Please post results/observations to the openib-general mailing list.
0ff89e
See "Contact Us" at http://openib.org/mailman/listinfo/openib-general and
0ff89e
http://www.openib.org.