Tree - rpms/perftest - CentOS Git server

rpms / perftest

Blame SOURCES/ib_atomic_bw.1

Blob History Raw

		5172a4	`.\" Copyright (c) 2014, Jan Chaloupka <jchaloup@redhat.com>`
		5172a4	`.\"`
		5172a4	`.\" %%%LICENSE_START(GPLv2+_DOC_FULL)`
		5172a4	`.\" This is free documentation; you can redistribute it and/or`
		5172a4	`.\" modify it under the terms of the GNU General Public License as`
		5172a4	`.\" published by the Free Software Foundation; either version 2 of`
		5172a4	`.\" the License, or (at your option) any later version.`
		5172a4	`.\"`
		5172a4	`.\" The GNU General Public License's references to "object code"`
		5172a4	`.\" and "executables" are to be interpreted as the output of any`
		5172a4	`.\" document formatting or typesetting system, including`
		5172a4	`.\" intermediate and printed output.`
		5172a4	`.\"`
		5172a4	`.\" This manual is distributed in the hope that it will be useful,`
		5172a4	`.\" but WITHOUT ANY WARRANTY; without even the implied warranty of`
		5172a4	`.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the`
		5172a4	`.\" GNU General Public License for more details.`
		5172a4	`.\"`
		5172a4	`.\" You should have received a copy of the GNU General Public`
		5172a4	`.\" License along with this manual; if not, see`
		5172a4	`.\" <http://www.gnu.org/licenses/>.`
		5172a4	`.\" %%%LICENSE_END`
		5172a4	`.TH "IB_ATOMIC_BW" 1 2014 "Open Fabrics Enterprise Distribution"`
		5172a4	`.\" IB_ATOMIC_BW`
		5172a4	`.SH NAME`
		5172a4	`ib_atomic_bw, ib_atomic_lat, ib_read_bw, ib_read_lat, ib_send_bw,`
		5172a4	`ib_send_lat, ib_write_bw, ib_write_lat`
		5172a4	`\- Collection of tests written over uverbs intended for use as a`
		5172a4	`performance micro-benchmark`
		5172a4	`.SH SYNOPSIS`
		5172a4	`.sp`
		5172a4	`.B ib_atomic_bw [<host>] [options]`
		5172a4	`.sp`
		5172a4	`.B ib_atomic_lat [<host>] [options]`
		5172a4	`.sp`
		5172a4	`.B ib_read_bw [<host>] [options]`
		5172a4	`.sp`
		5172a4	`.B ib_read_lat [<host>] [options]`
		5172a4	`.sp`
		5172a4	`.B ib_write_bw [<host>] [options]`
		5172a4	`.sp`
		5172a4	`.B ib_write_lat [<host>] [options]`
		5172a4	`.SH DESCRIPTION`
		5172a4	`This is a collection of tests written over uverbs intended for use as a`
		5172a4	`performance micro-benchmark. As an example, the tests can be used for`
		5172a4	`HW or SW tuning and/or functional testing.`
		5172a4
		5172a4	`The collection conatains a set of BW and latency benchmark such as :`
		5172a4	`.sp`
		5172a4	`* Read - ib_read_bw and ib_read_lat.`
		5172a4	`.sp`
		5172a4	`* Write - ib_write_bw and ib_wriet_lat.`
		5172a4	`.sp`
		5172a4	`* Send - ib_send_bw and ib_send_lat.`
		5172a4	`.sp`
		5172a4	`* Atomic - ib_atomic_bw and ib_atomic_lat`
		5172a4	`.sp`
		5172a4	`* Raw Etherent (when working with MOFED2) - raw_ethernet_bw, raw_ethernet_lat`
		5172a4
		5172a4	`The benchmark used the CPU cycle counter to get time stamps without context`
		5172a4	`switch. Some CPU architectures (e.g., Intel's 80486 or older PPC) do NOT`
		5172a4	`have such capability.`
		5172a4
		5172a4	`The latency benchmarks measures round-trip time but reports half of that as`
		5172a4	`one-way latency.`
		5172a4	`This means that it may not be sufficiently accurate for asymmetrical`
		5172a4	`configurations.`
		5172a4
		5172a4	`On Bw benchmarks, we calculate the BW on send side only, as he calculates`
		5172a4	`the Bw after collecting completion from the receive side.`
		5172a4	`In case we use the bidirectional flag , BW is calculated on both sides.`
		5172a4	`in ib_send_bw, server side also calculate the received throughput.`
		5172a4
		5172a4	`Min/Median/Max result is reported in latency tests.`
		5172a4	`The median (vs average) is less sensitive to extreme scores.`
		5172a4	`Typically, the "Max" value is the first value measured.`
		5172a4
		5172a4	`Larger samples help marginally only. The default (1000) is pretty good.`
		5172a4	`Note that an array of cycles_t (typically unsigned long) is allocated`
		5172a4	`once to collect samples and again to store the difference between them.`
		5172a4	`Really big sample sizes (e.g., 1 million) might expose other problems`
		5172a4	`with the program. In this case you can use -N flag (No Peak) to instruct`
		5172a4	`the test sample only 2 times (begining and end).`
		5172a4
		5172a4	`All throughput tests now have duration feature as well (-D <seconds to run>)`
		5172a4	`to instruct the test to run for <seconds to run>.`
		5172a4	`Another feature added is --run_infinitely, which instruct the test to run`
		5172a4	`all te time and print throughput every 5 seconds.`
		5172a4
		5172a4	`The "-H" option (latency) will dump the histogram for additional statistical`
		5172a4	`analysis.`
		5172a4	`See xgraph, ygraph, r-base (http://www.r-project.org/), pspp, or other`
		5172a4	`statistical math programs.`
		5172a4
		5172a4
		5172a4	`Architectures tested: i686, x86_64, ia64`
		5172a4	`.SH OPTIONS`
		5172a4	`The SAME OPTIONS must be passed to both server and client.`
		5172a4
		5172a4	`If`
		5172a4	`.I <host>`
		5172a4	`is not presented, command starts a server and waits for connection.`
		5172a4	`If it is, command connects to server at`
		5172a4	`.I <host>.`
		5172a4	`.sp`
		5172a4	`.B Common Options:`
		5172a4	`.RS 4`
		5172a4	`.TP`
		5172a4	`\fB\-h\fR, \fB\-\-help\fR`
		5172a4	`Display this help message screen.`
		5172a4	`.TP`
		5172a4	`\fB\-p\fR, \fB\-\-port\fR=\fI<port>\fR`
		5172a4	`Listen on/connect to port <port> (default: 18515) when exchaning data.`
		5172a4	`.TP`
		5172a4	`\fB\-R\fR, \fB\-\-rdma_cm\fR`
		5172a4	`Connect QPs with rdma_cm and run test on those QPs.`
		5172a4	`.TP`
		5172a4	`\fB\-z\fR, \fB\-\-com_rdma_cm\fR`
		5172a4	`Communicate with rdma_cm module to exchange data \- use regular QPs.`
		5172a4	`.TP`
		5172a4	`\fB\-m\fR, \fB\-\-mtu\fR=\fI<mtu>\fR`
		5172a4	`QP Mtu size (default: active_mtu from ibv_devinfo).`
		5172a4	`.TP`
		5172a4	`\fB\-c\fR, \fB\-\-connection\fR=\fI<RC/UC/UD>\fR`
		5172a4	`Connection type RC/UC/UD (default RC)`
		5172a4	`.TP`
		5172a4	`\fB\-d\fR, \fB\-\-ib\-dev\fR=\fI<dev>\fR`
		5172a4	`Use IB device <dev> (default: first device found).`
		5172a4	`.TP`
		5172a4	`\fB\-i\fR, \fB\-\-ib\-port\fR=\fI<port>\fR`
		5172a4	`Use port <port> of IB device (default: 1).`
		5172a4	`.TP`
		5172a4	`\fB\-s\fR, \fB\-\-size\fR=\fI<size>\fR`
		5172a4	`Size of message to exchange (default: 1).`
		5172a4	`.TP`
		5172a4	`\fB\-a\fR, \fB\-\-all\fR`
		5172a4	`Run sizes from 2 till 2^23.`
		5172a4	`.TP`
		5172a4	`\fB\-n\fR, \fB\-\-iters\fR=\fI<iters>\fR`
		5172a4	`Number of exchanges (at least 100, default: 1000).`
		5172a4	`.TP`
		5172a4	`\fB\-x\fR, \fB\-\-gid\-index\fR=\fI<index>\fR`
		5172a4	`Test uses GID with GID index taken from command`
		5172a4	`.TP`
		5172a4	`\fB\-V\fR, \fB\-\-version\fR`
		5172a4	`Display version number.`
		5172a4	`.TP`
		5172a4	`\fB\-e\fR, \fB\-\-events\fR`
		5172a4	`Sleep on CQ events (default poll).`
		5172a4	`.TP`
		5172a4	`\fB\-F\fR, \fB\-\-CPU\-freq\fR`
		5172a4	`Do not fail even if cpufreq_ondemand module.`
		5172a4	`.TP`
		5172a4	`\fB\-I\fR, \fB\-\-inline_size\fR=\fI<size>\fR`
		5172a4	`Max size of message to be sent in inline mode.`
		5172a4	`.TP`
		5172a4	`\fB\-u\fR, \fB\-\-qp\-timeout\fR=\fI<timeout>\fR`
		5172a4	`QP timeout, timeout value is 4 usec*2 ^timeout (default: 14).`
		5172a4	`.TP`
		5172a4	`\fB\-S\fR, \fB\-\-sl\fR=\fI<sl>\fR`
		5172a4	`SL \- Service Level (default 0)`
		5172a4	`.TP`
		5172a4	`\fB\-r\fR, \fB\-\-rx\-depth\fR=\fI<dep>\fR`
		5172a4	`Make rx queue bigger than tx (default 600).`
		5172a4	`.RE`
		5172a4	`.sp`
		5172a4	`.B Latenct tests options:`
		5172a4	`.RS 4`
		5172a4	`.TP`
		5172a4	`\fB\-C\fR, \fB\-\-report\-cycles\fR`
		5172a4	`Report times in cpu cycle units.`
		5172a4	`.TP`
		5172a4	`\fB\-H\fR, \fB\-\-report\-histogram\fR`
		5172a4	`Print out all results (Default: summary only).`
		5172a4	`.TP`
		5172a4	`\fB\-U\fR, \fB\-\-report\-unsorted\fR`
		5172a4	`Print out unsorted results (default sorted).`
		5172a4	`.RE`
		5172a4	`.sp`
		5172a4	`.B BW tests options:`
		5172a4	`.RS 4`
		5172a4	`.TP`
		5172a4	`\fB\-b\fR, \fB\-\-bidirectional\fR`
		5172a4	`Measure bidirectional bandwidth (default uni).`
		5172a4	`.TP`
		5172a4	`\fB\-N\fR, \fB\-\-no\fR`
		5172a4	`peak\-bw Cancel peak\-bw calculation (default with peak\-bw)`
		5172a4	`.TP`
		5172a4	`\fB\-Q\fR, \fB\-\-cq\-mod\fR`
		5172a4	`Generate Cqe only after <cq\-mod> completion`
		5172a4	`.TP`
		5172a4	`\fB\-t\fR, \fB\-\-tx\-depth=<dep>\fR`
		5172a4	`Size of tx queue (default: 128).`
		5172a4	`.TP`
		5172a4	`\fB\-O\fR, \fB\-\-dualport\fR`
		5172a4	`Run test in dual\-port mode (2 QPs). both ports must be active (default OFF).`
		5172a4	`.TP`
		5172a4	`\fB\-D\fR, \fB\-\-duration=<sec>\fR`
		5172a4	`Run test for <sec> period of seconds.`
		5172a4	`.TP`
		5172a4	`\fB\-f\fR, \fB\-\-margin=<sec>\fR`
		5172a4	`When in Duration, measure results within margins (default: 2)`
		5172a4	`.TP`
		5172a4	`\fB\-l\fR, \fB\-\-post_list=<list_size>\fR`
		5172a4	`Post list of WQEs of <list size> size (instead of single post).`
		5172a4	`.TP`
		5172a4	`\fB\-q\fR, \fB\-\-qp=<num_of_qps>\fR`
		5172a4	`Num of QPs running in the process (default: 1).`
		5172a4	`.TP`
		5172a4	`\fB\-\-run_infinitely \fR`
		5172a4	`Run test forever\fR, \fBprint results every 5 seconds.`
		5172a4	`.RE`
		5172a4	`.sp`
		5172a4	`.B SEND tests options:`
		5172a4	`.RS 4`
		5172a4	`.TP`
		5172a4	`\fB\-r\fR, \fB\-\-rx\-depth=<dep>\fR`
		5172a4	`Size of RX queue (default: 512 in BW test).`
		5172a4	`.TP`
		5172a4	`\fB\-g\fR, \fB\-\-mcg=<num_of_qps>\fR`
		5172a4	`Send messages to multicast group with <num_of_qps> qps attached to it.`
		5172a4	`.TP`
		5172a4	`\fB\-M\fR, \fB\-\-MGID=<multicast_gid>\fR`
		5172a4	`In multicast, uses <multicast_gid> as the group MGID.`
		5172a4	`.RE`
		5172a4	`.sp`
		5172a4	`.B Raw Ethernet BW test options:`
		5172a4	`.RS 4`
		5172a4	`.TP`
		5172a4	`\fB\-A\fR, \fB\-\-atomic_type=<type>\fR`
		5172a4	`type of atomic operation from {CMP_AND_SWAP,FETCH_AND_ADD}.`
		5172a4	`.TP`
		5172a4	`\fB\-o\fR, \fB\-\-outs=<num>\fR`
		5172a4	`Number of outstanding read/atomic requests \- also on READ tests.`
		5172a4	`.TP`
		5172a4	`\fB\-B\fR, \fB\-\-source_mac\fR`
		5172a4	`source MAC address by this format XX:XX:XX:XX:XX:XX (default take the MAC address form GID).`
		5172a4	`.TP`
		5172a4	`\fB\-E\fR, \fB\-\-dest_mac\fR`
		5172a4	`destination MAC address by this format XX:XX:XX:XX:XX:XX MUST be entered.`
		5172a4	`.TP`
		5172a4	`\fB\-J\fR, \fB\-\-server_ip\fR`
		5172a4	`server ip address by this format X.X.X.X (using to send packets with IP header).`
		5172a4	`.TP`
		5172a4	`\fB\-j\fR, \fB\-\-client_ip\fR`
		5172a4	`client ip address by this format X.X.X.X (using to send packets with IP header).`
		5172a4	`.TP`
		5172a4	`\fB\-K\fR, \fB\-\-server_port\fR`
		5172a4	`server udp port number (using to send packets with UPD header).`
		5172a4	`.TP`
		5172a4	`\fB\-k\fR, \fB\-\-client_port\fR`
		5172a4	`client udp port number (using to send packets with UDP header).`
		5172a4	`.TP`
		5172a4	`\fB\-Z\fR, \fB\-\-server\fR`
		5172a4	`choose server side for the current machine (\-\-server/\-\-client must be selected ).`
		5172a4	`.TP`
		5172a4	`\fB\-P\fR, \fB\-\-client\fR`
		5172a4	`choose client side for the current machine (\-\-server/\-\-client must be selected).`
		5172a4	`.RE`
		5172a4	`.SH ENVIRONMENT`
		5172a4	`.B Prerequisites:`
		5172a4	`.RS`
		5172a4	`kernel 2.6`
		5172a4	`.RE`
		5172a4	`.RS`
		5172a4	`(kernel module) matches libibverbs`
		5172a4	`.RE`
		5172a4	`.RS`
		5172a4	`(kernel module) matches librdmacm`
		5172a4	`.RE`
		5172a4	`.RS`
		5172a4	`(kernel module) matches libibumad`
		5172a4	`.RE`
		5172a4	`.RS`
		5172a4	`(kernel module) matches libmath (lm).`
		5172a4	`.RE`
		5172a4	`.SH NOTES`
		5172a4	`You need to be running a Subnet Manager on the switch or on one of the nodes in your fabric, in case you are in IB fabric.`
		5172a4	`.SH BUGS`
		5172a4	`1. Multicast feauture in ib_send_lat and in ib_send_bw still have many problems!`
		5172a4	`Will increase the support and bug fixes in this Q, but now the tests may stuck`
		5172a4	`and could produce undefine behaviours.`
		5172a4	`.sp`
		5172a4	`2. Bidirectional feature in ib_send_bw test, when running in UD or UC mode.`
		5172a4	`The algorithm we use for the bidirectional measurement is designed for RC connection type.`
		5172a4	`When running in UC or UD connection types, there is a small probablity the test will be stuck.`
		5172a4	`.sp`
		5172a4	`3. RDMA_CM feature in read tests still doesn't work.`
		5172a4	`.sp`
		5172a4	`4. Dual-port support currently works only with ib_write_bw.`
		5172a4	`.sp`
		5172a4	`5. Compabilty issues may occur between different versions of perftest.`
		5172a4	`Please make sure you work with the same version on both sides to ensure`
		5172a4	`consistency of the test.`
		5172a4	`.SH AUTHORS`
		5172a4	`Please post results/observations to the openib-general mailing list.`
		5172a4	`See "Contact Us" at http://openib.org/mailman/listinfo/openib-general and`
		5172a4	`http://www.openib.org.`

rpms / perftest

Source Code

Blame SOURCES/ib_atomic_bw.1