[root@localhost ~]# rdma link
link rxe0/1 state ACTIVE physical_state LINK_UP netdev eth0
[root@localhost ~]# ibv_devinfo
hca_id: rxe0
transport: InfiniBand (0)
fw_ver: 0.0.0
node_guid: 5054:00ff:fe52:8b53
sys_image_guid: 5054:00ff:fe52:8b53
vendor_id: 0xffffff
vendor_part_id: 0
hw_ver: 0x0
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
Soft-RoCE Requires a Specific IPv6 Address
Problem
So you’ve successfully set up Soft-RoCE. rdma_rxe
is loaded
into your kernel. All the tools are reporting that you have an RDMA device:
And now you’re trying to verify RDMA is working or validate RXE is working. Except, it isn’t.
Does your ibv_rc_pingpong
(and friends) hit errors like:
[root@localhost ~]# ibv_rc_pingpong -g 0 -d rxe0 -i 1
local address: LID 0x0000, QPN 0x000028, PSN 0x0096f2, GID fe80::5054:ff:fe52:8b53
Failed to modify QP to RTR
Couldn't connect to remote QP
[root@localhost ~]# ibv_rc_pingpong -g 0 -d rxe0 -i 1 10.0.0.234
local address: LID 0x0000, QPN 0x000026, PSN 0x5137fe, GID fe80::5054:ff:fe7a:e08d
client read/write: No space left on device
Couldn't read/write remote address
Does your qperf
fail with errors like:
[root@localhost ~]# qperf 10.0.0.234 ud_bw ud_lat
ud_bw:
failed to create address handle: Invalid argument
or
[root@localhost ~]# qperf 10.0.0.234 rc_bw
rc_bw:
failed to modify QP to RTR: Invalid argument
server: failed to modify QP to RTR: Invalid argument
Does your rping
mysteriously work totally fine?
[root@localhost ~]# rping -s -a 10.0.0.234
server DISCONNECT EVENT...
wait for RDMA_READ_ADV state 10
[root@localhost ~]# rping -c -a 10.0.0.234 -C 4 -v
ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs
ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst
ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu
Solution
It’s because rdma-core depends on having the eui64
encoded version of the MAC address registered as an IPv6 link local address on the same interface. See
rdma_rxe usage problem on the linux-rdma mailing list.
Rather than go through the manual steps outlined therein with ipv6calc
, the easy way to resolve this is just to get linux to do it for you:
ip link set dev eth0 addrgenmode eui64
ip link set dev eth0 down
ip link set dev eth0 up
To make this persist across reboots, use /etc/sysctl.d/
(or however your distro configures sysctls):
net.ipv6.conf.default.addr_gen_mode = 1
net.ipv6.conf.eth0.addr_gen_mode = 1