Update: I tested with Windows 2012 Clients to verify and I still get about 5.5 Gbit/s max.
Maybe someone has other 40Gbit adapters what are the speeds for you?
Update 2: The mainboard had 16x physical and only 2x electrical connection. (Special thx to Erez support admin for a quick and good answer)
After changing to a PCIe 3.0 8x lane I now get the following speed: (should still be 3x faster)
Update 3: One support admin suggested to not use passive copper, but to use optic fibre. After getting an 56Gbit Optical fibre IB cable I now get these results:
The story goes like this: Advertised 40Gbit , 32Gbit theoretical which is really only 25.6 Gbit according to Enez from Mellanox which turns out to be in Reality HALF-DUPLEX 16Gbit!
Do I make something wrong or is it just the way it works for customers of mellanox :/
If there is still something wrong how do i fix it?
OLD PART DO NOT READ: (READ UPDATE 3 instead)
I have two Windows 10 Machines with two MHQH19B-XTR 40 Gbit Adapters and a QSFP cable in between. The Vlan manager is opensm.
The connection should be about 32Gbits Lan. In reality i only get 5 Gbit performance. So clearly something is very wrong.
Yes 40Gb/s data rate, but sending 8 bits data in a 10 bit packet giving 32Gb/s max data thruput; however the PCIe bus will limit you to about 25Gb/s.
Keep in mind that the performance for hardware to hardware is better than software to software. I've only used Mellanox cards with Linux and the performance for hardware to hardware hits 25Gb/s with ConnectX-2 cards.
The IB equipment you are using has 4 pairs of wire running at 10Gb/s each - hence 40Gb/s total.
Real world file sharing, even with older 10Gb/s InfiniHost cards is better than 10Gb/s ethernet. My MAXIMUM performance tests (using the Linux fio program) are below. That being said we've avoided Windows file servers since at least Windows 2000 - the performance has been terrible compared to Linux; esp. when one factors in the cost of the hardware required.
I would suggest that you compare the exact servers using an ethernet link to see how it compares. In the end theoretical performance is nice - but what really matters is the actual software you are using. In my case going to 10Gb ethernet or QDR IB things like data replication (ZFS snapshots, rsync) went from 90 minutes to sub 3 minutes. It was often not the increased bandwidth but the lower latency (IOPs) that mattered. For user applications accessing the file server - compile times were only reduced by about 30% going to InfiniBand or 10Gb ethernet - but the ethernet is around 10x as expensive. I've not performance tested our Oracle database - but it went to 10Gb ethernet because my IB setup is for the students and I don't trust it yet on a "corporate" server.
In the case of file sharing you'll want to see if you're using the old ports 137 to 139 instead of 445 as that can impact performance.
Also - there is no way to exploit the exceptionally low latency of InfiniBand unless you've got SSDs or your data in RAM.
We are trying to attach a DDN GS7k to an existing IS5030 switch and we cannot get a ib link up. We suspect the switch either does not support SRIOV or the OpenSM they are running on the switch does not support it.
Can you tell me the following
Minimum OpenSM required on an IS5030 switch to support SRIOV
How to tell the current OpenSM version running via the Cli or GUI. We cannot find it on the gui currently and cannot find the Cli command
I installed the MLNX_OFED drivers on CentOS 6.8. (I had originally configured the network and IPoIB interface using the RHEL manual (Part II. InfiniBand and RDMA Networking) and was using NFS over the IPoIB but was receiving a bunch of page allocation failures)
I used the mlnxofedinstall script which completed successfully and updated the firmware, e.g.:
...
Device (84:00.0):
84:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
Please reboot your system for the changes to take effect.
To load the new driver, run:
/etc/init.d/openibd restart
#
I rebooted the system and then ran the self test:
# hca_self_test.ofed
---- Performing Adapter Device Self Test ----
Number of CAs Detected ................. 1
PCI Device Check ....................... PASS
Kernel Arch ............................ x86_64
Host Driver Version .................... MLNX_OFED_LINUX-3.4-1.0.0.0 (OFED-3.4-1.0.0): 2.6.32-642.el6.x86_64
Host Driver RPM Check .................. PASS
Firmware on CA #0 VPI .................. v2.36.5150
Host Driver Initialization ............. PASS
Number of CA Ports Active .............. 0
Port State of Port #1 on CA #0 (VPI)..... INIT (InfiniBand)
Port State of Port #2 on CA #0 (VPI)..... DOWN (InfiniBand)
Error Counter Check on CA #0 (VPI)...... FAIL
REASON: found errors in the following counters
Errors in /sys/class/infiniband/mlx4_0/ports/1/counters
port_rcv_errors: 93
Kernel Syslog Check .................... PASS
Node GUID on CA #0 (VPI) ............... e4:1d:2d:03:00:6f:89:f0
------------------ DONE ---------------------
#
As you can see there is an error with the port_rcv_errors counter. Also the port state for Port #1 will remain at INIT until i start the subnet manager (/etc/init.d/opensmd start) since we have unmanaged switch. That used to start automatically. So maybe the OFED installation wasn't completely successful?
Additionally, i am unable to configure NFS for RDMA. e.g.: