Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6230 articles
Browse latest View live

"Priority trust-mode is not supported on your system"?

$
0
0

Hello, I met a problem when I set the trust-mode for the ConnectX3-Pro 40GbE NIC.

The system information follows:

LSB Version: :core-4.1-amd64:core-4.1-noarch

Distributor ID: CentOS

Description: CentOS Linux release 7.3.1611 (Core)

Release: 7.3.1611

Codename: Core

The ConnectX3-Pro NIC information follows:

hca_id: mlx4_1

transport: InfiniBand (0)

fw_ver: 2.40.7000

node_guid: f452:1403:0095:2280

sys_image_guid: f452:1403:0095:2280

vendor_id: 0x02c9

vendor_part_id: 4103

hw_ver: 0x0

board_id: MT_1090111023

phys_port_cnt: 2

Device ports:

port: 1

state: PORT_ACTIVE (4)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

port: 2

state: PORT_DOWN (1)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

It's the first time that I have met this problem. So I don't know what to do.

What does this tip mean? Is the system version the main cause?

Waiting for your help.

Thanks.


Re: Can't get full FDR bandwidth with Connect-IB card in PCI 2.0 x16 slot

$
0
0

Server:
> ib_send_bw -a -F --report_gbits
Client:
> ib_send_bw -a -F --report_gbits <serverIP>

 

Please let me know your results and thank you...

~Steve

Re: Can't get full FDR bandwidth with Connect-IB card in PCI 2.0 x16 slot

$
0
0

Hi Steve,

 

Here you are.  Thanks for taking an interest.  Still getting ~45 Gb/sec on both client and server.  Here is the client output:

 

[root@vx01 ~]#  ib_send_bw -a -F --report_gbits vx02

---------------------------------------------------------------------------------------

                    Send BW Test

Dual-port       : OFF Device         : mlx5_0

Number of qps   : 1 Transport type : IB

Connection type : RC Using SRQ      : OFF

TX depth        : 128

CQ Moderation   : 100

Mtu             : 4096[B]

Link type       : IB

Max inline data : 0[B]

rdma_cm QPs : OFF

Data ex. method : Ethernet

---------------------------------------------------------------------------------------

local address: LID 0x3e4 QPN 0x005e PSN 0x239861

remote address: LID 0x3e6 QPN 0x004c PSN 0xed4513

---------------------------------------------------------------------------------------

#bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]

2          1000           0.098220            0.091887            5.742968

4          1000             0.20               0.19       6.037776

8          1000             0.40               0.39       6.071169

16         1000             0.78               0.67       5.220818

32         1000             1.53               1.43       5.576730

64         1000             3.16               3.10       6.053410

128        1000             6.20               6.16       6.012284

256        1000             12.35              12.28     5.997002

512        1000             22.67              22.47     5.486812

1024       1000             38.02              36.69     4.478158

2048       1000             42.26              42.04     2.565771

4096       1000             43.82              43.68     1.332978

8192       1000             44.63              44.63     0.681005

16384      1000             44.79              44.79     0.341728

32768      1000             45.21              45.21     0.172449

65536      1000             45.35              45.35     0.086506

131072     1000             45.45              45.45     0.043342

262144     1000             45.45              45.45     0.021670

524288     1000             45.47              45.47     0.010840

1048576    1000             45.47              45.47     0.005421

2097152    1000             45.48              45.48     0.002711

4194304    1000             45.48              45.48     0.001355

8388608    1000             45.48              45.48     0.000678

---------------------------------------------------------------------------------------

 

Here is the server output:

 

[root@vx02 ~]#  ib_send_bw -a -F --report_gbits

 

 

************************************

* Waiting for client to connect... *

************************************

---------------------------------------------------------------------------------------

                    Send BW Test

Dual-port       : OFF Device         : mlx5_0

Number of qps   : 1 Transport type : IB

Connection type : RC Using SRQ      : OFF

RX depth        : 512

CQ Moderation   : 100

Mtu             : 4096[B]

Link type       : IB

Max inline data : 0[B]

rdma_cm QPs : OFF

Data ex. method : Ethernet

---------------------------------------------------------------------------------------

local address: LID 0x3e6 QPN 0x004c PSN 0xed4513

remote address: LID 0x3e4 QPN 0x005e PSN 0x239861

---------------------------------------------------------------------------------------

#bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]

2          1000           0.000000            0.099141            6.196311

4          1000             0.00               0.20       6.229974

8          1000             0.00               0.40       6.265230

16         1000             0.00               0.69       5.362016

32         1000             0.00               1.47       5.727960

64         1000             0.00               3.22       6.283794

128        1000             0.00               6.34       6.191118

256        1000             0.00               12.64     6.169975

512        1000             0.00               23.08     5.634221

1024       1000             0.00               37.53     4.581582

2048       1000             0.00               42.63     2.602155

4096       1000             0.00               44.07     1.344970

8192       1000             0.00               45.04     0.687191

16384      1000             0.00               45.04     0.343602

32768      1000             0.00               45.35     0.172994

65536      1000             0.00               45.45     0.086690

131072     1000             0.00               45.52     0.043409

262144     1000             0.00               45.51     0.021699

524288     1000             0.00               45.52     0.010852

1048576    1000             0.00               45.52     0.005427

2097152    1000             0.00               45.53     0.002714

4194304    1000             0.00               45.53     0.001357

8388608    1000             0.00               45.53     0.000678

---------------------------------------------------------------------------------------

 

Please let me know if you want any other info and I will send it straight away.

 

Regards,

 

Eric

ASAP2 Live Migration & H/W LAG

$
0
0

Hi,

 

Is ASAP2 OVS Offload support OpenStack Live Migration? If not, which ASAP2 mode should I use, OVS Acceleration or Application Acceleration (DPDK Offload)? How to have H/W LAG (with LACP) on each of those three modes?

 

Best regards,

The problem with RoCE connectivity between ConnectX-3 and ConnectX-4 Lx adapters

$
0
0

Hello.

I have Microsoft Windows 2012 R2 cluster and some nodes have ConnectX-3 adapters and some nodes have ConnectX-4 Lx adapters.

There is RoCE connectivity between nodes with ConnectX-4 Lx adapters, but there isn’t connectivity between nodes with different adapters.

I think it’s because ConnectX-3 adapters use RoCE  1.0 mode, but ConnectX-4 Lx adapters use RoCE 2.0 mode.

I tred to change RoCE mode from 1.0 to 2.0 for ConnectX-3 adapters by “Set-MlnxDriverCoreSetting -RoceMode 2”, but had warning – “SingleFunc_2_0_0: RoCE v2.0 mode was requested, but it is not supported. The NIC starts in RoCE v1.5 mode” and RoCE connectivity doesn’t work.

What the best way to fix my problem ?

ConnectX-3 adapters don’t work in RoCE 2.0 mode at all ?    How about new FW ? Today I have WinOF-5_35 with FW:

 

Image type:      FS2

FW Version:      2.40.5032

FW Release Date: 16.1.2017

Product Version: 02.40.50.32

Rom Info:        type=PXE version=3.4.747 devid=4099

Device ID:       4099

Description:     Node             Port1            Port2            Sys image

GUIDs:           ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff

MACs: e41d2ddfa540     e41d2ddfa541

VSD:

PSID:            MT_1080120023

 

 

I don’t find way to change to RoCE 1.0 mode for ConnectX-4 Lx adapters for Microsoft Windows environment.

Unknown symbol nvme_find_pdev_from_bdev

$
0
0

Hi all,

 

After installing MLNX_OFED_LINUX-4.4-1 on Ubuntu 18.04 (kernel 4.15.0-24) as "$ mlnxofedinstall --force --without-dkms --with-nvmf" I'm trying to use RDMA tools, but

 

- modprobe on nvme_rdma fails with "nvme_rdma: Unknown symbol nvme_delete_wq (err 0)"

- modprobe on nvmet_rdma fails with "nvmet: Unknown symbol nvme_find_pdev_from_bdev (err 0)"

 

What am I doing wrong, please?

 

I see two kernel modules loaded: nvme and nvme_core.

 

This is Mellanox MCX516A-CCAT ConnectX-5 EN Network Interface Card 100GbE Dual-Port QSFP28.

 

Any inputs will be greatly appreciated

 

Thank you

Dmitri Fedorov

Ciena Canada

ConnectX-3: Only one queue is receiving packet

$
0
0

Hi,

I have a ConnectX-3 card with two physical ports, each port has 4 RX queues. I use another machine to generate sample traffic and send to both ports. However, there is only one queue that are receiving packets (from one IRQ channel), for BOTH physical ports.

How can I distribute the traffic to all queues (different IRQ channel)? Or at least, two physical ports should have different queues.

 

Here is the output from cat /proc/interrupts  | grep mlx4 :

 29:       2787       2570       2500       3093  IR-PCI-MSI 524288-edge      mlx4-async@pci:0000:01:00.0
 30:          0          0         66         11  IR-PCI-MSI 524289-edge      mlx4-1@0000:01:00.0
 31:          9          0         88          0  IR-PCI-MSI 524290-edge      mlx4-2@0000:01:00.0
 32:        124         13          0          0  IR-PCI-MSI 524291-edge      mlx4-3@0000:01:00.0
 33:     385333          0    2517273          0  IR-PCI-MSI 524292-edge      mlx4-4@0000:01:00.0
 34:          0          0          0          0  IR-PCI-MSI 524293-edge      mlx4-5@0000:01:00.0
 35:          0          0          0          0  IR-PCI-MSI 524294-edge      mlx4-6@0000:01:00.0
 36:          0          0          0          0  IR-PCI-MSI 524295-edge      mlx4-7@0000:01:00.0
 37:          0          0          0          0  IR-PCI-MSI 524296-edge      mlx4-8@0000:01:00.0

 

The packet counts from ifconfig show that both ports receive packets.

I use Ubuntu 18.04 64 bit with kernel v4.15. I use upstream driver from the kernel (because I need to run my XDP program on the NIC. However, in the test above, XDP is not running), and havent install Mellanox OFED yet. i I believe the card should work fine with only the drivers from the kernel(?).

Re: Firmware for MHJH29 ?

$
0
0

> Service for this Hardware and FW has ended. So it will not be hosted on our site.

 

Thank you for your help. Seems this new firmware is no newer than the one I already have, unfortunately.

Those are 10+ years old cards... no surprise they are difficult to put back in service.

 

Cordially & thanks again for the great support,


What are some good budget options for NICs and Switch?

$
0
0

I have a limited budget, and I want to buy a high speed network interface cards and a switch to connect multiple PCs into a local network (just a couple of meters away from one another) for HPC research.

Some of the key points for me are:

  • It should be compatible with Windows
  • Speed preferably 56 gigabit
  • Needs to be able to sustain a 100% load at all times
  • Non-managed Switch
  • No SFP(+) uplinks

 

Taking all of that into account what models should I look to buy? Perhaps some older models from a couple of years ago?

What are some caveats in building a system like that that I should consider?

What should I expect in terms of latency?

SN2100B v3.6.8004

$
0
0

Hi All,

 

I don't know if someone already posted something about this.

I just want to share that when we upgraded our SN2100 to v3.6.8004, the GUI doesn't load. Luckily it can still be accessible via SSH and I changed the next boot partition and it rebooted fine. I downgraded the partition from a working version and now we can manage the switch both via GUI and CLI.

 

Hope this helps someone who wants to upgrade to this version.

Re: MLNX+NVIDIA ASYNC GPUDirect - Segmentation fault: invalid permissions for mapped object running mpi with CUDA

$
0
0

I have encountered this question, too.

It was because of the ucx do not compile with cuda.(The mlnx install the default ucx).

When I recompile the ucx with cuda and reinstall it ,It works.

Header Data Split

$
0
0

I've made a feeble attempt to utilise Header Data Split (HDS) offload on Connect-X 5 adapters, by creating the striding WQ context with a non-zero log2_hds_buf_size value. However, the hardware won't have it and reports back bad_param error with syndrome 0x6aaebb.

 

According to an online error syndrome list, this translates to human readable as:

create_rq/rmp: log2_hds_buf_size not supported

 

Since the Public PRM does not describe HDS offload, I'm curious to whether certain preconditions need to met for this offload to work, or if this is a known restriction in current firmware? I'd also like to know if it's possible to configure the HDS "level", that is where the split happens (between L3/L4, L4/L5, ...).

 

The way I'd envision this feature to work is to zero-pad the end of headers up to log2_hds_buf_size, placing the upper layer payload at a fixed offset for any variable-size header length.

Re: Can't get full FDR bandwidth with Connect-IB card in PCI 2.0 x16 slot

$
0
0

Hello Eric -

   I hope all is well...

You won't achieve a line rate of 56G/s because the NIC is: MCB193A-FCAT MT_1220110019 and your PCEi is 2.0

And the release notes for you FW state:

Connect-IB® Host Channel Adapter, single-port QSFP, FDR 56Gb/s,PCIe3.0 x16, tall bracket, RoHS R6

 

So getting ~45 ~48 Gb/s is good.

 

Have a great day!

Steve

RoCE v2 configuration with Linux drivers and packages

$
0
0

is it possible to configure RoCE v2 with Connectx-4 card without MLNX_OFED? can someone please share info if there is any guide/doc available to configure with Linux drivers and packages?

I tried to do with drivers and packages but I am not able to succeed. When I used MLNX_OFED, RoCE is configured successfully.

send_bw test between QSFP ports on Dual Port Adapter

$
0
0

Hello Mellanox,

 

I have ConnectX-3 QSFP Dual Port CX354A  Adapter with Windows 7 x64 Pro on PC, I would like to make throughput test between QSFP ports. For this connected 1m fiber cable between ports and set CX354A in Ethernet mode in Device Manager. I also manualy set different IP addresses for 2 Ethernet adapters with the same IP mask:

IP1: 192.168.0.1

mask: 255.255.255.0

 

IP2: 192.168.0.2

mask: 255.255.255.0

 

I tried make tcp iperf3 test, but I had only 12 Gbit/s instead of 40 Gbit/s. In your documentation where ib_send_bw  utility was recommended for testing the performance.

 

How make test with ib_send_bw between two Ethernet interfaces on one PC?

 

For example: to set interface in iperf3 I can choose with key -B: iperf3 -B 192.168.0.1.

 

Thank you.

 

--

With Best Wishes,

Dmitrii


RoCEv2 GID disappeared ?

$
0
0

Hi everybody !

 

how are connectx-5 GID created/initialized ? they disappeared after ofed upgrade...

 

ibv_devinfo -v neither show_gids display any GID...

 

Any ideas

regards, raph

 

after upgrade to MLNX_OFED_LINUX-4.4-1.0.0.0-debian9.1-x86_64

 

scisoft13:~ % sudo show_gids

DEV     PORT INDEX   GID                        IPv4 VER DEV

---     ---- -----   ---                        ------------ --- ---

n_gids_found=0

 

 

 

before upgrade

scisoft13:~ % sudo show_gids

DEV     PORT INDEX   GID                        IPv4 VER DEV

---     ---- -----   ---                        ------------ --- ---

mlx4_0  1 0       fe80:0000:0000:0000:526b:4bff:fe4f:be21                 v1 enp131s0

mlx4_0  2 0       fe80:0000:0000:0000:526b:4bff:fe4f:be22                 v1 enp131s0d1

mlx5_0  1 0       fe80:0000:0000:0000:526b:4bff:fed3:d164                 v1 enp130s0f0

mlx5_0  1 1       fe80:0000:0000:0000:526b:4bff:fed3:d164                 v2 enp130s0f0

mlx5_0  1 2       0000:0000:0000:0000:0000:ffff:c0a8:030d 192.168.3.13    v1 enp130s0f0

mlx5_0  1 3       0000:0000:0000:0000:0000:ffff:c0a8:030d 192.168.3.13    v2 enp130s0f0

mlx5_1  1 0       fe80:0000:0000:0000:526b:4bff:fed3:d165                 v1 enp130s0f1

mlx5_1  1 1       fe80:0000:0000:0000:526b:4bff:fed3:d165                 v2 enp130s0f1

n_gids_found=8

 

hca_id: mlx5_1

 

        transport:                      InfiniBand (0)

        fw_ver:                         16.23.1000

        node_guid:                      506b:4b03:00d3:d185

        sys_image_guid:                 506b:4b03:00d3:d184

        vendor_id:                      0x02c9

        vendor_part_id:                 4119

        hw_ver:                         0x0

        board_id:                       MT_0000000012

        phys_port_cnt:                  1

        max_mr_size:                    0xffffffffffffffff

        page_size_cap:                  0xfffffffffffff000

        max_qp:                         262144

        max_qp_wr:                      32768

        device_cap_flags:               0xe5721c36

                                        BAD_PKEY_CNTR

                                        BAD_QKEY_CNTR

                                        AUTO_PATH_MIG

                                        CHANGE_PHY_PORT

                                        PORT_ACTIVE_EVENT

                                        SYS_IMAGE_GUID

                                        RC_RNR_NAK_GEN

                                        XRC

                                        Unknown flags: 0xe5620000

        device_cap_exp_flags:           0x520DF8F100000000

                                        EXP_DC_TRANSPORT

                                        EXP_CROSS_CHANNEL

                                        EXP_MR_ALLOCATE

                                        EXT_ATOMICS

                                        EXT_SEND NOP

                                        EXP_UMR

                                        EXP_ODP

                                        EXP_RX_CSUM_TCP_UDP_PKT

                                        EXP_RX_CSUM_IP_PKT

                                        EXP_MASKED_ATOMICS

                                        EXP_RX_TCP_UDP_PKT_TYPE

                                        EXP_SCATTER_FCS

                                        EXP_WQ_DELAY_DROP

                                        EXP_PHYSICAL_RANGE_MR

                                        EXP_UMR_FIXED_SIZE

                                        Unknown flags: 0x200000000000

        max_sge:                        30

        max_sge_rd:                     30

        max_cq:                         16777216

        max_cqe:                        4194303

        max_mr:                         16777216

        max_pd:                         16777216

        max_qp_rd_atom:                 16

        max_ee_rd_atom:                 0

        max_res_rd_atom:                4194304

        max_qp_init_rd_atom:            16

        max_ee_init_rd_atom:            0

        atomic_cap:                     ATOMIC_HCA (1)

        log atomic arg sizes (mask)             0x8

        masked_log_atomic_arg_sizes (mask)      0x3c

        masked_log_atomic_arg_sizes_network_endianness (mask)   0x34

        max fetch and add bit boundary  64

        log max atomic inline           5

        max_ee:                         0

        max_rdd:                        0

        max_mw:                         16777216

        max_raw_ipv6_qp:                0

        max_raw_ethy_qp:                0

        max_mcast_grp:                  2097152

        max_mcast_qp_attach:            240

        max_total_mcast_qp_attach:      503316480

        max_ah:                         2147483647

        max_fmr:                        0

        max_srq:                        8388608

        max_srq_wr:                     32767

        max_srq_sge:                    31

        max_pkeys:                      128

        local_ca_ack_delay:             16

        hca_core_clock:                 78125

        max_klm_list_size:              65536

        max_send_wqe_inline_klms:       20

        max_umr_recursion_depth:        4

        max_umr_stride_dimension:       1

        general_odp_caps:

                                        ODP_SUPPORT

                                        ODP_SUPPORT_IMPLICIT

        max_size:                       0xFFFFFFFFFFFFFFFF

        rc_odp_caps:

                                        SUPPORT_SEND

                                        SUPPORT_RECV

                                        SUPPORT_WRITE

                                        SUPPORT_READ

        uc_odp_caps:

                                        NO SUPPORT

        ud_odp_caps:

                                        SUPPORT_SEND

        dc_odp_caps:

                                        SUPPORT_SEND

                                        SUPPORT_WRITE

                                        SUPPORT_READ

        xrc_odp_caps:

                                        NO SUPPORT

        raw_eth_odp_caps:

                                        NO SUPPORT

        max_dct:                        262144

        max_device_ctx:                 1020

        Multi-Packet RQ supported

                Supported for objects type:

                        IBV_EXP_MP_RQ_SUP_TYPE_SRQ_TM

                        IBV_EXP_MP_RQ_SUP_TYPE_WQ_RQ

                Supported payload shifts:

                        2 bytes

                Log number of strides for single WQE: 3 - 16

                Log number of bytes in single stride: 6 - 13

 

        VLAN offloads caps:

                                        C-VLAN stripping offload

                                        C-VLAN insertion offload

        rx_pad_end_addr_align:  64

        tso_caps:

        max_tso:                        262144

        supported_qp:

                                        SUPPORT_RAW_PACKET

        packet_pacing_caps:

        qp_rate_limit_min:              0kbps

        qp_rate_limit_max:              0kbps

        ooo_caps:

        ooo_rc_caps  = 0x1

        ooo_xrc_caps = 0x1

        ooo_dc_caps  = 0x1

        ooo_ud_caps  = 0x0

                                        SUPPORT_RC_RW_DATA_PLACEMENT

                                        SUPPORT_XRC_RW_DATA_PLACEMENT

                                        SUPPORT_DC_RW_DATA_PLACEMENT

        sw_parsing_caps:

                                        SW_PARSING

                                        SW_PARSING_CSUM

                                        SW_PARSING_LSO

        supported_qp:

                                        SUPPORT_RAW_PACKET

        tag matching not supported

        tunnel_offloads_caps:

                                        TUNNEL_OFFLOADS_VXLAN

                                        TUNNEL_OFFLOADS_GRE

                                        TUNNEL_OFFLOADS_GENEVE

        UMR fixed size:

                max entity size:        2147483648

        Device ports:

                port:   1

                        state:                  PORT_ACTIVE (4)

                        max_mtu:                4096 (5)

                        active_mtu:             1024 (3)

                        sm_lid:                 0

                        port_lid:               0

                        port_lmc:               0x00

                        link_layer:             Ethernet

                        max_msg_sz:             0x40000000

                        port_cap_flags:         0x04010000

                        max_vl_num:             invalid value (0)

                        bad_pkey_cntr:          0x0

                        qkey_viol_cntr:         0x0

                        sm_sl:                  0

                        pkey_tbl_len:           1

                        gid_tbl_len:            256

                        subnet_timeout:         0

                        init_type_reply:        0

                        active_width:           4X (2)

                        active_speed:           25.0 Gbps (32)

                        phys_state:             LINK_UP (5)

Re: RoCEv2 GID disappeared ?

$
0
0

thank you for helping ! it is strange because everything was ok before

upgrade.

 

- ping is ok

- /sys/class/infiniband/ etc exists and is populated but gids...

 

after upgrade to MLNX_OFED_LINUX-4.4-1.0.0.0-debian9.1-x86_64

 

scisoft13:~ % sudo show_gids

 

DEV     PORT INDEX   GID                        IPv4 VER DEV

 

---     -


 

Re: RoCEv2 GID disappeared ?

$
0
0

scisoft13:~ % cat /sys/class/infiniband/mlx5_0/ports/1/gids/0

0000:0000:0000:0000:0000:0000:0000:0000

scisoft13:~ % cat /sys/class/infiniband/mlx5_0/ports/1/gids/1

0000:0000:0000:0000:0000:0000:0000:0000

 

scisoft13:~ % cat /sys/class/infiniband/mlx5_0/ports/1/gid_attrs/types/0

cat: /sys/class/infiniband/mlx5_0/ports/1/gid_attrs/types/0: Invalid argument

 

same issue on 2 servers with connectx-5 EN 100Gb/s optical link and connectx-3 40GBb/s  copper link

ofed install without issue

Re: RoCEv2 GID disappeared ?

$
0
0

Hi Raphael,

 

Thank you for the information, it looks as an unexpected behaviour related to the driver and this specific Operating system.

For us to continue and investigate it please send an email to support@mellanox.com and open a support ticket with all the details.

 

Thank you,

Karen.

Re: MLNX+NVIDIA ASYNC GPUDirect - Segmentation fault: invalid permissions for mapped object running mpi with CUDA

$
0
0

Thanks a lot for the reply. It solved the above issue but after running mpirun, i do not see any latency difference with and without GDR

 

My Questions :

  1. Why I do not see any latency difference with and without GDR. ?
  2. Does below sequence or steps correct ? Does it matter for my Question 1

 

Note: I am having single GPU on both host and peer. Iommu is disabled.

## nvidia-smi topo -m

           GPU0    mlx5_0  mlx5_1  CPU Affinity

GPU0     X      PHB     PHB     18-35

mlx5_0  PHB      X      PIX

mlx5_1  PHB     PIX      X

 

Steps followed are:

1. Install CUDA 9.2 and add the library and bin path in .bashrc

2. Install latest MLX OFED

3. Compile and Install nv_peer_mem driver

4. Get UCX from git. Configure UCX with cuda and  Install UCX

5. Configure Openmpi-3.1.1 and install it.

./configure --prefix=/usr/local --with-wrapper-ldflags=-Wl,-rpath,/lib --enable-orterun-prefix-by-default --disable-io-romio --enable-picky --with-cuda=/usr/local/cuda-9.2

6. Configure OSU Benchmarks-5.4.2 with cuda and install it

./configure prefix=/root/osu_benchmarks CC=mpicc --enable-cuda --with-cuda=/usr/local/cuda-9.2

 

Run mpirun. I do not see any latency difference with and without GDR.

 

Thanks for your Help.

Viewing all 6230 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>