Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all articles
Browse latest Browse all 6230

Re: Management command failed in KVM for SR-IOV

$
0
0

Hi,

 

 

Still nothing... hope this info can be helpful.

 

 

I notice that OpenSM must be started on hypervisor host in my case this is S1 otherwise the virtual function's ports are linked up but have state DOWN.

When I start OpenSM (option: PORTS="ALL") all the ports become active (both are cable connected).

 

 

I noticed also a few more things:

 

 

So far only with ibnetdiscover in virtual system produce system message in hypervisor host:

 

 

mlx4_core 0000:04:00.0: slave 2 is trying to execute a Subnet MGMT MAD, class 0x1, method 0x81 for attr 0x11. Rejecting

mlx4_core 0000:04:00.0: vhcr command MAD_IFC (0x24) slave:2 in_param 0x26aaf000 in_mod=0xffff0001, op_mod=0xc failed with error:0, status -1

 

 

sminfo command gives the correct OpenSM lid information i.e. give the lid number from OpenSM master:

# sminfo --debug -v

ibwarn: [2843] smp_query_status_via: attr 0x20 mod 0x0 route Lid 1

ibwarn: [2843] _do_madrpc: send failed; Function not implemented

ibwarn: [2843] mad_rpc: _do_madrpc failed; dport (Lid 1)

sminfo: iberror: [pid 2843] main: failed: query

 

 

In virtual host I can see message in log:

ibnetdiscover[2755]: segfault at e4 ip 00000031d420a8b6 sp 00007fffc2eee6b8 error 4 in libibmad.so.5.3.1[31d4200000+12000]

 

 

and in hypervisor host:

<mlx4_ib> _mlx4_ib_mcg_port_cleanup: _mlx4_ib_mcg_port_cleanup-1102: ff12401bffff000000000000ffffffff (port 2): WARNING: group refcount 1!!! (pointer ffff88083f4fa000)

 

One more thing:

 

In virtual machine I started OpenSM with guid point to local port in VF and get those messages:

 

Jul 24 14:27:09 830432 [FA2C0700] 0x80 -> Entering DISCOVERING state

Using default GUID 0x14050000000002

Jul 24 14:27:09 994036 [FA2C0700] 0x02 -> osm_vendor_bind: Mgmt class 0x81 binding to port GUID 0x14050000000002

Jul 24 14:27:10 398748 [FA2C0700] 0x02 -> osm_vendor_bind: Mgmt class 0x03 binding to port GUID 0x14050000000002

Jul 24 14:27:10 398958 [FA2C0700] 0x02 -> osm_vendor_bind: Mgmt class 0x04 binding to port GUID 0x14050000000002

Jul 24 14:27:10 399371 [FA2C0700] 0x02 -> osm_vendor_bind: Mgmt class 0x21 binding to port GUID 0x14050000000002

Jul 24 14:27:10 399960 [FA2C0700] 0x02 -> osm_opensm_bind: Setting IS_SM on port 0x0014050000000002

Jul 24 14:27:10 400439 [FA2C0700] 0x01 -> osm_vendor_set_sm: ERR 5431: setting IS_SM capmask: cannot open file '/dev/infiniband/issm0': Invalid argument

Jul 24 14:27:10 401700 [F66B8700] 0x01 -> osm_vendor_send: ERR 5430: Send p_madw = 0x7fcfe40008c0 of size 256 TID 0x1234 failed -5 (Invalid argument)

Jul 24 14:27:10 401700 [F66B8700] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_ERROR): SubnGet(NodeInfo), attr_mod 0x0, TID 0x1234

Jul 24 14:27:10 401700 [F66B8700] 0x01 -> vl15_send_mad: ERR 3E03: MAD send failed (IB_UNKNOWN_ERROR)

Jul 24 14:27:10 401983 [F5CB7700] 0x01 -> state_mgr_is_sm_port_down: ERR 3308: SM port GUID unknown

 

Regular linux cat on file /dev/infiniband/issm0 works in hypervisor system at least it's waiting  when in VM I get exactly the messages from OpenSM log:

 

# cat /dev/infiniband/issm0

cat: /dev/infiniband/issm0: Invalid argument

 

both file on host and VM are the same regarding to access:

VM:

#ls -aZ  /dev/infiniband/issm0

crw-rw----. root root system_u:object_r:device_t:s0    /dev/infiniband/issm0

 

Host:

#ls -lZ /dev/infiniband/issm0

crw-rw----. root root system_u:object_r:device_t:s0    /dev/infiniband/issm0


Viewing all articles
Browse latest Browse all 6230

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>