Quantcast
Viewing all articles
Browse latest Browse all 18256

Connectivity issues to hosts over MPLS

Hello,

 

We have been troubleshooting an issue that prevents our vCenter server from connecting to some of our remote hosts. This has impacted 2 different vCenter servers running 5.1 and 5.5 on Windows Server 2008 R2 and 2012 R2.

 

Process leading to the error

  • We are able to add hosts to a data center after a host reboot or fresh vCenter install
  • If our primary data center MPLS goes down (maintenance or otherwise) we lose connectivity to all remote hosts
  • One data center is able to reconnect without issue. This particular data center is our secondary data center
  • No other remote sites are able to reconnect

 

Troubleshooting

  • Disabled IPv6 across VMware infrastructure (Windows Servers, ESXi hosts)
  • Increased handshakeTimeoutMs to 120000
  • Restarted management network
  • Cleared ARP table
  • Lockdown mode is disabled

 

Notes

  • We have a single ESX 4.1 host that is able to reconnect without issue (has only experienced one disconnect, but came back without issue unlike the 5.5 counterpart)
  • We're able to connect to the hosts via vSphere console and SSH without issue
  • The network team is troubleshooting the issue as well, but we've not been able to rule out VMware as the culprit

 

Logs


vpxd

2014-09-24T14:00:14.785-05:00 [05920 warning 'Default'] Failed to connect socket; <io_obj p:0x000000000d10a128, h:3876, <TCP '0.0.0.0:0'>, <TCP '10.x.x.16:443'>>, e: system:10060(A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond)

2014-09-24T14:00:14.785-05:00 [05920 error 'HttpConnectionPool-000001'] [ConnectComplete] Connect failed to <cs p:000000000ee4c730, TCP:xxxesxi01.xxx.com:443>; cnx: (null), error: class Vmacore::SystemException(A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond)

2014-09-24T14:00:14.785-05:00 [05852 error 'httphttpUtil' opID=6159800D-000000AB-d6] [HttpUtil::ExecuteRequest] Error in sending request - A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

2014-09-24T14:00:14.785-05:00 [05852 error 'vpxdvpxdHostAccess' opID=6159800D-000000AB-d6] [VpxdHostAccess::Connect] Failed to discover version: vim.fault.HttpFault

2014-09-24T14:00:14.786-05:00 [05852 info 'commonvpxLro' opID=6159800D-000000AB-d6] [VpxLRO] -- FINISH task-internal-5070 -- datacenter-31 -- vim.Datacenter.queryConnectionInfo --

2014-09-24T14:00:14.786-05:00 [05852 info 'Default' opID=6159800D-000000AB-d6] [VpxLRO] -- ERROR task-internal-5070 -- datacenter-31 -- vim.Datacenter.queryConnectionInfo: vim.fault.NoHost:

--> Result:

--> (vim.fault.NoHost) {

-->    dynamicType = <unset>,

-->    faultCause = (vmodl.MethodFault) null,

-->    name = "xxxesxi01.xxx.com",

-->    msg = "",

--> }

--> Args:

-->

 

Connection error

Call "Datacenter.QueryConnectionInfo" for object "XXX" on vCenter Server "VCENTER" failed.

 

Thanks

 

Removed network details Message was edited by: OptimalZ06


Viewing all articles
Browse latest Browse all 18256

Trending Articles