2 Replies Latest reply on Feb 16, 2012 5:25 PM by genman

    Problem when server connect address does not match server's own address

    genman

      I have a number of servers on different sub networks, call them web & int. And they connect via different address configurations, e.g.

       

      db (on subnet int) -> rhqhost (using rhqhost-int hostname)

      webserver (on subnet web) -> rhqhost (using rhqhost-web hostname)

       

      This is what I see on the agent side: (who's connecting to 'rhqhost-int')

       

      2012-02-16 19:54:01,444 WARN  [RHQ Agent Registration Thread] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.failover-list-unreachable-host}
      Failover list has an unreachable host [rhqhost] (tested ports [7080] and [7443]). Cause: java.net.ConnectException:Connection timed out
      2012-02-16 19:54:01,445 WARN  [RHQ Agent Registration Thread] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.failover-list-check-failed}!!!
      There are [1] servers that are potentially unreachable by this agent.
      Please double check all public endpoints of your servers and ensure
      they are all reachable by this agent. The failed server endpoints are:
      [rhqhost:7080/7443]
      See the Administration > High Availability > Servers in the server GUI
      

       

      So what it comes down to is it seems like I need some way to map the host name 'rhqhost' to 'rhqhost-int' on the agent side. As a work around, which sounds even easier, just have the server tell the agent to try BOTH rhqhost-int and rhqhost-web.

       

      Is there some way to manually add list of Endpoint Addresses through the UI or database?

       

      Another way to work around this would be to change /etc/hosts to resolve correctly...

        • 1. Re: Problem when server connect address does not match server's own address
          mazz

          Unfortunately, having a list of IPs per RHQ Server is not supported. You'd have to do your second suggestion: make sure the agents' /etc/hosts resolve correctly.

           

          This is addressed in the HA docs:

           

               "Thus, it is critical that every RHQ Agent be able to resolve the Endpoint Address set for every RHQ Server in the HA Server cloud. So, when defining the RHQ Server in the installer, it is important that the Endpoint Address be public to the degree that the RHQ Agent population can resolve the RHQ Server's address and be able to reach the RHQ Server via the defined address and port"

           

          and in the blue box just below that:

           

               "Each server has a public endpoint address associated with it (which can be either a hostname or IP address). Those server public endpoints are used as failover list entries. Therefore, it is very important that all servers are assigned public endpoint addresses that are resolvable by all agents and that all agents have connectivity to those addresses."

          • 2. Re: Problem when server connect address does not match server's own address
            genman

            I'm looking at updating /etc/hosts.

             

            Thanks for pointing me to the documentation. I looked over the install and FAQs and didn't see this.