3 Replies Latest reply on Nov 4, 2015 4:10 AM by ymartin

    Automatic agent upgrade failure with secure communications

    ymartin

      Hello,

       

      My RHQ server installation (no HA or cluster) has been upgraded from 4.4 to 4.13.1 but agents now fail to upgrade themself automatically. I have changed log level in conf/log4j.xml to diagnose and get:

       

      2015-11-03 15:56:36,040 DEBUG [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.args-processed}Agent container has processed its command line arguments: [--daemon]
      2015-11-03 15:56:36,124 INFO  [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.identify-version}Version=[RHQ 4.4.0], Build Number=[516c434], Build Date=[May 7, 2012 11:05 PM]
      ...
      2015-11-03 15:56:36,604 DEBUG [main] (enterprise.communications.command.client.ClientCommandSender)- {ClientCommandSender.added-state-listener}Added the command client sender state listener [org.rhq.enterprise.agent.AgentMain$2@3febb011]; sender is sending=[false]; notify listener immediately=[true]
      2015-11-03 15:56:36,890 FATAL [RHQ Server Polling Thread] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.agent-not-supported}This version of the agent is not supported by the server - an agent update must be applied
      2015-11-03 15:56:36,893 INFO  [RHQ Agent Update Thread] (org.rhq.enterprise.agent.AgentUpdateThread)- {AgentUpdateThread.started}The agent update thread has started - will begin the agent auto-update now!
      ...
      2015-11-03 15:56:37,918 DEBUG [RHQ Agent Registration Thread] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.agent-registration-attempt}Agent will now attempt to register with the server [AgentRegistrationRequest: [name=[dev-srv1]; address=[172.20.6.14]; port=[16163]; remote-endpoint=[sslsocket://172.20.6.14:16163/?rhq.communications.connector.rhqtype=agent&numAcceptThreads=1&maxPoolSize=303&clientMaxPoolSize=304&socketTimeout=60000&enableTcpNoDelay=true&backlog=200]; regenerate-token=[false]; original-token=[<was not null>]; agent-version=[4.4.0(516c434)]]
      2015-11-03 15:56:37,922 DEBUG [RHQ Agent Registration Thread] (org.rhq.enterprise.agent.SecurityTokenCommandPreprocessor)- {SecurityTokenCommandPreprocessor.no-security-token-yet}There is no security token yet - the server will not accept commands from this agent until the agent is registered.
      ...
      2015-11-03 15:56:41,954 INFO  [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.shut-down}Agent has been shut down
      2015-11-03 15:56:41,954 FATAL [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.start-failure}Failed to start the agent
      org.rhq.core.clientapi.server.core.AgentNotSupportedException
          at org.rhq.enterprise.agent.AgentMain.waitForServer(AgentMain.java:1611)
          at org.rhq.enterprise.agent.AgentMain.start(AgentMain.java:655)
          at org.rhq.enterprise.agent.AgentMain.main(AgentMain.java:428)
      2015-11-03 15:56:41,956 DEBUG [RHQ Agent Update Thread] (org.rhq.enterprise.agent.AgentUpdateVersion)- {AgentUpdateVersion.update-version-retrieval}Getting the agent update version via URL [https://rhqserver:7443/agentupdate/version]
      2015-11-03 15:56:41,990 FATAL [RHQ Agent Update Thread] (org.rhq.enterprise.agent.AgentUpdateThread)- {PromptCommand.update.download-failed}Failed to download the agent update binary. Cause: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
      2015-11-03 15:56:41,990 FATAL [RHQ Agent Update Thread] (org.rhq.enterprise.agent.AgentUpdateThread)- {AgentUpdateThread.exception}The agent update thread encountered an exception: javax.net.ssl.SSLHandshakeException:sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target -> javax.net.ssl.SSLHandshakeException:sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target -> sun.security.validator.ValidatorException:PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target -> sun.security.provider.certpath.SunCertPathBuilderException:unable to find valid certification path to requested target
      2015-11-03 15:56:41,990 FATAL [RHQ Agent Update Thread] (org.rhq.enterprise.agent.AgentUpdateThread)- {AgentUpdateThread.cannot-restart-retry}The agent cannot restart after the aborted update, will try to update again in [60,000]ms
      
      
      
      
      

       

      And here is content visible in command-trace.log

       

      2015-11-03 15:56:38,139 TRACE {send.initiate}==>CoreServerService.connectAgent|?
      2015-11-03 15:56:38,162 TRACE {send.complete}=>>CoreServerService.connectAgent|?|failed:java.lang.reflect.InvocationTargetException:null -> org.rhq.core.clientapi.server.core.AgentNotSupportedException:Agent [dev-srv1] is an unsupported aent: 4.4.0(516c434)
      
      
      

       

      I have confirmed that the HTTPS port 7443 is opened with published certificates thanks to openssl s_client -connect rhqserver:7443

       

      By the way, I would say that my keystore and truststore are configured properly in agent 4.4 conf/ as it is able to query for the server version.

       

      My opinion is that the self-signed server certificate is rejected when downloading binary, so that the agent does not use the given truststore to accept it. How to diagnose to confirm ? Is there a work-around when upgrading with secure communications already setup ?

       

      Thank you in advance for your help

      Regards

      Yves