ServerManager-Server(-ProcessManager) communication
kabirkhan Aug 26, 2010 3:33 PMA change in how messages are passed between the ServerManager (SM) and Server instances was mentioned on IRC. Currently the SM and Server processes connect to the ProcessManager (PM) which then routes the messages to the appropriate process. The new idea is that Servers connect directly to SM, leaving the PM solely in charge of starting, stopping and reconnecting processes.
I think the flow for how that would work is something like the following. Stuff I'm unsure about is in bold
- PM starts up and listens on a socket on a port (PPM) for connections from the processes it manages.
- PM starts SM passing in PPM, PM’s host address and ‘ServerManager’ as the name
- SM records PPM , PM’s host address and ‘ServerManager’ in a file
- SM opens a socket on a different port (PSM) which listens for connections from the Server processes and from the Domain Controller (DC).
- SM initiates communication with PM, by connecting to port PPM. The first command it sends is 'STARTED ServerManager <PSM> <SM_ADDRESS> ', which helps PM associate the socket with the correct ManagedProcess.
- For each Server configured in SM:
- SM tells PM to add Server
- SM tells PM to start Server process
- PM launches the Server process, passing in PPM, PM_ADDRESS, PSM, SM_ADDRESS and the SERVER_NAME
- Server initiates communication with PM, by connecting to port PPM. The first command it sends is 'STARTED <SERVER_NAME>', which helps PM associate the socket with the correct ManagedProcess.
- Server starts listening for commands on the PM socket.
- Server initiates communication with SM, by connecting to port PSM. The first command it sends is 'AVAILABLE <SERVER_NAME>', which helps SM associate the socket with the correct Server proxy.
- Server starts listening for commands on the SM socket
- SM sends the ‘START serverConfig’ message to the server via the Server’s socket
- Server parses the serverConfig, starts up and sends to SM either
- ‘STARTED’ if successful.
- ‘START_FAILED’ if failed
- SM tells PM to stop process???
- SM tells PM to remove process???
- Server parses the serverConfig, starts up and sends to SM either
- While a ManagedProcess is registered as started in PM
- PM regularly pings process on the processes socket (Or should SM instead perhaps pick up on when the Server socket is closed?)
- Server or SM process sends ping back
- If a reply is not received from the process or the processes socket is closed:
- For Server processes, PM stops the ManagedProcess and sends ‘DOWN <SERVER_NAME> to SM on the PM-SM socket
- SM does 2.4.2 and and 2.4.3 for the Server according to its respawn policy if it has not initiated shutdown of that server
- After more retries than the respawn policy max
- SM tells PM to stop process
- SM tells PM to remove process
- After more retries than the respawn policy max
- SM does 2.4.2 and and 2.4.3 for the Server according to its respawn policy if it has not initiated shutdown of that server
- For ServerManager???
- For Server processes, PM stops the ManagedProcess and sends ‘DOWN <SERVER_NAME> to SM on the PM-SM socket
- PM regularly pings process on the processes socket (Or should SM instead perhaps pick up on when the Server socket is closed?)
- To shutdown a server
- SM sends ‘SHUTDOWN’ to server.
- Server closes down
- Server sends ‘STOPPED’ to SM.
- SM tells PM to stop the Server process
- SM tells PM to remove the Server process
- SM sends ‘SHUTDOWN’ to server.
- Closing down everything
- Shutdown hook in PM sends 'SHUTDOWN' message to SM
- For each server
- do step 4
- SM sends 'STOPPED' command to PM
- PM stops and removes SM process
- For each server
- Shutdown hook in PM sends 'SHUTDOWN' message to SM
- Restarting SM
- SM process is stopped by
- Message from PM?
- Message from DC?
- Process is killed
- SM is down...
- SM process is started
- SM reads PPM, PM address and process name from file (or should it be restarted via PM? In which case this could be passed in as in 2.)
- See 2.2
- See 2.3
- Some differentiator is needed to not do 2.4. SM sends “RESTARTED” command to PM
- For each Server process PM sends ‘SM_RESTARTED <PSM> <SM_ADDRESS>’
- Server reconnects to SM as in 2.4.2.4
- SM sends STATUS to Server
- Server responds with STARTED, START_FAILED etc.
- For each Server process PM sends ‘SM_RESTARTED <PSM> <SM_ADDRESS>’
- SM process is stopped by
I'm not really clear on what initiates 6 and what the steps should be there
I think SM should be responsible for the respawning of servers rather than PM which is what does that at the moment.