UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Keeping TCP connections intact across server failures Li, Rui

Abstract

Replication is widely employed to achieve fault tolerance and high availability. There are two common approaches: active replication and primary-backup replication. In the primary-backup approach, the service states are replicated on the backup server. When the primary server fails, the backup server takes over and continues the service. In most present implementations based on TCP/IP communication, the TCP connections between clients and servers break if the primary servers crash. The topic of this thesis is to keep the connections intact across primary server failures - the backup server will take all the TCP connections automatically so that, from the client's point of view, no services will be influenced by the failures. To achieve this goal, we implemented replication in the TCP layer. The information and data associated with the sockets are replicated on the backup server, so when a primary server fails, the backup server can reconstruct all the sockets. By using an ARP message to claim the IP address of the failed primary server, the backup server refreshes the routing tables in other nodes so the packets addressed to the failed primary server will be redirected to the backup server from then on. By this means, the backup server takes over seamlessly, without breaking the present TCP connections.

Item Media

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.