Log in to view full website content, join discussions, and to post information.

We have three web servers for the Earthquake Reponse site, http://response.scec.org. One server is at USC, one at Caltech, and one at Stanford. We want the three servers to be mirrors of each and one of them (presumably the one at USC) to be primary, but in case the primary server fails, we want all visitors to be directed to one of the other two mirrors.

We can easily set up round-robin DNS by having Caltech's hostmaster add each server's IP address as an A record of response.scec.org. However, the DNS server does not understand which IP address is primary. This means that users could be directed to any of the three servers and make changes there. Since we've only set up rsync from the primary server at USC to the mirrors, changes made on a mirror will not reach the primary server or the other mirror, and since the DNS server gives out addresses sequentially (i.e. user 1 gets the first one, user 2 gets the second one), the three servers can easily get out of sync.

  1. One solution is pointing response.scec.org at a load balancer, which would then check if the primary server is online and return an appropriate IP address, only directing visitors to the mirrors if the primary is offline.  Since Caltech IMSS cannot host a load balancer for us, we will have to set up and maintain a separate machine that becomes another single point of failure.
  2. Another solution is to simply set response.scec.org to point to the server at USC, only add a mirror to DNS if the primary server goes down, and sync the servers again after the primary server comes back online. However, updating DNS requires us to email the hostmaster and the hostmaster to actually make the change. Since the hostmaster is a group of people working normal business hours and not an automated script, this is a slow process.
  3. Yet another solution is to use round-robin DNS but set Drupal to use one primary database. Right now each server uses its own local database, and the database on the primary server is rsync'ed with the mirrors. Most changes to a Drupal website are stored in the database, except for configuration changes, which require modifying files on the web server. If we choose this option, users will be directed to any of the three servers, but changes they make will be stored in only one database which is then rsync'ed with the other databases. We will need to modify Drupal to use one database and to use another database if that primary database is offline. We also need to ensure that the databases allow remote access from all three mirrors.
  4. Another option, similar to the previous, is to set up MySQL replication instead of using rsync to keep the databases in sync and modifying Drupal to fail over to a different database.
  5. Finally, another option is to use round-robin DNS but keep the mirrors offline until they are needed. This requires manual intervention to get the mirror servers online. The Caltech server, at least, is not solely dedicated to the earthquake response site and needs to be online for other purposes.

Currently, response.scec.org is pointing only at the USC server until the beta site is set up on the mirrors. John Yu is looking into option 3. Does anyone have any other ideas for keeping the Eq. Response site servers redundant and in sync?