In-Depth - Database Services - High-Availability Solutions

We chose to use MySQL as our database engine and management system due it being extremely popular on the UNIX platform. UNIX and MySQL simply go hand-in-hand. The load balancing part of this section of the project was quite easy, but the replication on the other hand proved to be tricky which we will discuss in a later section. The nature of databases is actually very delicate as databases deal with dynamic content – the type that changes very quickly, especially in a network with many users or websites.

MySQL DBMS was load balanced through our Heartbeat package. We have added a special

configuration to our /etc/ha.d/ldirectord.cf file, which also allows many other things for us to load balance, such as LDAP, POP3/SMTP, DNS, DHCP, and such.

Table 9: MySQL Specific Configuration in /etc/ha.d/ldirectord.cf

virtual=192.168.2.235:3306

Whichever content management system we used had to go through the virtual IP that would direct the queries to whichever database system was available. Since the virtual IP caused the user to connect to only one database on that one system, whatever query was run would be recorded in that specific database on that specific system. This means that the data would not show up on the other database on the other system. As part of our goal of this project, we decided to implement a

44 master-to-master database replication scheme to overcome this issue. Whatever was recorded on one node was done instantaneously on the other one as well which kept both of them in perfect sync. Also, if one node were to fail, once it got back up, the databases would sync to the one with the latest information. Since this was a simple two-node configuration, we decided to use the regular MySQL package. If we were to use more than 2 nodes, then we would have to deploy the MySQL Cluster package (free Community download available). MySQL Cluster is a package that encompasses the MySQL base server package and provides multi-node clustering capabilities. It is more of a dedicated project aimed at providing true enterprise clustering services with maximum availability.

The replication configuration was configured in the /etc/mysql/my.cnf file since everything was performed inside of MySQL:

Table 10: /etc/mysql/my.cnf Output

# The MySQL database server configuration file.

# You can copy this to one of:

# - "/etc/mysql/my.cnf" to set global options,

# - "~/.my.cnf" to set user-specific options.

# One can use all long options that the program supports.

# Run program with --help to get a list of available options and with

# --print-defaults to see which it would actually understand and use.

# For explanations see

# http://dev.mysql.com/doc/mysql/en/server-system-variables.html

# This will be passed to all mysql clients

# It has been reported that passwords should be enclosed with ticks/quotes

# escpecially if they contain "#" chars...

# Remember to edit /etc/mysql/debian.cnf when changing the socket location.

[client]

port = 3306

socket = /var/run/mysqld/mysqld.sock

# Here is entries for some specific programs

# The following values assume you have at least 32M ram

# This was formally known as [safe_mysqld]. Both versions are currently parsed.

[mysqld_safe]

# If you make changes to these settings and your system uses apparmor, you may

# also need to also adjust /etc/apparmor.d/usr.sbin.mysqld.

# Instead of skip-networking the default is now to listen only on

# localhost which is more compatible and is not less secure.

#bind-address = 0.0.0.0

# Both location gets rotated by the cronjob.

# Be aware that this log type is a performance killer.

#log = /var/log/mysql/mysql.log

# Error logging goes to syslog. This is a Debian improvement :)

# Here you can see queries with especially long duration

#log_slow_queries = /var/log/mysql/mysql-slow.log

#long_query_time = 2

#log-queries-not-using-indexes

# The following can be used as easy to replay backup logs or for replication.

# note: if you are setting up a replication slave, see README.Debian about

# other settings you may need to change.

# Using BerkeleyDB is now discouraged as its support will cease in 5.1.12.

skip-bdb

# * InnoDB

# InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/.

# Read the manual for more InnoDB related options. There are many!

# You might want to disable InnoDB to shrink the mysqld process by circa 100MB.

#skip-innodb

# * Security Features

# Read the manual, too, if you want chroot!

# chroot = /var/lib/mysql/

# For generating SSL certificates I recommend the OpenSSL GUI "tinyca".

#no-auto-rehash # faster start of mysql but no tab completition

[isamchk]

key_buffer = 16M

# * NDB Cluster

# See /usr/share/doc/mysql-server-*/README.Debian for more information.

# The following configuration is read by the NDB Data Nodes (ndbd processes)

# not from the NDB Management Nodes (ndb_mgmd processes).

# [MYSQL_CLUSTER]

# ndb-connectstring=127.0.0.1

# * IMPORTANT: Additional settings that can override those from this file!

# The files must end with '.cnf', otherwise they'll be ignored.

!includedir /etc/mysql/conf.d/

The above is an entire MySQL configuration file. It seems big and scary at first, but if one really takes the time to read it through, it appears a lot simpler that one thinks. Plus it’s far shorter than the Apache web server’s configuration file. What we are really interested in though, is the following set of lines:

Table 11: Simple MySQL Replication Configuration

server-id = 2

Since both server nodes were “master” servers to each other, they both had almost the same

configs. The only difference laid in the server-id and the master-* set, which are always the opposite of the server with whom we were performing replication. Now, let’s break down this config.

The master-* settings set the master server from which the holder of this configuration set will replicate the database data from. Since both servers are using master-to-master replication, they both will have the same configurations and will contain information of the opposite server. The log_bin part is important in a way that it specifies the Binary Log file of the database which provides the information concerning replication and where it was left off. Technically, the file has to be the same on both nodes in order for both servers to have the same data within their databases. On another hand, binlog_do_db and replicate_do_db attributes specify the databases to be replicated (of course, the settings in this section have to be the same on both servers) through the Binary Log whereas the binlog_ignore_db specifies which databases to ignore. Binary Log contains every statement concerning the update data since all the statements and queries are store as events within the log. This log gets sent from the master server to the slave servers so the slave server can

49 replicate the master database. This file is also useful for database recovery as it contains everything that has happened in the database. The file is written immediately after the SQL query completes.

The file can be also further inspected using the mysqlbinlog utility which comes standard with every base installation.

We also have miniscule options such as the Binary Log file size and how often it should be deleted, but they do not need to be discussed at this moment. There is a lot more options that be used to enhance MySQL replication, but the above is the actually juice that fuels it and gets the job done. If one were to read the MySQL Reference Manual, they would also find all sorts of options that could be used with the replication.

In document High-Availability Solutions (Page 43-50)