MySQL Cluster: HA als StorageEngine
Erkan Yanar
linsenraum.de
What About
Hochverf¨
ugbarkeit mit MySQL
Topic
Wir wollen Hochverf¨ugbarkeit Daten
Service
L¨
osungsm¨
oglichkeiten:
HA auf Basis von Replikation
Replikation
Datenredundanz
Slave read only (Scaleout) HA
Datenredundanz/Verf¨ugbarkeit Replication
L¨
osungsm¨
oglichkeiten:
HA auf Basis von Replikation
Replikation
Datenredundanz Slave read only (Scaleout)
HA
L¨
osungsm¨
oglichkeiten:
HA auf Basis von Replikation
Replikation
Datenredundanz Slave read only (Scaleout)
HA
Datenredundanz/Verf¨ugbarkeit Replication
L¨
osungsm¨
oglichkeiten:
HA auf Basis von Replikation
Replikation
Datenredundanz Slave read only (Scaleout)
HA
Sicherheit semisyncvs. Performance async
Failover Daten
Datenredundanz/Verf¨ugbarkeit Blocklevel
Blocklevel
2 Nodes Limit Nur ein Rechner Failover
Daten
Recoverydowntime Keine Skalierung
Blocklevel
2 Nodes Limit Nur ein Rechner Failover
Daten
Recoverydowntime Keine Skalierung
Datenredundanz/Verf¨ugbarkeit StorageEngines Modulares MySQL Storage Engine Authentication INFORMATION SCHEMA General Replication Audit . . .
Modulares MySQL Storage Engine Authentication INFORMATION SCHEMA General Replication Audit . . .
Datenredundanz/Verf¨ugbarkeit StorageEngines
Local Storage Engine
Storage Engines
CREATE TABLE tablename (...) ENGINE=...
Local Storage INNODB/XtraDB MYISAM Aria PBXT OQGraph . . .
Datenredundanz/Verf¨ugbarkeit StorageEngines
Weite Welt
Network Storage FedratedX SpiderSE NDB (Network Database) CREATE TABLE tbl a( col a int, col b int, PRIMARY KEY(col a) ) ENGINE = FEDERATED CONNECTION ’mysql://ich:auch@10.10.10.24/db/tbl a’;
Datenredundanz/Verf¨ugbarkeit StorageEngines
Network Storage
FedratedX
SpiderSE
NDB (Network Database)
CREATE TABLE tbl a( col a int, col b int, PRIMARY KEY(col a)) ENGINE = SPIDER
CONNECTION ’ table tbl a, user ich, password auch ’ PARTITION BY RANGE( col a ) (
PARTITION p1 VALUES LESS THAN (1000) COMMENT ’host 192.168.178.25’, PARTITION p2 VALUES LESS THAN (2000) COMMENT ’host 192.168.178.26’,
Network Storage
FedratedX SpiderSE
NDB (Network Database)
Network Storage Speicher Engine Shared Nothing HA: 99,999% ACID READ COMMITTED Row Locking Two Phase Commit Keine FK
Key/Value Applikation! ALTER TABLE . . . ENGINE=NDB Hash-Partitionierung/Sharding
MySQL Cluster Komponenten
Komponenten
ndb(mt)d mgmd MySQL/API
Komponenten
ndb(mt)d mgmd MySQL/API Datenknoten Partitionierung/Redundanz 2-Phase-Commit Failover max. 48 NodesMySQL Cluster Komponenten
Komponenten
ndb(mt)d mgmd MySQL/API Managementdeamon/knoten: Konfiguration Logging Arbitrator ManagementinterfaceKomponenten
ndb(mt)d mgmd
MySQL/API
API
Kommunikation via ndb-api ENGINE=NDB
MySQL Cluster NoOfReplicas
Implementing HA
ndb(mt)d NoOfReplicas Primary lead FailoverImpelenting HA: 2PC
Datenkonsitenz a la 2PC TransactionCoordinator Prepare Commit LatenzMySQL Cluster 2PC
Impelenting HA: 2PC
100.000 Inserts:
InnoDB 0m7.911s
INSERT/UPDATE
Changing Data Parallel Inserts Parallel Updates Multi Data-Nodes Multi Api-Nodes!MySQL Cluster SELECTs
SELECT
NDB SELECTs Key Lookups Nested Loop Join Push-Down-Join (SPJ) Parallel Scans
MySQL Cluster SELECTs
a SELECT id2 FROM a WHERE id=@a; 100.000 x uniform
c SELECT COUNT(*) FROM a JOIN b using (id) where a.id=100; 100.000x
d SELECT COUNT(id2) FROM a JOIN b using (id2) where a.id=1000;
Query InnoDB 5.5.20 ndb 7.1.0 ndb 7.2.5
a 8.526s 29.290s 29.290s
b 0.130s 16.290s 0.750s
c 8.131s 40.285s 44.043s
MySQL Cluster SELECTs
a SELECT id2 FROM a WHERE id=@a; 100.000 x uniform
b SELECT COUNT(*) FROM a JOIN b using (id);
c SELECT COUNT(*) FROM a JOIN b using (id) where a.id=100; 100.000x
d SELECT COUNT(id2) FROM a JOIN b using (id2) where a.id=1000;
Query InnoDB 5.5.20 ndb 7.1.0 ndb 7.2.5
a 8.526s 29.290s 29.290s
b 0.130s 16.290s SPJ 0.750s
c 8.131s 40.285s 44.043s
MySQL Cluster SELECTs
a SELECT id2 FROM a WHERE id=@a; 100.000 x uniform
b SELECT COUNT(*) FROM a JOIN b using (id);
c SELECT COUNT(*) FROM a JOIN b using (id) where a.id=100; 100.000x
Query InnoDB 5.5.20 ndb 7.1.0 ndb 7.2.5
a 8.526s 29.290s 29.290s
b 0.130s 16.290s 0.750s
c 8.131s 40.285s 44.043s
MySQL Cluster SELECTs
a SELECT id2 FROM a WHERE id=@a; 100.000 x uniform
b SELECT COUNT(*) FROM a JOIN b using (id);
c SELECT COUNT(*) FROM a JOIN b using (id) where a.id=100; 100.000x
d SELECT COUNT(id2) FROM a JOIN b using (id2) where a.id=1000;
Query InnoDB 5.5.20 ndb 7.1.0 ndb 7.2.5
a 8.526s 29.290s 29.290s
b 0.130s 16.290s 0.750s
c 8.131s 40.285s 44.043s
a SELECT id2 FROM a WHERE id=@a; 100.000 x uniform
b SELECT COUNT(*) FROM a JOIN b using (id);
c SELECT COUNT(*) FROM a JOIN b using (id) where a.id=100; 100.000x
d SELECT COUNT(id2) FROM a JOIN b using (id2) where a.id=1000;
Query InnoDB 5.5.20 ndb 7.1.0 ndb 7.2.5
a 8.526s 29.290s 29.290s
b 0.130s 16.290s 0.750s
c 8.131s 40.285s 44.043s
MySQL Cluster Making the whole Cluster HA
Begrenztes HA
HA der MySQLd? Loadbalancer Connectoren PacemakerUseCases
Viele kleine Trans. Key/Value Lookups HA Hoher Durchsatz Telco Gaming . . .
Memory
Memory Only NDB
Speicheraufteilung IndexMemory DataMemory Hard Limit UNIQUE/PK IndicesIndices und Data Pages stick to table Kein Auslagern
Memory Only NDB
Speicheraufteilung IndexMemory DataMemory Hard Limit UNIQUE/PK IndicesIndices und Data Pages stick to table
Memory
Memory Only NDB
Speicheraufteilung IndexMemory DataMemory Hard Limit UNIQUE/PK Indices Indices und Data Pages stick to tableFr¨
ohliches Konfigurieren
IndexMemory PK, UNIQUE
DataMemory Data, (Ordered) Indices,
StringMemory Out of string memory, ...Permanent error: Schema error
MaxNoOfOrderedIndexes Maximale Zahl der Ordered Indices
MaxNoOfAttributes Max. Tabellen, Indices, Columns etc.
MaxNoOfUniqueHashIndexes Max. Unique Indices die nicht PK sind.
MaxNoOfConcurrentOperations Operation Records in TC1KB
Memory Memory hitting Disk
Und was ist mit Disk?
Disk IO!
MySQL Cluster schreibt in
DiskCheckpointSpeed DataMemory(als LCP) mitTimeBetweenLocalCheckpoints Abstand auf die Disk.
Das Redo Log (NoOfFragmentLoges * FragmentLogFileSize) muss groß genug sein, um die ¨Anderungen w¨ahrend 2x LCP zu speichern.
MySQL Cluster: Konfiguration
config.ini
Verwaltet vom mgmd Definiert Mitglieder Konfiguriert Teilnehmer
Konfiguration
StartUp
host1# ndb_mgmd [--initial] --config-file=/tmp/config.ini host2# ndbd [--initial] --ndb-connectstring=host1 host3# ndbd [--initial] --ndb-connectstring=host1 host4# mysqld --ndb-connectstring=host1
config.ini
[NDB_MGMD] HostName = 10.10.10.23 NodeID = 1 [NDBD DEFAULT] NoOfReplicas = 2 DataDir = /data/cluster IndexMemory = 20M DataMemory = 100M [NDBD] NodeID = 21 HostName = 10.10.10.21 [NDBD] NodeID = 22 HostName = 10.10.10.22 [mysqld] NodeID = 60 HostName = 10.10.10.20Konfiguration
config.ini
[NDB_MGMD] HostName = 10.10.10.23 NodeID = 1 [NDBD DEFAULT] NoOfReplicas = 2 DataDir = /data/cluster IndexMemory = 20M DataMemory = 100M NoOfFragmentLogFiles = 32 # 32 * (16M*4) TimeBetweenLocalCheckpoints = 20 # 20=4MB,21=8MB,22=16MB TimeBetweenGlobalCheckpoints = 2000 #ndbmtd: MaxNoOfExecutionThreads # cores value # 2 2 -> 1LQH # 4 4 -> 2LQH # 8 8 -> 4LQH # Partitionen= 8*LQH # Partition = 46.137.488 Rows [NDBD] NodeID = 21 HostName = 10.10.10.21 [NDBD] NodeID = 22 HostName = 10.10.10.22 [mysqld]Prozesskonfig in my.cnf
[mysqld] ndbcluster ndb-cluster-connection-pool = 1 [mysql_cluster] ndb-connectstring = 10.10.10.23 [ndb_mgmd] configdir = /data/clusterMaintenance Rolling Restart
Rolling Restarts/Upgrades
Cluster muss oft restartet werden :( Service soll HA sein.
auch bei Upgrades!
Solution: Rolling Rolling Rolling . . .
Rolling Restarts/Upgrades
Cluster muss oft restartet werden :( Service soll HA sein.
auch bei Upgrades!
Solution: Rolling Rolling Rolling . . .
Online Operations
ROLLING RESTART ADD INDEX DROP INDEX ADD COLUMN BACKUP ADD/CREATE NODEGROUP REORGANIZE PARTITIONIgnored Lost TableSpace (WAN) Replication memcached-api mod ndb . . .
Maintenance
Ende Gel¨
ande
Morgen Galera Cluster: (fast) alles besser!
Erkan Yanar
erkan.yanar@linsenraum.de . . .
linsenraum.de/erkules