MySQL Cluster: HA als StorageEngine

(1)

MySQL Cluster: HA als StorageEngine

Erkan Yanar

linsenraum.de

(2)

What About

Hochverf¨

ugbarkeit mit MySQL

Topic

Wir wollen Hochverf¨ugbarkeit Daten

Service

(3)

L¨

osungsm¨

oglichkeiten:

HA auf Basis von Replikation

Replikation

Datenredundanz

Slave read only (Scaleout) HA

(4)

Datenredundanz/Verf¨ugbarkeit Replication

L¨

osungsm¨

oglichkeiten:

Replikation

Datenredundanz Slave read only (Scaleout)

HA

(5)

L¨

osungsm¨

oglichkeiten:

Replikation

HA

(6)

Datenredundanz/Verf¨ugbarkeit Replication

L¨

osungsm¨

oglichkeiten:

Replikation

HA

(7)

Sicherheit semisyncvs. Performance async

Failover Daten

(8)

Datenredundanz/Verf¨ugbarkeit Blocklevel

Blocklevel

2 Nodes Limit Nur ein Rechner Failover

Daten

Recoverydowntime Keine Skalierung

(9)

Blocklevel

2 Nodes Limit Nur ein Rechner Failover

Daten

Recoverydowntime Keine Skalierung

(10)

Datenredundanz/Verf¨ugbarkeit StorageEngines Modulares MySQL Storage Engine Authentication INFORMATION SCHEMA General Replication Audit . . .

(11)

Modulares MySQL Storage Engine Authentication INFORMATION SCHEMA General Replication Audit . . .

(12)

Datenredundanz/Verf¨ugbarkeit StorageEngines

Local Storage Engine

(13)

Storage Engines

CREATE TABLE tablename (...) ENGINE=...

Local Storage INNODB/XtraDB MYISAM Aria PBXT OQGraph . . .

(14)

Weite Welt

(15)

Network Storage FedratedX SpiderSE NDB (Network Database) CREATE TABLE tbl a( col a int, col b int, PRIMARY KEY(col a) ) ENGINE = FEDERATED CONNECTION ’mysql://ich:[email protected]/db/tbl a’;

(16)

Network Storage

FedratedX

SpiderSE

NDB (Network Database)

CREATE TABLE tbl a( col a int, col b int, PRIMARY KEY(col a)) ENGINE = SPIDER

CONNECTION ’ table tbl a, user ich, password auch ’ PARTITION BY RANGE( col a ) (

PARTITION p1 VALUES LESS THAN (1000) COMMENT ’host 192.168.178.25’, PARTITION p2 VALUES LESS THAN (2000) COMMENT ’host 192.168.178.26’,

(17)

Network Storage

FedratedX SpiderSE

NDB (Network Database)

(18)

(19)

Network Storage Speicher Engine Shared Nothing HA: 99,999% ACID READ COMMITTED Row Locking Two Phase Commit Keine FK

Key/Value Applikation! ALTER TABLE . . . ENGINE=NDB Hash-Partitionierung/Sharding

(20)

MySQL Cluster Komponenten

Komponenten

ndb(mt)d mgmd MySQL/API

(21)

Komponenten

ndb(mt)d mgmd MySQL/API Datenknoten Partitionierung/Redundanz 2-Phase-Commit Failover max. 48 Nodes

(22)

MySQL Cluster Komponenten

Komponenten

ndb(mt)d mgmd MySQL/API Managementdeamon/knoten: Konfiguration Logging Arbitrator Managementinterface

(23)

Komponenten

ndb(mt)d mgmd

MySQL/API

API

Kommunikation via ndb-api ENGINE=NDB

(24)

MySQL Cluster NoOfReplicas

Implementing HA

ndb(mt)d NoOfReplicas Primary lead Failover

(25)

Impelenting HA: 2PC

Datenkonsitenz a la 2PC TransactionCoordinator Prepare Commit Latenz

(26)

MySQL Cluster 2PC

Impelenting HA: 2PC

100.000 Inserts:

InnoDB 0m7.911s

(27)

INSERT/UPDATE

Changing Data Parallel Inserts Parallel Updates Multi Data-Nodes Multi Api-Nodes!

(28)

MySQL Cluster SELECTs

SELECT

NDB SELECTs Key Lookups Nested Loop Join Push-Down-Join (SPJ) Parallel Scans

(29)

a SELECT id2 FROM a WHERE id=@a; 100.000 x uniform

c SELECT COUNT(*) FROM a JOIN b using (id) where a.id=100; 100.000x

d SELECT COUNT(id2) FROM a JOIN b using (id2) where a.id=1000;

Query InnoDB 5.5.20 ndb 7.1.0 ndb 7.2.5

a 8.526s 29.290s 29.290s

b 0.130s 16.290s 0.750s

c 8.131s 40.285s 44.043s

(30)

b SELECT COUNT(*) FROM a JOIN b using (id);

a 8.526s 29.290s 29.290s

b 0.130s 16.290s SPJ 0.750s

c 8.131s 40.285s 44.043s

(31)

a 8.526s 29.290s 29.290s

b 0.130s 16.290s 0.750s

c 8.131s 40.285s 44.043s

(32)

a 8.526s 29.290s 29.290s

b 0.130s 16.290s 0.750s

c 8.131s 40.285s 44.043s

(33)

a 8.526s 29.290s 29.290s

b 0.130s 16.290s 0.750s

c 8.131s 40.285s 44.043s

(34)

MySQL Cluster Making the whole Cluster HA

Begrenztes HA

HA der MySQLd? Loadbalancer Connectoren Pacemaker

(35)

UseCases

Viele kleine Trans. Key/Value Lookups HA Hoher Durchsatz Telco Gaming . . .

(36)

Memory

Memory Only NDB

Speicheraufteilung IndexMemory DataMemory Hard Limit UNIQUE/PK Indices

Indices und Data Pages stick to table Kein Auslagern

(37)

Memory Only NDB

Speicheraufteilung IndexMemory DataMemory Hard Limit UNIQUE/PK Indices

Indices und Data Pages stick to table

(38)

Memory

Memory Only NDB

Speicheraufteilung IndexMemory DataMemory Hard Limit UNIQUE/PK Indices Indices und Data Pages stick to table

(39)

Fr¨

ohliches Konfigurieren

IndexMemory PK, UNIQUE

DataMemory Data, (Ordered) Indices,

StringMemory Out of string memory, ...Permanent error: Schema error

MaxNoOfOrderedIndexes Maximale Zahl der Ordered Indices

MaxNoOfAttributes Max. Tabellen, Indices, Columns etc.

MaxNoOfUniqueHashIndexes Max. Unique Indices die nicht PK sind.

MaxNoOfConcurrentOperations Operation Records in TC1KB

(40)

Memory Memory hitting Disk

Und was ist mit Disk?

Disk IO!

MySQL Cluster schreibt in

DiskCheckpointSpeed DataMemory(als LCP) mitTimeBetweenLocalCheckpoints Abstand auf die Disk.

Das Redo Log (NoOfFragmentLoges * FragmentLogFileSize) muss groß genug sein, um die ¨Anderungen w¨ahrend 2x LCP zu speichern.

(41)

MySQL Cluster: Konfiguration

config.ini

Verwaltet vom mgmd Definiert Mitglieder Konfiguriert Teilnehmer

(42)

Konfiguration

StartUp

host1# ndb_mgmd [--initial] --config-file=/tmp/config.ini host2# ndbd [--initial] --ndb-connectstring=host1 host3# ndbd [--initial] --ndb-connectstring=host1 host4# mysqld --ndb-connectstring=host1

(43)

config.ini

[NDB_MGMD] HostName = 10.10.10.23 NodeID = 1 [NDBD DEFAULT] NoOfReplicas = 2 DataDir = /data/cluster IndexMemory = 20M DataMemory = 100M [NDBD] NodeID = 21 HostName = 10.10.10.21 [NDBD] NodeID = 22 HostName = 10.10.10.22 [mysqld] NodeID = 60 HostName = 10.10.10.20

(44)

Konfiguration

config.ini

[NDB_MGMD] HostName = 10.10.10.23 NodeID = 1 [NDBD DEFAULT] NoOfReplicas = 2 DataDir = /data/cluster IndexMemory = 20M DataMemory = 100M NoOfFragmentLogFiles = 32 # 32 * (16M*4) TimeBetweenLocalCheckpoints = 20 # 20=4MB,21=8MB,22=16MB TimeBetweenGlobalCheckpoints = 2000 #ndbmtd: MaxNoOfExecutionThreads # cores value # 2 2 -> 1LQH # 4 4 -> 2LQH # 8 8 -> 4LQH # Partitionen= 8*LQH # Partition = 46.137.488 Rows [NDBD] NodeID = 21 HostName = 10.10.10.21 [NDBD] NodeID = 22 HostName = 10.10.10.22 [mysqld]

(45)

Prozesskonfig in my.cnf

[mysqld] ndbcluster ndb-cluster-connection-pool = 1 [mysql_cluster] ndb-connectstring = 10.10.10.23 [ndb_mgmd] configdir = /data/cluster

(46)

Maintenance Rolling Restart

Rolling Restarts/Upgrades

Cluster muss oft restartet werden :( Service soll HA sein.

auch bei Upgrades!

Solution: Rolling Rolling Rolling . . .

(47)

Rolling Restarts/Upgrades

Cluster muss oft restartet werden :( Service soll HA sein.

auch bei Upgrades!

Solution: Rolling Rolling Rolling . . .

(48)

(49)

Online Operations

ROLLING RESTART ADD INDEX DROP INDEX ADD COLUMN BACKUP ADD/CREATE NODEGROUP REORGANIZE PARTITION

(50)

Ignored Lost TableSpace (WAN) Replication memcached-api mod ndb . . .

(51)

(52)

Maintenance

Ende Gel¨

ande

Morgen Galera Cluster: (fast) alles besser!

Erkan Yanar

[email protected] . . .

linsenraum.de/erkules