Monday, September 26, 2016

PerconaLive Amsterdam 2016 - Talks and shows

With Oracle Open World behind us, we are now getting ready for the next big event, i.e. the European edition of PerconaLive. I am going to be a presenter three times:

  • MySQL operations in Docker is a three-hour tutorial, and it will be an expansion of the talk by the same title presented at OOW. Attendees who want to play along can do it, by coming prepared with Docker 1.11 or later and the following images already pulled (images with [+] are mandatory, while [-] are optional):

    • mysql/mysql-server [+]
    • mysql/mysql-gr [+]
    • mysql [-]
    • percona/percona-server [-]
    • mariadb [-]
    • mysql/shell [-]
    • datacharmer/mysql-minimal-5.5 [-]
    • datacharmer/mysql-minimal-5.6 [-]
    • datacharmer/mysql-minimal-5.7 [-]
    • datacharmer/mysql-minimal-8.0 [-]
    • datacharmer/my-ubuntu [-]
    • datacharmer/my-debian [-]
    • datacharmer/my-centos [-]
  • The fastest customized MySQL deployment ever is a presentation where I show two ways of deploying MySQL on a custom server, with MySQL Sandbox and Docker.

  • The lightning talks will be held during the Community dinner at Booking.com, diamond sponsor of the conference. If you want to attend, you need to register, and if you want a free ticker for that dinner, there is still ONE SLOT for the lightning talks. Contact me directly with a title and an abstract if you want to be considered for that slot (finding my email is part of the challenge, but it should not be that difficult).

UPDATE: here's the lightning talks program. Percona should eventually add it to the schedule.

  • 19:00: arrival at Booking.com by boat
  • 19:15: welcoming speech and beginning of the LT
  • 20:00 food buffet opens, LT are not finished yet
  • 20:30: LT are done, buffet still open
  • 21:15: buffet closes

The talks accepted are the following:

  • Jaime Crespo
    dbhell: a tiny Python framework for the administration and monitoring of farms of MySQL servers
  • Federico Razzoli
    How to write slow and buggy stored procedures
  • Art Van Scheppingen
    Simple household rules that keep MySQL running
  • Pavel Tru
    What internal statistics every self-respecting dbms should have!
  • Giuseppe Maxia
    Fastest, smallest, newest: download and deploy MySQL at high speed
  • Ronald Bradford
    An Awe-Inspiring Journey Through the World of Numbers

Sunday, September 25, 2016

Showing the hidden tables in MySQL 8 data dictionary

The freshly released MySQL 8.0 includes a data dictionary, which makes MySQL much more reliable. Thanks to this features, we don't have any '.frm' files, and querying the information_schema is 30x to 100x faster than previous versions.

One drawback of the implementation is that the data dictionary tables are hidden by design.

While the reason is fully understandable (they don't want to commit on an interface that may change in the future) many curious users are disappointed, because openness is the basis of good understanding and feedback.

The problem to access the dictionary tables can be split in three parts:

  • Finding the list of tables;
  • Finding the description of the tables;
  • Getting the contents of each table.

The first part is easily solved. We know that the data dictionary tables are accessed from some information_schema views (the views are defined during the initialization, at which point the DD tables are readable.)

For example:

 show create view information_schema.tables\G
*************************** 1. row ***************************
                View: TABLES
         Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost`
SQL SECURITY DEFINER VIEW `TABLES` AS select `cat`.`name` AS
`TABLE_CATALOG`,`sch`.`name` AS `TABLE_SCHEMA`,`tbl`.`name` AS
`TABLE_NAME`,`tbl`.`type` AS `TABLE_TYPE`,if((`tbl`.`type` = 'BASE
TABLE'),`tbl`.`engine`,NULL) AS `ENGINE`,if((`tbl`.`type` = 'VIEW'),NULL,10)
AS `VERSION`,`tbl`.`row_format` AS `ROW_FORMAT`,`stat`.`table_rows` AS
`TABLE_ROWS`,`stat`.`avg_row_length` AS `AVG_ROW_LENGTH`,`stat`.`data_length`
AS `DATA_LENGTH`,`stat`.`max_data_length` AS
`MAX_DATA_LENGTH`,`stat`.`index_length` AS `INDEX_LENGTH`,`stat`.`data_free`
AS `DATA_FREE`,`stat`.`auto_increment` AS `AUTO_INCREMENT`,`tbl`.`created` AS
`CREATE_TIME`,`stat`.`update_time` AS `UPDATE_TIME`,`stat`.`check_time` AS
`CHECK_TIME`,`col`.`name` AS `TABLE_COLLATION`,`stat`.`checksum` AS
`CHECKSUM`,if((`tbl`.`type` =
'VIEW'),NULL,get_dd_create_options(`tbl`.`options`,if((ifnull(`tbl`.`partition_expression`,'NOT_PART_TBL')
= 'NOT_PART_TBL'),0,1))) AS
`CREATE_OPTIONS`,internal_get_comment_or_error(`sch`.`name`,`tbl`.`name`,`tbl`.`type`,`tbl`.`options`,`tbl`.`comment`)
AS `TABLE_COMMENT` from ((((`mysql`.`tables` `tbl` join `mysql`.`schemata`
`sch` on((`tbl`.`schema_id` = `sch`.`id`))) join `mysql`.`catalogs` `cat`
on((`cat`.`id` = `sch`.`catalog_id`))) left join `mysql`.`collations` `col`
on((`tbl`.`collation_id` = `col`.`id`))) left join `mysql`.`table_stats`
`stat` on(((`tbl`.`name` = `stat`.`table_name`) and (`sch`.`name` =
`stat`.`schema_name`)))) where (can_access_table(`sch`.`name`,`tbl`.`name`)
and (not(`tbl`.`hidden`)))
character_set_client: latin1
collation_connection: latin1_swedish_ci
1 row in set (0.00 sec)

Here we see many tables (such as mysql.schemata or mysql.collations) that don't show up when we run SHOW TABLES in the mysql database.

We can use a script like this to get the list of all tables:

MYSQL=~/sandboxes/msb_full_8_0_0/use
TABLES=$($MYSQL  information_schema -BN -e 'show tables')

function show_tables
{
    for T in $TABLES
    do
        is_view=$($MYSQL information_schema -BN -e "show create table $T\G" | grep ALGORITHM)
        if [ -n "$is_view" ]
        then
            $MYSQL information_schema -e "show create table $T\G" \
               | perl -lne 'print $1 while /mysql.\..(\w+)/g'
        fi
    done
}
show_tables | sort | uniq

This script searches all information_schema tables, skips the ones that are not views, and then searches in the view definition every table from the mysql database. The result is this:

catalogs
character_sets
collations
columns
foreign_key_column_usage
foreign_keys
index_column_usage
indexes
index_stats
schemata
tables
table_stats

Good. Now we have the list of tables that we can't see. The second operation is getting the description.

So, I looked at the source code, and I found out where the prohibition originated. From there, I saw that the table is accessible when the variable skip_dd_table_access_check is set. Looking at the variables inside the server, I did not find any skip_dd_table_access_check, as I was expecting, since it would not make sense to provide this information in the open after going through the pains of making all DD tables unreachable.

I searched the code for the string skip_dd_table_access_check and I found out how it is used in the test suite. The key is using the debug build of the MySQL server.

Using MySQL Sandbox, with a sandbox made from the full tarball of MySQL 8.0, I run:

 ~/sandboxes/msb_full_8_0_0/restart --mysqld=mysqld-debug

Now I have loaded the debug-enabled server. Let's try:

$ ~/sandboxes/msb_full_8_0_0/use mysql
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 91
Server version: 8.0.0-dmr-debug MySQL Community Server - Debug (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql [localhost] {msandbox} (mysql) > show create table schemata\G
ERROR 3554 (HY000): Access to system table 'mysql.schemata' is rejected.

Having the debug build is not enough. We need to use the magic spell.

mysql [localhost] {msandbox} (mysql) > SET SESSION debug='+d,skip_dd_table_access_check';
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (mysql) > show create table schemata\G
*************************** 1. row ***************************
       Table: schemata
Create Table: CREATE TABLE `schemata` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `catalog_id` bigint(20) unsigned NOT NULL,
  `name` varchar(64) COLLATE utf8_bin NOT NULL,
  `default_collation_id` bigint(20) unsigned NOT NULL,
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `last_altered` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  UNIQUE KEY `catalog_id` (`catalog_id`,`name`),
  KEY `default_collation_id` (`default_collation_id`),
  CONSTRAINT `schemata_ibfk_1` FOREIGN KEY (`catalog_id`) REFERENCES `catalogs` (`id`),
  CONSTRAINT `schemata_ibfk_2` FOREIGN KEY (`default_collation_id`) REFERENCES  `collations` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
1 row in set (0.00 sec)

Yay! The table is now visible! We can modify the above script as follows:

mysql_tables=$(show_tables | sort | uniq)
for T in $mysql_tables
do
    echo "-- $T "
    $MYSQL -e "SET SESSION debug= '+d,skip_dd_table_access_check'; show create table mysql.$T\G"
done

And we get the description of every table in the data dictionary. Here it goes:

-- catalogs
*************************** 1. row ***************************
       Table: catalogs
Create Table: CREATE TABLE `catalogs` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(64) COLLATE utf8_bin NOT NULL,
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `last_altered` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  UNIQUE KEY `name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
-- character_sets
*************************** 1. row ***************************
       Table: character_sets
Create Table: CREATE TABLE `character_sets` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(64) CHARACTER SET utf8 NOT NULL,
  `default_collation_id` bigint(20) unsigned NOT NULL,
  `comment` varchar(2048) COLLATE utf8_bin NOT NULL,
  `mb_max_length` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `name` (`name`),
  KEY `default_collation_id` (`default_collation_id`),
  CONSTRAINT `character_sets_ibfk_1` FOREIGN KEY (`default_collation_id`) REFERENCES `collations` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=249 DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
-- collations
*************************** 1. row ***************************
       Table: collations
Create Table: CREATE TABLE `collations` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(64) CHARACTER SET utf8 NOT NULL,
  `character_set_id` bigint(20) unsigned NOT NULL,
  `is_compiled` tinyint(1) NOT NULL,
  `sort_length` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `name` (`name`),
  KEY `character_set_id` (`character_set_id`),
  CONSTRAINT `collations_ibfk_1` FOREIGN KEY (`character_set_id`) REFERENCES `character_sets` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=278 DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
-- columns
*************************** 1. row ***************************
       Table: columns
Create Table: CREATE TABLE `columns` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `table_id` bigint(20) unsigned NOT NULL,
  `name` varchar(64) CHARACTER SET utf8 COLLATE utf8_tolower_ci NOT NULL,
  `ordinal_position` int(10) unsigned NOT NULL,
  `type` enum(/* removed */) COLLATE utf8_bin NOT NULL,
  `is_nullable` tinyint(1) NOT NULL,
  `is_zerofill` tinyint(1) DEFAULT NULL,
  `is_unsigned` tinyint(1) DEFAULT NULL,
  `char_length` int(10) unsigned DEFAULT NULL,
  `numeric_precision` int(10) unsigned DEFAULT NULL,
  `numeric_scale` int(10) unsigned DEFAULT NULL,
  `datetime_precision` int(10) unsigned DEFAULT NULL,
  `collation_id` bigint(20) unsigned DEFAULT NULL,
  `has_no_default` tinyint(1) DEFAULT NULL,
  `default_value` blob,
  `default_value_utf8` text COLLATE utf8_bin,
  `default_option` blob,
  `update_option` varchar(32) COLLATE utf8_bin DEFAULT NULL,
  `is_auto_increment` tinyint(1) DEFAULT NULL,
  `is_virtual` tinyint(1) DEFAULT NULL,
  `generation_expression` longblob,
  `generation_expression_utf8` longtext COLLATE utf8_bin,
  `comment` varchar(2048) COLLATE utf8_bin NOT NULL,
  `hidden` tinyint(1) NOT NULL,
  `options` mediumtext COLLATE utf8_bin,
  `se_private_data` mediumtext COLLATE utf8_bin,
  `column_key` enum('','PRI','UNI','MUL') COLLATE utf8_bin NOT NULL,
  `column_type_utf8` mediumtext COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `table_id` (`table_id`,`name`),
  UNIQUE KEY `table_id_2` (`table_id`,`ordinal_position`),
  KEY `collation_id` (`collation_id`),
  CONSTRAINT `columns_ibfk_1` FOREIGN KEY (`table_id`) REFERENCES `tables` (`id`),
  CONSTRAINT `columns_ibfk_2` FOREIGN KEY (`collation_id`) REFERENCES `collations` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3450 DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
-- foreign_key_column_usage
*************************** 1. row ***************************
       Table: foreign_key_column_usage
Create Table: CREATE TABLE `foreign_key_column_usage` (
  `foreign_key_id` bigint(20) unsigned NOT NULL,
  `ordinal_position` int(10) unsigned NOT NULL,
  `column_id` bigint(20) unsigned NOT NULL,
  `referenced_column_name` varchar(64) CHARACTER SET utf8 COLLATE utf8_tolower_ci NOT NULL,
  PRIMARY KEY (`foreign_key_id`,`ordinal_position`),
  UNIQUE KEY `foreign_key_id` (`foreign_key_id`,`column_id`,`referenced_column_name`),
  KEY `column_id` (`column_id`),
  CONSTRAINT `foreign_key_column_usage_ibfk_1` FOREIGN KEY (`foreign_key_id`) REFERENCES `foreign_keys` (`id`),
  CONSTRAINT `foreign_key_column_usage_ibfk_2` FOREIGN KEY (`column_id`) REFERENCES `columns` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
-- foreign_keys
*************************** 1. row ***************************
       Table: foreign_keys
Create Table: CREATE TABLE `foreign_keys` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `schema_id` bigint(20) unsigned NOT NULL,
  `table_id` bigint(20) unsigned NOT NULL,
  `name` varchar(64) CHARACTER SET utf8 NOT NULL,
  `unique_constraint_id` bigint(20) unsigned NOT NULL,
  `match_option` enum('NONE','PARTIAL','FULL') COLLATE utf8_bin NOT NULL,
  `update_rule` enum('NO ACTION','RESTRICT','CASCADE','SET NULL','SET DEFAULT') COLLATE utf8_bin NOT NULL,
  `delete_rule` enum('NO ACTION','RESTRICT','CASCADE','SET NULL','SET DEFAULT') COLLATE utf8_bin NOT NULL,
  `referenced_table_catalog` varchar(64) COLLATE utf8_bin NOT NULL,
  `referenced_table_schema` varchar(64) COLLATE utf8_bin NOT NULL,
  `referenced_table_name` varchar(64) COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `schema_id` (`schema_id`,`name`),
  UNIQUE KEY `table_id` (`table_id`,`name`),
  KEY `unique_constraint_id` (`unique_constraint_id`),
  CONSTRAINT `foreign_keys_ibfk_1` FOREIGN KEY (`schema_id`) REFERENCES `schemata` (`id`),
  CONSTRAINT `foreign_keys_ibfk_2` FOREIGN KEY (`unique_constraint_id`) REFERENCES `indexes` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=46 DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
-- index_column_usage
*************************** 1. row ***************************
       Table: index_column_usage
Create Table: CREATE TABLE `index_column_usage` (
  `index_id` bigint(20) unsigned NOT NULL,
  `ordinal_position` int(10) unsigned NOT NULL,
  `column_id` bigint(20) unsigned NOT NULL,
  `length` int(10) unsigned DEFAULT NULL,
  `order` enum('UNDEF','ASC','DESC') COLLATE utf8_bin NOT NULL,
  `hidden` tinyint(1) NOT NULL,
  UNIQUE KEY `index_id` (`index_id`,`ordinal_position`),
  UNIQUE KEY `index_id_2` (`index_id`,`column_id`,`hidden`),
  KEY `f2` (`column_id`),
  CONSTRAINT `index_column_usage_ibfk_1` FOREIGN KEY (`index_id`) REFERENCES `indexes` (`id`),
  CONSTRAINT `index_column_usage_ibfk_2` FOREIGN KEY (`column_id`) REFERENCES `columns` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
-- indexes
*************************** 1. row ***************************
       Table: indexes
Create Table: CREATE TABLE `indexes` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `table_id` bigint(20) unsigned NOT NULL,
  `name` varchar(64) CHARACTER SET utf8 COLLATE utf8_tolower_ci NOT NULL,
  `type` enum('PRIMARY','UNIQUE','MULTIPLE','FULLTEXT','SPATIAL') COLLATE utf8_bin NOT NULL,
  `algorithm` enum('SE_SPECIFIC','BTREE','RTREE','HASH','FULLTEXT') COLLATE utf8_bin NOT NULL,
  `is_algorithm_explicit` tinyint(1) NOT NULL,
  `is_visible` tinyint(1) NOT NULL,
  `is_generated` tinyint(1) NOT NULL,
  `hidden` tinyint(1) NOT NULL,
  `ordinal_position` int(10) unsigned NOT NULL,
  `comment` varchar(2048) COLLATE utf8_bin NOT NULL,
  `options` mediumtext COLLATE utf8_bin,
  `se_private_data` mediumtext COLLATE utf8_bin,
  `tablespace_id` bigint(20) unsigned DEFAULT NULL,
  `engine` varchar(64) COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `table_id` (`table_id`,`name`),
  KEY `tablespace_id` (`tablespace_id`),
  CONSTRAINT `indexes_ibfk_1` FOREIGN KEY (`table_id`) REFERENCES `tables` (`id`),
  CONSTRAINT `indexes_ibfk_2` FOREIGN KEY (`tablespace_id`) REFERENCES `tablespaces` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=235 DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
-- index_stats
*************************** 1. row ***************************
       Table: index_stats
Create Table: CREATE TABLE `index_stats` (
  `schema_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `table_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `index_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `column_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `cardinality` bigint(20) unsigned DEFAULT NULL,
  UNIQUE KEY `schema_name` (`schema_name`,`table_name`,`index_name`,`column_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
-- schemata
*************************** 1. row ***************************
       Table: schemata
Create Table: CREATE TABLE `schemata` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `catalog_id` bigint(20) unsigned NOT NULL,
  `name` varchar(64) COLLATE utf8_bin NOT NULL,
  `default_collation_id` bigint(20) unsigned NOT NULL,
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `last_altered` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  UNIQUE KEY `catalog_id` (`catalog_id`,`name`),
  KEY `default_collation_id` (`default_collation_id`),
  CONSTRAINT `schemata_ibfk_1` FOREIGN KEY (`catalog_id`) REFERENCES `catalogs` (`id`),
  CONSTRAINT `schemata_ibfk_2` FOREIGN KEY (`default_collation_id`) REFERENCES `collations` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
-- tables
*************************** 1. row ***************************
       Table: tables
Create Table: CREATE TABLE `tables` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `schema_id` bigint(20) unsigned NOT NULL,
  `name` varchar(64) COLLATE utf8_bin NOT NULL,
  `type` enum('BASE TABLE','VIEW','SYSTEM VIEW') COLLATE utf8_bin NOT NULL,
  `engine` varchar(64) CHARACTER SET utf8 NOT NULL,
  `mysql_version_id` int(10) unsigned NOT NULL,
  `row_format` enum('Fixed','Dynamic','Compressed','Redundant','Compact','Paged') COLLATE utf8_bin DEFAULT NULL,
  `collation_id` bigint(20) unsigned DEFAULT NULL,
  `comment` varchar(2048) COLLATE utf8_bin NOT NULL,
  `hidden` tinyint(1) NOT NULL,
  `options` mediumblob,
  `se_private_data` mediumtext COLLATE utf8_bin,
  `se_private_id` bigint(20) unsigned DEFAULT NULL,
  `tablespace_id` bigint(20) unsigned DEFAULT NULL,
  `partition_type` enum(/*removed*/) COLLATE utf8_bin DEFAULT NULL,
  `partition_expression` varchar(2048) COLLATE utf8_bin DEFAULT NULL,
  `default_partitioning` enum('NO','YES','NUMBER') COLLATE utf8_bin DEFAULT NULL,
  `subpartition_type` enum('HASH','KEY_51','KEY_55','LINEAR_HASH','LINEAR_KEY_51','LINEAR_KEY_55') COLLATE utf8_bin DEFAULT NULL,
  `subpartition_expression` varchar(2048) COLLATE utf8_bin DEFAULT NULL,
  `default_subpartitioning` enum('NO','YES','NUMBER') COLLATE utf8_bin DEFAULT NULL,
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `last_altered` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `view_definition` longblob,
  `view_definition_utf8` longtext COLLATE utf8_bin,
  `view_check_option` enum('NONE','LOCAL','CASCADED') COLLATE utf8_bin DEFAULT NULL,
  `view_is_updatable` enum('NO','YES') COLLATE utf8_bin DEFAULT NULL,
  `view_algorithm` enum('UNDEFINED','TEMPTABLE','MERGE') COLLATE utf8_bin DEFAULT NULL,
  `view_security_type` enum('DEFAULT','INVOKER','DEFINER') COLLATE utf8_bin DEFAULT NULL,
  `view_definer` varchar(93) COLLATE utf8_bin DEFAULT NULL,
  `view_client_collation_id` bigint(20) unsigned DEFAULT NULL,
  `view_connection_collation_id` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `schema_id` (`schema_id`,`name`),
  UNIQUE KEY `engine` (`engine`,`se_private_id`),
  KEY `engine_2` (`engine`),
  KEY `collation_id` (`collation_id`),
  KEY `tablespace_id` (`tablespace_id`),
  CONSTRAINT `tables_ibfk_1` FOREIGN KEY (`schema_id`) REFERENCES `schemata` (`id`),
  CONSTRAINT `tables_ibfk_2` FOREIGN KEY (`collation_id`) REFERENCES `collations` (`id`),
  CONSTRAINT `tables_ibfk_3` FOREIGN KEY (`tablespace_id`) REFERENCES `tablespaces` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=322 DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
-- table_stats
*************************** 1. row ***************************
       Table: table_stats
Create Table: CREATE TABLE `table_stats` (
  `schema_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `table_name` varchar(64) COLLATE utf8_bin NOT NULL,
  `table_rows` bigint(20) unsigned DEFAULT NULL,
  `avg_row_length` bigint(20) unsigned DEFAULT NULL,
  `data_length` bigint(20) unsigned DEFAULT NULL,
  `max_data_length` bigint(20) unsigned DEFAULT NULL,
  `index_length` bigint(20) unsigned DEFAULT NULL,
  `data_free` bigint(20) unsigned DEFAULT NULL,
  `auto_increment` bigint(20) unsigned DEFAULT NULL,
  `checksum` bigint(20) unsigned DEFAULT NULL,
  `update_time` timestamp NULL DEFAULT NULL,
  `check_time` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`schema_name`,`table_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC

In addition to the tables referred in views, there are three that are mentioned in the documentation but not implemented as a view in information_schema: triggers, events, and routines.


show create table triggers\G
*************************** 1. row ***************************
       Table: triggers
Create Table: CREATE TABLE `triggers` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `schema_id` bigint(20) unsigned NOT NULL,
  `name` varchar(64) CHARACTER SET utf8 NOT NULL,
  `event_type` enum('INSERT','UPDATE','DELETE') COLLATE utf8_bin NOT NULL,
  `table_id` bigint(20) unsigned NOT NULL,
  `action_timing` enum('BEFORE','AFTER') COLLATE utf8_bin NOT NULL,
  `action_order` int(10) unsigned NOT NULL,
  `action_statement` longblob NOT NULL,
  `action_statement_utf8` longtext COLLATE utf8_bin NOT NULL,
  `created` timestamp(2) NOT NULL DEFAULT CURRENT_TIMESTAMP(2) ON UPDATE CURRENT_TIMESTAMP(2),
  `last_altered` timestamp(2) NOT NULL DEFAULT CURRENT_TIMESTAMP(2),
  `sql_mode` set(/*removed*/) COLLATE utf8_bin NOT NULL,
  `definer` varchar(93) COLLATE utf8_bin NOT NULL,
  `client_collation_id` bigint(20) unsigned NOT NULL,
  `connection_collation_id` bigint(20) unsigned NOT NULL,
  `schema_collation_id` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `schema_id` (`schema_id`,`name`),
  UNIQUE KEY `table_id` (`table_id`,`event_type`,`action_timing`,`action_order`),
  KEY `client_collation_id` (`client_collation_id`),
  KEY `connection_collation_id` (`connection_collation_id`),
  KEY `schema_collation_id` (`schema_collation_id`),
  CONSTRAINT `triggers_ibfk_1` FOREIGN KEY (`schema_id`) REFERENCES `schemata` (`id`),
  CONSTRAINT `triggers_ibfk_2` FOREIGN KEY (`table_id`) REFERENCES `tables` (`id`),
  CONSTRAINT `triggers_ibfk_3` FOREIGN KEY (`client_collation_id`) REFERENCES `collations` (`id`),
  CONSTRAINT `triggers_ibfk_4` FOREIGN KEY (`connection_collation_id`) REFERENCES `collations` (`id`),
  CONSTRAINT `triggers_ibfk_5` FOREIGN KEY (`schema_collation_id`) REFERENCES `collations` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC
show create table events\G
*************************** 1. row ***************************
       Table: events
Create Table: CREATE TABLE `events` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `schema_id` bigint(20) unsigned NOT NULL,
  `name` varchar(64) CHARACTER SET utf8 NOT NULL,
  `definer` varchar(93) COLLATE utf8_bin NOT NULL,
  `time_zone` varchar(64) COLLATE utf8_bin NOT NULL,
  `definition` longblob NOT NULL,
  `definition_utf8` longtext COLLATE utf8_bin NOT NULL,
  `execute_at` datetime DEFAULT NULL,
  `interval_value` int(11) DEFAULT NULL,
  `interval_field` enum('YEAR','QUARTER','MONTH','DAY','HOUR','MINUTE','WEEK','SECOND','MICROSECOND','YEAR_MONTH','DAY_HOUR','DAY_MINUTE','DAY_SECOND','HOUR_MINUTE','HOUR_SECOND','MINUTE_SECOND','DAY_MICROSECOND','HOUR_MICROSECOND','MINUTE_MICROSECOND','SECOND_MICROSECOND') COLLATE utf8_bin DEFAULT NULL,
  `sql_mode` set('REAL_AS_FLOAT','PIPES_AS_CONCAT','ANSI_QUOTES','IGNORE_SPACE','NOT_USED','ONLY_FULL_GROUP_BY','NO_UNSIGNED_SUBTRACTION','NO_DIR_IN_CREATE','POSTGRESQL','ORACLE','MSSQL','DB2','MAXDB','NO_KEY_OPTIONS','NO_TABLE_OPTIONS','NO_FIELD_OPTIONS','MYSQL323','MYSQL40','ANSI','NO_AUTO_VALUE_ON_ZERO','NO_BACKSLASH_ESCAPES','STRICT_TRANS_TABLES','STRICT_ALL_TABLES','NO_ZERO_IN_DATE','NO_ZERO_DATE','INVALID_DATES','ERROR_FOR_DIVISION_BY_ZERO','TRADITIONAL','NO_AUTO_CREATE_USER','HIGH_NOT_PRECEDENCE','NO_ENGINE_SUBSTITUTION','PAD_CHAR_TO_FULL_LENGTH') COLLATE utf8_bin NOT NULL,
  `starts` datetime DEFAULT NULL,
  `ends` datetime DEFAULT NULL,
  `status` enum('ENABLED','DISABLED','SLAVESIDE_DISABLED') COLLATE utf8_bin NOT NULL,
  `on_completion` enum('DROP','PRESERVE') COLLATE utf8_bin NOT NULL,
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `last_altered` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `last_executed` datetime DEFAULT NULL,
  `comment` varchar(2048) COLLATE utf8_bin NOT NULL,
  `originator` int(10) unsigned NOT NULL,
  `client_collation_id` bigint(20) unsigned NOT NULL,
  `connection_collation_id` bigint(20) unsigned NOT NULL,
  `schema_collation_id` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `schema_id` (`schema_id`,`name`),
  KEY `client_collation_id` (`client_collation_id`),
  KEY `connection_collation_id` (`connection_collation_id`),
  KEY `schema_collation_id` (`schema_collation_id`),
  CONSTRAINT `events_ibfk_1` FOREIGN KEY (`schema_id`) REFERENCES `schemata` (`id`),
  CONSTRAINT `events_ibfk_2` FOREIGN KEY (`client_collation_id`) REFERENCES `collations` (`id`),
  CONSTRAINT `events_ibfk_3` FOREIGN KEY (`connection_collation_id`) REFERENCES `collations` (`id`),
  CONSTRAINT `events_ibfk_4` FOREIGN KEY (`schema_collation_id`) REFERENCES `collations` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC

show create table routines\G
*************************** 1. row ***************************
       Table: ROUTINES
Create Table: CREATE TEMPORARY TABLE `ROUTINES` (
  `SPECIFIC_NAME` varchar(64) NOT NULL DEFAULT '',
  `ROUTINE_CATALOG` varchar(512) NOT NULL DEFAULT '',
  `ROUTINE_SCHEMA` varchar(64) NOT NULL DEFAULT '',
  `ROUTINE_NAME` varchar(64) NOT NULL DEFAULT '',
  `ROUTINE_TYPE` varchar(9) NOT NULL DEFAULT '',
  `DATA_TYPE` varchar(64) NOT NULL DEFAULT '',
  `CHARACTER_MAXIMUM_LENGTH` int(21) DEFAULT NULL,
  `CHARACTER_OCTET_LENGTH` int(21) DEFAULT NULL,
  `NUMERIC_PRECISION` bigint(21) unsigned DEFAULT NULL,
  `NUMERIC_SCALE` int(21) DEFAULT NULL,
  `DATETIME_PRECISION` bigint(21) unsigned DEFAULT NULL,
  `CHARACTER_SET_NAME` varchar(64) DEFAULT NULL,
  `COLLATION_NAME` varchar(64) DEFAULT NULL,
  `DTD_IDENTIFIER` longtext,
  `ROUTINE_BODY` varchar(8) NOT NULL DEFAULT '',
  `ROUTINE_DEFINITION` longtext,
  `EXTERNAL_NAME` varchar(64) DEFAULT NULL,
  `EXTERNAL_LANGUAGE` varchar(64) DEFAULT NULL,
  `PARAMETER_STYLE` varchar(8) NOT NULL DEFAULT '',
  `IS_DETERMINISTIC` varchar(3) NOT NULL DEFAULT '',
  `SQL_DATA_ACCESS` varchar(64) NOT NULL DEFAULT '',
  `SQL_PATH` varchar(64) DEFAULT NULL,
  `SECURITY_TYPE` varchar(7) NOT NULL DEFAULT '',
  `CREATED` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `LAST_ALTERED` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `SQL_MODE` varchar(8192) NOT NULL DEFAULT '',
  `ROUTINE_COMMENT` longtext NOT NULL,
  `DEFINER` varchar(93) NOT NULL DEFAULT '',
  `CHARACTER_SET_CLIENT` varchar(32) NOT NULL DEFAULT '',
  `COLLATION_CONNECTION` varchar(32) NOT NULL DEFAULT '',
  `DATABASE_COLLATION` varchar(32) NOT NULL DEFAULT ''
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Looking at the code again, I see that there are also tables tablespaces and version:

show create table tablespaces\G
*************************** 1. row ***************************
       Table: tablespaces
Create Table: CREATE TABLE `tablespaces` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(255) COLLATE utf8_bin NOT NULL,
  `options` mediumtext COLLATE utf8_bin,
  `se_private_data` mediumtext COLLATE utf8_bin,
  `comment` varchar(2048) COLLATE utf8_bin NOT NULL,
  `engine` varchar(64) COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC

show create table version\G
*************************** 1. row ***************************
       Table: version
Create Table: CREATE TABLE `version` (
  `version` int(10) unsigned NOT NULL,
  PRIMARY KEY (`version`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin STATS_PERSISTENT=0 ROW_FORMAT=DYNAMIC

Now we can try the last part of our task, i.e., querying the data directory for some specific info.

mysql [localhost] {msandbox} (mysql) > SET SESSION debug='+d,skip_dd_table_access_check';
mysql [localhost] {msandbox} (mysql) > select * from version;
+---------+
| version |
+---------+
|       1 |
+---------+

mysql [localhost] {msandbox} (mysql) > select id, name from schemata;
+----+--------------------+
| id | name               |
+----+--------------------+
|  2 | information_schema |
|  1 | mysql              |
|  3 | performance_schema |
|  4 | sys                |
|  5 | test               |
+----+--------------------+


mysql [localhost] {msandbox} (mysql) > select id, name, type, engine, mysql_version_id, comment from tables where name = 'user' and schema_id=1;
+----+------+------------+--------+------------------+-----------------------------+
| id | name | type       | engine | mysql_version_id | comment                     |
+----+------+------------+--------+------------------+-----------------------------+
| 84 | user | BASE TABLE | InnoDB |            80000 | Users and global privileges |
+----+------+------------+--------+------------------+-----------------------------+

Now the data dictionary is much more readable!

DISCLAIMER: there may be a simpler or more elegant solution to this problem. The method shown here is what I got by researching. But in fact, if there is a better method, short of recompiling the server, I'd like to know.

WARNING: Don't do what I do in the following paragraphs!

To complete the experiment, I am going to do what the MySQL team does not want me to do at all.

First, I create a directory inside the data directory. As shown in data dictionary limitations, this is not supported. But since we can access the data dictionary ...


$ mkdir ~/sandboxes/msb_full_8_0_0/data/db1

Now for the felony part:

mysql [localhost] {msandbox} (mysql) > SET SESSION debug= '+d,skip_dd_table_access_check';
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (mysql) > select * from schemata;
+----+------------+--------------------+----------------------+---------------------+---------------------+
| id | catalog_id | name               | default_collation_id | created             | last_altered        |
+----+------------+--------------------+----------------------+---------------------+---------------------+
|  1 |          1 | mysql              |                    8 | 0000-00-00 00:00:00 | 0000-00-00 00:00:00 |
|  2 |          1 | information_schema |                   33 | 2016-09-25 18:06:00 | 2016-09-25 18:06:00 |
|  3 |          1 | performance_schema |                   33 | 0000-00-00 00:00:00 | 0000-00-00 00:00:00 |
|  4 |          1 | sys                |                   33 | 0000-00-00 00:00:00 | 0000-00-00 00:00:00 |
|  5 |          1 | test               |                    8 | 0000-00-00 00:00:00 | 0000-00-00 00:00:00 |
+----+------------+--------------------+----------------------+---------------------+---------------------+
5 rows in set (0.00 sec)

mysql [localhost] {msandbox} (mysql) > insert into schemata values (6, 1, 'db1', 8, now(), now());
Query OK, 1 row affected (0.01 sec)

mysql [localhost] {msandbox} (mysql) > show schemas;
+--------------------+
| Database           |
+--------------------+
| db1                | ## TA-DA!
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+

Now, pay attention! This why the MySQL team don't want anyone to mess up with the data dictionary tables.

DOUBLE WARNING! Don't do the following!


mysql [localhost] {msandbox} (mysql) > insert into schemata values (7, 1, 'db2', 8, now(), now());
Query OK, 1 row affected (0.01 sec)

mysql [localhost] {msandbox} (mysql) > show schemas;
+--------------------+
| Database           |
+--------------------+
| db1                |
| db2                |
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
7 rows in set (0.00 sec)

mysql [localhost] {msandbox} (mysql) > use db2
ERROR 1049 (42000): Unknown database 'db2'

There! I broke the system. Lesson learned: read, don't write data dictionary tables.

MySQL at Oracle Open World 2016

MySQL is a growing presence at Oracle Open World. While most of the headlines belong to the main products, where Oracle services are aiming at world domination, MySQL shared the spotlight, as it was announced to be part of Oracle database cloud. It seems a logical move for Oracle: after all the effort to make MySQL 5.7 the biggest release ever, it stands to reason that it is offered as a competitive feature in its own database as a service.

With this offer, Oracle is applying enterprise pricing and methodologies to a target of emerging companies. MySQL in the Oracle cloud differs from the competition by a few key points:

  • It's only MySQL 5.7. While this is the most advanced MySQL server available, users with existing deployments may exercise caution before adopting this cloud service. Companies that are starting now, instead, may take advantage of using the latest and greatest.
  • It's MySQL Enterprise edition, with a wealth of additional monitoring and management tools that are missing in other cloud offerings.
  • Unlike some popular competitors, it includes access to the server host, allowing DBAs to enable plugins, fine tune the deployment, and explore the operating system in addition to the database.
  • It includes Oracle support.

Time will tell if this offering will be successful. It may not appeal to all the users, but there is surely a category of power users who can take advantage of these features.

The surprises of the MySQL keynote did not stop at the cloud. We had already seen, one week before the conference, that MySQL 8.0 was released, with many juicy features. What came unexpected are two announcements:

  • MySQL group replication, a product that has been living in MySQL Labs for quite a while, was declared "release candidate" despite lacking documentation and being released in a format that discourages adoption, except from diehard hackers.
  • Another product is been released, again in the labs, with an ambitious mission. The MySQL InnoDB cluster is based on group replication and wants to be a 15-minute deployment of high-availability and scalable system, thanks to an enhanced version of MySQL Shell (the same used for the document store. Its feature set are exciting, but what we have seen in the demos suggests that the product is still in the early stages of development.

With these moves, Oracle is showing two paths of MySQL development:

  • in the main path, which has produced the GA of MySQL 5.5, 5.6, and 5.7, the MySQL team is showing the positive influence of Oracle engineering, with focus on security, performance, and stability.
  • in a parallel course, which started last April with the announcement of MySQL document store and its related shell, the team wants to introduce new features to a GA release as plugins, with the reasoning that the main release will not be touched (thus avoiding the taboo of altering a stable product) but users are free to enable plugins and unleash new functionalities.

The mix of traditional and agile releases are provoking exciting thoughts, albeit moderated by the fear of using together experimental code in a GA deployment.

The methodology of these releases is also baffling. It is unclear how mature is the document store. The plugin comes with the server, and it is accompanied by an huge set of documentation, which implies that it has been designed extensively and tested internally for a while, but the main tool for the feature, mysql shell is labeled as development preview: not very encouraging. On the other hand, the latest plugin addition, the MySQL group replication, which has been declared of release candidate quality, is still in the labs (no plugin in the server release), and without documentation.

All considered, while it is clear that Oracle is putting an enormous engineering effort into growing MySQL, I have the feeling that the replication features have been neglected and the announcement of group replication mixed fortunes confirms me in this belief.

The conference was useful to me. I had the chance of meeting many Oracle engineers and users, and discuss technical and community matters at length. My own presentation, a two-hour tutorial on MySQL operations in Docker was highly satisfactory, as it proved to be an interesting topic that was actively discussed by the audience.

Wednesday, September 21, 2016

MySQL team: make it easy to give you feedback!

There was a bold announcement during the MySQL Keynote at Oracle Open World. A new product that will mix up with the existing GA server, called MySQL InnoDB Cluster. This is an evolution of MySQL group replication, which has been in the labs for long time, and the MySQL shell, which was introduced as a side feature last April. The boldness I mentioned before is on account of wanting to add to a GA server something that was defined as release candidate despite never having been out of the labs. The product is interesting as it promises to be a quick and painless cluster deployment, with built-in high availability and scalability.

What surprised me most was a heartfelt and urgent request to test this new product and provide feedback, hinting that it would be GA soon.

Here are some thoughts on this matter:

  • A product in the labs is perceived as pre-release, i.e. less than beta quality. This is what happened with previous releases on labs: GTID, multi-source replication, and data dictionary were all released in labs before eventually being integrated in the main project.
  • Putting a product in labs again and declaring it release candidate feels odd.
  • The problem with labs is that the previews are distributed with a limited set of packages, and without documentation. The brave souls that test these packages need to find information about the new software in blog posts or dig in the source code, without any assurance that this package would ever become officially supported.

There is some confusion about which package is of which quality. From the keynote it looked like MySQL InnoDB Cluster (MIC) was the one being RC, but when I asked for clarifications it seems that group replication is RC (from its niche in the labs) while MIC is still of unknown quality. From what I saw in the demos it seems quite alpha to me.

Back to the main topic. MySQL want feedback, but provides software in the labs, in a way that is not easy to use. Specifically:

  • There is an OSX package that contains .dmg files, implying that I should install those in my main computer. Given that the perceived quality is low, I'd say "No, thanks," as I don't want to risk my laptop with alpha quality installed as root. Besides, this is cluster software, so I would need at least three nodes to make it work. There is a "sandbox mode" that allows you to simulate three nodes on a single server, but this still requires a main installation, with all the risks involved. No, thanks, again.
  • There are only .rpm files for Linux, which means that I need to have either servers or VMs where to install software as root. I have the same concerns as I have for the Mac: while VMs can be thrown away and remade, it is still a big investment in time and resources to test something new.
  • Missing are generic .tar.gz binaries, which would allow users to install in user space, without affecting the operating system or other MySQL servers.
  • Missing are also Docker packages, which would allow users to test quickly and painlessly without any risk.
  • Finally, and probably most importantly, there is no documentation. If this is RC software, there should be at least a couple of workloads that could be included in the labs packages for reference.

Summing up, I have a message for the MySQL team product managers and developers: if the software is meant to be usable, i.e. more than a proof of concept as other things in the labs, move it to the downloads section, same as it happened with the MySQL Shell and the document store early this year. Also, provide Docker images early on, so that people can test without many risks. This exercise alone would discover bugs just while you are doing it. And please add documentation for the feature you want feedback for. If the manual is not ready, don't limit the docs to a skinny blog post, but add the specifications used to create the feature (workloads) or even an alpha version of the docs. In short, if the software is worth giving feedback, it should be treated with more respect than it is shown right now. And the same respect goes for the users whom you are asking feedback from.

Improving the design of MySQL replication

Now that MySQL 8.0 has been revealed, it's time to take a deep look at replication features in the latest releases, and review its overall design.

Server UUID vs Server-ID

At the beginning of replication, there was the server_id variable that identified uniquely a node in a replication system. The variable is still here, but in MySQL 5.6 it was joined by another value, which is created during the server initialisation, regardless of its involvement in a replication system. The server_uuid is a string of hexadecimal characters that is the basis for global transaction identifiers:

select @@server_id, @@server_uuid;
+-------------+--------------------------------------+
| @@server_id | @@server_uuid                        |
+-------------+--------------------------------------+
|         101 | 4c0a9670-7f9a-11e6-9f8b-188f83e4eefc |
+-------------+--------------------------------------+

If this server is a master, its slave will report:

        Master_Server_ID: 101 
             Master_UUID: 4c0a9670-7f9a-11e6-9f8b-188f83e4eefc
       [...]
       Retrieved_Gtid_Set: 4c0a9670-7f9a-11e6-9f8b-188f83e4eefc:1-42
        Executed_Gtid_Set: 4c0a9670-7f9a-11e6-9f8b-188f83e4eefc:1-42

The serverid is still indispensable. We can't use the same value for two nodes in the replication system, or we get errors. But instead of using the serverid as the identifier for global transaction identifiers (GTID) we use the serverUUID, thus being subjected to the inhuman treatment of dealing with unreadable values to track our valuable transactions. Yet, server UUIDs have not replaced everything: if we want to exclude one or more servers from being applied, we still refer to them by serverid, as in the IGNORE_SERVER_IDS clause of CHANGE MASTER TO.

How should it be instead

While UUIDs guarantee that identifiers are unique in the the whole planet, the number of nodes in a replication system are finite, and for practical a single system may not go beyond the thousands. I don't care if my serverid is the same as a serverid in another continent: all I want is to be unique within my system. And since we have to use unique values for every node server_id, it is unnecessarily cruel to force users to deal with long strings instead of plain numbers.

VARIABLES vs STATUS

In general, MySQL behavior gets changed by setting variables to a given value either in the options file or using SET GLOBAL dynamically. To see the setting of a given value, we use SHOW VARIABLES:

show variables like 'max_connections';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 15    |
+-----------------+-------+

Here max_connections was changed from its default value to a much lower one. To see how many connections we have consumed, we use SHOW STATUS:

show status like '%connections%';
+-----------------------------------+---------------------+
| Variable_name                     | Value               |
+-----------------------------------+---------------------+
| Connection_errors_max_connections | 0                   |
| Connections                       | 7                   |
| Max_used_connections              | 1                   |
| Max_used_connections_time         | 2016-09-20 18:01:40 |
+-----------------------------------+---------------------+

That's the general theory. Sometimes, in replication, MySQL follows the same path. For example, in semi-synchronous replication, we have the settings of the functionality using variables:

 show variables like 'rpl%';
+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 10000      |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 1          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_stop_slave_timeout                    | 31536000   |
+-------------------------------------------+------------+

and the running results of how the feature is performing using status:

set status like 'rpl%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 2     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 4     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 431   |
| Rpl_semi_sync_master_tx_wait_time          | 863   |
| Rpl_semi_sync_master_tx_waits              | 2     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 2     |
+--------------------------------------------+-------+

However, when it comes to GTID, the paradigm is broken:

slave1 [localhost] {msandbox} ((none)) > show global VARIABLES like '%gtid%';
+----------------------------------+-------------------------------------------+
| Variable_name                    | Value                                     |
+----------------------------------+-------------------------------------------+
| binlog_gtid_simple_recovery      | ON                                        |
| enforce_gtid_consistency         | ON                                        |
| gtid_executed                    | 00011808-1111-1111-1111-111111111111:1-42 |
| gtid_executed_compression_period | 1000                                      |
| gtid_mode                        | ON                                        |
| gtid_owned                       |                                           |
| gtid_purged                      | 00011808-1111-1111-1111-111111111111:1-42 |
| session_track_gtids              | OFF                                       |
+----------------------------------+-------------------------------------------+
8 rows in set (0.00 sec)

slave1 [localhost] {msandbox} ((none)) > show global STATUS like '%gtid%';
Empty set (0.01 sec)

Here we have variables like gtid_mode and enforce_gtid_consistency that are used to enable the behavior, but the monitoring of the result (gtid_executed and gtid_purged) should have been in STATUS, not in VARIABLES. The same usage of VARIABLES can be seen in MySQL group replication.

How should it be instead

The values that monitor the functionality should be in STATUS rather than in VARIABLES. However, in this case, it would be better if the values were tracked in performance_schema tables, as noted below.

GTID executed

When GTIDs are enabled, we can see its value in SHOW MASTER STATUS

SHOW MASTER STATUS\G
*************************** 1. row ***************************
             File: mysql-bin.000002
         Position: 6325
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: 00007801-1111-1111-1111-111111111111:1-22
1 row in set (0.00 sec)

The information here has a precise meaning: the transaction set 1-22 comes from a master identified by UUID 00007801-1111-1111-1111-111111111111. The latest transaction corresponds to binary log mysql-bin.000002 at position 6325.

This is the identification of a source, and as such is the basis for monitoring. When we want to check if the slave service is running fine, we need to compare the slave progress against the information reported by the master. So far, so good.

However, when we use multiple masters, the matter becomes less clear. A slave that was replicating from this master and from two more, will show the following:

SHOW MASTER STATUS\G
*************************** 1. row ***************************
             File: mysql-bin.000002
         Position: 154
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: 00007801-1111-1111-1111-111111111111:1-22,
00007802-2222-2222-2222-222222222222:1-21,
00007803-3333-3333-3333-333333333333:1-4

This is completely wrong. What we see here has nothing to do with being a master, as the data shown in the above statement was originated in different servers. The data is also wrong because it is thrown together in the same field. Using the same server, we can see the same information in different ways.

select @@global.gtid_executed\G
*************************** 1. row ***************************
@@global.gtid_executed: 00007801-1111-1111-1111-111111111111:1-22,
00007802-2222-2222-2222-222222222222:1-21,
00007803-3333-3333-3333-333333333333:1-4

This is the same info seen before. In this context, it may seem less wrong, as it is only data coming from other servers.

Here again the information will become much more confused when the slave is also a master and produces its own data sets. If we create something in this server, we end up with the following:

select @@global.gtid_executed\G
*************************** 1. row ***************************
@@global.gtid_executed: 00007801-1111-1111-1111-111111111111:1-22,
00007802-2222-2222-2222-222222222222:1-21,
00007803-3333-3333-3333-333333333333:1-4,
00007804-4444-4444-4444-444444444444:1-2

Now we have in the same field the data that was produced in this server and the data that came through replication. The matter looks even more embarrassing if we look at SHOW SLAVE STATUS, which reports distinct data for every channel, but the executed_gtid_set is reported all mixed up, in every channel, unchanged and confused, master data and slave data alike.

SHOW SLAVE STATUS\G
*************************** 1. row ***************************
[...]
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 6325
               Relay_Log_File: mysql-relay-node1.000002
                Relay_Log_Pos: 6538
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
[...]
          Exec_Master_Log_Pos: 6325
              Relay_Log_Space: 6747
[...]
             Master_Server_Id: 101
                  Master_UUID: 00007801-1111-1111-1111-111111111111
[...]
           Retrieved_Gtid_Set: 00007801-1111-1111-1111-111111111111:1-22
            Executed_Gtid_Set: 00007801-1111-1111-1111-111111111111:1-22,
00007802-2222-2222-2222-222222222222:1-21,
00007803-3333-3333-3333-333333333333:1-4,
00007804-4444-4444-4444-444444444444:1-2
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: node1
           Master_TLS_Version:
*************************** 2. row ***************************
[...]
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 6039
               Relay_Log_File: mysql-relay-node2.000002
                Relay_Log_Pos: 6252
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
[...]
          Exec_Master_Log_Pos: 6039
              Relay_Log_Space: 6461
[...]
             Master_Server_Id: 102
                  Master_UUID: 00007802-2222-2222-2222-222222222222
[...]
           Retrieved_Gtid_Set: 00007802-2222-2222-2222-222222222222:1-21
            Executed_Gtid_Set: 00007801-1111-1111-1111-111111111111:1-22,
00007802-2222-2222-2222-222222222222:1-21,
00007803-3333-3333-3333-333333333333:1-4,
00007804-4444-4444-4444-444444444444:1-2
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: node2
           Master_TLS_Version:
*************************** 3. row ***************************
[...]
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 1177
               Relay_Log_File: mysql-relay-node3.000002
                Relay_Log_Pos: 1390
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
[...]
          Exec_Master_Log_Pos: 1177
              Relay_Log_Space: 1599
[...]
             Master_Server_Id: 103
                  Master_UUID: 00007803-3333-3333-3333-333333333333
[...]
           Retrieved_Gtid_Set: 00007803-3333-3333-3333-333333333333:1-4
            Executed_Gtid_Set: 00007801-1111-1111-1111-111111111111:1-22,
00007802-2222-2222-2222-222222222222:1-21,
00007803-3333-3333-3333-333333333333:1-4,
00007804-4444-4444-4444-444444444444:1-2
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: node3
           Master_TLS_Version:

How should it be instead

First off, the data should not be in a single variable, because it is composed of multiple values. But even if it were to be kept in the environment (as STATUS items, not variables, as we have seen above,) we should have a distinction:

SHOW STATUS LIKE 'gtid_applied'
*************************** 1. row ***************************
@@global.gtid_executed: 00007801-1111-1111-1111-111111111111:1-22,
00007802-2222-2222-2222-222222222222:1-21,
00007803-3333-3333-3333-333333333333:1-4

SHOW STATUS LIKE 'gtid_generated'
*************************** 1. row ***************************
00007804-4444-4444-4444-444444444444:1-2

The SHOW SLAVE STATUS should report the right line for each channel. For example:

*************************** 1. row ***************************
[...]
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 6325
               Relay_Log_File: mysql-relay-node1.000002
                Relay_Log_Pos: 6538
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
[...]
          Exec_Master_Log_Pos: 6325
              Relay_Log_Space: 6747
[...]
             Master_Server_Id: 101
                  Master_UUID: 00007801-1111-1111-1111-111111111111
[...]
           Retrieved_Gtid_Set: 00007801-1111-1111-1111-111111111111:1-22
            Executed_Gtid_Set: 00007801-1111-1111-1111-111111111111:1-22
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: node1
           Master_TLS_Version:
*************************** 2. row ***************************
[...]
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 6039
               Relay_Log_File: mysql-relay-node2.000002
                Relay_Log_Pos: 6252
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
[...]
          Exec_Master_Log_Pos: 6039
              Relay_Log_Space: 6461
[...]
             Master_Server_Id: 102
                  Master_UUID: 00007802-2222-2222-2222-222222222222
[...]
           Retrieved_Gtid_Set: 00007802-2222-2222-2222-222222222222:1-21
            Executed_Gtid_Set: 00007802-2222-2222-2222-222222222222:1-21
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: node2
           Master_TLS_Version:
*************************** 3. row ***************************
[...]
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 1177
               Relay_Log_File: mysql-relay-node3.000002
                Relay_Log_Pos: 1390
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
[...]
          Exec_Master_Log_Pos: 1177
              Relay_Log_Space: 1599
[...]
             Master_Server_Id: 103
                  Master_UUID: 00007803-3333-3333-3333-333333333333
[...]
           Retrieved_Gtid_Set: 00007803-3333-3333-3333-333333333333:1-4
            Executed_Gtid_Set: 00007803-3333-3333-3333-333333333333:1-4
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: node3
           Master_TLS_Version:

Missing information in performance_schema tables

The idea of having replication information in performance schema is to replace SHOW SLAVE STATUS with a set of tables that report the same info. The theory is good, but the implementation is lacking.

First of all, the tables are misnamed. replication_applier_configuration has only settings about the applier delay, while replication_applier_status again reports only info about the remaining delay. The information worth looking at for monitoring is in a table named replication_connection_status, while it should refer to applier status or applier progress.

select * from replication_connection_status\G
*************************** 1. row ***************************
             CHANNEL_NAME: node1
               GROUP_NAME:
              SOURCE_UUID: 00007801-1111-1111-1111-111111111111
                THREAD_ID: 35
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 362
 LAST_HEARTBEAT_TIMESTAMP: 2016-09-20 23:05:41
 RECEIVED_TRANSACTION_SET: 00007801-1111-1111-1111-111111111111:1-22
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
             CHANNEL_NAME: node2
               GROUP_NAME:
              SOURCE_UUID: 00007802-2222-2222-2222-222222222222
                THREAD_ID: 39
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 362
 LAST_HEARTBEAT_TIMESTAMP: 2016-09-20 23:05:41
 RECEIVED_TRANSACTION_SET: 00007802-2222-2222-2222-222222222222:1-21
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 3. row ***************************
             CHANNEL_NAME: node3
               GROUP_NAME:
              SOURCE_UUID: 00007803-3333-3333-3333-333333333333
                THREAD_ID: 43
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 362
 LAST_HEARTBEAT_TIMESTAMP: 2016-09-20 23:05:42
 RECEIVED_TRANSACTION_SET: 00007803-3333-3333-3333-333333333333:1-4
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00

What is missing here is the applied data, i.e. what appears in @@global.gtid_executed, which, as said above, should be split by server. Moreover, we miss the corresponding binary logs and positions, which we find in some neglected tables in mysql databases and that should be removed, with their contents integrated here. Another thing that's missing are the schema and the timestamps of the latest transaction, both the creation and the replication times. All this information is valuable to the DBA when troubleshooting. Rather than opening binary logs to find out the details, they should be all collected in this table for ease of use.

How should it be instead

The monitoring tables should include information about:

  • transaction timestamp;
  • binary log and position (or what a future implementation would use to transport data);
  • default schema.

Here's an example:

select * from replication_applier_progress\G
*************************** 1. row ***************************
             CHANNEL_NAME: node1
               GROUP_NAME:
              SOURCE_UUID: 00007801-1111-1111-1111-111111111111
                THREAD_ID: 35
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 362
 LAST_HEARTBEAT_TIMESTAMP: 2016-09-20 23:05:41
    TRANSACTION_TIMESTAMP: 2016-09-20 22:24:13
          APPLY_TIMESTAMP: 2016-09-20 22:24:15
 RECEIVED_TRANSACTION_SET: 00007801-1111-1111-1111-111111111111:1-22
  APPLIED_TRANSACTION_SET: 00007801-1111-1111-1111-111111111111:1-22
          Master_Log_File: mysql-bin.000002
      Read_Master_Log_Pos: 6325
           Relay_Log_File: mysql-relay-node1.000002
            Relay_Log_Pos: 6538
    Relay_Master_Log_File: mysql-bin.000002
       applying_on_schema: db1
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
             CHANNEL_NAME: node2
               GROUP_NAME:
              SOURCE_UUID: 00007802-2222-2222-2222-222222222222
                THREAD_ID: 39
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 362
 LAST_HEARTBEAT_TIMESTAMP: 2016-09-20 23:05:41
    TRANSACTION_TIMESTAMP: 2016-09-20 22:25:15
          APPLY_TIMESTAMP: 2016-09-20 22:25:18
 RECEIVED_TRANSACTION_SET: 00007802-2222-2222-2222-222222222222:1-21
  APPLIED_TRANSACTION_SET: 00007802-2222-2222-2222-222222222222:1-21
          Master_Log_File: mysql-bin.000002
      Read_Master_Log_Pos: 6039
           Relay_Log_File: mysql-relay-node2.000002
            Relay_Log_Pos: 6252
    Relay_Master_Log_File: mysql-bin.000002
       applying_on_schema: db2
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 3. row ***************************
             CHANNEL_NAME: node3
               GROUP_NAME:
              SOURCE_UUID: 00007803-3333-3333-3333-333333333333
                THREAD_ID: 43
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 362
 LAST_HEARTBEAT_TIMESTAMP: 2016-09-20 23:05:42
    TRANSACTION_TIMESTAMP: 2016-09-20 22:28:16
          APPLY_TIMESTAMP: 2016-09-20 22:28:19
 RECEIVED_TRANSACTION_SET: 00007803-3333-3333-3333-333333333333:1-4
  APPLIED_TRANSACTION_SET: 00007803-3333-3333-3333-333333333333:1-4
          Master_Log_File: mysql-bin.000002
      Read_Master_Log_Pos: 1177
           Relay_Log_File: mysql-relay-node3.000002
            Relay_Log_Pos: 1390
    Relay_Master_Log_File: mysql-bin.000002
       applying_on_schema: db3
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE:
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00

The same level of detail should be used for replication_applier_status_by_worker, which, by the way, has a more appropriate name than replication_connection_status.

Monitoring tables are in different places

MySQL 5.6 introduced replication tables in the mysql database. Unfortunately, this inclusion happened before GTID was implemented. As a result, we have tables in mysql database reporting log names and positions and tables in performance_schema reporting GTIDs, but without correlations.

How should it be instead

The replication tables in mysql should be removed, and integrated in the performance_schema as shown above.

There is no table for SHOW MASTER STATUS

The information in SHOW SLAVE STATUS has been translated almost entirely in performance_schema tables. No such thing happened to SHOW MASTER STATUS. To make monitoring more efficient, all the data used for monitoring should be in tables, and the information should not contain more than one piece of information per field, as we have seen in previous cases.

How should it be instead

There should be a table containing the information from SHOW MASTER STATUS, except the filter info, which should be in a different table:

select * from replication_master_status\G
SHOW MASTER STATUS\G
*************************** 1. row ***************************
                 File: mysql-bin.000002
             Position: 154
    Executed_Gtid_Set: 00007801-1111-1111-1111-111111111111:1-22
Transaction_timestamp: 2016-09-20 22:24:13

select * from replication_filters\G
*************************** 1. row ***************************
     Filter_Name: binlog_do_db
     Filter_Yype: binlog
    Filter_Value: personnel.*
   Filter_Status: ACTIVE

Call for action

Dear MySQL product managers: please take action to make replication more consistent and usable. I may have forgotten something, but I think the requests in this article are a good start to improve the design of MySQL replication.

Monday, September 12, 2016

MySQL 8.0 first impressions

Mysql8

MySQL 8.0.0 was released today. It has been some time in the making, shrouded in a veil of secrecy for over one year. We knew, from listening to the gossip and looking at the few available previews, some of what was going to bring. So, for the observant users, its main features may not come as a surprise. For the rest of you, here's a quick roundup:

Notable features

  • No MyISAM tables anymore! The grant tables are now InnoDB, meaning that grant operations are now atomic.
  • A real data dictionary. This change is less visible than the previous one. The data dictionary tables are hidden and only a subset of the data is available through information_schema views. The reason for the hidden tables is to allow a stable interface through several versions. I am not happy about this choice, but I understand the reason.
  • The .frm files are gone. If you create a table using an engine other than InnoDB, the system creates a JSON file (.SDI) containing serialised data dictionary info.
  • Roles! You can now define a set of privileges as a role, and then assign the role to a user. This feature greatly simplifies user management.
  • You can set persistent variables from SQL (set persist variable_name=something) and they will survive a restart (except when they don't.
  • There is a new plugin infrastructure of which I haven't seen any example yet.

Gotchas

  • Some of the tables in the mysql database are gone. If you have tests that look for them (e.g. proc, event) they will fail, as these tables are now hidden in the data dictionary.
  • Only InnoDB tables support partitioning! If you create MyISAM or Archive tables and try to partition them, you will get an error.
  • The installation requires a new library (libnuma1 in Debian/Ubuntu, numactl-libs in CentOS). You need to install it before starting the MySQL server.

Unmatched expectations

  • The installation still issues the same warnings reported for MySQL 5.7. I was thinking that by now the developers know how to set the initialisation without triggering a warning about TIMESTAMPS. I was mistaken.

[Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
++ 2016-09-11T17:42:06.471261Z 1 [Warning] InnoDB: New log files created, LSN=49311
++ 2016-09-11T17:42:06.548747Z 1 [Warning] InnoDB: Creating foreign key constraint system tables.
++ 2016-09-11T17:42:07.593428Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 0f4a8ceb-7847-11e6-bd60-0242ac110002.
++ 2016-09-11T17:42:07.597075Z 0 [Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
  • The monitoring features for replication are unchanged. There is nothing new about reporting GTID execution in performance_schema tables, except some minor additions for RBR operations.

Taking MySQL 8.0 for a spin

MySQL-Sandbox 3.2.01 can install the new version from a tarball.

If you are on Linux, you can even try the newest MySQL-Sandbox feature, which gets you reduced binaries for a given MySQL version with a single command. Just install MySQL::Sandbox 3.2.02 and run:


make_sandbox GET:8.0 -- --no_show
2016-09-12 17:38:40 URL:https://raw.githubusercontent.com/datacharmer/mysql-docker-minimal/master/dbdata/available.txt [98/98] -> "/tmp/available.txt" [1]
wget -nv -O 8.0.0.tar.gz 'https://github.com/datacharmer/mysql-docker-minimal/blob/master/dbdata/8.0.0.tar.gz?raw=true'
    The MySQL Sandbox,  version 3.2.02
    (C) 2006-2016 Giuseppe Maxia
# Starting server
... sandbox server started
# Loading grants
Your sandbox server was installed in $HOME/sandboxes/msb_8_0_0

$  ~/sandboxes/msb_8_0_0/use
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 8.0.0-dmr MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql [localhost] {msandbox} ((none)) >  show schemas;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.00 sec)

At first sight, it looks like the usual MySQL. Same schemas, and almost the same tables. You will see a few tables related to roles in mysql, and a dozen new tables between information_schema and performance_schema. However, where the database has a different aspect is on the outside. Let's start with the data directory:


total 188468
-rw-r----- 1 root root       56 Sep 11 18:14 auto.cnf
-rw-r----- 1 root root     1813 Sep 11 18:14 ib_buffer_pool
-rw-r----- 1 root root 50331648 Sep 11 18:15 ib_logfile0
-rw-r----- 1 root root 50331648 Sep 11 18:15 ib_logfile1
-rw-r----- 1 root root 79691776 Sep 11 18:15 ibdata1
-rw-r----- 1 root root 12582912 Sep 11 18:15 ibtmp1
-rw-r----- 1 root root     4396 Sep 11 18:39 msandbox.err
drwxr-x--- 2 root root     4096 Sep 11 18:14 mysql
-rw-r----- 1 root root        4 Sep 11 18:15 mysql_sandbox8000.pid
-rw-rw---- 1 root root        3 Sep 11 18:15 mysqld_safe.pid
-rw-r----- 1 root root      225 Sep 11 18:14 performance_sche_3.SDI
drwxr-x--- 2 root root     4096 Sep 11 18:14 performance_schema
drwxr-x--- 2 root root     4096 Sep 11 18:14 sys
-rw-r----- 1 root root      210 Sep 11 18:14 sys_4.SDI
drwxr-x--- 2 root root     4096 Sep 11 18:15 test
-rw-r----- 1 root root      210 Sep 11 18:15 test_5.SDI

Notice that, for each schema except mysql, we have a .SDI file. For example:


cat performance_sche_3.SDI
{
    "sdi_version": 1,
    "dd_version": 1,
    "dd_object_type": "Schema",
    "dd_object": {
        "name": "performance_schema",
        "default_collation_id": 33,
        "created": 0,
        "last_altered": 0
    }
}

For tables other than innodb, we get a similar file, only more complex:


mysql [localhost] {msandbox} (test) > create table t1 (i int not null auto_increment primary key, c char(10) ) engine=myisam;
Query OK, 0 rows affected (0.01 sec)

####

ls -l ./test
total 8
-rw-r----- 1 root root    0 Sep 11 18:49 t1.MYD
-rw-r----- 1 root root 1024 Sep 11 18:49 t1.MYI
-rw-r----- 1 root root 2706 Sep 11 18:49 t1_326.SDI

cat test/t1_326.SDI
{
    "sdi_version": 1,
    "dd_version": 1,
    "dd_object_type": "Table",
    "dd_object": {
        "name": "t1",
        "mysql_version_id": 80000,
        "created": 20160911185434,
        "last_altered": 20160911185434,
        "options": "avg_row_length=0;key_block_size=0;keys_disabled=0;pack_record=0;stats_auto_recalc=0;stats_sample_pages=0;",
        "columns": [
            {
                "name": "i",
                "type": 4,
                "is_nullable": false,
                "is_zerofill": false,
                "is_unsigned": false,
                "is_auto_increment": true,
                "is_virtual": false,
                "hidden": false,
                "ordinal_position": 1,
                "char_length": 11,
                "numeric_precision": 10,
                "numeric_scale": 0,
                "datetime_precision": 0,
                "has_no_default": false,
                "default_value_null": false,
                "default_value": "AAAAAA==",
                "default_option": "",
                "update_option": "",
                "comment": "",
                "generation_expression": "",
                "generation_expression_utf8": "",
                "options": "interval_count=0;",
                "se_private_data": "",
                "column_key": 2,
                "column_type_utf8": "int(11)",
                "elements": [],
                "collation_id": 8
            },
            {
                "name": "c",
                "type": 29,
                "is_nullable": true,
                "is_zerofill": false,
                "is_unsigned": false,
                "is_auto_increment": false,
                "is_virtual": false,
                "hidden": false,
                "ordinal_position": 2,
                "char_length": 10,
                "numeric_precision": 0,
                "numeric_scale": 0,
                "datetime_precision": 0,
                "has_no_default": false,
                "default_value_null": true,
                "default_value": "",
                "default_option": "",
                "update_option": "",
                "comment": "",
                "generation_expression": "",
                "generation_expression_utf8": "",
                "options": "interval_count=0;",
                "se_private_data": "",
                "column_key": 1,
                "column_type_utf8": "char(10)",
                "elements": [],
                "collation_id": 8
            }
        ],
        "schema_ref": "test",
        "hidden": false,
        "se_private_id": 18446744073709551615,
        "engine": "MyISAM",
        "comment": "",
        "se_private_data": "",
        "row_format": 1,
        "partition_type": 0,
        "partition_expression": "",
        "default_partitioning": 0,
        "subpartition_type": 0,
        "subpartition_expression": "",
        "default_subpartitioning": 0,
        "indexes": [
            {
                "name": "PRIMARY",
                "hidden": false,
                "is_generated": false,
                "ordinal_position": 1,
                "comment": "",
                "options": "flags=0;",
                "se_private_data": "",
                "type": 1,
                "algorithm": 2,
                "is_algorithm_explicit": false,
                "is_visible": true,
                "engine": "MyISAM",
                "elements": [
                    {
                        "ordinal_position": 1,
                        "length": 4,
                        "order": 2,
                        "column_opx": 0
                    }
                ]
            }
        ],
        "foreign_keys": [],
        "partitions": [],
        "collation_id": 8
    }
}

If the table has many columns, the structure becomes quite large.

Let's try some new features:


mysql [localhost] {root} ((none)) > create role role1;
Query OK, 0 rows affected (0.01 sec)

mysql [localhost] {root} ((none)) > grant all on test.* to role1;
Query OK, 0 rows affected (0.01 sec)

mysql [localhost] {root} ((none)) > create role role2;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {root} ((none)) > grant select on *.* to role2;
Query OK, 0 rows affected (0.01 sec)

mysql [localhost] {root} ((none)) > select host, user from mysql.user;
+-----------+-------------+
| host      | user        |
+-----------+-------------+
| %         | role1       |
| %         | role2       |
| 127.%     | msandbox    |
| 127.%     | msandbox_ro |
| 127.%     | msandbox_rw |
| 127.%     | rsandbox    |
| localhost | msandbox    |
| localhost | msandbox_ro |
| localhost | msandbox_rw |
| localhost | mysql.sys   |
| localhost | root        |
+-----------+-------------+
11 rows in set (0.00 sec)

From this, we see that roles are users. In fact, you can assign a user to another user, as if it were a role.


mysql [localhost] {root} ((none)) > show grants for role1;
+-------------------------------------------------+
| Grants for role1@%                              |
+-------------------------------------------------+
| GRANT USAGE ON *.* TO `role1`@`%`               |
| GRANT ALL PRIVILEGES ON `test`.* TO `role1`@`%` |
+-------------------------------------------------+
2 rows in set (0.00 sec)

mysql [localhost] {root} ((none)) > create user user1 identified by 'msandbox';
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {root} ((none)) > grant role1 to user1;
Query OK, 0 rows affected (0.00 sec)
mysql [localhost] {root} ((none)) > alter user user1 default role   role1;
Query OK, 0 rows affected (0.01 sec)

mysql [localhost] {root} ((none)) > show grants for user1;
+-----------------------------------+
| Grants for user1@%                |
+-----------------------------------+
| GRANT USAGE ON *.* TO `user1`@`%` |
| GRANT `role1`@`%` TO `user1`@`%`  |
+-----------------------------------+
2 rows in set (0.00 sec)

mysql [localhost] {root} ((none)) > show grants for user1 using role1;
+-------------------------------------------------+
| Grants for user1@%                              |
+-------------------------------------------------+
| GRANT USAGE ON *.* TO `user1`@`%`               |
| GRANT ALL PRIVILEGES ON `test`.* TO `user1`@`%` |
| GRANT `role1`@`%` TO `user1`@`%`                |
+-------------------------------------------------+
3 rows in set (0.00 sec)

 create user user2 identified by 'msandbox';
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {root} ((none)) > grant role2 to user2;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {root} ((none)) > show grants for user2;
+-----------------------------------+
| Grants for user2@%                |
+-----------------------------------+
| GRANT USAGE ON *.* TO `user2`@`%` |
| GRANT `role2`@`%` TO `user2`@`%`  |
+-----------------------------------+
2 rows in set (0.00 sec)

mysql [localhost] {root} ((none)) > show grants for user2 using role2;
+------------------------------------+
| Grants for user2@%                 |
+------------------------------------+
| GRANT SELECT ON *.* TO `user2`@`%` |
| GRANT `role2`@`%` TO `user2`@`%`   |
+------------------------------------+
2 rows in set (0.01 sec)

Now we connect with user1


mysql [localhost] {user1} ((none)) > use mysql
ERROR 1044 (42000): Access denied for user 'user1'@'%' to database 'mysql'
mysql [localhost] {user1} ((none)) > use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql [localhost] {user1} (test) > show tables;
+----------------+
| Tables_in_test |
+----------------+
| t1             |
+----------------+
1 row in set (0.00 sec)

That seems to be correct, as user1 can only access the test database. Let's try user2:


mysql [localhost] {user2} ((none)) > show tables from test;
ERROR 1044 (42000): Access denied for user 'user2'@'%' to database 'test'
mysql [localhost] {user2} ((none)) > show grants;
+-----------------------------------+
| Grants for user2@%                |
+-----------------------------------+
| GRANT USAGE ON *.* TO `user2`@`%` |
| GRANT `role2`@`%` TO `user2`@`%`  |
+-----------------------------------+
2 rows in set (0.00 sec)

mysql [localhost] {user2} ((none)) > set role role2;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {user2} ((none)) > show tables from test;
+----------------+
| Tables_in_test |
+----------------+
| t1             |
+----------------+
1 row in set (0.01 sec)

Here we have something that, unless we read the documentation carefully, we may miss. A role is ineffective if is not assigned. For user1, we did an ALTER USER and set the default role. Since we didn't do it for user2, we need to set the role explicitly before using its powers.

We can also assign to a user several roles, and activate them all:


mysql [localhost] {root} ((none)) > create user user3 identified by 'msandbox';
Query OK, 0 rows affected (0.01 sec)

mysql [localhost] {root} ((none)) > grant role1 to user3;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {root} ((none)) > grant role2 to user3;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {root} ((none)) > alter user user3 default role   role1,role2;
Query OK, 0 rows affected (0.00 sec)

Now, when we connect with user3 we can see all tables (SELECT grant from role2) and create or modify tables in test (all privileges on test database from role1.)

One final observation, just because it became crucial when I was starting to test these new binaries. Look at the size of the latest major versions of MySQL:


$ ls -lh mysql-5.5.52-linux2.6-x86_64.tar.gz
-rw-rw-r-- 1 root root 178M Aug 26 15:38 mysql-5.5.52-linux2.6-x86_64.tar.gz

$ du -sh mysql-5.5.52-linux2.6-x86_64/
690M     mysql-5.5.52-linux2.6-x86_64/

$ ls -lh mysql-5.6.33-linux-glibc2.5-x86_64.tar.gz
-rw-rw-r-- 1 root root 300M Aug 26 12:52 mysql-5.6.33-linux-glibc2.5-x86_64.tar.gz

$ du -sh mysql-5.6.33-linux-glibc2.5-x86_64/
1.1G     mysql-5.6.33-linux-glibc2.5-x86_64/

$ ls -lh mysql-5.7.15-linux-glibc2.5-x86_64.tar.gz
-rw-rw-r-- 1 root root 612M Aug 25 13:08 mysql-5.7.15-linux-glibc2.5-x86_64.tar.gz

$ du -sh mysql-5.7.15-linux-glibc2.5-x86_64/
2.5G     mysql-5.7.15-linux-glibc2.5-x86_64/

$ ls -lh mysql-8.0.0-dmr-linux-glibc2.12-x86_64.tar.gz
-rwxr-xr-x 1 root root 961M Aug 26 17:37 mysql-8.0.0-dmr-linux-glibc2.12-x86_64.tar.gz

$ du -sh mysql-8.0.0-dmr-linux-glibc2.12-x86_64/
3.6G     mysql-8.0.0-dmr-linux-glibc2.12-x86_64/