36 Changes in MySQL NDB Cluster 7.5.2 (5.7.12-ndb-7.5.2) (2016-06-01, Development Milestone)

MySQL NDB Cluster 7.5.2 is a new release of MySQL NDB Cluster 7.5, based on MySQL Server 5.7 and including features in version 7.5 of the NDB storage engine, as well as fixing recently discovered bugs in previous NDB Cluster releases.

Obtaining MySQL NDB Cluster 7.5. MySQL NDB Cluster 7.5 source code and binaries can be obtained from https://dev.mysql.com/downloads/cluster/.

For an overview of changes made in MySQL NDB Cluster 7.5, see What is New in NDB Cluster 7.5.

This release also incorporates all bug fixes and changes made in previous NDB Cluster releases, as well as all bug fixes and feature changes which were added in mainline MySQL 5.7 through MySQL 5.7.12 (see Changes in MySQL 5.7.12 (2016-04-11, General Availability)).

Functionality Added or Changed

Important Change; NDB Replication: The ndb_binlog_index table now uses the InnoDB storage engine. When upgrading previous versions to the current release or to a later one, use mysql_upgrade with --force --upgrade-system-tables to have it perform ALTER TABLE ... ENGINE=INNODB on this table. Use of the MyISAM storage engine for this table continues to be supported for backward compatibility.
One benefit of this change is that it is now possible to depend on transactional behavior and lock-free reads for this table, which should help alleviate concurrency issues during purge operations and log rotation, and improve the availability of this table.
Due to differences in storage requirements for MyISAM and InnoDB tables, users should be aware that ndb_binlog_index may now take up more disk space than was previously required.
For more information, see NDB Cluster Replication Schema and Tables. (WL #7162)
Performance: A deficiency in event buffer memory allocation was identified as inefficient and possibly leading to undesirable results. This could happen when additional memory was allocated from the operating system to buffer received event data even when memory had already been allocated but remained unused. This is fixed by allocating the event buffer memory directly from the page allocation memory manager (mmap()), where such functionality is offered by the operating system, allowing for direct control over the memory such that it is in fact returned to the system when released.
This remimplementation avoids the tendencies of the existing one to approach worst-case memory usage, maintainence of data structures for a worst-case event buffer event count, and useless caching of free memory in unusable positions. This work should also help minimize the runtime costs of buffering events, minimize heap fragmentation, and avoid OS-specific problems due to excessive numbers of distinct memory mappings.
In addition, the relationship between epochs and internal EventData objects is now preserved throughout the event lifecycle, reception to consumption, thus removing the need for iterating, and keeping in synch, two different lists representing the epochs and their EventData objects.
As part of this work, better reporting on the relevant event buffer metrics is now provided in the cluster logs. (WL #7677, WL #9207)
References: See also: Bug #21651536, Bug #21660947, Bug #21661297, Bug #21673318, Bug #21689380, Bug #21809959.
NDB Cluster APIs: Added the Ndb::setEventBufferQueueEmptyEpoch() method, which makes it possible to enable queuing of empty events (event type TE_EMPTY). (Bug #22157845)
NDB Cluster APIs: Ndb_free_list_t is a template used in the implementation of the NDB API to create a list keeping released API objects such as NdbTransaction, NdbOperation, and NdbRecAttr. One drawback to this template is that released API objects are kept in the list for the lifetime of the owning Ndb object, such that a transient peak in the demand for any object causes an effective leak in memory that persists until this Ndb object itself has been released.
This work adds statistics to each Ndb_free_list instance which samples the usage of objects maintained by the list; now, when objects are released, they can be released into the free list, or deallocated, based on the collected usage statistics. (WL #8351)
JSON: The NDB storage engine supports the MySQL JSON data type and MySQL JSON functions implemented in MySQL 5.7.8 and later. This support is subject to the limitation that a single NDB table can have at most 3 JSON columns. (WL #9007)
MySQL NDB ClusterJ: To make it easier for ClusterJ to handle fatal errors that require the SessionFactory to be closed, a new public method in the SessionFactory interface, getConnectionPoolSessionCounts(), has been created. When it returns zeros for all pooled connections, it means all sessions have been closed, at which point the SessionFactory can be closed and reopened. See Error Handling and Reconnection for more detail. (Bug #22353594)
Made the following enhancements and additions to the ThreadConfig multithreaded data node (ndbmtd) configuration parameter:
- Added support for non-exclusive CPU locking on FreeBSD and Windows using cpubind and cpuset.
- Added support for exclusive CPU locking on Solaris, using cpubind_exclusive and cpuset_exclusive, which are added in this release.
- Added thread prioritization using thread_prio, which is added in this release. thread_prio is supported on Linux, FreeBSD, Windows, and Solaris, but the exact effects of this setting are platform-specific; see the documentation for details.
- Added support for setting realtime on Windows platforms.
For more information, see the description of the ThreadConfig parameter in the online documentation. (Bug #25830247, WL #9096)
ndb_restore now performs output logging for specific stages of its operation. (Bug #21097957)
An improvement in the hash index implementation used by MySQL NDB Cluster data nodes means that partitions may now contain more than 16 GB of data for fixed columns, and the maximum partition size for fixed column data is now increased to 128 TB. The previous limitation originated with the DBACC block in the NDB kernel using only 32-bit references to the fixed-size part of a row handled in the DBTUP block, even though 45-bit references were already in use elsewhere in the kernel outside the DBACC block; all such references in DBACC now use 45-bit pointers instead.
As part of this work, error messages returned by the DBACC kernel block that were overly generic have now been improved by making them more specific. (Bug #13844581, Bug #17465232, WL #8570, WL #8817, WL #8962)
A number of changes and improvements were made to the handling of send threads by NDB. The changes are summarized with brief descriptions in the following list:
- Decreased resource requirements for send threads, making sure that a given configuration using send threads outperforms the same configuration without send threads.
- Made use of otherwise idle threads (other than receiver threads) as send threads without incurring extra CPU resources in low-load situations.
- Improved response time for write transactions.
- Provided for handling of bad configuration data in a more graceful manner.
- Made it possible to measure CPU usage with improved real-time reporting from data nodes on their resource usage.
As part of implementing the last item of those just listed, a number of new tables providing information about CPU and thread activity by node, thread ID, and thread type have been added to the ndbinfo information database. These tables are listed here:
- cpustat: Provides per-second, per-thread CPU statistics
- cpustat_50ms: Shows raw per-thread CPU statistics data, gathered every 50ms
- cpustat_1sec: Provides raw per-thread CPU statistics data, gathered each second
- cpustat_20sec: Displays raw per-thread CPU statistics data, gathered every 20 seconds
- threads: Shows names and descriptions of thread types
For more information, see ndbinfo: The NDB Cluster Information Database. (WL #8968)
Rewrote the implementation of the NDB storage engine's global schema lock functionality to make use of the metadata lock hooks implemented in the MySQL Server in MySQL 5.7.11. (WL #8331)
A number of improvements provide additional read scalability for NDB by making it possible to read tables locally. It is now possible to enable reads from any fragment replica, rather than from the primary fragment replica only. This is disabled by default to remain compatible with previous behavior, but can be enabled for a given SQL node using the ndb_read_backup system variable added in this release.
It also becomes possible to be more flexible about the assignment of partitions by setting a fragment count type. Possible count types are one per node, one per node group, one per Local Data Manager (LDM) per node (this is the previous assignment scheme and still the default), and one per LDM per node group. This setting can be controlled for individual tables by means of a FRAGMENT_COUNT_TYPE option embedded in an NDB_TABLE comment in CREATE TABLE or ALTER TABLE.
This also means that, when restoring table schemas, ndb_restore --restore-meta now uses the default partitioning for the target cluster, rather than duplicating the partitioning of the cluster from which the backup was taken. This is useful when restoring to a cluster having more data nodes than the original. See Restoring to More Nodes Than the Original, for more information.
Tables using one of the two per-node-group settings for the fragment count type can also be fully replicated. This requires that the table's fragment count type is ONE_PER_NODE_GROUP or ONE_PER_LDM_PER_NODE_GROUP, and can be enabled using the option FULLY_REPLICATED=1 within in an NDB_TABLE comment. The option can be enabled by default for all new NDB tables using the ndb_fully_replicated system variable added in this release.
Settings for table-level READ_BACKUP are also supported using the COMMENT="NDB_TABLE=..." syntax. It is also possible (and often preferable) to set multiple options in one comment within a single CREATE TABLE or ALTER TABLE statement. For more information and examples, see Setting NDB Comment Options.
This release also introduces the ndb_data_node_neighbour system variable, which is intended to be employed with transaction hinting (and fully-replicated tables), and to provide the current SQL node with the ID of a “nearby” data node to use. See the description of this variable in the documentation for more information. (WL #9018, WL #9019, WL #9020)
References: See also: Bug #18435416, Bug #11762155, Bug #54717.

Bugs Fixed

Incompatible Change: When the data nodes are only partially connected to the API nodes, a node used for a pushdown join may get its request from a transaction coordinator on a different node, without (yet) being connected to the API node itself. In such cases, the NodeInfo object for the requesting API node contained no valid info about the software version of the API node, which caused the DBSPJ block to assume (incorrectly) when aborting to assume that the API node used NDB version 7.2.4 or earlier, requiring the use of a backward compatability mode to be used during query abort which sent a node failure error instead of the real error causing the abort.
Now, whenever this situation occurs, it is assumed that, if the NDB software version is not yet available, the API node version is greater than 7.2.4. (Bug #23049170)
Important Change: When started with the --initialize option, mysqld no longer enables the NDBCLUSTER storage engine plugin. This change was needed to prevent attempted initialization of system databases as distributed (rather than as specific to individual SQL nodes), which could result in a metadata lock deadlock. This fix also brings the behavior of --initialize in this regard into line with that of the discontinued --bootstrap option, which started a minimal mysqld instance without enabling NDB. (Bug #22758238)
Performance: A performance problem was found in an internal polling method do_poll() where the polling client did not check whether it had itself been woken up before completing the poll. Subsequent analysis showed that it is sufficient that only some clients in the polling queue receive data. do_poll() can then signal these clients and give up its polling rights, even if the maximum specified wait time (10 ms) has not expired.
This change allows do_poll() to continue polling until either the maximum specified wait time has expired, or the polling client itself has been woken up (by receiving what it was waiting for). This avoids unnecessary thread switches between client threads and thus reduces the associated overhead by as much as 10% in the API client, resulting in a significant performance improvement when client threads perform their own polling. (Bug #81229, Bug #23202735)
macOS: On OS X, ndb_config failed when an empty string was used for the --host option. (Bug #80689, Bug #22908696)
Microsoft Windows: ndb_mgmd failed to start on 32-bit Windows platforms, due to an issue with calling dynamically loaded functions; such issues were also likely to occur with other NDB programs using ndb_init(). It was found that all of the functions used are already supported in targeted versions of Windows, so this problem is fixed by removing the dynamic loading of these functions and using the versions provided by the Windows header files instead. (Bug #80876, Bug #23014820)
Microsoft Windows: When building MySQL NDB Cluster on Windows using more than one parallel build job it was sometimes possible for the build to fail because host_info.exe could not be installed. To fix this problem, the install_mcc target is now always built prior to the host_info target. (Bug #80051, Bug #22566368)
Microsoft Windows: Performing ANALYZE TABLE on a table having one or more indexes caused ndbmtd to fail with an InvalidAttrInfo error due to signal corruption. This issue occurred consistently on Windows, but could also be encountered on other platforms. (Bug #77716, Bug #21441297)
Solaris: The ndb_print_file utility failed consistently on Solaris 9 for SPARC. (Bug #80096, Bug #22579581)
NDB Disk Data: The following improvements were made to logging during restarts by data nodes using MySQL NDB Cluster Disk Data:
- The total amount of undo log to be applied by the data node is now provided as the total number of pages present in the log. This is a worst case estimate.
- Progress information is now provided at regular intervals (once for each 30000 records) as the undo log is applied. This information is supplied as the number of records and number of undo log pages applied so far during the current restart.
(Bug #22513381)
NDB Cluster APIs: Deletion of Ndb objects used a dispoportionately high amount of CPU. (Bug #22986823)
NDB Cluster APIs: Executing a transaction with an NdbIndexOperation based on an obsolete unique index caused the data node process to fail. Now the index is checked in such cases, and if it cannot be used the transaction fails with an appropriate error. (Bug #79494, Bug #22299443)
Reserved send buffer for the loopback transporter, introduced in MySQL NDB Cluster 7.4.8 and used by API and management nodes for administrative signals, was calculated incorrectly. (Bug #23093656, Bug #22016081)
References: This issue is a regression of: Bug #21664515.
During a node restart, re-creation of internal triggers used for verifying the referential integrity of foreign keys was not reliable, because it was possible that not all distributed TC and LDM instances agreed on all trigger identities. To fix this problem, an extra step is added to the node restart sequence, during which the trigger identities are determined by querying the current master node. (Bug #23068914)
References: See also: Bug #23221573.
Following the forced shutdown of one of the 2 data nodes in a cluster where NoOfReplicas=2, the other data node shut down as well, due to arbitration failure. (Bug #23006431)
Aborting a CREATE LOGFILE GROUP statement which had failed due to lack of shared global memory was not performed correctly, causing node failure. In addition, the transaction in which this occurred was not rolled back correctly, also causing any subsequent CREATE LOGFILE GROUP to fail. (Bug #22982618)
The ndbinfo.tc_time_track_stats table uses histogram buckets to give a sense of the distribution of latencies. The sizes of these buckets were also reported as HISTOGRAM BOUNDARY INFO messages during data node startup; this printout was redundant and so has been removed. (Bug #22819868)
Online upgrades from previous versions of MySQL NDB Cluster to MySQL NDB Cluster 7.5 were not possible due to missing entries in the matrix used to test upgrade compatibility between versions. (Bug #22024947)
A failure occurred in DBTUP in debug builds when variable-sized pages for a fragment totalled more than 4 GB. (Bug #21313546)
Restoration of metadata with ndb_restore -m occasionally failed with the error message Failed to create index... when creating a unique index. While disgnosing this problem, it was found that the internal error PREPARE_SEIZE_ERROR (a temporary error) was reported as an unknown error. Now in such cases, ndb_restore retries the creation of the unique index, and PREPARE_SEIZE_ERROR is reported as NDB Error 748 Busy during read of event table. (Bug #21178339)
References: See also: Bug #22989944.
mysqld did not shut down cleanly when executing ndb_index_stat. (Bug #21098142)
References: See also: Bug #23343739.
The following improvements were made to the data node error logging mechanism:
- Increased the message slot size 499 bytes to 999 bytes to prevent log messages from overwriting one another or from being truncated.
- Added a Trace file name field to the output. This field contains the trace file name (without any path) and trace file number for the thread causing the trace.
- ndbmtd trace files are also now shown in the error log.
(Bug #21082710)
DBDICT and GETTABINFOREQ queue debugging were enhanced as follows:
- Monitoring by a data node of the progress of GETTABINFOREQ signals can be enabled by setting DictTrace >= 2.
- Added the ApiVerbose configuration parameter, which enables NDB API debug logging for an API node where it is set greater than or equal to 2.
- Added DUMP code 1229 which shows the current state of the GETTABINFOREQ queue. (See DUMP 1229.)
See also The DBDICT Block. (Bug #20368450)
References: See also: Bug #20368354.
mysql_upgrade failed to upgrade the sys schema if a sys database directory existed but was empty. (Bug #81352, Bug #23249846, Bug #22875519)
When a write to the ndb_binlog_index table failed during a MySQL Server shutdown, mysqld killed the NDB binary logging thread. (Bug #81166, Bug #23142945)
Memory associated with table descriptions was not freed by the internal table information method NdbDictInterface::parseTableInfo(). (Bug #81141, Bug #23130084)
Improved memory usage by the internal TransporterFacade constructor when performing mutex array initialization. (Bug #81134, Bug #23127824)
Fixed a memory leak that occurred when an error was raised in ha_ndbcluster::get_metadata() or one of the functions which this method calls. (Bug #81045, Bug #23089566)
An internal function used to validate connections failed to update the connection count when creating a new Ndb object. This had the potential to create a new Ndb object for every operation validating the connection, which could have an impact on performance, particularly when performing schema operations. (Bug #80750, Bug #22932982)
A table scan on an NDB table using neither an ordered index nor any Disk Data columns normally uses an ACC scan. If this happened while scanning an unique but unordered index which shrank (due to rows being deleted) after the scan started and then grew again (rows inserted), a single row that had been neither deleted nor inserted could be scanned twice. (Bug #80733, Bug #22926938)
Starting a backup in the ndb_mgm client after creating a large number of tables caused a forced shutdown of the cluster. (Bug #80640, Bug #228849958)
When an SQL node was started, and joined the schema distribution protocol, another SQL node, already waiting for a schema change to be distributed, timed out during that wait. This was because the code incorrectly assumed that the new SQL node would also acknowledge the schema distribution even though the new node joined too late to be a participant in it.
As part of this fix, printouts of schema distribution progress now always print the more significant part of a bitmask before the less significant; formatting of bitmasks in such printouts has also been improved. (Bug #80554, Bug #22842538)
The MySQL NDB Cluster Auto-Installer failed to work in various ways on different platforms. (Bug #79853, Bug #22502247)
The internal function ndbrequire(), which, like assert(), evaluates a given expression and terminates the process if the expression does not evaluate as true, now includes the failed expression in its output to the error logs. (Bug #77021, Bug #21128318)
Trying to drop a table during an ongoing backup failed with the error message Unknown table; now, it fails with Unable to alter table as backup is in progress. (Bug #47301, Bug #11755512)
References: See also: Bug #44695, Bug #11753280.