MySQL NDB Cluster 7.6 Release Notes
NDB Disk Data: A new file format is introduced in this release for NDB Disk Data tables. The new format provides a mechanism whereby each Disk Data table can be uniquely identified without reusing table IDs. This is intended to help resolve issues with page and extent handling that were visible to the user as problems with rapid creating and dropping of Disk Data tables, and for which the old format did not provide a ready means to fix.
The new format is now used whenever new undo log file groups and tablespace data files are created. Files relating to existing Disk Data tables continue to use the old format until their tablespaces and undo log file groups are re-created. Important: The old and new formats are not compatible and so cannot be employed for different data files or undo log files that are used by the same Disk Data table or tablespace.
To avoid problems relating to the old format, you should
re-create any existing tablespaces and undo log file groups when
upgrading. You can do this by performing an initial restart of
each data node (that is, using the
--initial
option) as part of the
upgrade process. Since the current release is a pre-GA Developer
release, this initial node restart is optional for now, but
you should expect it—and prepare for it
now—to be mandatory in GA versions of NDB 7.6.
If you are using Disk Data tables, a downgrade from
any NDB 7.6 release to any NDB 7.5 or
earlier release requires restarting data nodes with
--initial
as part of the downgrade
process, due to the fact that NDB 7.5 and earlier releases
cannot read the new Disk Data file format.
For more information, see Upgrading and Downgrading NDB Cluster. (WL #9778)
Packaging:
NDB Cluster Auto-Installer RPM packages for SLES 12 failed due
to a dependency on python2-crypto
instead of
python-pycrypto
.
(Bug #25399608)
NDB Disk Data: Stale data from NDB Disk Data tables that had been dropped could potentially be included in backups due to the fact that disk scans were enabled for these. To prevent this possibility, disk scans are now disabled—as are other types of scans—when taking a backup. (Bug #84422, Bug #25353234)
NDB Cluster APIs:
When signals were sent while the client process was receiving
signals such as SUB_GCP_COMPLETE_ACK
and
TC_COMMIT_ACK
, these signals were temporary
buffered in the send buffers of the clients which sent them. If
not explicitly flushed, the signals remained in these buffers
until the client woke up again and flushed its buffers. Because
there was no attempt made to enforce an upper limit on how long
the signal could remain unsent in the local client buffers, this
could lead to timeouts and other misbehavior in the components
waiting for these signals.
In addition, the fix for a previous, related issue likely made this situation worse by removing client wakeups during which the client send buffers could have been flushed.
The current fix moves responsibility for flushing messages sent
by the receivers, to the receiver (poll_owner
client). This means that it is no longer necessary to wake up
all clients merely to have them flush their buffers. Instead,
the poll_owner
client (which is already
running) performs flushing the send buffer of whatever was sent
while delivering signals to the recipients.
(Bug #22705935)
References: See also: Bug #18753341, Bug #23202735.
NDB Cluster APIs:
The adaptive send algorithm was not used as expected, resulting
in every execution request being sent to the
NDB
kernel immediately, instead of
trying first to collect multiple requests into larger blocks
before sending them. This incurred a performance penalty on the
order of 10%. The issue was due to the transporter layer always
handling the forceSend
argument used
in several API methods (including
nextResult()
and close()
)
as true
.
(Bug #82738, Bug #24526123)
The ndb_print_backup_file utility failed when attempting to read from a backup file when the backup included a table having more than 500 columns. (Bug #25302901)
References: See also: Bug #25182956.
ndb_restore did not restore tables having
more than 341 columns correctly. This was due to the fact that
the buffer used to hold table metadata read from
.ctl
files was of insufficient size, so
that only part of the table descriptor could be read from it in
such cases. This issue is fixed by increasing the size of the
buffer used by ndb_restore for file reads.
(Bug #25182956)
References: See also: Bug #25302901.
No traces were written when ndbmtd received a
signal in any thread other than the main thread, due to the fact
that all signals were blocked for other threads. This issue is
fixed by the removal of SIGBUS
,
SIGFPE
, SIGILL
, and
SIGSEGV
signals from the list of signals
being blocked.
(Bug #25103068)
The ndb_show_tables utility did not display type information for hash maps or fully replicated triggers. (Bug #24383742)
The NDB Cluster Auto-Installer did not show the user how to force an exit from the application (CTRL+C). (Bug #84235, Bug #25268310)
The NDB Cluster Auto-Installer failed to exit when it was unable to start the associated service. (Bug #84234, Bug #25268278)
The NDB Cluster Auto-Installer failed when the port specified by
the --port
option (or the default port 8081)
was already in use. Now in such cases, when the required port is
not available, the next 20 ports are tested in sequence, with
the first one available being used; only if all of these are in
use does the Auto-Installer fail.
(Bug #84233, Bug #25268221)
Multiples instances of the NDB Cluster Auto-Installer were not
detected. This could lead to inadvertent multiple deployments on
the same hosts, stray processes, and similar issues. This issue
is fixed by having the Auto-Installer create a PID file
(mcc.pid
), which is removed upon a
successful exit.
(Bug #84232, Bug #25268121)
When a data node running with
StopOnError
set to 0
underwent an unplanned shutdown, the automatic restart performed
the same type of start as the previous one. In the case where
the data node had previously been started with the
--initial
option, this meant that
an initial start was performed, which in cases of multiple data
node failures could lead to loss of data. This issue also
occurred whenever a data node shutdown led to generation of a
core dump. A check is now performed to catch all such cases, and
to perform a normal restart instead.
In addition, in cases where a failed data node was unable prior to shutting down to send start phase information to the angel process, the shutdown was always treated as a startup failure, also leading to an initial restart. This issue is fixed by adding a check to execute startup failure handling only if a valid start phase was received from the client. (Bug #83510, Bug #24945638)
Data nodes that were shut down when the redo log was exhausted
did not automatically trigger a local checkpoint when restarted,
and required the use of DUMP
7099
to start one manually.
(Bug #82469, Bug #24412033)
When a data node was restarted, the node was first stopped, and
then, after a fixed wait, the management server assumed that the
node had entered the NOT_STARTED
state, at
which point, the node was sent a start signal. If the node was
not ready because it had not yet completed stopping (and was
therefore not actually in NOT_STARTED
), the
signal was silently ignored.
To fix this issue, the management server now checks to see whether the data node has in fact reached the NOT_STARTED state before sending the start signal. The wait for the node to reach this state is split into two separate checks:
Wait for data nodes to start shutting down (maximum 12 seconds)
Wait for data nodes to complete shutting down and reach
NOT_STARTED
state (maximum 120 seconds)
If either of these cases times out, the restart is considered failed, and an appropriate error is returned. (Bug #49464, Bug #11757421)
References: See also: Bug #28728485.