MySQL NDB Cluster 8.4 Release Notes
MySQL NDB Cluster 8.4.4 is a new LTS release of NDB 8.4, based on
MySQL Server 8.4 and including features in version 8.4 of the
NDB
storage engine, as well as fixing
recently discovered bugs in previous NDB Cluster releases.
Obtaining MySQL NDB Cluster 8.4. NDB Cluster 8.4 source code and binaries can be obtained from https://dev.mysql.com/downloads/cluster/.
For an overview of major changes made in NDB Cluster 8.4, see What is New in MySQL NDB Cluster 8.4.
This release also incorporates all bug fixes and changes made in previous NDB Cluster releases, as well as all bug fixes and feature changes which were added in mainline MySQL 8.4 through MySQL 8.4.4 (see Changes in MySQL 8.4.4 (2025-01-21, LTS Release)).
macOS:
A uint64_t
value used with
%zu
caused a [-Wformat]
compiler warning on MacOS.
(Bug #37174692)
Removed a warning in
storage/ndb/src/common/util/cstrbuf.cpp
.
(Bug #37049014)
Microsoft Windows:
Successive iterations of the sequence
ndb_sign_keys
--create-key
followed by
ndb_sign_keys
--promote
were
unsuccessful on Windows.
(Bug #36951132)
NDB Disk Data:
mysqld did not use a disk scan for
NDB
tables with 256 disk columns or more.
(Bug #37201922)
NDB Replication:
The replication applier normally retries temporary errors
occurring while applying transactions. Such retry logic is not
performed for transactions containing row events where the
STMT_END_F
flag is missing; instead, the
statement is committed in an additional step while applying the
subsequent COMMIT
query event when there are
still locked tables. Problems arose when committing this
statement, because temporary errors were not handled properly.
Replica skip error functionality was also affected in that it
attempted to skip only the error that occurred when a
transaction was committed a second time.
The binary log contains an epoch transaction with writes from
multiple server IDs on the source. The replica then uses
IGNORE_SERVER_IDS
(<last_server_id_in_binlog>) to cause the
STMT_END_F
to be filtered away, thus
committing the statement from the COMMIT query log event on the
applier. Holding a lock on one of the rows to be updated by the
applier triggered error handling, which caused replication to
stop with an error, with no retries being performed.
We now handle such errors, logging all messages in diagnostics areas (as is already done for row log events) and then retrying the transaction. (Bug #37331118)
NDB Replication:
When a MySQL server performing binary logging connects to an NDB
Cluster, it checks for existing binary logs; if it finds any, it
writes an Incident
event to a log file of its
own so that any downstream replicas can detect the potential for
lost events. Problems arose under some circumstances because it
was possible for the timestamps of events logged in this file to
be out of order; the Incident
event was
written following other events but had a smaller timestamp than
these preceding events. We fix this issue by ensuring that a
fresh timestamp is used prior to writing an incident to the
binary log on startup rather than one which may have been
obtained and held for some time previously.
(Bug #37228735)
NDB Cluster APIs:
The Ndb_cluster_connection
destructor calls g_eventLogger::stopAsync()
in order to release the buffers used by the asynchronous logging
mechanism as well as to stop the threads responsible for this
logging. When the g_eventLogger
object was
deleted before the Ndb_cluster_connection
destructor was called, the application terminated after trying
to use a method on a null object. This could happen in either of
two ways:
An API program deleted the logger object before deleting the
Ndb_cluster_connection
.
ndb_end()
was called before the
Ndb_cluster_connection
was deleted.
We solve this issue by skipping the call to
stopAsync()
in the
Ndb_cluster_connection
destructor when g_eventLogger
is
NULL
. This fix also adds a warning to inform
API users that deleting g_eventLogger
before
calling the Ndb_cluster_connection
destructor
is incorrect usage.
For more information, see API Initialization and Cleanup. (Bug #37300558)
NDB Cluster APIs:
Removed known causes of API node sersus data node state
misalignments, and improved the handling of state misalignments
when detected. In one such case, separate handling of scan
errors in the NDB
kernel and those
originating in API programs led to cleanup not being being
performed after some scans. Handling of
DBTC
and API state alignment
errors has been improved by this set of fixes, as well as scan
protocol timeout handling in
DBSPJ
; now, when such
misalignments in state are detected, the involved API nodes are
disconnected rather than the data node detecting it being forced
to shut down.
(Bug #20430083, Bug #22782511, Bug #23528433, Bug #28505289, Bug #36273474, Bug #36395384, Bug #36838756, Bug #37022773, Bug #37022901, Bug #37023549)
References: See also: Bug #22782511, Bug #23528433, Bug #36273474, Bug #36395384, Bug #36838756.
ndbinfo Information Database:
At table create and drop time, access of
ndbinfo
tables such as
operations_per_fragment
and
memory_per_fragment
sometimes
examined data which was not valid.
To fix this, during scans of these ndbinfo
tables, we ignore any fragments from tables in transient states
at such times due to being created or dropped.
(Bug #37140331)
Work done previously to support opening NDB
tables with missing indexes was intended to allow the features
of the MySQL server to be used to solve problems in cases where
indexes cannot be rebuilt due to unmet constraints. With missing
indexes, some of the SQL handler functionality is
unavailable—for example, the use of indexes to select rows
for modification efficiently, or to identify duplicates when
processing modifications, or to push joins relying on indexes.
This could lead to the unplanned shutdown of an NDB Cluster SQL
node.
In such cases, the server now simply returns an error. (Bug #37299071)
Recent refactoring of the transporter layer added the reporting of the presence of socket shutdown errors, but not their nature. This led to confusion in the common case where a socket shutdown is requested, but the socket is already closed by the peer. To avoid such confusion, this logging has been removed. (Bug #37243135)
References: This issue is a regression of: Bug #35750771.
It was not possible to create an NDB
table
with 256 or more BLOB
columns
when also specifying a reduced inline size, as in the following
SQL statement:
CREATE TABLE t1 (
pk INT PRIMARY KEY,
b1 BLOB COMMENT 'NDB_COLUMN=BLOB_INLINE_SIZE=100',
b2 BLOB COMMENT 'NDB_COLUMN=BLOB_INLINE_SIZE=100',
...,
b256 BLOB COMMENT 'NDB_COLUMN=BLOB_INLINE_SIZE=100'
) ENGINE=NDBCLUSTER;
(Bug #37201818)
In some cases, the occurrence of node failures during shutdown led to the cluster becoming unrecoverable without manual intervention.
We fix this by modifying global checkpoint ID (GCI) information
propagation (CopyGCI
mechanism) to reject
propagation of any set of GCI information which does not
describe the ability to recover the cluster automatically as
part of a system restart.
(Bug #37163647)
References: See also: Bug #37162636.
In some cases, node failures during an otherwise graceful
shutdown could lead to a cluster becoming unrecoverable without
manual intervention. This fix modifies the generic GCI info
propagation mechanism (CopyGCI
) to reject
propagating any set of GCI information which does not describe
the ability to recover a cluster automatically.
(Bug #37162636)
Improved variable names used in
start_resend()
, and enhanced related debug
messages to users and developers with additional information.
(Bug #37157987)
In certain cases, a COPY_FRAGREQ
signal did
not honor a fragment scan lock.
(Bug #37125935)
In cases where NDB
experienced an API
protocol timeout when attempting to close a scan operation, it
considered the DBTC
ApiConnectRecord
involved to be lost for
further use, at least until the API disconnected and API failure
handling within DBTC
reclaimed the record.
This has been improved by having the API send a
TCRELEASEREQ
signal to
DBTC
in such cases, performing API failure
handling for a single ApiConnectRecord
within
DBTC
.
(Bug #37023661)
References: See also: Bug #36273474, Bug #36395384, Bug #37022773, Bug #37022901, Bug #37023549.
For tables using the NDB
storage
engine, the column comment option
BLOB_INLINE_SIZE
was silently ignored for
TINYBLOB
columns, and (silently)
defaulted to the hard-coded 256 byte value regardless of the
size provided; this was misleading to users.
To fix this problem, we now specifically disallow
BLOB_INLINE_SIZE
on
TINYBLOB
columns altogether, and
NDB
now prints a warning saying that the
column size is defaulting to 256 bytes.
(Bug #36725332)
Testing revealed that a fix for a previous issue which added a
check of the ApiConnectRecord
failure number
against the system's current failure number did not
initialize the ApiConnectRecord
failure
number in all cases.
(Bug #36155195)
References: This issue is a regression of: Bug #36028828.
ndb_config did not always handle very long file paths correctly.
Our thanks to Dirkjan Bussink for the contribution. (Bug #116748, Bug #37310680)
Errors of unknown provenance were logged while assigning node
IDs during cluster synchronization, leading to user doubt and
concern. Logging of the data node
QMGR
block and the
ndb_mgmd process relating to node ID
allocation issues has therefore been improved, to supply more
and better information about what is being reported in such
cases.
(Bug #116351, Bug #37189356)
A multi-range scan sometimes lost its fragment lock for the second and subsequent ranges of the scan. (Bug #111932, Bug #35660890)