MySQL NDB Cluster 7.5 Release Notes
Important Change:
Previously, the ndbinfo
information database included lookup tables that used the
MyISAM
storage engine. This
dependency on MyISAM
has now been removed.
(Bug #20075747, WL #7575)
Important Change:
Previously, the NDB
scheduler always
optimized for speed against throughput in a predetermined manner
(this was hard coded); this balance can now be set using the
SchedulerResponsiveness
data node configuration parameter. This parameter accepts an
integer in the range of 0-10 inclusive, with 5 as the default.
Higher values provide better response times relative to
throughput. Lower values provide increased throughput, but
impose longer response times.
(Bug #78531, Bug #21889312)
Important Change:
A number of MySQL NDB Cluster data node configuration parameters
were deprecated in earlier versions of MySQL NDB Cluster, and
have been removed with this release. These parameters include
Id
,
NoOfDiskPagesToDiskDuringRestartTUP
,
NoOfDiskPagesToDiskDuringRestartACC
,
NoOfDiskPagesToDiskAfterRestartACC
,
NoOfDiskPagesToDiskAfterRestartTUP
,
ReservedSendBufferMemory
,
MaxNoOfIndexes
, and
Discless
(use Diskless
instead), as well as DiskCheckpointSpeed
and
DiskCheckpointSpeedInRestart
. The archaic and
unused ByteOrder
computer configuration
parameter has also been removed, as well as the unused
MaxNoOfSavedEvents
management node
confugration parameter. These parameters are no longer
supported; most of them already did not have (or no longer had)
any effect. Trying to use any of these parameters in a MySQL NDB
Cluster configuration file now results in an error.
For more information, see What is New in NDB Cluster 7.5. (Bug #77404, Bug #21280428)
Important Change:
The ndbinfo
database can now
provide default and current information about MySQL NDB Cluster
node configuration parameters as a result of the following
changes:
The config_params
table
has been enhanced with additional columns providing
information about each configuration parameter, including
its type, default, and maximum and minimum values (where
applicable).
A new config_values
table
has been added. A row in this table shows the current value
of a parameter on a given node.
You can obtain values of MySQL NDB Cluster configuration parameters by name using a join on these two tables such as the one shown here:
SELECT p.param_name AS Name, v.node_id AS Node, p.param_type AS Type, p.param_default AS 'Default', v.config_value AS Current FROM config_params p JOIN config_values v ON p.param_number = v.config_param WHERE p. param_name IN ('NodeId', 'HostName','DataMemory', 'IndexMemory');
(Bug #71587, Bug #18183958, WL #8703)
Important Change:
The ExecuteOnComputer
configuration parameter
for management, data, and API nodes is now deprecated, and is
subject to removal in a future MySQL NDB Cluster version. For
all types of MySQL NDB Cluster nodes, you should now use the
HostName
parameter exclusively for
identifying hosts in the cluster configuration file.
This information is also now displayed in the output of
ndb_config
--configinfo
--xml
.
(Bug #53052, Bug #11760628)
Deprecated MySQL NDB Cluster node configuration parameters are
now indicated as such by ndb_config
--configinfo
--xml
. For each parameter
currently deprecated, the corresponding
<param/>
tag in the XML output now
includes the attribute deprecated="true"
.
(Bug #21127135)
Added the --ndb-cluster-connection-pool-nodeids
option for mysqld, which can be used to
specify a list of nodes by node ID for connection pooling. The
number of node IDs in the list must equal the value set for
--ndb-cluster-connection-pool
.
(Bug #19521789)
Added the PROMPT
command in the
ndb_mgm client. This command has the syntax
PROMPT
,
which sets the client's prompt to
string
string
. Issuing the command without
an argument causes the prompt to be reset to the default
(ndb_mgm>
). See
Commands in the NDB Cluster Management Client, for more
information.
(Bug #18421338)
When the --database
option has not been specified for
ndb_show_tables, and no tables are found in
the TEST_DB
database, an appropriate warning
message is now issued.
(Bug #50633, Bug #11758430)
The NDB
storage engine now uses the
improved records-per-key interface for index statistics
introduced for the optimizer in MySQL 5.7. Some improvements due
to this change are listed here:
The optimizer can now choose better execution plans for
queries on NDB
tables in many cases where
a less optimal join index or table join order would
previously have been chosen.
EXPLAIN
now provides more
accurate row estimates than previously.
Improved cardinality estimates can be obtained from
SHOW INDEX
.
(WL #8165)
Incompatible Change; NDB Cluster APIs:
The pollEvents2()
method
now returns -1, indicating an error, whenever a negative value
is used for the time argument.
(Bug #20762291)
Important Change; NDB Cluster APIs:
Ndb::pollEvents()
is now
compatible with the TE_EMPTY
,
TE_INCONSISTENT
, and
TE_OUT_OF_MEMORY
event types introduced in
MySQL NDB Cluster 7.4.3. For detailed information about this
change, see the description of this method in the
MySQL NDB Cluster API Developer Guide.
(Bug #20646496)
Important Change; NDB Cluster APIs:
Added the method
Ndb::isExpectingHigherQueuedEpochs()
to the NDB API to detect when additional, newer event epochs
were detected by
pollEvents2()
.
The behavior of
Ndb::pollEvents()
has also been
modified such that it now returns
NDB_FAILURE_GCI (equal to
~(Uint64) 0
) when a cluster failure has been
detected.
(Bug #18753887)
Important Change; NDB Cluster APIs:
To release the memory used for dropped event operations, the
event API formerly depended on
pollEvents()
and
nextEvent()
to consume all
events possibly referring to the dropped events. This dependency
between
dropEventOperation()
and
the first two methods required the entire event buffer to be
read before attempting to release event operation memory (that
is, until successive calls to pollEvents()
and nextEvent()
returned no more events).
A related cleanup issue arose following the reset of the event
buffer (when all event operations had previously been dropped),
and the event buffer was truncated by the first
createEventOperation()
call subsequent to the
reset.
To fix these problems, the event buffer is now cleared when the last event operation is dropped, rather than waiting for a subsequent create operation which might or might not occur. Memory taken up by dropped event operations is also now released when the event queue has been cleared, which removes the hidden requirement for consuming all events to free up memory. In addition, event operation memory is now released as soon as all events referring to the operation have been consumed, rather than waiting for the entire event buffer to be consumed. (Bug #78145, Bug #21661297)
Important Change; NDB Cluster APIs:
The MGM API error-handling functions
ndb_mgm_get_latest_error()
,
ndb_mgm_get_latest_error_msg()
,
and
ndb_mgm_get_latest_error_desc()
each failed when used with a NULL
handle. You
should note that, although these functions are now null-safe,
values returned in this case are arbitrary and not meaningful.
(Bug #78130, Bug #21651706)
Important Change; NDB Cluster APIs: The following NDB API methods were not actually implemented and have been removed from the sources:
Datafile
methods:
getNode()
,
setNode()
, and
getFileNo()
Undofile
methods:
getNode()
,
setNode()
, and
getFileNo()
Table
methods:
getObjectType()
and
setObjectType()
Important Change:
The options controlling behavior of
NDB
programs with regard to the
number and timing of successive attempts to connect to a
management server have changed as listed here:
The minimum value for the
--connect-retry-delay
option common to all
NDB
programs has been changed from 0 to
1; this means that all NDB
programs now
wait at least 1 second between successive connection
attempts, and it is no longer possible to set a waiting time
equal to 0.
The semantics for the --connect-retries
option have changed slightly, such that the value of this
option now sets the number of times an
NDB
program tries to connect to a
management server. Setting this option to 0 now causes the
program to attempt the connection indefinitely, until it
either succeeds or is terminated by other means (such as
kill).
In addition, the default for the
--connect-retries
option for the
ndb_mgm client has been changed from 3 to
12, so that the minimum, maximum, and default values for
this option when used with ndb_mgm are
now exactly the same as for all other NDB
programs.
The ndb_mgm
--try-reconnect
option,
although deprecated in MySQL NDB Cluster 7.4, continues to
be supported as a synonym for ndb_mgm
--connect-retries
to provide backwards
compatibility. The default value for
--try-reconnect
has also been changed from
3 to 12, respectively, so that this option continues to
behave in the exactly in the same way as
--connect-retries
.
(Bug #22116937)
Important Change:
In previous versions of MySQL NDB Cluster, other DDL operations
could not be part of
ALTER ONLINE TABLE
... RENAME ...
. (This was disallowed by the fix for
BUG#16021021.) MySQL NDB Cluster 7.5 makes the following
changes:
Support for the ONLINE
and
OFFLINE
keywords, which was deprecated in
MySQL NDB Cluster 7.3, is now removed, and use of these now
causes a syntax error; the NDB
storage engine now accepts only
ALGORITHM = DEFAULT
, ALGORITHM =
COPY
, and ALGORITHM = INPLACE
to specify whether the ALTER
operation is
copying or in-place, just as in the standard MySQL Server.
NDB
now allows ALTER TABLE ...
ALGORITHM=COPYING RENAME
.
(Bug #20804269, Bug #76543, Bug #20479917, Bug #75797)
References: See also: Bug #16021021.
NDB Disk Data:
A unique index on a column of an
NDB
table is implemented with an
associated internal ordered index, used for scanning. While
dropping an index, this ordered index was dropped first,
followed by the drop of the unique index itself. This meant
that, when the drop was rejected due to (for example) a
constraint violation, the statement was rejected but the
associated ordered index remained deleted, so that any
subsequent operation using a scan on this table failed. We fix
this problem by causing the unique index to be removed first,
before removing the ordered index; removal of the related
ordered index is no longer performed when removal of a unique
index fails.
(Bug #78306, Bug #21777589)
NDB Cluster APIs:
The binary log injector did not work correctly with
TE_INCONSISTENT
event type handling by
Ndb::nextEvent()
.
(Bug #22135541)
References: See also: Bug #20646496.
NDB Cluster APIs:
While executing
dropEvent()
, if the
coordinator DBDICT
failed after the
subscription manager (SUMA
block) had removed
all subscriptions but before the coordinator had deleted the
event from the system table, the dropped event remained in the
table, causing any subsequent drop or create event with the same
name to fail with NDB
error 1419
Subscription already dropped or error 746
Event name already exists. This occurred
even when calling
dropEvent()
with a
nonzero force argument.
Now in such cases, error 1419 is ignored, and
DBDICT
deletes the event from the table.
(Bug #21554676)
NDB Cluster APIs:
Creation and destruction of
Ndb_cluster_connection
objects
by multiple threads could make use of the same application lock,
which in some cases led to failures in the global dictionary
cache. To alleviate this problem, the creation and destruction
of several internal NDB API objects have been serialized.
(Bug #20636124)
NDB Cluster APIs:
When an Ndb
object created
prior to a failure of the cluster was reused, the event queue of
this object could still contain data node events originating
from before the failure. These events could reference
“old” epochs (from before the failure occurred),
which in turn could violate the assumption made by the
nextEvent()
method that
epoch numbers always increase. This issue is addressed by
explicitly clearing the event queue in such cases.
(Bug #18411034)
References: See also: Bug #20888668.
NDB Cluster APIs:
Ndb::pollEvents()
and
pollEvents2()
were slow to
receive events, being dependent on other client threads or
blocks to perform polling of transporters on their behalf. This
fix allows a client thread to perform its own transporter
polling when it has to wait in either of these methods.
Introduction of transporter polling also revealed a problem with
missing mutex protection in the
ndbcluster_binlog
handler, which has been
added as part of this fix.
(Bug #79311, Bug #20957068, Bug #22224571, WL #8627)
NDB Cluster APIs:
After the initial restart of a node following a cluster failure,
the cluster failure event added as part of the restart process
was deleted when an event that existed prior to the restart was
later deleted. This meant that, in such cases, an Event API
client had no way of knowing that failure handling was needed.
In addition, the GCI used for the final cleanup of deleted event
operations, performed by
pollEvents()
and
nextEvent()
when these
methods have consumed all available events, was lost.
(Bug #78143, Bug #21660947)
A serious regression was inadvertently introduced in MySQL NDB
Cluster 7.4.8 whereby local checkpoints and thus restarts often
took much longer than expected. This occurred due to the fact
that the setting for
MaxDiskWriteSpeedOwnRestart
was ignored during restarts and the value of
MaxDiskWriteSpeedOtherNodeRestart
,
which is much lower by default than the default for
MaxDiskWriteSpeedOwnRestart
, was used
instead. This issue affected restart times and performance only
and did not have any impact on normal operations.
(Bug #22582233)
The epoch for the latest restorable checkpoint provided in the
cluster log as part of its reporting for
EventBufferStatus
events (see
NDB Cluster: Messages in the Cluster Log) was not well
defined and thus unreliable; depending on various factors, the
reported epoch could be the one currently being consumed, the
one most recently consumed, or the next one queued for
consumption.
This fix ensures that the latest restorable global checkpoint is always regarded as the one that was most recently completely consumed by the user, and thus that it was the latest restorable global checkpoint that existed at the time the report was generated. (Bug #22378288)
Added the
--ndb-allow-copying-alter-table
option for
mysqld. Setting this option (or the
equivalent system variable
ndb_allow_copying_alter_table
) to
OFF
keeps ALTER
TABLE
statements from performing copying operations.
The default value is ON
.
(Bug #22187649)
References: See also: Bug #17400320.
Attempting to create an NDB
table
having greater than the maximum supported combined width for all
BIT
columns (4096) caused data
node failure when these columns were defined with
COLUMN_FORMAT DYNAMIC
.
(Bug #21889267)
Creating a table with the maxmimum supported number of columns
(512) all using COLUMN_FORMAT DYNAMIC
led to
data node failures.
(Bug #21863798)
In a MySQL NDB Cluster with multiple LDM instances, all instances wrote to the node log, even inactive instances on other nodes. During restarts, this caused the log to be filled with messages from other nodes, such as the messages shown here:
2015-06-24 00:20:16 [ndbd] INFO -- We are adjusting Max Disk Write Speed, a restart is ongoing now ... 2015-06-24 01:08:02 [ndbd] INFO -- We are adjusting Max Disk Write Speed, no restarts ongoing anymore
Now this logging is performed only by the active LDM instance. (Bug #21362380)
Backup block states were reported incorrectly during backups. (Bug #21360188)
References: See also: Bug #20204854, Bug #21372136.
For a timeout in GET_TABINFOREQ
while
executing a CREATE INDEX
statement, mysqld returned Error 4243 (Index not
found) instead of the expected Error 4008
(Receive from NDB failed).
The fix for this bug also fixes similar timeout issues for a
number of other signals that are sent the
DBDICT
kernel block as part of DDL
operations, including ALTER_TAB_REQ
,
CREATE_INDX_REQ
,
DROP_FK_REQ
,
DROP_INDX_REQ
,
INDEX_STAT_REQ
,
DROP_FILE_REQ
,
CREATE_FILEGROUP_REQ
,
DROP_FILEGROUP_REQ
,
CREATE_EVENT
,
WAIT_GCP_REQ
,
DROP_TAB_REQ
, and
LIST_TABLES_REQ
, as well as several internal
functions used in handling NDB
schema operations.
(Bug #21277472)
References: See also: Bug #20617891, Bug #20368354, Bug #19821115.
Previously, multiple send threads could be invoked for handling sends to the same node; these threads then competed for the same send lock. While the send lock blocked the additional send threads, work threads could be passed to other nodes.
This issue is fixed by ensuring that new send threads are not activated while there is already an active send thread assigned to the same node. In addition, a node already having an active send thread assigned to it is no longer visible to other, already active, send threads; that is, such a node is longer added to the node list when a send thread is currently assigned to it. (Bug #20954804, Bug #76821)
Queueing of pending operations when the redo log was overloaded
(DefaultOperationRedoProblemAction
API node configuration parameter) could lead to timeouts when
data nodes ran out of redo log space
(P_TAIL_PROBLEM errors). Now when the
redo log is full, the node aborts requests instead of queuing
them.
(Bug #20782580)
References: See also: Bug #20481140.
An NDB
event buffer can be used with an
Ndb
object to subscribe to
table-level row change event streams. Users subscribe to an
existing event; this causes the data nodes to start sending
event data signals (SUB_TABLE_DATA
) and epoch
completion signals (SUB_GCP_COMPLETE
) to the
Ndb
object.
SUB_GCP_COMPLETE_REP
signals can arrive for
execution in concurrent receiver thread before completion of the
internal method call used to start a subscription.
Execution of SUB_GCP_COMPLETE_REP
signals
depends on the total number of SUMA
buckets
(sub data streams), but this may not yet have been set, leading
to the present issue, when the counter used for tracking the
SUB_GCP_COMPLETE_REP
signals
(TOTAL_BUCKETS_INIT
) was found to be set to
erroneous values. Now TOTAL_BUCKETS_INIT
is
tested to be sure it has been set correctly before it is used.
(Bug #20575424, Bug #76255)
References: See also: Bug #20561446, Bug #21616263.
NDB
statistics queries could be
delayed by the error delay set for
ndb_index_stat_option
(default
60 seconds) when the index that was queried had been marked with
internal error. The same underlying issue could also cause
ANALYZE TABLE
to hang when
executed against an NDB
table having multiple
indexes where an internal error occurred on one or more but not
all indexes.
Now in such cases, any existing statistics are returned immediately, without waiting for any additonal statistics to be discovered. (Bug #20553313, Bug #20707694, Bug #76325)
Memory allocated when obtaining a list of tables or databases was not freed afterward. (Bug #20234681, Bug #74510)
References: See also: Bug #18592390, Bug #72322.
Added the
BackupDiskWriteSpeedPct
data node parameter. Setting this parameter causes the data node
to reserve a percentage of its maximum write speed (as
determined by the value of
MaxDiskWriteSpeed
) for
use in local checkpoints while performing a backup.
BackupDiskWriteSpeedPct
is interpreted as a
percentage which can be set between 0 and 90 inclusive, with a
default value of 50.
(Bug #20204854)
References: See also: Bug #21372136.
After restoring the database schema from backup using ndb_restore, auto-discovery of restored tables in transactions having multiple statements did not work correctly, resulting in Deadlock found when trying to get lock; try restarting transaction errors.
This issue was encountered both in the mysql client, as well as when such transactions were executed by application programs using Connector/J and possibly other MySQL APIs.
Prior to upgrading, this issue can be worked around by executing
SELECT TABLE_NAME, TABLE_SCHEMA FROM
INFORMATION_SCHEMA.TABLES WHERE ENGINE = 'NDBCLUSTER'
on all SQL nodes following the restore operation, before
executing any other statements.
(Bug #18075170)
Using ndb_mgm STOP -f
to
force a node shutdown even when it triggered a complete shutdown
of the cluster, it was possible to lose data when a sufficient
number of nodes were shut down, triggering a cluster shutodwn,
and the timing was such that SUMA
handovers
had been made to nodes already in the process of shutting down.
(Bug #17772138)
When using a sufficiently large value for
TransactionDeadlockDetectionTimeout
and the default value for
sort_buffer_size
, executing
SELECT
* FROM
ndbinfo.cluster_operations
ORDER BY transid
with multiple concurrent conflicting
or deadlocked transactions, each transaction having several
pending operations, caused the SQL node where the query was run
to fail.
(Bug #16731538, Bug #67596)
The ndbinfo.config_params
table is now read-only.
(Bug #11762750, Bug #55383)
NDB
failed during a node restart
due to the status of the current local checkpoint being set but
not as active, even though it could have other states under such
conditions.
(Bug #78780, Bug #21973758)
ndbmtd checked for signals being sent only
after a full cycle in run_job_buffers
, which
is performed for all job buffer inputs. Now this is done as part
of run_job_buffers
itself, which avoids
executing for extended periods of time without sending to other
nodes or flushing signals to other threads.
(Bug #78530, Bug #21889088)
When attempting to enable index statistics, creation of the required system tables, events and event subscriptions often fails when multiple mysqld processes using index statistics are started concurrently in conjunction with starting, restarting, or stopping the cluster, or with node failure handling. This is normally recoverable, since the affected mysqld process or processes can (and do) retry these operations shortly thereafter. For this reason, such failures are no longer logged as warnings, but merely as informational events. (Bug #77760, Bug #21462846)
It was possible to end up with a lock on the send buffer mutex
when send buffers became a limiting resource, due either to
insufficient send buffer resource configuration, problems with
slow or failing communications such that all send buffers became
exhausted, or slow receivers failing to consume what was sent.
In this situation worker threads failed to allocate send buffer
memory for signals, and attempted to force a send in order to
free up space, while at the same time the send thread was busy
trying to send to the same node or nodes. All of these threads
competed for taking the send buffer mutex, which resulted in the
lock already described, reported by the watchdog as
Stuck in Send
. This fix is made in two parts,
listed here:
The send thread no longer holds the global send thread mutex while getting the send buffer mutex; it now releases the global mutex prior to locking the send buffer mutex. This keeps worker threads from getting stuck in send in such cases.
Locking of the send buffer mutex done by the send threads
now uses a try-lock. If the try-lock fails, the node to make
the send to is reinserted at the end of the list of send
nodes in order to be retried later. This removes the
Stuck in Send
condition for the send
threads.
(Bug #77081, Bug #21109605)