MySQL 9.3 Reference Manual Including MySQL NDB Cluster 9.3
Once your data reaches a stable size, or a growing table has
increased by tens or some hundreds of megabytes, consider
using the OPTIMIZE TABLE
statement to
reorganize the table and compact any wasted space. The
reorganized tables require less disk I/O to perform full
table scans. This is a straightforward technique that can
improve performance when other techniques such as improving
index usage or tuning application code are not practical.
OPTIMIZE TABLE
copies the data part of
the table and rebuilds the indexes. The benefits come from
improved packing of data within indexes, and reduced
fragmentation within the tablespaces and on disk. The
benefits vary depending on the data in each table. You may
find that there are significant gains for some and not for
others, or that the gains decrease over time until you next
optimize the table. This operation can be slow if the table
is large or if the indexes being rebuilt do not fit into the
buffer pool. The first run after adding a lot of data to a
table is often much slower than later runs.
In InnoDB
, having a long PRIMARY
KEY
(either a single column with a lengthy value,
or several columns that form a long composite value) wastes
a lot of disk space. The primary key value for a row is
duplicated in all the secondary index records that point to
the same row. (See Section 17.6.2.1, “Clustered and Secondary Indexes”.)
Create an AUTO_INCREMENT
column as the
primary key if your primary key is long, or index a prefix
of a long VARCHAR
column instead of the
entire column.
Use the VARCHAR
data type
instead of CHAR
to store
variable-length strings or for columns with many
NULL
values. A
CHAR(
column always takes N
)N
characters
to store data, even if the string is shorter or its value is
NULL
. Smaller tables fit better in the
buffer pool and reduce disk I/O.
When using COMPACT
row format (the
default InnoDB
format) and
variable-length character sets, such as
utf8mb4
or sjis
,
CHAR(
columns occupy a variable amount of space, but still at
least N
)N
bytes.
For tables that are big, or contain lots of repetitive text
or numeric data, consider using
COMPRESSED
row format. Less disk I/O is
required to bring data into the buffer pool, or to perform
full table scans. Before making a permanent decision,
measure the amount of compression you can achieve by using
COMPRESSED
versus
COMPACT
row format.