目录
MySQL支持多种类型的SQL数据类型:数字类型,日期和时间类型,字符串(字符和字节)类型,空间类型和
JSON
数据类型。本章概述了这些数据类型,每个类别中类型属性的更详细描述,以及数据类型存储要求的摘要。初步概述是故意简短的。有关特定数据类型的其他信息(例如可以指定值的允许格式),请参阅本章后面的更详细说明。
数据类型描述使用以下约定:
下面是数值数据类型的摘要。有关数字类型的属性和存储要求的其他信息,请参见第11.2节“数值类型”和 第11.8节“数据类型存储要求”。
对于整数类型,M
表示最大显示宽度。最大显示宽度为255.显示宽度与类型可包含的值范围无关,如第11.2节“数值类型”中所述。
对于浮点和定点类型,
M
是可以存储的总位数。
从MySQL 8.0.17开始,对于整数数据类型,不推荐使用display width属性,并且将在以后的MySQL版本中删除它。
如果指定ZEROFILL
数字列,MySQL会自动将该UNSIGNED
属性添加到列中。
从MySQL 8.0.17开始,该ZEROFILL
属性不推荐用于数字数据类型,并将在未来的MySQL版本中删除。考虑使用另一种方法来产生该属性的效果。例如,应用程序可以使用该LPAD()
函数将数字填充到所需的宽度,或者它们可以将格式化的数字存储在
CHAR
列中。
允许该UNSIGNED
属性的数字数据类型也允许SIGNED
。但是,默认情况下SIGNED
会对这些数据类型进行签名,因此该
属性不起作用。
如MySQL的8.0.17中,UNSIGNED
属性是不鼓励对类型的列
FLOAT
,
DOUBLE
和
DECIMAL
(和任何同义词)和将在以后的MySQL版本被删除。考虑使用简单CHECK
约束代替此类列。
SERIAL
是别名BIGINT
UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE
。
SERIAL DEFAULT VALUE
在整数列的定义中是别名NOT NULL AUTO_INCREMENT
UNIQUE
。
当您在一个类型的整数值之间使用减法时UNSIGNED
,除非NO_UNSIGNED_SUBTRACTION
启用SQL模式,否则结果是无符号的
。请参见第12.10节“强制转换函数和运算符”。
位值类型。M
表示每个值的位数,从1到64.如果M
省略,则默认值为1
。
TINYINT[(
M
)]
[UNSIGNED] [ZEROFILL]
一个非常小的整数。签署的范围是
-128
到127
。无符号的范围是0
到
255
。
这些类型是同义词
TINYINT(1)
。值为零被视为false。非零值被认为是真的:
MySQL的>SELECT IF(0, 'true', 'false');
+ ------------------------ + | IF(0,'true','false')| + ------------------------ + | 假| + ------------------------ + MySQL的>SELECT IF(1, 'true', 'false');
+ ------------------------ + | IF(1,'true','false')| + ------------------------ + | 是的| + ------------------------ + MySQL的>SELECT IF(2, 'true', 'false');
+ ------------------------ + | IF(2,'true','false')| + ------------------------ + | 是的| + ------------------------ +
然而,这些值TRUE
和
FALSE
仅是别名为
1
和0
分别,如下所示:
MySQL的>SELECT IF(0 = FALSE, 'true', 'false');
+ -------------------------------- + | IF(0 = FALSE,'true','false')| + -------------------------------- + | 是的| + -------------------------------- + MySQL的>SELECT IF(1 = TRUE, 'true', 'false');
+ ------------------------------- + | IF(1 = TRUE,'true','false')| + ------------------------------- + | 是的| + ------------------------------- + MySQL的>SELECT IF(2 = TRUE, 'true', 'false');
+ ------------------------------- + | IF(2 = TRUE,'true','false')| + ------------------------------- + | 假| + ------------------------------- + MySQL的>SELECT IF(2 = FALSE, 'true', 'false');
+ -------------------------------- + | IF(2 = FALSE,'true','false')| + -------------------------------- + | 假| + -------------------------------- +
最后两个语句显示显示的结果,因为
2
它们既不等
1
也不等0
。
SMALLINT[(
M
)]
[UNSIGNED] [ZEROFILL]
一个小整数。签署的范围是
-32768
到32767
。无符号的范围是0
到
65535
。
MEDIUMINT[(
M
)]
[UNSIGNED] [ZEROFILL]
一个中等大小的整数。签署的范围是
-8388608
到8388607
。无符号的范围是0
到
16777215
。
INT[(
M
)]
[UNSIGNED] [ZEROFILL]
正常大小的整数。签署的范围是
-2147483648
到
2147483647
。无符号的范围是
0
到4294967295
。
INTEGER[(
M
)]
[UNSIGNED] [ZEROFILL]
此类型是其同义词
INT
。
BIGINT[(
M
)]
[UNSIGNED] [ZEROFILL]
一个大整数。签署的范围是
-9223372036854775808
到
9223372036854775807
。无符号的范围是0
到
18446744073709551615
。
SERIAL
是别名BIGINT
UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE
。
关于BIGINT
列,您应该注意的一些事项
:
所有算术都是使用有符号
BIGINT
或
DOUBLE
值完成的,因此9223372036854775807
除了位函数之外,不应使用大于(63位)的无符号大整数
!如果这样做,结果中的一些最后数字可能是错误的,因为在将BIGINT
值转换为a
时出现舍入错误
DOUBLE
。
MySQL可以BIGINT
在以下情况下处理:
使用整数在BIGINT
列中存储大的无符号值时。
在
或中
,where 指的是列。
MIN(
col_name
)MAX(
col_name
)col_name
BIGINT
BIGINT
通过使用字符串存储它,
您始终可以在列中存储精确的整数值
。在这种情况下,MySQL执行字符串到数字的转换,不涉及中间双精度表示。
的-
,
+
和
*
运营商使用BIGINT
算术当两个操作数是整数值。这意味着如果将两个大整数(或返回整数的函数的结果)相乘,则当结果大于时,可能会得到意外结果
9223372036854775807
。
DECIMAL[(
M
[,D
])]
[UNSIGNED] [ZEROFILL]
打包的“ 精确 ”定点数。
M
是总位数(精度),D
是小数点后的位数(刻度)。小数点和(对于负数)
-
符号不计入
M
。如果
D
为0,则值没有小数点或小数部分。最大位数(M
)为
DECIMAL
65.最大支持小数数(D
)为30.如果D
省略,则默认值为0.如果M
省略,则默认值为10。
UNSIGNED
,如果指定,则禁止否定值。从MySQL 8.0.17开始,UNSIGNED
对于类型DECIMAL
(和任何同义词)的列,不推荐使用该属性,并且将在以后的MySQL版本中删除该
属性。考虑使用简单CHECK
约束代替此类列。
所有+, -, *, /
带DECIMAL
列的基本计算()
都以65位的精度完成。
DEC[(
,
,
M
[,D
])]
[UNSIGNED] [ZEROFILL]NUMERIC[(
M
[,D
])]
[UNSIGNED] [ZEROFILL]FIXED[(
M
[,D
])]
[UNSIGNED] [ZEROFILL]
FLOAT[(
M
,D
)]
[UNSIGNED] [ZEROFILL]
A small (single-precision) floating-point number.
Permissible values are -3.402823466E+38
to -1.175494351E-38
,
0
, and 1.175494351E-38
to 3.402823466E+38
. These are the
theoretical limits, based on the IEEE standard. The actual
range might be slightly smaller depending on your hardware
or operating system.
M
is the total number of digits
and D
is the number of digits
following the decimal point. If M
and D
are omitted, values are
stored to the limits permitted by the hardware. A
single-precision floating-point number is accurate to
approximately 7 decimal places.
FLOAT(
is a nonstandard MySQL extension. As of MySQL 8.0.17, this
syntax is deprecated and support for it will be removed in a
future MySQL version.
M
,D
)
UNSIGNED
, if specified, disallows
negative values. As of MySQL 8.0.17, the
UNSIGNED
attribute is deprecated for
columns of type FLOAT
(and
any synonyms) and will be removed in a future MySQL version.
Consider using a simple CHECK
constraint
instead for such columns.
Using FLOAT
might give you
some unexpected problems because all calculations in MySQL
are done with double precision. See
Section B.4.4.7, “Solving Problems with No Matching Rows”.
FLOAT(
p
)
[UNSIGNED] [ZEROFILL]
A floating-point number. p
represents the precision in bits, but MySQL uses this value
only to determine whether to use
FLOAT
or
DOUBLE
for the resulting data
type. If p
is from 0 to 24, the
data type becomes FLOAT
with
no M
or
D
values. If
p
is from 25 to 53, the data type
becomes DOUBLE
with no
M
or D
values. The range of the resulting column is the same as for
the single-precision FLOAT
or
double-precision DOUBLE
data
types described earlier in this section.
UNSIGNED
, if specified, disallows
negative values. As of MySQL 8.0.17, the
UNSIGNED
attribute is deprecated for
columns of type FLOAT
(and
any synonyms) and will be removed in a future MySQL version.
Consider using a simple CHECK
constraint
instead for such columns.
FLOAT(
syntax is provided for ODBC compatibility.
p
)
DOUBLE[(
M
,D
)]
[UNSIGNED] [ZEROFILL]
A normal-size (double-precision) floating-point number.
Permissible values are
-1.7976931348623157E+308
to
-2.2250738585072014E-308
,
0
, and
2.2250738585072014E-308
to
1.7976931348623157E+308
. These are the
theoretical limits, based on the IEEE standard. The actual
range might be slightly smaller depending on your hardware
or operating system.
M
is the total number of digits
and D
is the number of digits
following the decimal point. If M
and D
are omitted, values are
stored to the limits permitted by the hardware. A
double-precision floating-point number is accurate to
approximately 15 decimal places.
DOUBLE(
is a nonstandard MySQL extension. As of MySQL 8.0.17, this
syntax is deprecated and support for it will be removed in a
future MySQL version.
M
,D
)
UNSIGNED
, if specified, disallows
negative values. As of MySQL 8.0.17, the
UNSIGNED
attribute is deprecated for
columns of type DOUBLE
(and
any synonyms) and will be removed in a future MySQL version.
Consider using a simple CHECK
constraint
instead for such columns.
DOUBLE
PRECISION[(
,
M
,D
)]
[UNSIGNED] [ZEROFILL]REAL[(
M
,D
)]
[UNSIGNED] [ZEROFILL]
These types are synonyms for
DOUBLE
. Exception: If the
REAL_AS_FLOAT
SQL mode is
enabled, REAL
is a synonym
for FLOAT
rather than
DOUBLE
.
A summary of the temporal data types follows. For additional information about properties and storage requirements of the temporal types, see Section 11.3, “Date and Time Types”, and Section 11.8, “Data Type Storage Requirements”. For descriptions of functions that operate on temporal values, see Section 12.7, “Date and Time Functions”.
For the DATE
and
DATETIME
range descriptions,
“supported” means that although earlier values
might work, there is no guarantee.
MySQL permits fractional seconds for
TIME
,
DATETIME
, and
TIMESTAMP
values, with up to
microseconds (6 digits) precision. To define a column that
includes a fractional seconds part, use the syntax
,
where type_name
(fsp
)type_name
is
TIME
,
DATETIME
, or
TIMESTAMP
, and
fsp
is the fractional seconds
precision. For example:
CREATE TABLE t1 (t TIME(3), dt DATETIME(6));
The fsp
value, if given, must be in
the range 0 to 6. A value of 0 signifies that there is no
fractional part. If omitted, the default precision is 0. (This
differs from the standard SQL default of 6, for compatibility
with previous MySQL versions.)
表中的
任何TIMESTAMP
或
DATETIME
列都可以具有自动初始化和更新属性。
一个约会。支持的范围是
'1000-01-01'
到
'9999-12-31'
。MySQL DATE
以'YYYY-MM-DD'
格式显示
值
,但允许DATE
使用字符串或数字将值分配给列。
日期和时间组合。支持的范围是
'1000-01-01 00:00:00.000000'
到
'9999-12-31 23:59:59.999999'
。MySQL DATETIME
以
格式显示值
,但允许使用字符串或数字将值分配给
列。
'
YYYY-MM-DD
hh:mm:ss
[.fraction
]'DATETIME
fsp
可以给出0到6范围内
的可选值以指定小数秒精度。值为0表示没有小数部分。如果省略,则默认精度为0。
DATETIME
可以使用DEFAULT
和
ON UPDATE
列定义子句指定
自动初始化和更新到列的当前日期和时间,如第11.3.4节“TIMESTAMP和DATETIME的自动初始化和更新”中所述。
时间戳。范围是'1970-01-01
00:00:01.000000'
UTC到'2038-01-19
03:14:07.999999'
UTC。
TIMESTAMP
值存储为自纪元('1970-01-01 00:00:00'
UTC)以来的秒数。甲
TIMESTAMP
不能代表值'1970-01-01 00:00:00'
,因为这是等同于从所述历元和值00秒被保留用于表示'0000-00-00
00:00:00'
,该“ 零 ”
TIMESTAMP
值。
fsp
可以给出0到6范围内
的可选值以指定小数秒精度。值为0表示没有小数部分。如果省略,则默认精度为0。
服务器处理TIMESTAMP
定义的方式取决于explicit_defaults_for_timestamp
系统变量的值
(请参见
第5.1.8节“服务器系统变量”)。
如果
explicit_defaults_for_timestamp
启用,则不会自动将DEFAULT CURRENT_TIMESTAMP
或ON
UPDATE CURRENT_TIMESTAMP
属性分配
给任何
TIMESTAMP
列。它们必须明确包含在列定义中。此外,任何
TIMESTAMP
未明确声明为NOT NULL
许可
NULL
值的。
如果
explicit_defaults_for_timestamp
禁用,则服务器处理TIMESTAMP
如下:
除非另有说明,否则TIMESTAMP
表中的第一
列定义为如果未明确赋值,则自动设置为最近修改的日期和时间。这TIMESTAMP
对于记录INSERT
或
UPDATE
操作的时间戳非常有用
。您还可以TIMESTAMP
通过为其指定NULL
值来将任何列设置为当前日期和时间
,除非已使用NULL
允许NULL
值的属性
定义该列。
可以使用DEFAULT
CURRENT_TIMESTAMP
和ON UPDATE
CURRENT_TIMESTAMP
列定义子句指定自动初始化和更新到当前日期和时间。默认情况下,第一TIMESTAMP
列具有这些属性,如前所述。但是,TIMESTAMP
可以将表中的任何列定义为具有这些属性。
一次。范围是'-838:59:59.000000'
到'838:59:59.000000'
。MySQL TIME
以
格式显示
值
,但允许使用字符串或数字将值分配给
列。
'
hh:mm:ss
[.fraction
]'TIME
fsp
可以给出0到6范围内
的可选值以指定小数秒精度。值为0表示没有小数部分。如果省略,则默认精度为0。
一年四位数格式。MySQL YEAR
以YYYY
格式显示
值
,但允许YEAR
使用字符串或数字将值分配给列。值显示为
1901
到2155
和
0000
。
有关YEAR
显示格式和输入值解释的更多信息
,请参见第11.3.3节“YEAR类型”。
MySQL 8.0不支持YEAR(2)
旧版MySQL中允许的
数据类型。有关转换成指令YEAR(4)
,请参见
YEAR(2)的限制和迁移到YEAR(4)中
的MySQL 5.7参考手册。
在SUM()
和
AVG()
聚合函数不具有时间价值的工作。(它们将值转换为数字,在第一个非数字字符后丢失所有内容。)要解决此问题,请转换为数字单位,执行聚合操作,然后转换回时间值。例子:
SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(time_col
)))FROMtbl_name
; SELECT FROM_DAYS(SUM(TO_DAYS(date_col
)))FROMtbl_name
;
下面是字符串数据类型的摘要。有关字符串类型的属性和存储要求的其他信息,请参见第11.4节“字符串类型”和 第11.8节“数据类型存储要求”。
在某些情况下,MySQL可能会将字符串列更改为与CREATE
TABLE
or ALTER TABLE
语句中给出的类型不同的类型。请参见第13.1.20.8节“无声列规范更改”。
MySQL以字符为单位解释字符列定义中的长度规范。这适用于
CHAR
,
VARCHAR
和
TEXT
类型。
用于字符串的数据类型列定义
CHAR
,
VARCHAR
的
TEXT
类型,
ENUM
,
SET
,和任何同义词)可以指定列字符集和归类:
CHARACTER SET
指定字符集。如果需要,可以使用COLLATE
属性以及任何其他属性指定字符集的排序规则。例如:
创建表t ( c1 VARCHAR(20)CHARACTER SET utf8, c2 TEXT CHARACTER SET latin1 COLLATE latin1_general_cs );
此表定义创建一个名为a的列c1
,该列
具有该字符集
utf8
的默认排序规则的字符集,以及一个名称c2
具有字符集latin1
和区分大小写的排序规则的列。
第10.3.5节“列字符集和排序规则”中介绍了在缺少属性CHARACTER SET
和
COLLATE
属性
时分配字符集和排序规则的规则。
CHARSET
是...的同义词
CHARACTER SET
。
指定CHARACTER SET binary
字符串数据类型的属性会导致将列创建为相应的二进制字符串数据类型:
CHAR
变为
BINARY
,
VARCHAR
变为
VARBINARY
和
TEXT
变为
BLOB
。对于
ENUM
和
SET
数据类型,不会发生这种情况; 它们是按声明创建的。假设您使用此定义指定表:
创建表t ( c1 VARCHAR(10)CHARACTER SET二进制, c2 TEXT CHARACTER SET二进制, c3 ENUM('a','b','c')CHARACTER SET二进制 );
结果表具有以下定义:
创建表t ( c1 VARBINARY(10), c2 BLOB, c3 ENUM('a','b','c')CHARACTER SET二进制 );
该BINARY
属性是一个非标准的MySQL扩展,它是指定_bin
列字符集(如果未指定列字符集,则为表默认字符集)的二进制()排序规则的简写。在这种情况下,比较和排序基于数字字符代码值。假设您使用此定义指定表:
创建表t ( c1 VARCHAR(10)CHARACTER SET latin1 BINARY, c2 TEXT BINARY )CHARACTER SET utf8mb4;
结果表具有以下定义:
CREATE TABLE t( c1 VARCHAR(10)CHARACTER SET latin1 COLLATE latin1_bin, c2 TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin )CHARACTER SET utf8mb4;
在MySQL 8.0中,该BINARY
属性的非标准使用
是不明确的,因为
utf8mb4
字符集具有多个
_bin
排序规则。从MySQL 8.0.17开始,BINARY
不推荐使用该
属性,并且在将来的MySQL版本中将删除对它的支持。应调整应用程序以使用显式
_bin
排序规则。
BINARY
指定数据类型或字符集
的用法保持不变。
该ASCII
属性是简写
CHARACTER SET latin1
。
该UNICODE
属性是简写
CHARACTER SET ucs2
。
字符列比较和排序基于分配给列的排序规则。对于
CHAR
,
VARCHAR
,
TEXT
,
ENUM
,和
SET
数据类型,可以使用二进制(声明一个柱_bin
)归类或所述
BINARY
属性,以使比较和排序,以使用底层字符代码值,而不是一个词汇顺序。
有关在MySQL中使用字符集的其他信息,请参阅第10章,字符集,排序规则,Unicode。
[NATIONAL] CHAR[(
M
)]
[CHARACTER SET charset_name
]
[COLLATE
collation_name
]
一个固定长度的字符串,在存储时始终用空格填充指定的长度。
M
表示以字符为单位的列长度。范围M
为0到255.如果M
省略,则长度为1。
CHAR
除非PAD_CHAR_TO_FULL_LENGTH
启用SQL模式,否则在检索值
时将删除尾随空格
。
CHAR
是简写
CHARACTER
。
NATIONAL CHAR
(或其等效的简短形式NCHAR
)是定义CHAR
列应使用某些预定义字符集的标准SQL方法
。MySQL使用utf8
此预定义字符集。
第10.3.7节“国家字符集”。
的CHAR BYTE
数据类型是用于一个别名BINARY
的数据类型。这是兼容性功能。
MySQL允许您创建类型的列
CHAR(0)
。当您必须符合依赖于列的存在但实际上不使用其值的旧应用程序时,这非常有用。CHAR(0)
当你需要一个只能占用两个值的列时,这也是非常好的:定义为CHAR(0) NULL
只占用一位的列,只能取值
NULL
和''
(空字符串)。
[NATIONAL] VARCHAR(
M
)
[CHARACTER SET charset_name
]
[COLLATE
collation_name
]
可变长度的字符串。M
表示字符的最大列长度。范围M
是0到65,535。a的有效最大长度
VARCHAR
取决于最大行大小(65,535字节,在所有列之间共享)和使用的字符集。例如,
utf8
字符每个字符最多可能需要三个字节,因此VARCHAR
使用该utf8
字符集的
列
可以声明为最多21,844个字符。请参见
第C.10.4节“表列数和行大小的限制”。
MySQL将VARCHAR
值存储为1字节或2字节长度前缀加数据。长度前缀表示值中的字节数。甲
VARCHAR
列使用一个长度字节的值,如果不需要超过255字节,2个字节长度值是否可能需要多于255个字节。
MySQL的遵从标准SQL规范,并
没有从删除尾随空格
VARCHAR
的值。
VARCHAR
是简写
CHARACTER VARYING
。
NATIONAL VARCHAR
是标准SQL方法,用于定义
VARCHAR
列应使用某些预定义字符集。MySQL使用
utf8
此预定义字符集。
第10.3.7节“国家字符集”。
NVARCHAR
是简写
NATIONAL VARCHAR
。
该BINARY
类型是类似的CHAR
类型,但保存二进制字节字符串而不非二进制字符串。可选长度M
表示以字节为单位的列长度。如果省略,则
M
默认为1。
该VARBINARY
类型是类似的VARCHAR
类型,但保存二进制字节字符串而不非二进制字符串。M
表示最大列长度(以字节为单位)。
阿BLOB
为255字节的最大长度塔(2 8 - 1)个字节。每个TINYBLOB
值使用1字节长度前缀存储,该前缀指示值中的字节数。
TINYTEXT
[CHARACTER SET
charset_name
]
[COLLATE
collation_name
]
阿TEXT
与255(2的最大长度列8个字符- 1)。如果值包含多字节字符,则有效最大长度较小。每个
TINYTEXT
值使用1字节长度前缀存储,该前缀指示值中的字节数。
阿BLOB
用的65,535(2的最大长度柱16 - 1)字节。每个BLOB
值使用2字节长度前缀存储,该前缀指示值中的字节数。
M
可以为此类型指定
可选长度。如果这样做,MySQL BLOB
会将列创建为足够大的最小类型,以保存M
long bytes的值。
TEXT[(
M
)]
[CHARACTER SET charset_name
]
[COLLATE
collation_name
]
一TEXT
列,最大长度为65,535(2 16 - 1)个字符。如果值包含多字节字符,则有效最大长度较小。每个
TEXT
值使用2字节长度前缀存储,该前缀指示值中的字节数。
M
可以为此类型指定
可选长度。如果这样做,MySQL TEXT
会将列创建为足够大的最小类型,以保存值M
字符长。
一BLOB
列,最大长度为16,777,215(2 24 - 1)个字节。每个MEDIUMBLOB
值使用3字节长度前缀存储,该前缀指示值中的字节数。
MEDIUMTEXT
[CHARACTER SET
charset_name
]
[COLLATE
collation_name
]
一TEXT
列,最大长度为16,777,215(2 24 - 1)个字符。如果值包含多字节字符,则有效最大长度较小。每个
MEDIUMTEXT
值使用3字节长度前缀存储,该前缀指示值中的字节数。
甲BLOB
(2具有4294967295或4GB的最大长度柱32 - 1)个字节。LONGBLOB
列的有效最大长度
取决于客户端/服务器协议中配置的最大数据包大小和可用内存。每个
LONGBLOB
值使用4字节长度前缀存储,该前缀指示值中的字节数。
LONGTEXT
[CHARACTER SET
charset_name
]
[COLLATE
collation_name
]
A TEXT
column with a maximum
length of 4,294,967,295 or 4GB
(232 − 1) characters. The
effective maximum length is less if the value contains
multibyte characters. The effective maximum length of
LONGTEXT
columns also depends on the configured maximum packet size
in the client/server protocol and available memory. Each
LONGTEXT
value is stored using a 4-byte length prefix that indicates
the number of bytes in the value.
ENUM('
value1
','value2
',...)
[CHARACTER SET charset_name
]
[COLLATE
collation_name
]
An enumeration. A string object that can have only one
value, chosen from the list of values
'
,
value1
''
,
value2
'...
, NULL
or the
special ''
error value.
ENUM
values are represented
internally as integers.
An ENUM
column can have a
maximum of 65,535 distinct elements.
The maximum supported length of an individual
ENUM
element is
M
<= 255 and
(M
x
w
) <= 1020, where
M
is the element literal length and
w
is the number of bytes required
for the maximum-length character in the character set.
SET('
value1
','value2
',...)
[CHARACTER SET charset_name
]
[COLLATE
collation_name
]
A set. A string object that can have zero or more values,
each of which must be chosen from the list of values
'
,
value1
''
,
value2
'...
SET
values are represented internally as integers.
A SET
column can have a
maximum of 64 distinct members.
单个SET
元素的最大支持长度
为
M
<= 255和(M
x
w
)<= 1020,其中
M
是元素文字长度,
w
是字符集中最大长度字符所需的字节数。
MySQL支持所有标准SQL数值数据类型。这些类型包括确切的数值数据类型(INTEGER
,
SMALLINT
,
DECIMAL
,和
NUMERIC
),以及该近似数值数据类型(FLOAT
,
REAL
,和
DOUBLE PRECISION
)。关键字
INT
是
INTEGER
关键字DEC
,
关键字
FIXED
是同义词
DECIMAL
。MySQL将其
DOUBLE
视为DOUBLE PRECISION
(非标准扩展)的同义词
。
除非启用SQL模式,否则MySQL还将其视为(非标准变体)
REAL
的同义词。
DOUBLE PRECISION
REAL_AS_FLOAT
该BIT
数据类型存储位值,并且被支撑为MyISAM
,
MEMORY
,
InnoDB
,和
NDB
表。
有关MySQL如何处理在表达式评估期间将超出范围值分配给列和溢出的信息,请参见 第11.2.6节“超出范围和溢出处理”。
有关数字类型存储要求的信息,请参见 第11.8节“数据类型存储要求”。
用于计算数字操作数的结果的数据类型取决于操作数的类型和对它们执行的操作。有关更多信息,请参见 第12.6.1节“算术运算符”。
MySQL支持SQL标准整数类型
INTEGER
(或INT
)和
SMALLINT
。作为一个可扩展标准,MySQL也支持整数类型
TINYINT
,MEDIUMINT
和
BIGINT
。下表显示了每种整数类型所需的存储和范围。
表11.1 MySQL支持的整数类型所需的存储和范围
类型 | 存储(字节) | 最小值签名 | 最小值无符号 | 最大值签名 | 最大值无符号 |
---|---|---|---|---|---|
TINYINT |
1 | -128 |
0 |
127 |
255 |
SMALLINT |
2 | -32768 |
0 |
32767 |
65535 |
MEDIUMINT |
3 | -8388608 |
0 |
8388607 |
16777215 |
INT |
4 | -2147483648 |
0 |
2147483647 |
4294967295 |
BIGINT |
8 | -263 |
0 |
263-1 |
264-1 |
该DECIMAL
和NUMERIC
类型的存储精确的数值数据。在保持精确精度很重要时使用这些类型,例如使用货币数据。在MySQL中,NUMERIC
实现为DECIMAL
,所以下面的注释DECIMAL
同样适用于
NUMERIC
。
MySQL DECIMAL
以二进制格式存储值。请参见第12.25节“精确数学”。
在DECIMAL
列声明中,可以(通常是)指定精度和小数位数。例如:
工资DECIMAL(5,2)
在这个例子中,5
是精度,
2
是规模。精度表示为值存储的有效位数,刻度表示小数点后可存储的位数。
标准SQL要求DECIMAL(5,2)
能够存储五位数和两位小数的任何值,因此可以存储在salary
列中的值的范围-999.99
是
999.99
。
在标准SQL中,语法
等效于
。类似地,语法等同于允许实现决定值
的语法。MySQL支持这两种变体形式的语法。默认值为10。
DECIMAL(
M
)DECIMAL(
M
,0)DECIMAL
DECIMAL(
M
,0)M
DECIMAL
M
如果比例为0,则DECIMAL
值不包含小数点或小数部分。
最大位数为DECIMAL
65,但给定DECIMAL
列的实际范围可以通过给定列的精度或比例来约束。如果为这样的列分配了小数点后面的位数超过指定比例允许的值,则该值将转换为该比例。(精确的行为是特定于操作系统的,但通常效果是截断到允许的位数。)
的FLOAT
和DOUBLE
类型代表近似数字数据值。MySQL对于单精度值使用四个字节,对于双精度值使用八个字节。
因为FLOAT
,SQL标准允许FLOAT
在括号中的关键字后面的位中选择性地指定精度(但不是指数的范围)
; ; 就是,
。MySQL还支持此可选的精度规范,但精度值
仅用于确定存储大小。精度从0到23会产生一个4字节的单精度
列。精度从24到53会产生一个8字节的双精度列。
FLOAT(
p
)FLOAT(
p
)FLOAT
DOUBLE
MySQL允许非标准语法:
或
或。这里,
除了值之外,还可以存储多达
数字的数字,其中的
数字可以在小数点后面。例如,定义的列
将
在显示时显示。MySQL能够执行存储值时,舍入,所以如果插入
到一个
列中,近似的结果是
。
FLOAT(
M
,D
)REAL(
M
,D
)DOUBLE
PRECISION(
M
,D
)(
M
,D
)M
D
FLOAT(7,4)
-999.9999
999.00009
FLOAT(7,4)
999.0001
从MySQL 8.0.17开始,
不推荐使用非标准
和
语法,并且在将来的MySQL版本中将删除对它的支持。
FLOAT(
M
,D
)DOUBLE(
M
,D
)
由于浮点值是近似值而未存储为精确值,因此尝试在比较中将它们视为精确值可能会导致问题。它们还受平台或实现依赖性的影响。有关更多信息,请参见 第B.4.4.8节“浮点值的问题”
为了获得最大的可移植性,需要存储近似数值数据值的代码应使用FLOAT
或
DOUBLE PRECISION
不使用精度或数字位数。
该BIT
数据类型被用于存储比特值。一种
能够存储位值的类型。
范围从1到64。
BIT(
M
)M
M
要指定位值,
可以使用表示法。是使用零和1写的二进制值。例如,
和
分别代表图7和128。请参见
第9.1.5节“位值文字”。
b'
value
'value
b'111'
b'10000000'
如果为小于位长的列分配值
,则会在左侧用零填充该值。例如,为
列分配值实际上与分配相同
。
BIT(
M
)M
b'101'
BIT(6)
b'000101'
NDB集群。 BIT
给定NDB
表中使用
的所有列的最大组合大小不得超过4096位。
MySQL支持扩展,可以选择在类型的base关键字后面的括号中指定整数数据类型的显示宽度。例如,
INT(4)
指定
INT
显示宽度为四位的a。应用程序可以使用此可选显示宽度来显示宽度小于为列指定的宽度的整数值,方法是用空格填充它们。(也就是说,此宽度存在于使用结果集返回的元数据中。是否使用它取决于应用程序。)
显示宽度也没有限制,可以被存储在列中的值的范围内。它也不会阻止比列显示宽度更宽的值正确显示。例如,指定为
SMALLINT(3)
具有通常
SMALLINT
范围的
-32768
to的列32767
,以及超过三位数允许的范围之外的值将使用超过三位数完全显示。
与可选(非标准)ZEROFILL
属性一起使用时
,默认的空格填充将替换为零。例如,对于声明为的列INT(4) ZEROFILL
,将5
检索
值为0005
。
The ZEROFILL
attribute is ignored for
columns involved in expressions or
UNION
queries.
If you store values larger than the display width in an
integer column that has the ZEROFILL
attribute, you may experience problems when MySQL generates
temporary tables for some complicated joins. In these cases,
MySQL assumes that the data values fit within the column
display width.
As of MySQL 8.0.17, the ZEROFILL
attribute is
deprecated for numeric data types, as is the display width
attribute for integer data types. Support for
ZEROFILL
and display widths for integer data
types will be removed in a future MySQL version. Consider using
an alternative means of producing the effect of these
attributes. For example, applications could use the
LPAD()
function to zero-pad
numbers up to the desired width, or they could store the
formatted numbers in CHAR
columns.
All integer types can have an optional (nonstandard)
UNSIGNED
attribute. An unsigned type can be
used to permit only nonnegative numbers in a column or when you
need a larger upper numeric range for the column. For example,
if an INT
column is
UNSIGNED
, the size of the column's range is
the same but its endpoints shift up, from
-2147483648
and 2147483647
to 0
and 4294967295
.
Floating-point and fixed-point types also can be
UNSIGNED
. As with integer types, this
attribute prevents negative values from being stored in the
column. Unlike the integer types, the upper range of column
values remains the same. As of MySQL 8.0.17, the
UNSIGNED
attribute is deprecated for columns
of type FLOAT
,
DOUBLE
, and
DECIMAL
(and any synonyms) and
will be removed in a future MySQL version. Consider using a
simple CHECK
constraint instead for such
columns.
If you specify ZEROFILL
for a numeric column,
MySQL automatically adds the UNSIGNED
attribute.
Integer or floating-point data types can have the
AUTO_INCREMENT
attribute. When you insert a
value of NULL
into an indexed
AUTO_INCREMENT
column, the column is set to
the next sequence value. Typically this is
, where
value
+1value
is the largest value for the
column currently in the table.
(AUTO_INCREMENT
sequences begin with
1
.)
Storing 0
into an
AUTO_INCREMENT
column has the same effect as
storing NULL
, unless the
NO_AUTO_VALUE_ON_ZERO
SQL mode
is enabled.
Inserting NULL
to generate
AUTO_INCREMENT
values requires that the
column be declared NOT NULL
. If the column is
declared NULL
, inserting
NULL
stores a NULL
. When
you insert any other value into an
AUTO_INCREMENT
column, the column is set to
that value and the sequence is reset so that the next
automatically generated value follows sequentially from the
inserted value.
Negative values for AUTO_INCREMENT
columns
are not supported.
CHECK
constraints cannot refer to columns
that have the AUTO_INCREMENT
attribute, nor
can the AUTO_INCREMENT
attribute be added to
existing columns that are used in CHECK
constraints.
As of MySQL 8.0.17, AUTO_INCREMENT
support is
deprecated for FLOAT
and
DOUBLE
columns and will be
removed in a future MySQL version. Consider removing the
AUTO_INCREMENT
attribute from such columns,
or convert them to an integer type.
When MySQL stores a value in a numeric column that is outside the permissible range of the column data type, the result depends on the SQL mode in effect at the time:
If strict SQL mode is enabled, MySQL rejects the out-of-range value with an error, and the insert fails, in accordance with the SQL standard.
If no restrictive modes are enabled, MySQL clips the value to the appropriate endpoint of the column data type range and stores the resulting value instead.
When an out-of-range value is assigned to an integer column, MySQL stores the value representing the corresponding endpoint of the column data type range.
When a floating-point or fixed-point column is assigned a value that exceeds the range implied by the specified (or default) precision and scale, MySQL stores the value representing the corresponding endpoint of that range.
Suppose that a table t1
has this definition:
CREATE TABLE t1 (i1 TINYINT, i2 TINYINT UNSIGNED);
With strict SQL mode enabled, an out of range error occurs:
mysql>SET sql_mode = 'TRADITIONAL';
mysql>INSERT INTO t1 (i1, i2) VALUES(256, 256);
ERROR 1264 (22003): Out of range value for column 'i1' at row 1 mysql>SELECT * FROM t1;
Empty set (0.00 sec)
With strict SQL mode not enabled, clipping with warnings occurs:
mysql>SET sql_mode = '';
mysql>INSERT INTO t1 (i1, i2) VALUES(256, 256);
mysql>SHOW WARNINGS;
+---------+------+---------------------------------------------+ | Level | Code | Message | +---------+------+---------------------------------------------+ | Warning | 1264 | Out of range value for column 'i1' at row 1 | | Warning | 1264 | Out of range value for column 'i2' at row 1 | +---------+------+---------------------------------------------+ mysql>SELECT * FROM t1;
+------+------+ | i1 | i2 | +------+------+ | 127 | 255 | +------+------+
When strict SQL mode is not enabled, column-assignment
conversions that occur due to clipping are reported as warnings
for ALTER TABLE
,
LOAD DATA
,
UPDATE
, and multiple-row
INSERT
statements. In strict
mode, these statements fail, and some or all the values are not
inserted or changed, depending on whether the table is a
transactional table and other factors. For details, see
Section 5.1.11, “Server SQL Modes”.
Overflow during numeric expression evaluation results in an
error. For example, the largest signed
BIGINT
value is
9223372036854775807, so the following expression produces an
error:
mysql> SELECT 9223372036854775807 + 1;
ERROR 1690 (22003): BIGINT value is out of range in '(9223372036854775807 + 1)'
To enable the operation to succeed in this case, convert the value to unsigned;
mysql> SELECT CAST(9223372036854775807 AS UNSIGNED) + 1;
+-------------------------------------------+
| CAST(9223372036854775807 AS UNSIGNED) + 1 |
+-------------------------------------------+
| 9223372036854775808 |
+-------------------------------------------+
Whether overflow occurs depends on the range of the operands, so
another way to handle the preceding expression is to use
exact-value arithmetic because
DECIMAL
values have a larger
range than integers:
mysql> SELECT 9223372036854775807.0 + 1;
+---------------------------+
| 9223372036854775807.0 + 1 |
+---------------------------+
| 9223372036854775808.0 |
+---------------------------+
Subtraction between integer values, where one is of type
UNSIGNED
, produces an unsigned result by
default. If the result would otherwise have been negative, an
error results:
mysql>SET sql_mode = '';
Query OK, 0 rows affected (0.00 sec) mysql>SELECT CAST(0 AS UNSIGNED) - 1;
ERROR 1690 (22003): BIGINT UNSIGNED value is out of range in '(cast(0 as unsigned) - 1)'
If the NO_UNSIGNED_SUBTRACTION
SQL mode is enabled, the result is negative:
mysql>SET sql_mode = 'NO_UNSIGNED_SUBTRACTION';
mysql>SELECT CAST(0 AS UNSIGNED) - 1;
+-------------------------+ | CAST(0 AS UNSIGNED) - 1 | +-------------------------+ | -1 | +-------------------------+
If the result of such an operation is used to update an
UNSIGNED
integer column, the result is
clipped to the maximum value for the column type, or clipped to
0 if NO_UNSIGNED_SUBTRACTION
is enabled. If strict SQL mode is enabled, an error occurs and
the column remains unchanged.
The date and time types for representing temporal values are
DATE
,
TIME
,
DATETIME
,
TIMESTAMP
, and
YEAR
. Each temporal type has a
range of valid values, as well as a “zero” value that
may be used when you specify an invalid value that MySQL cannot
represent. The TIMESTAMP
type has
special automatic updating behavior, described later. For temporal
type storage requirements, see
Section 11.8, “Data Type Storage Requirements”.
Keep in mind these general considerations when working with date and time types:
MySQL retrieves values for a given date or time type in a standard output format, but it attempts to interpret a variety of formats for input values that you supply (for example, when you specify a value to be assigned to or compared to a date or time type). For a description of the permitted formats for date and time types, see Section 9.1.3, “Date and Time Literals”. It is expected that you supply valid values. Unpredictable results may occur if you use values in other formats.
Although MySQL tries to interpret values in several formats,
date parts must always be given in year-month-day order (for
example, '98-09-04'
), rather than in the
month-day-year or day-month-year orders commonly used
elsewhere (for example, '09-04-98'
,
'04-09-98'
).
Dates containing two-digit year values are ambiguous because the century is unknown. MySQL interprets two-digit year values using these rules:
Year values in the range 70-99
are
converted to 1970-1999
.
Year values in the range 00-69
are
converted to 2000-2069
.
Conversion of values from one temporal type to another occurs according to the rules in Section 11.3.6, “Conversion Between Date and Time Types”.
MySQL automatically converts a date or time value to a number if the value is used in a numeric context and vice versa.
By default, when MySQL encounters a value for a date or time
type that is out of range or otherwise invalid for the type,
it converts the value to the “zero” value for
that type. The exception is that out-of-range
TIME
values are clipped to the
appropriate endpoint of the
TIME
range.
By setting the SQL mode to the appropriate value, you can
specify more exactly what kind of dates you want MySQL to
support. (See Section 5.1.11, “Server SQL Modes”.) You can get MySQL
to accept certain dates, such as
'2009-11-31'
, by enabling the
ALLOW_INVALID_DATES
SQL
mode. This is useful when you want to store a “possibly
wrong” value which the user has specified (for example,
in a web form) in the database for future processing. Under
this mode, MySQL verifies only that the month is in the range
from 1 to 12 and that the day is in the range from 1 to 31.
MySQL permits you to store dates where the day or month and
day are zero in a DATE
or
DATETIME
column. This is useful
for applications that need to store birthdates for which you
may not know the exact date. In this case, you simply store
the date as '2009-00-00'
or
'2009-01-00'
. If you store dates such as
these, you should not expect to get correct results for
functions such as DATE_SUB()
or
DATE_ADD()
that require
complete dates. To disallow zero month or day parts in dates,
enable the NO_ZERO_IN_DATE
mode.
MySQL permits you to store a “zero” value of
'0000-00-00'
as a “dummy
date.” This is in some cases more convenient than using
NULL
values, and uses less data and index
space. To disallow '0000-00-00'
, enable the
NO_ZERO_DATE
mode.
“Zero” date or time values used through
Connector/ODBC are converted automatically to
NULL
because ODBC cannot handle such
values.
The following table shows the format of the “zero”
value for each type. The “zero” values are special,
but you can store or refer to them explicitly using the values
shown in the table. You can also do this using the values
'0'
or 0
, which are easier
to write. For temporal types that include a date part
(DATE
,
DATETIME
, and
TIMESTAMP
), use of these values
produces warnings if the
NO_ZERO_DATE
SQL mode is
enabled.
Data Type | “Zero” Value |
---|---|
DATE |
'0000-00-00' |
TIME |
'00:00:00' |
DATETIME |
'0000-00-00 00:00:00' |
TIMESTAMP |
'0000-00-00 00:00:00' |
YEAR |
0000 |
The DATE
, DATETIME
, and
TIMESTAMP
types are related. This section
describes their characteristics, how they are similar, and how
they differ. MySQL recognizes DATE
,
DATETIME
, and TIMESTAMP
values in several formats, described in
Section 9.1.3, “Date and Time Literals”. For the
DATE
and DATETIME
range
descriptions, “supported” means that although
earlier values might work, there is no guarantee.
The DATE
type is used for values with a date
part but no time part. MySQL retrieves and displays
DATE
values in
'YYYY-MM-DD'
format. The supported range is
'1000-01-01'
to
'9999-12-31'
.
The DATETIME
type is used for values that
contain both date and time parts. MySQL retrieves and displays
DATETIME
values in 'YYYY-MM-DD
hh:mm:ss'
format. The supported range is
'1000-01-01 00:00:00'
to '9999-12-31
23:59:59'
.
The TIMESTAMP
data type is used for values
that contain both date and time parts.
TIMESTAMP
has a range of '1970-01-01
00:00:01'
UTC to '2038-01-19
03:14:07'
UTC.
A DATETIME
or TIMESTAMP
value can include a trailing fractional seconds part in up to
microseconds (6 digits) precision. In particular, any fractional
part in a value inserted into a DATETIME
or
TIMESTAMP
column is stored rather than
discarded. With the fractional part included, the format for
these values is '
,
the range for YYYY-MM-DD
hh:mm:ss
[.fraction
]'DATETIME
values is
'1000-01-01 00:00:00.000000'
to
'9999-12-31 23:59:59.999999'
, and the range
for TIMESTAMP
values is '1970-01-01
00:00:01.000000'
to '2038-01-19
03:14:07.999999'
. The fractional part should always be
separated from the rest of the time by a decimal point; no other
fractional seconds delimiter is recognized. For information
about fractional seconds support in MySQL, see
Section 11.3.5, “Fractional Seconds in Time Values”.
The TIMESTAMP
and DATETIME
data types offer automatic initialization and updating to the
current date and time. For more information, see
Section 11.3.4, “Automatic Initialization and Updating for TIMESTAMP and DATETIME”.
MySQL converts TIMESTAMP
values from the
current time zone to UTC for storage, and back from UTC to the
current time zone for retrieval. (This does not occur for other
types such as DATETIME
.) By default, the
current time zone for each connection is the server's time. The
time zone can be set on a per-connection basis. As long as the
time zone setting remains constant, you get back the same value
you store. If you store a TIMESTAMP
value,
and then change the time zone and retrieve the value, the
retrieved value is different from the value you stored. This
occurs because the same time zone was not used for conversion in
both directions. The current time zone is available as the value
of the time_zone
system
variable. For more information, see
Section 5.1.13, “MySQL Server Time Zone Support”.
Invalid DATE
, DATETIME
, or
TIMESTAMP
values are converted to the
“zero” value of the appropriate type
('0000-00-00'
or '0000-00-00
00:00:00'
).
Be aware of certain properties of date value interpretation in MySQL:
MySQL permits a “relaxed” format for values
specified as strings, in which any punctuation character may
be used as the delimiter between date parts or time parts.
In some cases, this syntax can be deceiving. For example, a
value such as '10:11:12'
might look like
a time value because of the :
, but is
interpreted as the year '2010-11-12'
if
used in a date context. The value
'10:45:15'
is converted to
'0000-00-00'
because
'45'
is not a valid month.
The only delimiter recognized between a date and time part and a fractional seconds part is the decimal point.
The server requires that month and day values be valid, and
not merely in the range 1 to 12 and 1 to 31, respectively.
With strict mode disabled, invalid dates such as
'2004-04-31'
are converted to
'0000-00-00'
and a warning is generated.
With strict mode enabled, invalid dates generate an error.
To permit such dates, enable
ALLOW_INVALID_DATES
. See
Section 5.1.11, “Server SQL Modes”, for more information.
MySQL does not accept TIMESTAMP
values
that include a zero in the day or month column or values
that are not a valid date. The sole exception to this rule
is the special “zero” value
'0000-00-00 00:00:00'
.
Dates containing two-digit year values are ambiguous because the century is unknown. MySQL interprets two-digit year values using these rules:
Year values in the range 00-69
are
converted to 2000-2069
.
Year values in the range 70-99
are
converted to 1970-1999
.
MySQL retrieves and displays TIME
values in
'hh:mm:ss'
format (or
'hhh:mm:ss'
format for large hours
values). TIME
values may range from
'-838:59:59'
to
'838:59:59'
. The hours part may be so large
because the TIME
type can be used not only to
represent a time of day (which must be less than 24 hours), but
also elapsed time or a time interval between two events (which
may be much greater than 24 hours, or even negative).
MySQL recognizes TIME
values in several
formats, some of which can include a trailing fractional seconds
part in up to microseconds (6 digits) precision. See
Section 9.1.3, “Date and Time Literals”. For information about
fractional seconds support in MySQL, see
Section 11.3.5, “Fractional Seconds in Time Values”. In particular, any
fractional part in a value inserted into a
TIME
column is stored rather than discarded.
With the fractional part included, the range for
TIME
values is
'-838:59:59.000000'
to
'838:59:59.000000'
.
Be careful about assigning abbreviated values to a
TIME
column. MySQL interprets abbreviated
TIME
values with colons as time of the day.
That is, '11:12'
means
'11:12:00'
, not
'00:11:12'
. MySQL interprets abbreviated
values without colons using the assumption that the two
rightmost digits represent seconds (that is, as elapsed time
rather than as time of day). For example, you might think of
'1112'
and 1112
as meaning
'11:12:00'
(12 minutes after 11 o'clock), but
MySQL interprets them as '00:11:12'
(11
minutes, 12 seconds). Similarly, '12'
and
12
are interpreted as
'00:00:12'
.
The only delimiter recognized between a time part and a fractional seconds part is the decimal point.
By default, values that lie outside the TIME
range but are otherwise valid are clipped to the closest
endpoint of the range. For example,
'-850:00:00'
and
'850:00:00'
are converted to
'-838:59:59'
and
'838:59:59'
. Invalid TIME
values are converted to '00:00:00'
. Note that
because '00:00:00'
is itself a valid
TIME
value, there is no way to tell, from a
value of '00:00:00'
stored in a table,
whether the original value was specified as
'00:00:00'
or whether it was invalid.
For more restrictive treatment of invalid
TIME
values, enable strict SQL mode to cause
errors to occur. See Section 5.1.11, “Server SQL Modes”.
The YEAR
type is a 1-byte type used to
represent year values. It can be declared as
YEAR
or YEAR(4)
and has a
display width of four characters.
MySQL 8.0 does not support the
YEAR(2)
data type permitted in
older versions of MySQL. For instructions on converting to
YEAR(4)
, see
YEAR(2) Limitations and Migrating to YEAR(4) in
MySQL 5.7 Reference Manual.
MySQL displays YEAR
values in
YYYY
format, with a range of
1901
to 2155
, or
0000
.
You can specify input YEAR
values in a
variety of formats:
As a 4-digit number in the range 1901
to
2155
.
As a 4-digit string in the range '1901'
to '2155'
.
As a 1- or 2-digit number in the range 1
to 99
. MySQL converts values in the
ranges 1
to 69
and
70
to 99
to
YEAR
values in the ranges
2001
to 2069
and
1970
to 1999
.
As a 1- or 2-digit string in the range
'0'
to '99'
. MySQL
converts values in the ranges '0'
to
'69'
and '70'
to
'99'
to YEAR
values in
the ranges 2000
to
2069
and 1970
to
1999
.
The result of inserting a numeric 0
has a
display value of 0000
and an internal
value of 0000
. To insert zero and have it
be interpreted as 2000
, specify it as a
string '0'
or '00'
.
As the result of a function that returns a value that is
acceptable in a YEAR
context, such as
NOW()
.
MySQL converts invalid YEAR
values to
0000
.
TIMESTAMP
and
DATETIME
columns can be
automatically initializated and updated to the current date and
time (that is, the current timestamp).
For any TIMESTAMP
or
DATETIME
column in a table, you
can assign the current timestamp as the default value, the
auto-update value, or both:
An auto-initialized column is set to the current timestamp for inserted rows that specify no value for the column.
An auto-updated column is automatically updated to the
current timestamp when the value of any other column in the
row is changed from its current value. An auto-updated
column remains unchanged if all other columns are set to
their current values. To prevent an auto-updated column from
updating when other columns change, explicitly set it to its
current value. To update an auto-updated column even when
other columns do not change, explicitly set it to the value
it should have (for example, set it to
CURRENT_TIMESTAMP
).
In addition, if the
explicit_defaults_for_timestamp
system variable is disabled, you can initialize or update any
TIMESTAMP
(but not
DATETIME
) column to the current date and time
by assigning it a NULL
value, unless it has
been defined with the NULL
attribute to
permit NULL
values.
To specify automatic properties, use the DEFAULT
CURRENT_TIMESTAMP
and ON UPDATE
CURRENT_TIMESTAMP
clauses in column definitions. The
order of the clauses does not matter. If both are present in a
column definition, either can occur first. Any of the synonyms
for CURRENT_TIMESTAMP
have the
same meaning as
CURRENT_TIMESTAMP
. These are
CURRENT_TIMESTAMP()
,
NOW()
,
LOCALTIME
,
LOCALTIME()
,
LOCALTIMESTAMP
, and
LOCALTIMESTAMP()
.
Use of DEFAULT CURRENT_TIMESTAMP
and
ON UPDATE CURRENT_TIMESTAMP
is specific to
TIMESTAMP
and
DATETIME
. The
DEFAULT
clause also can be used to specify a
constant (nonautomatic) default value (for example,
DEFAULT 0
or DEFAULT '2000-01-01
00:00:00'
).
The following examples use DEFAULT 0
, a
default that can produce warnings or errors depending on
whether strict SQL mode or the
NO_ZERO_DATE
SQL mode is
enabled. Be aware that the
TRADITIONAL
SQL mode
includes strict mode and
NO_ZERO_DATE
. See
Section 5.1.11, “Server SQL Modes”.
TIMESTAMP
or
DATETIME
column definitions can
specify the current timestamp for both the default and
auto-update values, for one but not the other, or for neither.
Different columns can have different combinations of automatic
properties. The following rules describe the possibilities:
With both DEFAULT CURRENT_TIMESTAMP
and
ON UPDATE CURRENT_TIMESTAMP
, the column
has the current timestamp for its default value and is
automatically updated to the current timestamp.
CREATE TABLE t1 ( ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, dt DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP );
With a DEFAULT
clause but no ON
UPDATE CURRENT_TIMESTAMP
clause, the column has
the given default value and is not automatically updated to
the current timestamp.
The default depends on whether the
DEFAULT
clause specifies
CURRENT_TIMESTAMP
or a constant value.
With CURRENT_TIMESTAMP
, the default is
the current timestamp.
CREATE TABLE t1 ( ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP, dt DATETIME DEFAULT CURRENT_TIMESTAMP );
With a constant, the default is the given value. In this case, the column has no automatic properties at all.
CREATE TABLE t1 ( ts TIMESTAMP DEFAULT 0, dt DATETIME DEFAULT 0 );
With an ON UPDATE CURRENT_TIMESTAMP
clause and a constant DEFAULT
clause, the
column is automatically updated to the current timestamp and
has the given constant default value.
CREATE TABLE t1 ( ts TIMESTAMP DEFAULT 0 ON UPDATE CURRENT_TIMESTAMP, dt DATETIME DEFAULT 0 ON UPDATE CURRENT_TIMESTAMP );
With an ON UPDATE CURRENT_TIMESTAMP
clause but no DEFAULT
clause, the column
is automatically updated to the current timestamp but does
not have the current timestamp for its default value.
The default in this case is type dependent.
TIMESTAMP
has a default of 0
unless defined with the NULL
attribute,
in which case the default is NULL
.
CREATE TABLE t1 ( ts1 TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, -- default 0 ts2 TIMESTAMP NULL ON UPDATE CURRENT_TIMESTAMP -- default NULL );
DATETIME
has a default of
NULL
unless defined with the NOT
NULL
attribute, in which case the default is 0.
CREATE TABLE t1 ( dt1 DATETIME ON UPDATE CURRENT_TIMESTAMP, -- default NULL dt2 DATETIME NOT NULL ON UPDATE CURRENT_TIMESTAMP -- default 0 );
TIMESTAMP
and
DATETIME
columns have no
automatic properties unless they are specified explicitly, with
this exception: If the
explicit_defaults_for_timestamp
system variable is disabled, the first
TIMESTAMP
column has both
DEFAULT CURRENT_TIMESTAMP
and ON
UPDATE CURRENT_TIMESTAMP
if neither is specified
explicitly. To suppress automatic properties for the first
TIMESTAMP
column, use one of
these strategies:
Enable the
explicit_defaults_for_timestamp
system variable. In this case, the DEFAULT
CURRENT_TIMESTAMP
and ON UPDATE
CURRENT_TIMESTAMP
clauses that specify automatic
initialization and updating are available, but are not
assigned to any TIMESTAMP
column unless explicitly included in the column definition.
Alternatively, if
explicit_defaults_for_timestamp
is disabled, do either of the following:
Define the column with a DEFAULT
clause that specifies a constant default value.
Specify the NULL
attribute. This also
causes the column to permit NULL
values, which means that you cannot assign the current
timestamp by setting the column to
NULL
. Assigning
NULL
sets the column to
NULL
, not the current timestamp. To
assign the current timestamp, set the column to
CURRENT_TIMESTAMP
or a
synonym such as NOW()
.
Consider these table definitions:
CREATE TABLE t1 ( ts1 TIMESTAMP DEFAULT 0, ts2 TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP); CREATE TABLE t2 ( ts1 TIMESTAMP NULL, ts2 TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP); CREATE TABLE t3 ( ts1 TIMESTAMP NULL DEFAULT 0, ts2 TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP);
The tables have these properties:
In each table definition, the first
TIMESTAMP
column has no
automatic initialization or updating.
The tables differ in how the ts1
column
handles NULL
values. For
t1
, ts1
is
NOT NULL
and assigning it a value of
NULL
sets it to the current timestamp.
For t2
and t3
,
ts1
permits NULL
and
assigning it a value of NULL
sets it to
NULL
.
t2
and t3
differ in
the default value for ts1
. For
t2
, ts1
is defined to
permit NULL
, so the default is also
NULL
in the absence of an explicit
DEFAULT
clause. For
t3
, ts1
permits
NULL
but has an explicit default of 0.
If a TIMESTAMP
or
DATETIME
column definition
includes an explicit fractional seconds precision value
anywhere, the same value must be used throughout the column
definition. This is permitted:
CREATE TABLE t1 ( ts TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6) );
This is not permitted:
CREATE TABLE t1 ( ts TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP(3) );
If the
explicit_defaults_for_timestamp
system variable is disabled,
TIMESTAMP
columns by default are
NOT NULL
, cannot contain
NULL
values, and assigning
NULL
assigns the current timestamp. To permit
a TIMESTAMP
column to contain
NULL
, explicitly declare it with the
NULL
attribute. In this case, the default
value also becomes NULL
unless overridden
with a DEFAULT
clause that specifies a
different default value. DEFAULT NULL
can be
used to explicitly specify NULL
as the
default value. (For a TIMESTAMP
column not declared with the NULL
attribute,
DEFAULT NULL
is invalid.) If a
TIMESTAMP
column permits
NULL
values, assigning
NULL
sets it to NULL
, not
to the current timestamp.
The following table contains several
TIMESTAMP
columns that permit
NULL
values:
CREATE TABLE t ( ts1 TIMESTAMP NULL DEFAULT NULL, ts2 TIMESTAMP NULL DEFAULT 0, ts3 TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP );
A TIMESTAMP
column that permits
NULL
values does not
take on the current timestamp at insert time except under one of
the following conditions:
Its default value is defined as
CURRENT_TIMESTAMP
and no
value is specified for the column
CURRENT_TIMESTAMP
or any of
its synonyms such as NOW()
is
explicitly inserted into the column
In other words, a TIMESTAMP
column defined to permit NULL
values
auto-initializes only if its definition includes
DEFAULT CURRENT_TIMESTAMP
:
CREATE TABLE t (ts TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP);
If the TIMESTAMP
column permits
NULL
values but its definition does not
include DEFAULT CURRENT_TIMESTAMP
, you must
explicitly insert a value corresponding to the current date and
time. Suppose that tables t1
and
t2
have these definitions:
CREATE TABLE t1 (ts TIMESTAMP NULL DEFAULT '0000-00-00 00:00:00'); CREATE TABLE t2 (ts TIMESTAMP NULL DEFAULT NULL);
To set the TIMESTAMP
column in
either table to the current timestamp at insert time, explicitly
assign it that value. For example:
INSERT INTO t2 VALUES (CURRENT_TIMESTAMP); INSERT INTO t1 VALUES (NOW());
If the
explicit_defaults_for_timestamp
system variable is enabled,
TIMESTAMP
columns permit
NULL
values only if declared with the
NULL
attribute. Also,
TIMESTAMP
columns do not permit
assigning NULL
to assign the current
timestamp, whether declared with the NULL
or
NOT NULL
attribute. To assign the current
timestamp, set the column to
CURRENT_TIMESTAMP
or a synonym
such as NOW()
.
MySQL 8.0 has fractional seconds support for
TIME
,
DATETIME
, and
TIMESTAMP
values, with up to
microseconds (6 digits) precision:
To define a column that includes a fractional seconds part,
use the syntax
,
where type_name
(fsp
)type_name
is
TIME
,
DATETIME
, or
TIMESTAMP
, and
fsp
is the fractional seconds
precision. For example:
CREATE TABLE t1 (t TIME(3), dt DATETIME(6));
The fsp
value, if given, must be
in the range 0 to 6. A value of 0 signifies that there is no
fractional part. If omitted, the default precision is 0.
(This differs from the standard SQL default of 6, for
compatibility with previous MySQL versions.)
Inserting a TIME
,
DATE
, or
TIMESTAMP
value with a
fractional seconds part into a column of the same type but
having fewer fractional digits results in rounding. Consider
a table created and populated as follows:
CREATE TABLE fractest( c1 TIME(2), c2 DATETIME(2), c3 TIMESTAMP(2) ); INSERT INTO fractest VALUES ('17:51:04.777', '2018-09-08 17:51:04.777', '2018-09-08 17:51:04.777');
The temporal values are inserted into the table with rounding:
mysql> SELECT * FROM fractest;
+-------------+------------------------+------------------------+
| c1 | c2 | c3 |
+-------------+------------------------+------------------------+
| 17:51:04.78 | 2018-09-08 17:51:04.78 | 2018-09-08 17:51:04.78 |
+-------------+------------------------+------------------------+
No warning or error is given when such rounding occurs. This behavior follows the SQL standard.
To insert the values with truncation instead, enable the
TIME_TRUNCATE_FRACTIONAL
SQL mode:
SET @@sql_mode = sys.list_add(@@sql_mode, 'TIME_TRUNCATE_FRACTIONAL');
With that SQL mode enabled, the temporal values are inserted with truncation:
mysql> SELECT * FROM fractest;
+-------------+------------------------+------------------------+
| c1 | c2 | c3 |
+-------------+------------------------+------------------------+
| 17:51:04.77 | 2018-09-08 17:51:04.77 | 2018-09-08 17:51:04.77 |
+-------------+------------------------+------------------------+
Functions that take temporal arguments accept values with
fractional seconds. Return values from temporal functions
include fractional seconds as appropriate. For example,
NOW()
with no argument
returns the current date and time with no fractional part,
but takes an optional argument from 0 to 6 to specify that
the return value includes a fractional seconds part of that
many digits.
Syntax for temporal literals produces temporal values:
DATE '
,
str
'TIME '
,
and str
'TIMESTAMP
'
, and the
ODBC-syntax equivalents. The resulting value includes a
trailing fractional seconds part if specified. Previously,
the temporal type keyword was ignored and these constructs
produced the string value. See
Standard SQL and ODBC Date and Time Literals
str
'
To some extent, you can convert a value from one temporal type
to another. However, there may be some alteration of the value
or loss of information. In all cases, conversion between
temporal types is subject to the range of valid values for the
resulting type. For example, although
DATE
,
DATETIME
, and
TIMESTAMP
values all can be
specified using the same set of formats, the types do not all
have the same range of values.
TIMESTAMP
values cannot be
earlier than 1970
UTC or later than
'2038-01-19 03:14:07'
UTC. This means that a
date such as '1968-01-01'
, while valid as a
DATE
or
DATETIME
value, is not valid as a
TIMESTAMP
value and is converted
to 0
.
Conversion of DATE
values:
Conversion of DATETIME
and
TIMESTAMP
values:
Conversion to a DATE
value
takes fractional seconds into account and rounds the time
part. For example, '1999-12-31
23:59:59.499'
becomes
'1999-12-31'
, whereas
'1999-12-31 23:59:59.500'
becomes
'2000-01-01'
.
Conversion to a TIME
value
discards the date part because the
TIME
type contains no date
information.
For conversion of TIME
values to
other temporal types, the value of
CURRENT_DATE()
is used for the
date part. The TIME
is
interpreted as elapsed time (not time of day) and added to the
date. This means that the date part of the result differs from
the current date if the time value is outside the range from
'00:00:00'
to '23:59:59'
.
Suppose that the current date is
'2012-01-01'
.
TIME
values of
'12:00:00'
, '24:00:00'
,
and '-12:00:00'
, when converted to
DATETIME
or
TIMESTAMP
values, result in
'2012-01-01 12:00:00'
, '2012-01-02
00:00:00'
, and '2011-12-31
12:00:00'
, respectively.
Conversion of TIME
to
DATE
is similar but discards the
time part from the result: '2012-01-01'
,
'2012-01-02'
, and
'2011-12-31'
, respectively.
Explicit conversion can be used to override implicit conversion.
For example, in comparison of
DATE
and
DATETIME
values, the
DATE
value is coerced to the
DATETIME
type by adding a time
part of '00:00:00'
. To perform the comparison
by ignoring the time part of the
DATETIME
value instead, use the
CAST()
function in the following
way:
date_col
= CAST(datetime_col
AS DATE)
Conversion of TIME
and
DATETIME
values to numeric form
(for example, by adding +0
) depends on
whether the value contains a fractional seconds part.
TIME(
or
N
)DATETIME(
is converted to integer when N
)N
is 0
(or omitted) and to a DECIMAL
value with
N
decimal digits when
N
is greater than 0:
mysql>SELECT CURTIME(), CURTIME()+0, CURTIME(3)+0;
+-----------+-------------+--------------+ | CURTIME() | CURTIME()+0 | CURTIME(3)+0 | +-----------+-------------+--------------+ | 09:28:00 | 92800 | 92800.887 | +-----------+-------------+--------------+ mysql>SELECT NOW(), NOW()+0, NOW(3)+0;
+---------------------+----------------+--------------------+ | NOW() | NOW()+0 | NOW(3)+0 | +---------------------+----------------+--------------------+ | 2012-08-15 09:28:00 | 20120815092800 | 20120815092800.889 | +---------------------+----------------+--------------------+
Date values with two-digit years are ambiguous because the century is unknown. Such values must be interpreted into four-digit form because MySQL stores years internally using four digits.
For DATETIME
,
DATE
, and
TIMESTAMP
types, MySQL interprets
dates specified with ambiguous year values using these rules:
Year values in the range 00-69
are
converted to 2000-2069
.
Year values in the range 70-99
are
converted to 1970-1999
.
For YEAR
, the rules are the same, with this
exception: A numeric 00
inserted into
YEAR(4)
results in 0000
rather than 2000
. To specify zero for
YEAR(4)
and have it be interpreted as
2000
, specify it as a string
'0'
or '00'
.
Remember that these rules are only heuristics that provide reasonable guesses as to what your data values mean. If the rules used by MySQL do not produce the values you require, you must provide unambiguous input containing four-digit year values.
ORDER BY
properly sorts
YEAR
values that have two-digit
years.
Some functions like MIN()
and
MAX()
convert a
YEAR
to a number. This means that
a value with a two-digit year does not work properly with these
functions. The fix in this case is to convert the
YEAR
to four-digit year format.
The string types are CHAR
,
VARCHAR
,
BINARY
,
VARBINARY
,
BLOB
,
TEXT
,
ENUM
, and
SET
. This section describes how
these types work and how to use them in your queries. For string
type storage requirements, see
Section 11.8, “Data Type Storage Requirements”.
The CHAR
and VARCHAR
types
are similar, but differ in the way they are stored and
retrieved. They also differ in maximum length and in whether
trailing spaces are retained.
The CHAR
and VARCHAR
types
are declared with a length that indicates the maximum number of
characters you want to store. For example,
CHAR(30)
can hold up to 30 characters.
The length of a CHAR
column is fixed to the
length that you declare when you create the table. The length
can be any value from 0 to 255. When CHAR
values are stored, they are right-padded with spaces to the
specified length. When CHAR
values are
retrieved, trailing spaces are removed unless the
PAD_CHAR_TO_FULL_LENGTH
SQL
mode is enabled.
Values in VARCHAR
columns are variable-length
strings. The length can be specified as a value from 0 to
65,535. The effective maximum length of a
VARCHAR
is subject to the maximum row size
(65,535 bytes, which is shared among all columns) and the
character set used. See Section C.10.4, “Limits on Table Column Count and Row Size”.
In contrast to CHAR
,
VARCHAR
values are stored as a 1-byte or
2-byte length prefix plus data. The length prefix indicates the
number of bytes in the value. A column uses one length byte if
values require no more than 255 bytes, two length bytes if
values may require more than 255 bytes.
If strict SQL mode is not enabled and you assign a value to a
CHAR
or VARCHAR
column
that exceeds the column's maximum length, the value is truncated
to fit and a warning is generated. For truncation of nonspace
characters, you can cause an error to occur (rather than a
warning) and suppress insertion of the value by using strict SQL
mode. See Section 5.1.11, “Server SQL Modes”.
For VARCHAR
columns, trailing spaces in
excess of the column length are truncated prior to insertion and
a warning is generated, regardless of the SQL mode in use. For
CHAR
columns, truncation of excess trailing
spaces from inserted values is performed silently regardless of
the SQL mode.
VARCHAR
values are not padded when they are
stored. Trailing spaces are retained when values are stored and
retrieved, in conformance with standard SQL.
The following table illustrates the differences between
CHAR
and VARCHAR
by
showing the result of storing various string values into
CHAR(4)
and VARCHAR(4)
columns (assuming that the column uses a single-byte character
set such as latin1
).
Value | CHAR(4) |
Storage Required | VARCHAR(4) |
Storage Required |
---|---|---|---|---|
'' |
' ' |
4 bytes | '' |
1 byte |
'ab' |
'ab ' |
4 bytes | 'ab' |
3 bytes |
'abcd' |
'abcd' |
4 bytes | 'abcd' |
5 bytes |
'abcdefgh' |
'abcd' |
4 bytes | 'abcd' |
5 bytes |
The values shown as stored in the last row of the table apply only when not using strict mode; if MySQL is running in strict mode, values that exceed the column length are not stored, and an error results.
InnoDB
encodes fixed-length fields greater
than or equal to 768 bytes in length as variable-length fields,
which can be stored off-page. For example, a
CHAR(255)
column can exceed 768 bytes if the
maximum byte length of the character set is greater than 3, as
it is with utf8mb4
.
If a given value is stored into the CHAR(4)
and VARCHAR(4)
columns, the values retrieved
from the columns are not always the same because trailing spaces
are removed from CHAR
columns upon retrieval.
The following example illustrates this difference:
mysql>CREATE TABLE vc (v VARCHAR(4), c CHAR(4));
Query OK, 0 rows affected (0.01 sec) mysql>INSERT INTO vc VALUES ('ab ', 'ab ');
Query OK, 1 row affected (0.00 sec) mysql>SELECT CONCAT('(', v, ')'), CONCAT('(', c, ')') FROM vc;
+---------------------+---------------------+ | CONCAT('(', v, ')') | CONCAT('(', c, ')') | +---------------------+---------------------+ | (ab ) | (ab) | +---------------------+---------------------+ 1 row in set (0.06 sec)
Values in CHAR
and VARCHAR
columns are sorted and compared according to the character set
collation assigned to the column.
Most MySQL collations have a pad attribute of PAD SPACE. The exceptions are Unicode collations based on UCA 9.0.0 and higher, which have a pad attribute of NO PAD. (see Section 10.10.1, “Unicode Character Sets”).
To determine the pad attribute for a collation, use the
INFORMATION_SCHEMA
COLLATIONS
table, which has a
PAD_ATTRIBUTE
column.
The pad attribute determines how trailing spaces are treated for
comparison of nonbinary strings (CHAR
,
VARCHAR
, and TEXT
values).
NO PAD collations treat spaces at the end of strings like any
other character. For PAD SPACE collations, trailing spaces are
insignificant in comparisons; strings are compared without
regard to any trailing spaces. “Comparison” in this
context does not include the LIKE
pattern-matching operator, for which trailing spaces are
significant. For example:
mysql>CREATE TABLE names (myname CHAR(10));
Query OK, 0 rows affected (0.03 sec) mysql>INSERT INTO names VALUES ('Monty');
Query OK, 1 row affected (0.00 sec) mysql>SELECT myname = 'Monty', myname = 'Monty ' FROM names;
+------------------+--------------------+ | myname = 'Monty' | myname = 'Monty ' | +------------------+--------------------+ | 1 | 1 | +------------------+--------------------+ 1 row in set (0.00 sec) mysql>SELECT myname LIKE 'Monty', myname LIKE 'Monty ' FROM names;
+---------------------+-----------------------+ | myname LIKE 'Monty' | myname LIKE 'Monty ' | +---------------------+-----------------------+ | 1 | 0 | +---------------------+-----------------------+ 1 row in set (0.00 sec)
This is true for all MySQL versions, and is not affected by the server SQL mode.
For more information about MySQL character sets and collations, see Chapter 10, Character Sets, Collations, Unicode. For additional information about storage requirements, see Section 11.8, “Data Type Storage Requirements”.
For those cases where trailing pad characters are stripped or
comparisons ignore them, if a column has an index that requires
unique values, inserting into the column values that differ only
in number of trailing pad characters will result in a
duplicate-key error. For example, if a table contains
'a'
, an attempt to store
'a '
causes a duplicate-key error.
The BINARY
and VARBINARY
types are similar to CHAR
and
VARCHAR
, except that they contain
binary strings rather than nonbinary strings. That is, they
contain byte strings rather than character strings. This means
they have the binary
character set and
collation, and comparison and sorting are based on the numeric
values of the bytes in the values.
The permissible maximum length is the same for
BINARY
and VARBINARY
as it
is for CHAR
and
VARCHAR
, except that the length
for BINARY
and VARBINARY
is a length in bytes rather than in characters.
The BINARY
and VARBINARY
data types are distinct from the CHAR BINARY
and VARCHAR BINARY
data types. For the latter
types, the BINARY
attribute does not cause
the column to be treated as a binary string column. Instead, it
causes the binary (_bin
) collation for the
column character set to be used, and the column itself contains
nonbinary character strings rather than binary byte strings. For
example, CHAR(5) BINARY
is treated as
CHAR(5) CHARACTER SET utf8mb4 COLLATE
utf8mb4_bin
, assuming that the default character set
is utf8mb4
. This differs from
BINARY(5)
, which stores 5-bytes binary
strings that have the binary
character set
and collation. For information about differences between binary
strings and binary collations for nonbinary strings, see
Section 10.8.5, “The binary Collation Compared to _bin Collations”.
If strict SQL mode is not enabled and you assign a value to a
BINARY
or VARBINARY
column
that exceeds the column's maximum length, the value is truncated
to fit and a warning is generated. For cases of truncation, you
can cause an error to occur (rather than a warning) and suppress
insertion of the value by using strict SQL mode. See
Section 5.1.11, “Server SQL Modes”.
When BINARY
values are stored, they are
right-padded with the pad value to the specified length. The pad
value is 0x00
(the zero byte). Values are
right-padded with 0x00
on insert, and no
trailing bytes are removed on select. All bytes are significant
in comparisons, including ORDER BY
and
DISTINCT
operations. 0x00
bytes and spaces are different in comparisons, with
0x00
< space.
Example: For a BINARY(3)
column,
'a '
becomes
'a \0'
when inserted.
'a\0'
becomes 'a\0\0'
when
inserted. Both inserted values remain unchanged when selected.
For VARBINARY
, there is no padding on insert
and no bytes are stripped on select. All bytes are significant
in comparisons, including ORDER BY
and
DISTINCT
operations. 0x00
bytes and spaces are different in comparisons, with
0x00
< space.
For those cases where trailing pad bytes are stripped or
comparisons ignore them, if a column has an index that requires
unique values, inserting into the column values that differ only
in number of trailing pad bytes will result in a duplicate-key
error. For example, if a table contains 'a'
,
an attempt to store 'a\0'
causes a
duplicate-key error.
You should consider the preceding padding and stripping
characteristics carefully if you plan to use the
BINARY
data type for storing binary data and
you require that the value retrieved be exactly the same as the
value stored. The following example illustrates how
0x00
-padding of BINARY
values affects column value comparisons:
mysql>CREATE TABLE t (c BINARY(3));
Query OK, 0 rows affected (0.01 sec) mysql>INSERT INTO t SET c = 'a';
Query OK, 1 row affected (0.01 sec) mysql>SELECT HEX(c), c = 'a', c = 'a\0\0' from t;
+--------+---------+-------------+ | HEX(c) | c = 'a' | c = 'a\0\0' | +--------+---------+-------------+ | 610000 | 0 | 1 | +--------+---------+-------------+ 1 row in set (0.09 sec)
If the value retrieved must be the same as the value specified
for storage with no padding, it might be preferable to use
VARBINARY
or one of the
BLOB
data types instead.
A BLOB
is a binary large object that can hold
a variable amount of data. The four BLOB
types are TINYBLOB
, BLOB
,
MEDIUMBLOB
, and LONGBLOB
.
These differ only in the maximum length of the values they can
hold. The four TEXT
types are
TINYTEXT
, TEXT
,
MEDIUMTEXT
, and LONGTEXT
.
These correspond to the four BLOB
types and
have the same maximum lengths and storage requirements. See
Section 11.8, “Data Type Storage Requirements”.
BLOB
values are treated as binary strings
(byte strings). They have the binary
character set and collation, and comparison and sorting are
based on the numeric values of the bytes in column values.
TEXT
values are treated as nonbinary strings
(character strings). They have a character set other than
binary
, and values are sorted and compared
based on the collation of the character set.
If strict SQL mode is not enabled and you assign a value to a
BLOB
or TEXT
column that
exceeds the column's maximum length, the value is truncated to
fit and a warning is generated. For truncation of nonspace
characters, you can cause an error to occur (rather than a
warning) and suppress insertion of the value by using strict SQL
mode. See Section 5.1.11, “Server SQL Modes”.
Truncation of excess trailing spaces from values to be inserted
into TEXT
columns always
generates a warning, regardless of the SQL mode.
For TEXT
and BLOB
columns,
there is no padding on insert and no bytes are stripped on
select.
If a TEXT
column is indexed, index entry
comparisons are space-padded at the end. This means that, if the
index requires unique values, duplicate-key errors will occur
for values that differ only in the number of trailing spaces.
For example, if a table contains 'a'
, an
attempt to store 'a '
causes a
duplicate-key error. This is not true for
BLOB
columns.
In most respects, you can regard a BLOB
column as a VARBINARY
column that
can be as large as you like. Similarly, you can regard a
TEXT
column as a
VARCHAR
column.
BLOB
and TEXT
differ from
VARBINARY
and
VARCHAR
in the following ways:
For indexes on BLOB
and
TEXT
columns, you must specify an index
prefix length. For CHAR
and
VARCHAR
, a prefix length is
optional. See Section 8.3.5, “Column Indexes”.
If you use the BINARY
attribute with a
TEXT
data type, the column is assigned the
binary (_bin
) collation of the column
character set.
LONG
and LONG VARCHAR
map
to the MEDIUMTEXT
data type. This is a
compatibility feature.
MySQL Connector/ODBC defines BLOB
values as
LONGVARBINARY
and TEXT
values as LONGVARCHAR
.
Because BLOB
and TEXT
values can be extremely long, you might encounter some
constraints in using them:
Only the first
max_sort_length
bytes of
the column are used when sorting. The default value of
max_sort_length
is 1024.
You can make more bytes significant in sorting or grouping
by increasing the value of
max_sort_length
at server
startup or runtime. Any client can change the value of its
session max_sort_length
variable:
mysql>SET max_sort_length = 2000;
mysql>SELECT id, comment FROM t
->ORDER BY comment;
Instances of BLOB
or
TEXT
columns in the result of a query
that is processed using a temporary table causes the server
to use a table on disk rather than in memory because the
MEMORY
storage engine does not support
those data types (see
Section 8.4.4, “Internal Temporary Table Use in MySQL”). Use of disk
incurs a performance penalty, so include
BLOB
or TEXT
columns
in the query result only if they are really needed. For
example, avoid using
SELECT *
,
which selects all columns.
The maximum size of a BLOB
or
TEXT
object is determined by its type,
but the largest value you actually can transmit between the
client and server is determined by the amount of available
memory and the size of the communications buffers. You can
change the message buffer size by changing the value of the
max_allowed_packet
variable, but you must do so for both the server and your
client program. For example, both mysql
and mysqldump enable you to change the
client-side
max_allowed_packet
value.
See Section 5.1.1, “Configuring the Server”,
Section 4.5.1, “mysql — The MySQL Command-Line Client”, and Section 4.5.4, “mysqldump — A Database Backup Program”.
You may also want to compare the packet sizes and the size
of the data objects you are storing with the storage
requirements, see Section 11.8, “Data Type Storage Requirements”
Each BLOB
or TEXT
value is
represented internally by a separately allocated object. This is
in contrast to all other data types, for which storage is
allocated once per column when the table is opened.
In some cases, it may be desirable to store binary data such as
media files in BLOB
or
TEXT
columns. You may find MySQL's string
handling functions useful for working with such data. See
Section 12.5, “String Functions”. For security and other
reasons, it is usually preferable to do so using application
code rather than giving application users the
FILE
privilege. You can discuss
specifics for various languages and platforms in the MySQL
Forums (http://forums.mysql.com/).
An ENUM
is a string object with a value
chosen from a list of permitted values that are enumerated
explicitly in the column specification at table creation time.
It has these advantages:
Compact data storage in situations where a column has a
limited set of possible values. The strings you specify as
input values are automatically encoded as numbers. See
Section 11.8, “Data Type Storage Requirements” for the storage
requirements for ENUM
types.
Readable queries and output. The numbers are translated back to the corresponding strings in query results.
and these potential issues to consider:
If you make enumeration values that look like numbers, it is easy to mix up the literal values with their internal index numbers, as explained in Enumeration Limitations.
Using ENUM
columns in ORDER
BY
clauses requires extra care, as explained in
Enumeration Sorting.
An enumeration value must be a quoted string literal. For
example, you can create a table with an
ENUM
column like this:
CREATE TABLE shirts ( name VARCHAR(40), size ENUM('x-small', 'small', 'medium', 'large', 'x-large') ); INSERT INTO shirts (name, size) VALUES ('dress shirt','large'), ('t-shirt','medium'), ('polo shirt','small'); SELECT name, size FROM shirts WHERE size = 'medium'; +---------+--------+ | name | size | +---------+--------+ | t-shirt | medium | +---------+--------+ UPDATE shirts SET size = 'small' WHERE size = 'large'; COMMIT;
Inserting 1 million rows into this table with a value of
'medium'
would require 1 million bytes of
storage, as opposed to 6 million bytes if you stored the
actual string 'medium'
in a
VARCHAR
column.
Each enumeration value has an index:
The elements listed in the column specification are assigned index numbers, beginning with 1.
The index value of the empty string error value is 0. This
means that you can use the following
SELECT
statement to find
rows into which invalid ENUM
values
were assigned:
mysql> SELECT * FROM tbl_name
WHERE enum_col
=0;
The index of the NULL
value is
NULL
.
The term “index” here refers to a position within the list of enumeration values. It has nothing to do with table indexes.
For example, a column specified as ENUM('Mercury',
'Venus', 'Earth')
can have any of the values shown
here. The index of each value is also shown.
Value | Index |
---|---|
NULL |
NULL |
'' |
0 |
'Mercury' |
1 |
'Venus' |
2 |
'Earth' |
3 |
An ENUM
column can have a
maximum of 65,535 distinct elements.
If you retrieve an ENUM
value in a numeric
context, the column value's index is returned. For example,
you can retrieve numeric values from an
ENUM
column like this:
mysql> SELECT enum_col
+0 FROM tbl_name
;
Functions such as SUM()
or
AVG()
that expect a numeric
argument cast the argument to a number if necessary. For
ENUM
values, the index number is used in
the calculation.
Trailing spaces are automatically deleted from
ENUM
member values in the table definition
when a table is created.
When retrieved, values stored into an ENUM
column are displayed using the lettercase that was used in the
column definition. Note that ENUM
columns
can be assigned a character set and collation. For binary or
case-sensitive collations, lettercase is taken into account
when assigning values to the column.
If you store a number into an ENUM
column,
the number is treated as the index into the possible values,
and the value stored is the enumeration member with that
index. (However, this does not work with
LOAD DATA
, which treats all
input as strings.) If the numeric value is quoted, it is still
interpreted as an index if there is no matching string in the
list of enumeration values. For these reasons, it is not
advisable to define an ENUM
column with
enumeration values that look like numbers, because this can
easily become confusing. For example, the following column has
enumeration members with string values of
'0'
, '1'
, and
'2'
, but numeric index values of
1
, 2
, and
3
:
numbers ENUM('0','1','2')
If you store 2
, it is interpreted as an
index value, and becomes '1'
(the value
with index 2). If you store '2'
, it matches
an enumeration value, so it is stored as
'2'
. If you store '3'
,
it does not match any enumeration value, so it is treated as
an index and becomes '2'
(the value with
index 3).
mysql>INSERT INTO t (numbers) VALUES(2),('2'),('3');
mysql>SELECT * FROM t;
+---------+ | numbers | +---------+ | 1 | | 2 | | 2 | +---------+
To determine all possible values for an
ENUM
column, use
SHOW COLUMNS
FROM
and parse the
tbl_name
LIKE
'enum_col
'ENUM
definition in the
Type
column of the output.
In the C API, ENUM
values are returned as
strings. For information about using result set metadata to
distinguish them from other strings, see
Section 28.7.5, “C API Data Structures”.
An enumeration value can also be the empty string
(''
) or NULL
under
certain circumstances:
If you insert an invalid value into an
ENUM
(that is, a string not present in
the list of permitted values), the empty string is
inserted instead as a special error value. This string can
be distinguished from a “normal” empty string
by the fact that this string has the numeric value 0. See
Index Values for Enumeration Literals for details about the
numeric indexes for the enumeration values.
If strict SQL mode is enabled, attempts to insert invalid
ENUM
values result in an error.
If an ENUM
column is declared to permit
NULL
, the NULL
value
is a valid value for the column, and the default value is
NULL
. If an ENUM
column is declared NOT NULL
, its
default value is the first element of the list of
permitted values.
ENUM
values are sorted based on their index
numbers, which depend on the order in which the enumeration
members were listed in the column specification. For example,
'b'
sorts before 'a'
for
ENUM('b', 'a')
. The empty string sorts
before nonempty strings, and NULL
values
sort before all other enumeration values.
To prevent unexpected results when using the ORDER
BY
clause on an ENUM
column, use
one of these techniques:
Specify the ENUM
list in alphabetic
order.
Make sure that the column is sorted lexically rather than
by index number by coding ORDER BY
CAST(
or
col
AS CHAR)ORDER BY
CONCAT(
.
col
)
An enumeration value cannot be an expression, even one that evaluates to a string value.
For example, this CREATE TABLE
statement does not work because the
CONCAT
function cannot be used to construct
an enumeration value:
CREATE TABLE sizes ( size ENUM('small', CONCAT('med','ium'), 'large') );
You also cannot employ a user variable as an enumeration value. This pair of statements do not work:
SET @mysize = 'medium'; CREATE TABLE sizes ( size ENUM('small', @mysize, 'large') );
We strongly recommend that you do not use
numbers as enumeration values, because it does not save on
storage over the appropriate
TINYINT
or
SMALLINT
type, and it is easy
to mix up the strings and the underlying number values (which
might not be the same) if you quote the
ENUM
values incorrectly. If you do use a
number as an enumeration value, always enclose it in quotation
marks. If the quotation marks are omitted, the number is
regarded as an index. See Handling of Enumeration Literals to
see how even a quoted number could be mistakenly used as a
numeric index value.
Duplicate values in the definition cause a warning, or an error if strict SQL mode is enabled.
A SET
is a string object that can have zero
or more values, each of which must be chosen from a list of
permitted values specified when the table is created.
SET
column values that consist of multiple
set members are specified with members separated by commas
(,
). A consequence of this is that
SET
member values should not themselves
contain commas.
For example, a column specified as SET('one', 'two')
NOT NULL
can have any of these values:
'' 'one' 'two' 'one,two'
A SET
column can have a maximum
of 64 distinct members.
Duplicate values in the definition cause a warning, or an error if strict SQL mode is enabled.
Trailing spaces are automatically deleted from
SET
member values in the table definition
when a table is created.
When retrieved, values stored in a SET
column
are displayed using the lettercase that was used in the column
definition. Note that SET
columns can be
assigned a character set and collation. For binary or
case-sensitive collations, lettercase is taken into account when
assigning values to the column.
MySQL stores SET
values numerically, with the
low-order bit of the stored value corresponding to the first set
member. If you retrieve a SET
value in a
numeric context, the value retrieved has bits set corresponding
to the set members that make up the column value. For example,
you can retrieve numeric values from a SET
column like this:
mysql> SELECT set_col
+0 FROM tbl_name
;
If a number is stored into a SET
column, the
bits that are set in the binary representation of the number
determine the set members in the column value. For a column
specified as SET('a','b','c','d')
, the
members have the following decimal and binary values.
SET Member |
Decimal Value | Binary Value |
---|---|---|
'a' |
1 |
0001 |
'b' |
2 |
0010 |
'c' |
4 |
0100 |
'd' |
8 |
1000 |
If you assign a value of 9
to this column,
that is 1001
in binary, so the first and
fourth SET
value members
'a'
and 'd'
are selected
and the resulting value is 'a,d'
.
For a value containing more than one SET
element, it does not matter what order the elements are listed
in when you insert the value. It also does not matter how many
times a given element is listed in the value. When the value is
retrieved later, each element in the value appears once, with
elements listed according to the order in which they were
specified at table creation time. For example, suppose that a
column is specified as SET('a','b','c','d')
:
mysql> CREATE TABLE myset (col SET('a', 'b', 'c', 'd'));
如果插入值'a,d'
,
'd,a'
,'a,d,d'
,
'a,d,a'
,和'd,a,d'
:
MySQL的> INSERT INTO myset (col) VALUES
- >('a,d'),('d,a'),('a,d,a'),('a,d,d'),('d,a,d');
查询OK,5行受影响(0.01秒)
记录:5个重复:0个警告:0
然后所有这些值都显示为'a,d'
检索时:
MySQL的> SELECT col FROM myset;
+ ------ +
| col |
+ ------ +
| a,d |
| a,d |
| a,d |
| a,d |
| a,d |
+ ------ +
5行(0.04秒)
如果将SET
列设置为不受支持的值,则会忽略该值并发出警告:
MySQL的>INSERT INTO myset (col) VALUES ('a,d,d,s');
查询正常,1行受影响,1警告(0.03秒) MySQL的>SHOW WARNINGS;
+ --------- + ------ + -------------------------------- ---------- + | 等级| 代码| 消息| + --------- + ------ + -------------------------------- ---------- + | 警告| 1265 | 第1行|列'col'的数据被截断 + --------- + ------ + -------------------------------- ---------- + 1排(0.04秒) MySQL的>SELECT col FROM myset;
+ ------ + | col | + ------ + | a,d | | a,d | | a,d | | a,d | | a,d | | a,d | + ------ + 6行(0.01秒)
如果启用了严格的SQL模式,则尝试插入无效
SET
值会导致错误。
SET
值按数字排序。
NULL
值在非NULL
SET
值之前排序。
如果需要,
诸如SUM()
或
AVG()
期望数字参数的函数将参数强制转换为数字。对于
SET
值,强制转换操作会导致使用数值。
通常,您SET
使用FIND_IN_SET()
函数或
LIKE
运算符搜索值
:
mysql> mysql>SELECT * FROM
tbl_name
WHERE FIND_IN_SET('value
',set_col
)>0;SELECT * FROM
tbl_name
WHEREset_col
LIKE '%value
%';
第一个语句查找set_col
包含
value
set成员的行
。第二个是类似的,但不一样:它找到set_col
包含
value
任何地方的行
,即使是另一个set成员的子串。
以下陈述也是允许的:
mysql> mysql>SELECT * FROM
tbl_name
WHEREset_col
& 1;SELECT * FROM
tbl_name
WHEREset_col
= 'val1
,val2
';
这些语句中的第一个查找包含第一个set成员的值。第二个寻找完全匹配。小心比较第二种类型。比较设定值以
返回与比较值不同的结果
。您应该按照列定义中列出的顺序指定值。
'
val1
,val2
''
val2
,val1
'
要确定SET
列的所有可能值,请使用并解析
输出列中的定义。
SHOW COLUMNS FROM
tbl_name
LIKE
set_col
SET
Type
在C API中,SET
值以字符串形式返回。有关使用结果集元数据将其与其他字符串区分开的信息,请参见
第28.7.5节“C API数据结构”。
该开放地理空间联盟(OGC)超过250家企业,机构,以及参与公开可用的概念解决方案,可与各种用来管理空间数据的应用程序非常有用的发展大学的国际财团。
Open Geospatial Consortium发布 OpenGIS®地理信息实施标准 - 简单的功能访问 - 第2部分:SQL选项,该文档提出了几种扩展SQL RDBMS以支持空间数据的概念方法。该规范可从OGC网站 http://www.opengeospatial.org/standards/sfs获得。
遵循OGC规范,MySQL将空间扩展实现为具有几何类型环境的SQL的子集。该术语指的是使用一组几何类型扩展的SQL环境。几何值SQL列实现为具有几何类型的列。该规范描述了一组SQL几何类型,以及这些类型上用于创建和分析几何值的函数。
MySQL空间扩展支持地理特征的生成,存储和分析:
用于表示空间值的数据类型
用于操纵空间值的函数
空间索引可改善空间列的访问时间
The spatial data types and functions are available for
MyISAM
,
InnoDB
,
NDB
, and
ARCHIVE
tables. For indexing spatial
columns, MyISAM
and InnoDB
support both SPATIAL
and
non-SPATIAL
indexes. The other storage engines
support non-SPATIAL
indexes, as described in
Section 13.1.15, “CREATE INDEX Syntax”.
A geographic feature is anything in the world that has a location. A feature can be:
An entity. For example, a mountain, a pond, a city.
A space. For example, town district, the tropics.
A definable location. For example, a crossroad, as a particular place where two streets intersect.
Some documents use the term geospatial feature to refer to geographic features.
Geometry is another word that denotes a geographic feature. Originally the word geometry meant measurement of the earth. Another meaning comes from cartography, referring to the geometric features that cartographers use to map the world.
The discussion here considers these terms synonymous: geographic feature, geospatial feature, feature, or geometry. The term most commonly used is geometry, defined as a point or an aggregate of points representing anything in the world that has a location.
The following material covers these topics:
The spatial data types implemented in MySQL model
The basis of the spatial extensions in the OpenGIS geometry model
Data formats for representing spatial data
How to use spatial data in MySQL
Use of indexing for spatial data
MySQL differences from the OpenGIS specification
For information about functions that operate on spatial data, see Section 12.16, “Spatial Analysis Functions”.
These standards are important for the MySQL implementation of spatial operations:
SQL/MM Part 3: Spatial.
The Open Geospatial Consortium publishes the OpenGIS® Implementation Standard for Geographic information, a document that proposes several conceptual ways for extending an SQL RDBMS to support spatial data. See in particular Simple Feature Access - Part 1: Common Architecture, and Simple Feature Access - Part 2: SQL Option. The Open Geospatial Consortium (OGC) maintains a website at http://www.opengeospatial.org/. The specification is available there at http://www.opengeospatial.org/standards/sfs. It contains additional information relevant to the material here.
The grammar for spatial reference system (SRS) definitions is based on the grammar defined in OpenGIS Implementation Specification: Coordinate Transformation Services, Revision 1.00, OGC 01-009, January 12, 2001, Section 7.2. This specification is available at http://www.opengeospatial.org/standards/ct. For differences from that specification in SRS definitions as implemented in MySQL, see Section 13.1.19, “CREATE SPATIAL REFERENCE SYSTEM Syntax”.
If you have questions or concerns about the use of the spatial extensions to MySQL, you can discuss them in the GIS forum: https://forums.mysql.com/list.php?23.
MySQL has spatial data types that correspond to OpenGIS classes. The basis for these types is described in Section 11.5.2, “The OpenGIS Geometry Model”.
Some spatial data types hold single geometry values:
GEOMETRY
POINT
LINESTRING
POLYGON
GEOMETRY
can store geometry values of any
type. The other single-value types (POINT
,
LINESTRING
, and POLYGON
)
restrict their values to a particular geometry type.
The other spatial data types hold collections of values:
MULTIPOINT
MULTILINESTRING
MULTIPOLYGON
GEOMETRYCOLLECTION
GEOMETRYCOLLECTION
can store a collection of
objects of any type. The other collection types
(MULTIPOINT
,
MULTILINESTRING
, and
MULTIPOLYGON
) restrict collection members to
those having a particular geometry type.
Example: To create a table named geom
that
has a column named g
that can store values of
any geometry type, use this statement:
CREATE TABLE geom (g GEOMETRY);
Columns with a spatial data type can have an
SRID
attribute, to explicitly indicate the
spatial reference system (SRS) for values stored in the column.
For example:
CREATE TABLE geom ( p POINT SRID 0, g GEOMETRY NOT NULL SRID 4326 );
SPATIAL
indexes can be created on spatial
columns if they are NOT NULL
and have a
specific SRID, so if you plan to index the column, declare it
with the NOT NULL
and SRID
attributes:
CREATE TABLE geom (g GEOMETRY NOT NULL SRID 4326);
InnoDB
tables permit SRID
values for Cartesian and geographic SRSs.
MyISAM
tables permit SRID
values for Cartesian SRSs.
The SRID
attribute makes a spatial column
SRID-restricted, which has these implications:
The column can contain only values with the given SRID. Attempts to insert values with a different SRID produce an error.
The optimizer can use SPATIAL
indexes on
the column. See
Section 8.3.3, “SPATIAL Index Optimization”.
Spatial columns with no SRID
attribute are
not SRID-restricted and accept values with any SRID. However,
the optimizer cannot use SPATIAL
indexes on
them until the column definition is modified to include an
SRID
attribute, which may require that the
column contents first be modified so that all values have the
same SRID.
For other examples showing how to use spatial data types in MySQL, see Section 11.5.6, “Creating Spatial Columns”. For information about spatial reference systems, see Section 11.5.5, “Spatial Reference System Support”.
The set of geometry types proposed by OGC's SQL with Geometry Types environment is based on the OpenGIS Geometry Model. In this model, each geometric object has the following general properties:
It is associated with a spatial reference system, which describes the coordinate space in which the object is defined.
It belongs to some geometry class.
The geometry classes define a hierarchy as follows:
Geometry
(noninstantiable)
Point
(instantiable)
Curve
(noninstantiable)
LineString
(instantiable)
Line
LinearRing
Surface
(noninstantiable)
Polygon
(instantiable)
GeometryCollection
(instantiable)
MultiPoint
(instantiable)
MultiCurve
(noninstantiable)
MultiLineString
(instantiable)
MultiSurface
(noninstantiable)
MultiPolygon
(instantiable)
It is not possible to create objects in noninstantiable classes. It is possible to create objects in instantiable classes. All classes have properties, and instantiable classes may also have assertions (rules that define valid class instances).
Geometry
is the base class. It is an
abstract class. The instantiable subclasses of
Geometry
are restricted to zero-, one-, and
two-dimensional geometric objects that exist in
two-dimensional coordinate space. All instantiable geometry
classes are defined so that valid instances of a geometry
class are topologically closed (that is, all defined
geometries include their boundary).
The base Geometry
class has subclasses for
Point
, Curve
,
Surface
, and
GeometryCollection
:
Point
represents zero-dimensional
objects.
Curve
represents one-dimensional
objects, and has subclass LineString
,
with sub-subclasses Line
and
LinearRing
.
Surface
is designed for two-dimensional
objects and has subclass Polygon
.
GeometryCollection
has specialized
zero-, one-, and two-dimensional collection classes named
MultiPoint
,
MultiLineString
, and
MultiPolygon
for modeling geometries
corresponding to collections of Points
,
LineStrings
, and
Polygons
, respectively.
MultiCurve
and
MultiSurface
are introduced as abstract
superclasses that generalize the collection interfaces to
handle Curves
and
Surfaces
.
Geometry
, Curve
,
Surface
, MultiCurve
, and
MultiSurface
are defined as noninstantiable
classes. They define a common set of methods for their
subclasses and are included for extensibility.
Point
, LineString
,
Polygon
,
GeometryCollection
,
MultiPoint
,
MultiLineString
, and
MultiPolygon
are instantiable classes.
Geometry
is the root class of the
hierarchy. It is a noninstantiable class but has a number of
properties, described in the following list, that are common
to all geometry values created from any of the
Geometry
subclasses. Particular subclasses
have their own specific properties, described later.
Geometry Properties
A geometry value has the following properties:
Its type. Each geometry belongs to one of the instantiable classes in the hierarchy.
Its SRID, or spatial reference identifier. This value identifies the geometry's associated spatial reference system that describes the coordinate space in which the geometry object is defined.
In MySQL, the SRID value is an integer associated with the geometry value. The maximum usable SRID value is 232−1. If a larger value is given, only the lower 32 bits are used.
SRID 0 represents an infinite flat Cartesian plane with no units assigned to its axes. To ensure SRID 0 behavior, create geometry values using SRID 0. SRID 0 is the default for new geometry values if no SRID is specified.
For computations on multiple geometry values, all values must have the same SRID or an error occurs.
Its coordinates in its spatial reference system, represented as double-precision (8-byte) numbers. All nonempty geometries include at least one pair of (X,Y) coordinates. Empty geometries contain no coordinates.
Coordinates are related to the SRID. For example, in different coordinate systems, the distance between two objects may differ even when objects have the same coordinates, because the distance on the planar coordinate system and the distance on the geodetic system (coordinates on the Earth's surface) are different things.
Its interior, boundary, and exterior.
Every geometry occupies some position in space. The exterior of a geometry is all space not occupied by the geometry. The interior is the space occupied by the geometry. The boundary is the interface between the geometry's interior and exterior.
Its MBR (minimum bounding rectangle), or envelope. This is the bounding geometry, formed by the minimum and maximum (X,Y) coordinates:
((MINX MINY, MAXX MINY, MAXX MAXY, MINX MAXY, MINX MINY))
Whether the value is
simple or
nonsimple. Geometry
values of types (LineString
,
MultiPoint
,
MultiLineString
) are either simple or
nonsimple. Each type determines its own assertions for
being simple or nonsimple.
Whether the value is
closed or
not closed. Geometry
values of types (LineString
,
MultiString
) are either closed or not
closed. Each type determines its own assertions for being
closed or not closed.
Whether the value is
empty or
nonempty A geometry is
empty if it does not have any points. Exterior, interior,
and boundary of an empty geometry are not defined (that
is, they are represented by a NULL
value). An empty geometry is defined to be always simple
and has an area of 0.
Its dimension. A geometry can have a dimension of −1, 0, 1, or 2:
−1 for an empty geometry.
0 for a geometry with no length and no area.
1 for a geometry with nonzero length and zero area.
2 for a geometry with nonzero area.
Point
objects have a dimension of zero.
LineString
objects have a dimension of
1. Polygon
objects have a dimension of
2. The dimensions of MultiPoint
,
MultiLineString
, and
MultiPolygon
objects are the same as
the dimensions of the elements they consist of.
A Point
is a geometry that represents a
single location in coordinate space.
Point
Examples
Imagine a large-scale map of the world with many cities. A
Point
object could represent each city.
On a city map, a Point
object could
represent a bus stop.
Point
Properties
X-coordinate value.
Y-coordinate value.
Point
is defined as a zero-dimensional
geometry.
The boundary of a Point
is the empty
set.
A Curve
is a one-dimensional geometry,
usually represented by a sequence of points. Particular
subclasses of Curve
define the type of
interpolation between points. Curve
is a
noninstantiable class.
Curve
Properties
A Curve
has the coordinates of its
points.
A Curve
is defined as a one-dimensional
geometry.
A Curve
is simple if it does not pass
through the same point twice, with the exception that a
curve can still be simple if the start and end points are
the same.
A Curve
is closed if its start point is
equal to its endpoint.
The boundary of a closed Curve
is
empty.
The boundary of a nonclosed Curve
consists of its two endpoints.
A Curve
that is simple and closed is a
LinearRing
.
A LineString
is a Curve
with linear interpolation between points.
LineString
Examples
On a world map, LineString
objects
could represent rivers.
In a city map, LineString
objects could
represent streets.
LineString
Properties
A LineString
has coordinates of
segments, defined by each consecutive pair of points.
A LineString
is a
Line
if it consists of exactly two
points.
A LineString
is a
LinearRing
if it is both closed and
simple.
A Surface
is a two-dimensional geometry. It
is a noninstantiable class. Its only instantiable subclass is
Polygon
.
Surface
Properties
A Surface
is defined as a
two-dimensional geometry.
The OpenGIS specification defines a simple
Surface
as a geometry that consists of
a single “patch” that is associated with a
single exterior boundary and zero or more interior
boundaries.
The boundary of a simple Surface
is the
set of closed curves corresponding to its exterior and
interior boundaries.
A Polygon
is a planar
Surface
representing a multisided geometry.
It is defined by a single exterior boundary and zero or more
interior boundaries, where each interior boundary defines a
hole in the Polygon
.
Polygon
Examples
On a region map, Polygon
objects could
represent forests, districts, and so on.
Polygon
Assertions
The boundary of a Polygon
consists of a
set of LinearRing
objects (that is,
LineString
objects that are both simple
and closed) that make up its exterior and interior
boundaries.
A Polygon
has no rings that cross. The
rings in the boundary of a Polygon
may
intersect at a Point
, but only as a
tangent.
A Polygon
has no lines, spikes, or
punctures.
A Polygon
has an interior that is a
connected point set.
A Polygon
may have holes. The exterior
of a Polygon
with holes is not
connected. Each hole defines a connected component of the
exterior.
The preceding assertions make a Polygon
a
simple geometry.
A GeomCollection
is a geometry that is a
collection of zero or more geometries of any class.
GeomCollection
and
GeometryCollection
are synonymous, with
GeomCollection
the preferred type name.
All the elements in a geometry collection must be in the same
spatial reference system (that is, in the same coordinate
system). There are no other constraints on the elements of a
geometry collection, although the subclasses of
GeomCollection
described in the following
sections may restrict membership. Restrictions may be based
on:
Element type (for example, a MultiPoint
may contain only Point
elements)
Dimension
Constraints on the degree of spatial overlap between elements
A MultiPoint
is a geometry collection
composed of Point
elements. The points are
not connected or ordered in any way.
MultiPoint
Examples
On a world map, a MultiPoint
could
represent a chain of small islands.
On a city map, a MultiPoint
could
represent the outlets for a ticket office.
MultiPoint
Properties
A MultiPoint
is a zero-dimensional
geometry.
A MultiPoint
is simple if no two of its
Point
values are equal (have identical
coordinate values).
The boundary of a MultiPoint
is the
empty set.
A MultiCurve
is a geometry collection
composed of Curve
elements.
MultiCurve
is a noninstantiable class.
MultiCurve
Properties
A MultiCurve
is a one-dimensional
geometry.
A MultiCurve
is simple if and only if
all of its elements are simple; the only intersections
between any two elements occur at points that are on the
boundaries of both elements.
A MultiCurve
boundary is obtained by
applying the “mod 2 union rule” (also known
as the “odd-even rule”): A point is in the
boundary of a MultiCurve
if it is in
the boundaries of an odd number of
Curve
elements.
A MultiCurve
is closed if all of its
elements are closed.
The boundary of a closed MultiCurve
is
always empty.
A MultiLineString
is a
MultiCurve
geometry collection composed of
LineString
elements.
MultiLineString
Examples
On a region map, a MultiLineString
could represent a river system or a highway system.
A MultiSurface
is a geometry collection
composed of surface elements. MultiSurface
is a noninstantiable class. Its only instantiable subclass is
MultiPolygon
.
MultiSurface
Assertions
Surfaces within a MultiSurface
have no
interiors that intersect.
Surfaces within a MultiSurface
have
boundaries that intersect at most at a finite number of
points.
A MultiPolygon
is a
MultiSurface
object composed of
Polygon
elements.
MultiPolygon
Examples
On a region map, a MultiPolygon
could
represent a system of lakes.
MultiPolygon
Assertions
A MultiPolygon
has no two
Polygon
elements with interiors that
intersect.
A MultiPolygon
has no two
Polygon
elements that cross (crossing
is also forbidden by the previous assertion), or that
touch at an infinite number of points.
A MultiPolygon
may not have cut lines,
spikes, or punctures. A MultiPolygon
is
a regular, closed point set.
A MultiPolygon
that has more than one
Polygon
has an interior that is not
connected. The number of connected components of the
interior of a MultiPolygon
is equal to
the number of Polygon
values in the
MultiPolygon
.
MultiPolygon
Properties
A MultiPolygon
is a two-dimensional
geometry.
A MultiPolygon
boundary is a set of
closed curves (LineString
values)
corresponding to the boundaries of its
Polygon
elements.
Each Curve
in the boundary of the
MultiPolygon
is in the boundary of
exactly one Polygon
element.
Every Curve
in the boundary of an
Polygon
element is in the boundary of
the MultiPolygon
.
Two standard spatial data formats are used to represent geometry objects in queries:
Well-Known Text (WKT) format
Well-Known Binary (WKB) format
Internally, MySQL stores geometry values in a format that is not identical to either WKT or WKB format. (Internal format is like WKB but with an initial 4 bytes to indicate the SRID.)
There are functions available to convert between different data formats; see Section 12.16.6, “Geometry Format Conversion Functions”.
The following sections describe the spatial data formats MySQL uses:
The Well-Known Text (WKT) representation of geometry values is designed for exchanging geometry data in ASCII form. The OpenGIS specification provides a Backus-Naur grammar that specifies the formal production rules for writing WKT values (see Section 11.5, “Spatial Data Types”).
Examples of WKT representations of geometry objects:
A Point
:
POINT(15 20)
The point coordinates are specified with no separating
comma. This differs from the syntax for the SQL
Point()
function, which
requires a comma between the coordinates. Take care to use
the syntax appropriate to the context of a given spatial
operation. For example, the following statements both use
ST_X()
to extract the
X-coordinate from a Point
object. The
first produces the object directly using the
Point()
function. The
second uses a WKT representation converted to a
Point
with
ST_GeomFromText()
.
mysql>SELECT ST_X(Point(15, 20));
+---------------------+ | ST_X(POINT(15, 20)) | +---------------------+ | 15 | +---------------------+ mysql>SELECT ST_X(ST_GeomFromText('POINT(15 20)'));
+---------------------------------------+ | ST_X(ST_GeomFromText('POINT(15 20)')) | +---------------------------------------+ | 15 | +---------------------------------------+
A LineString
with four points:
LINESTRING(0 0, 10 10, 20 25, 50 60)
The point coordinate pairs are separated by commas.
A Polygon
with one exterior ring and
one interior ring:
POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7, 5 5))
A MultiPoint
with three
Point
values:
MULTIPOINT(0 0, 20 20, 60 60)
Spatial functions such as
ST_MPointFromText()
and
ST_GeomFromText()
that
accept WKT-format representations of
MultiPoint
values permit individual
points within values to be surrounded by parentheses. For
example, both of the following function calls are valid:
ST_MPointFromText('MULTIPOINT (1 1, 2 2, 3 3)') ST_MPointFromText('MULTIPOINT ((1 1), (2 2), (3 3))')
A MultiLineString
with two
LineString
values:
MULTILINESTRING((10 10, 20 20), (15 15, 30 15))
A MultiPolygon
with two
Polygon
values:
MULTIPOLYGON(((0 0,10 0,10 10,0 10,0 0)),((5 5,7 5,7 7,5 7, 5 5)))
A GeometryCollection
consisting of two
Point
values and one
LineString
:
GEOMETRYCOLLECTION(POINT(10 10), POINT(30 30), LINESTRING(15 15, 20 20))
The Well-Known Binary (WKB) representation of geometric values
is used for exchanging geometry data as binary streams
represented by BLOB
values
containing geometric WKB information. This format is defined
by the OpenGIS specification (see
Section 11.5, “Spatial Data Types”). It is also defined in the
ISO SQL/MM Part 3: Spatial standard.
WKB uses 1-byte unsigned integers, 4-byte unsigned integers, and 8-byte double-precision numbers (IEEE 754 format). A byte is eight bits.
For example, a WKB value that corresponds to POINT(1
-1)
consists of this sequence of 21 bytes, each
represented by two hexadecimal digits:
0101000000000000000000F03F000000000000F0BF
The sequence consists of the components shown in the following table.
Table 11.2 WKB Components Example
Component | Size | Value |
---|---|---|
Byte order | 1 byte | 01 |
WKB type | 4 bytes | 01000000 |
X coordinate | 8 bytes | 000000000000F03F |
Y coordinate | 8 bytes | 000000000000F0BF |
Component representation is as follows:
The byte order indicator is either 1 or 0 to signify little-endian or big-endian storage. The little-endian and big-endian byte orders are also known as Network Data Representation (NDR) and External Data Representation (XDR), respectively.
The WKB type is a code that indicates the geometry type.
MySQL uses values from 1 through 7 to indicate
Point
, LineString
,
Polygon
, MultiPoint
,
MultiLineString
,
MultiPolygon
, and
GeometryCollection
.
A Point
value has X and Y coordinates,
each represented as a double-precision value.
WKB values for more complex geometry values have more complex data structures, as detailed in the OpenGIS specification.
MySQL stores geometry values using 4 bytes to indicate the SRID followed by the WKB representation of the value. For a description of WKB format, see Well-Known Binary (WKB) Format.
For the WKB part, these MySQL-specific considerations apply:
The byte-order indicator byte is 1 because MySQL stores geometries as little-ending values.
MySQL supports geometry types of Point
,
LineString
, Polygon
,
MultiPoint
,
MultiLineString
,
MultiPolygon
, and
GeometryCollection
. Other geometry
types are not supported.
Only GeometryCollection
can be empty.
Such a value is stored with 0 elements.
Polygon rings can be specified both clockwise and counterclockwise. MySQL flips the rings automatically when reading data.
Cartesian coordinates are stored in the length unit of the spatial reference system, with X values in the X coordinates and Y values in the Y coordinates. Axis directions are those specified by the spatial reference system.
Geographic coordinates are stored in the angle unit of the spatial reference system, with longitudes in the X coordinates and latitudes in the Y coordinates. Axis directions and the meridian are those specified by the spatial reference system.
The LENGTH()
function returns
the space in bytes required for value storage. Example:
mysql>SET @g = ST_GeomFromText('POINT(1 -1)');
mysql>SELECT LENGTH(@g);
+------------+ | LENGTH(@g) | +------------+ | 25 | +------------+ mysql>SELECT HEX(@g);
+----------------------------------------------------+ | HEX(@g) | +----------------------------------------------------+ | 000000000101000000000000000000F03F000000000000F0BF | +----------------------------------------------------+
The value length is 25 bytes, made up of these components (as can be seen from the hexadecimal value):
4 bytes for integer SRID (0)
1 byte for integer byte order (1 = little-endian)
4 bytes for integer type information (1 =
Point
)
8 bytes for double-precision X coordinate (1)
8 bytes for double-precision Y coordinate (−1)
For geometry values, MySQL distinguishes between the concepts of syntactically well-formed and geometrically valid.
A geometry is syntactically well-formed if it satisfies conditions such as those in this (nonexhaustive) list:
Linestrings have at least two points
Polygons have at least one ring
Polygon rings are closed (first and last points the same)
Polygon rings have at least 4 points (minimum polygon is a triangle with first and last points the same)
Collections are not empty (except
GeometryCollection
)
A geometry is geometrically valid if it is syntactically well-formed and satisfies conditions such as those in this (nonexhaustive) list:
Polygons are not self-intersecting
Polygon interior rings are inside the exterior ring
Multipolygons do not have overlapping polygons
Spatial functions fail if a geometry is not syntactically well-formed. Spatial import functions that parse WKT or WKB values raise an error for attempts to create a geometry that is not syntactically well-formed. Syntactic well-formedness is also checked for attempts to store geometries into tables.
It is permitted to insert, select, and update geometrically
invalid geometries, but they must be syntactically well-formed.
Due to the computational expense, MySQL does not check
explicitly for geometric validity. Spatial computations may
detect some cases of invalid geometries and raise an error, but
they may also return an undefined result without detecting the
invalidity. Applications that require geometically valid
geometries should check them using the
ST_IsValid()
function.
A spatial reference system (SRS) for spatial data is a coordinate-based system for geographic locations.
There are different types of spatial reference systems:
A projected SRS is a projection of a globe onto a flat surface; that is, a flat map. For example, a light bulb inside a globe that shines on a paper cylinder surrounding the globe projects a map onto the paper. The result is georeferenced: Each point maps to a place on the globe. The coordinate system on that plane is Cartesian using a length unit (meters, feet, and so forth), rather than degrees of longitude and latitude.
The globes in this case are ellipsoids; that is, flattened spheres. Earth is a bit shorter in its North-South axis than its East-West axis, so a slightly flattened sphere is more correct, but perfect spheres permit faster calculations.
A geographic SRS is a nonprojected SRS representing longitude-latitude (or latitude-longitude) coordinates on an ellipsoid, in any angular unit.
The SRS denoted in MySQL by SRID 0 represents an infinite flat Cartesian plane with no units assigned to its axes. Unlike projected SRSs, it is not georeferenced and it does not necessarily represent Earth. It is an abstract plane that can be used for anything. SRID 0 is the default SRID for spatial data in MySQL.
MySQL maintains information about available spatial reference
systems for spatial data in the data dictionary
mysql.st_spatial_reference_systems
table,
which can store entries for projected and geographic SRSs. This
data dictionary table is invisible, but SRS entry contents are
available through the INFORMATION_SCHEMA
ST_SPATIAL_REFERENCE_SYSTEMS
table,
implemented as a view on
mysql.st_spatial_reference_systems
(see
Section 25.28, “The INFORMATION_SCHEMA ST_SPATIAL_REFERENCE_SYSTEMS Table”).
The following example shows what an SRS entry looks like:
mysql>SELECT *
FROM INFORMATION_SCHEMA.ST_SPATIAL_REFERENCE_SYSTEMS
WHERE SRS_ID = 4326\G
*************************** 1. row *************************** SRS_NAME: WGS 84 SRS_ID: 4326 ORGANIZATION: EPSG ORGANIZATION_COORDSYS_ID: 4326 DEFINITION: GEOGCS["WGS 84",DATUM["World Geodetic System 1984", SPHEROID["WGS 84",6378137,298.257223563, AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]], PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]], UNIT["degree",0.017453292519943278, AUTHORITY["EPSG","9122"]], AXIS["Lat",NORTH],AXIS["Long",EAST], AUTHORITY["EPSG","4326"]] DESCRIPTION:
This entry describes the SRS used for GPS systems. It has a name
(SRS_NAME
) of WGS 84 and an ID
(SRS_ID
) of 4326, which is the ID used by the
European Petroleum Survey
Group (EPSG).
SRS definitions in the DEFINITION
column are
WKT values, represented as specified in the
Open Geospatial
Consortium document
OGC
12-063r5.
SRS_ID
values represent the same kind of
values passed as the SRID argument to spatial functions. SRID 0
(the unitless Cartesian plane) is special. It is always a legal
spatial reference system ID and can be used in any computations
on spatial data that depend on SRID values.
For computations on multiple geometry values, all values must have the same SRID or an error occurs.
SRS definition parsing occurs on demand when definitions are needed by GIS functions. Parsed definitions are cached in the data dictionary cache so that parsing overhead is not incurred for every statement that needs SRS information.
To enable manipulation of SRS entries stored in the data dictionary, MySQL provides these SQL statements:
CREATE SPATIAL REFERENCE
SYSTEM
: See
Section 13.1.19, “CREATE SPATIAL REFERENCE SYSTEM Syntax”. The
description for this statement includes additional
information about SRS components.
DROP SPATIAL REFERENCE
SYSTEM
: See
Section 13.1.31, “DROP SPATIAL REFERENCE SYSTEM Syntax”.
MySQL provides a standard way of creating spatial columns for
geometry types, for example, with CREATE
TABLE
or ALTER TABLE
.
Spatial columns are supported for
MyISAM
,
InnoDB
,
NDB
, and
ARCHIVE
tables. See also the notes
about spatial indexes under
Section 11.5.10, “Creating Spatial Indexes”.
Columns with a spatial data type can have an SRID attribute, to explicitly indicate the spatial reference system (SRS) for values stored in the column. For implications of an SRID-restricted column, see Section 11.5.1, “Spatial Data Types”.
Use the CREATE TABLE
statement to create a table with a spatial column:
CREATE TABLE geom (g GEOMETRY);
Use the ALTER TABLE
statement
to add or drop a spatial column to or from an existing
table:
ALTER TABLE geom ADD pt POINT; ALTER TABLE geom DROP pt;
After you have created spatial columns, you can populate them with spatial data.
Values should be stored in internal geometry format, but you can convert them to that format from either Well-Known Text (WKT) or Well-Known Binary (WKB) format. The following examples demonstrate how to insert geometry values into a table by converting WKT values to internal geometry format:
The following examples insert more complex geometries into the table:
SET @g ='LINESTRING(0 0,1 1,2 2)'; INSERT INTO geom VALUES(ST_GeomFromText(@g)); SET @g ='POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7,5 5))'; INSERT INTO geom VALUES(ST_GeomFromText(@g)); SET @g = '几何图形(点(1 1),LINESTRING(0 0,1 1,2 2,3 3,4 4))'; INSERT INTO geom VALUES(ST_GeomFromText(@g));
前面的示例用于
ST_GeomFromText()
创建几何值。您还可以使用特定于类型的函数:
SET @g ='POINT(1 1)'; INSERT INTO geom VALUES(ST_PointFromText(@g)); SET @g ='LINESTRING(0 0,1 1,2 2)'; INSERT INTO geom VALUES(ST_LineStringFromText(@g)); SET @g ='POLYGON((0 0,10 0,10 10,0 10,0 0),(5 5,7 5,7 7,5 7,5 5))'; INSERT INTO geom VALUES(ST_PolygonFromText(@g)); SET @g = '几何图形(点(1 1),LINESTRING(0 0,1 1,2 2,3 3,4 4))'; INSERT INTO geom VALUES(ST_GeomCollFromText(@g));
想要使用几何值的WKB表示的客户端应用程序负责将查询中正确形成的WKB发送到服务器。有几种方法可以满足这一要求。例如:
POINT(1 1)
使用十六进制文字语法
插入值:
插入geom VALUES (ST_GeomFromWKB(X'0101000000000000000000F03F000000000000F03F'));
ODBC应用程序可以发送WKB表示,使用BLOB
类型的参数将其绑定到占位符
:
INSERT INTO geom VALUES(ST_GeomFromWKB(?))
其他编程接口可以支持类似的占位符机制。
在C程序中,您可以使用转义二进制值
mysql_real_escape_string_quote()
,并将结果包含在发送到服务器的查询字符串中。请参见
第28.7.7.56节“mysql_real_escape_string_quote()”。
存储在表中的几何值可以以内部格式获取。您也可以将它们转换为WKT或WKB格式。
以内部格式获取空间数据:
使用内部格式获取几何值在表到表传输中非常有用:
CREATE TABLE geom2(g GEOMETRY)SELECT g FROM geom;
以WKT格式获取空间数据:
该ST_AsText()
函数将几何从内部格式转换为WKT字符串。
SELECT ST_AsText(g)FROM geom;
以WKB格式获取空间数据:
该ST_AsBinary()
函数将几何从内部格式转换为BLOB
包含WKB值的几何
。
SELECT ST_AsBinary(g)FROM geom;
对于MyISAM
和
InnoDB
表,可以使用SPATIAL
索引优化包含空间数据的列中的搜索操作
。最典型的操作是:
点查询搜索包含给定点的所有对象
搜索与给定区域重叠的所有对象的区域查询
MySQL使用R-TreesSPATIAL
对空间列上的索引进行二次分裂。甲SPATIAL
指数使用几何的最小外接矩形(MBR)构建的。对于大多数几何体,MBR是围绕几何形状的最小矩形。对于水平或垂直线串,MBR是一个退化为线串的矩形。对于某一点,MBR是一个退化为该点的矩形。
也可以在空间列上创建普通索引。在非SPATIAL
索引中,您必须为除列之外的任何空间列声明前缀
POINT
。
MyISAM
并InnoDB
支持SPATIAL
和非SPATIAL
索引。其他存储引擎支持非SPATIAL
索引,如
第13.1.15节“CREATE INDEX语法”中所述。
对于InnoDB
和MyISAM
表,MySQL可以使用类似于创建常规索引的语法创建空间索引,但使用
SPATIAL
关键字。必须声明空间索引中的列NOT NULL
。以下示例演示了如何创建空间索引:
CREATE TABLE geom(g GEOMETRY NOT NULL SRID 4326,空间索引(g));
CREATE TABLE geom(g GEOMETRY NOT NULL SRID 4326); ALTER TABLE geom ADD SPATIAL INDEX(g);
CREATE TABLE geom(g GEOMETRY NOT NULL SRID 4326); 创建空间索引g on geom(g);
SPATIAL INDEX
创建一个R树索引。对于支持空间列的非空间索引的存储引擎,引擎会创建B树索引。空间值的B树索引对精确值查找很有用,但对于范围扫描则不行。
优化程序可以使用在SRID限制的列上定义的空间索引。有关更多信息,请参见 第11.5.1节“空间数据类型”和 第8.3.3节“空间索引优化”。
有关索引空间列的更多信息,请参见 第13.1.15节“CREATE INDEX语法”。
要删除空间索引,请使用ALTER
TABLE
或DROP INDEX
:
ALTER TABLE geom DROP INDEX g;
DROP INDEX g ON geom;
示例:假设一个表geom
包含超过32,000个几何,这些几何存储在g
类型列
中GEOMETRY
。该表还有一个用于存储对象ID值的AUTO_INCREMENT
列
fid
。
MySQL的>DESCRIBE geom;
+ ------- + ---------- + ------ + ------ + --------- + ------- --------- + | 领域| 输入| 空| 钥匙| 默认| 额外的| + ------- + ---------- + ------ + ------ + --------- + ------- --------- + | fid | int(11)| | PRI | NULL | auto_increment | | g | 几何| | | | | + ------- + ---------- + ------ + ------ + --------- + ------- --------- + 2行(0.00秒) MySQL的>SELECT COUNT(*) FROM geom;
+ ---------- + | count(*)| + ---------- + | 32376 | + ---------- + 1排(0.00秒)
要在列上添加空间索引g
,请使用以下语句:
MySQL的> ALTER TABLE geom ADD SPATIAL INDEX(g);
查询OK,受影响的32376行(4.05秒)
记录:32376重复:0警告:0
优化程序调查可用空间索引是否可以参与搜索使用诸如MBRContains()
或
MBRWithin()
在
WHERE
子句中的函数的查询。以下查询查找给定矩形中的所有对象:
mysql>SET @poly =
- > mysql> - >'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
SELECT fid,ST_AsText(g) FROM geom WHERE
MBRContains(ST_GeomFromText(@poly),g);
+ ----- + ------------------------------------------- -------------------- + | fid | ST_AsText(g)| + ----- + ------------------------------------------- -------------------- + | 21 | LINESTRING(30350.4 15828.8,30350.6 15845,30333.8 15845,30 ... | | 22 | LINESTRING(30350.6 15871.4,30350.6 15887.8,30334 15887.8,... | | 23 | LINESTRING(30350.6 15914.2,30350.6 15930.4,30334 15930.4,... | | 24 | LINESTRING(30290.2 15823,30290.2 15839.4,30273.4 15839.4,... | | 25 | LINESTRING(30291.4 15866.2,30291.6 15882.4,30274.8 15882. ... | | 26 | LINESTRING(30291.6 15918.2,30291.6 15934.4,30275 15934.4,... | | 249 | LINESTRING(30337.8 15938.6,30337.8 15946.8,30320.4 15946. ... | | 1 | LINESTRING(30250.4 15129.2,30248.8 15138.4,30238.2 15136. ... | | 2 | LINESTRING(30220.2 15122.8,30217.2 15137.8,30207.6 15136,... | | 3 | LINESTRING(30179 15114.4,30176.6 15129.4,30167 15128,3016 ... | | 4 | LINESTRING(30155.2 15121.4,30140.4 15118.6,30142 15109,30 ... | | 5 | LINESTRING(30192.4 15085,30177.6 15082.2,30179.2 15072.4,... | | 6 | LINESTRING(30244 15087,30229 15086.2,30229.4 15076.4,3024 ... | | 7 | LINESTRING(30200.6 15059.4,30185.6 15058.6,30186 15048.8,... | | 10 | LINESTRING(30179.6 15017.8,30181 15002.8,30190.8 15003.6,... | | 11 | LINESTRING(30154.2 15000.4,30168.6 15004.8,30166 15014.2,... | | 13 | LINESTRING(30105 15065.8,30108.4 15050.8,30118 15053,3011 ... | | 154 | LINESTRING(30276.2 15143.8,30261.4 15141,30263 15131.4,30 ... | | 155 | LINESTRING(30269.8 15084,30269.4 15093.4,30258.6 15093,30 ... | | 157 | LINESTRING(30128.2 15011,30113.2 15010.2,30113.6 15000.4,... | + ----- + ------------------------------------------- -------------------- + 20行(0.00秒)
使用EXPLAIN
检查执行该查询方式:
mysql>SET @poly =
- > mysql> - >'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
EXPLAIN SELECT fid,ST_AsText(g) FROM geom WHERE
MBRContains(ST_GeomFromText(@poly),g)\G
*************************** 1。排******************** ******* id:1 select_type:SIMPLE 表:geom 类型:范围 possible_keys:g 关键:g key_len:32 ref:NULL 行:50 额外:使用在哪里 1排(0.00秒)
检查没有空间索引会发生什么:
mysql>SET @poly =
- > mysql> - >'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
EXPLAIN SELECT fid,ST_AsText(g) FROM g IGNORE INDEX (g) WHERE
MBRContains(ST_GeomFromText(@poly),g)\G
*************************** 1。排******************** ******* id:1 select_type:SIMPLE 表:geom 类型:全部 possible_keys:NULL key:NULL key_len:NULL ref:NULL 行:32376 额外:使用在哪里 1排(0.00秒)
在SELECT
没有空间索引的情况下执行语句会产生相同的结果,但会导致执行时间从0.00秒增加到0.46秒:
mysql>SET @poly =
- > mysql> - >'Polygon((30000 15000, 31000 15000, 31000 16000, 30000 16000, 30000 15000))';
SELECT fid,ST_AsText(g) FROM geom IGNORE INDEX (g) WHERE
MBRContains(ST_GeomFromText(@poly),g);
+ ----- + ------------------------------------------- -------------------- + | fid | ST_AsText(g)| + ----- + ------------------------------------------- -------------------- + | 1 | LINESTRING(30250.4 15129.2,30248.8 15138.4,30238.2 15136. ... | | 2 | LINESTRING(30220.2 15122.8,30217.2 15137.8,30207.6 15136,... | | 3 | LINESTRING(30179 15114.4,30176.6 15129.4,30167 15128,3016 ... | | 4 | LINESTRING(30155.2 15121.4,30140.4 15118.6,30142 15109,30 ... | | 5 | LINESTRING(30192.4 15085,30177.6 15082.2,30179.2 15072.4,... | | 6 | LINESTRING(30244 15087,30229 15086.2,30229.4 15076.4,3024 ... | | 7 | LINESTRING(30200.6 15059.4,30185.6 15058.6,30186 15048.8,... | | 10 | LINESTRING(30179.6 15017.8,30181 15002.8,30190.8 15003.6,... | | 11 | LINESTRING(30154.2 15000.4,30168.6 15004.8,30166 15014.2,... | | 13 | LINESTRING(30105 15065.8,30108.4 15050.8,30118 15053,3011 ... | | 21 | LINESTRING(30350.4 15828.8,30350.6 15845,30333.8 15845,30 ... | | 22 | LINESTRING(30350.6 15871.4,30350.6 15887.8,30334 15887.8,... | | 23 | LINESTRING(30350.6 15914.2,30350.6 15930.4,30334 15930.4,... | | 24 | LINESTRING(30290.2 15823,30290.2 15839.4,30273.4 15839.4,... | | 25 | LINESTRING(30291.4 15866.2,30291.6 15882.4,30274.8 15882. ... | | 26 | LINESTRING(30291.6 15918.2,30291.6 15934.4,30275 15934.4,... | | 154 | LINESTRING(30276.2 15143.8,30261.4 15141,30263 15131.4,30 ... | | 155 | LINESTRING(30269.8 15084,30269.4 15093.4,30258.6 15093,30 ... | | 157 | LINESTRING(30128.2 15011,30113.2 15010.2,30113.6 15000.4,... | | 249 | LINESTRING(30337.8 15938.6,30337.8 15946.8,30320.4 15946. ... | + ----- + ------------------------------------------- -------------------- + 20行(0.46秒)
MySQL支持RFC 7159JSON
定义的本机数据类型,可以高效访问JSON(JavaScript Object Notation)文档中的数据。与
在字符串列中存储JSON格式字符串相比,数据类型具有以下优势:
JSON
存储在JSON
列中的JSON文档的自动验证
。无效的文档会产生错误。
优化的存储格式。存储在JSON
列中的JSON文档将
转换为内部格式,以允许对文档元素进行快速读取访问。当服务器稍后必须读取以该二进制格式存储的JSON值时,不需要从文本表示中解析该值。二进制格式的结构使服务器能够直接通过键或数组索引查找子对象或嵌套值,而无需在文档之前或之后读取所有值。
MySQL 8.0还支持使用该
函数在RFC 7396中定义的JSON Merge Patch格式
。有关示例和更多信息,请参阅此函数的说明以及
JSON值的规范化,合并和自动包装。
JSON_MERGE_PATCH()
此讨论使用JSON
monotype来指定JSON数据类型和
常规字体中的“ JSON ”以指示一般的JSON数据。
存储JSON
文档所需的空间与LONGBLOB
或
大致相同LONGTEXT
; 有关更多信息,请参见
第11.8节“数据类型存储要求”。请务必记住,JSON
列中存储的任何JSON文档的大小都限制为max_allowed_packet
系统变量的值。(当服务器在内存中内部操作JSON值时,它可能大于此值;当服务器存储时,该限制适用。)您可以使用该JSON_STORAGE_SIZE()
函数获取存储JSON文档所需的空间量
; 请注意JSON
列,存储大小 - 以及此函数返回的值 - 是在可能已对其执行的任何部分更新之前由列使用的值(请参阅本节后面的JSON部分更新优化的讨论)。
在MySQL 8.0.13之前,JSON
列不能具有非NULL
默认值。
除了JSON
数据类型之外,还有一组SQL函数可用于对JSON值进行操作,例如创建,操作和搜索。以下讨论显示了这些操作的示例。有关各个函数的详细信息,请参见第12.17节“JSON函数”。
还提供了一组用于操作GeoJSON值的空间函数。请参见第12.16.11节“空间GeoJSON函数”。
JSON
列,如其他二进制类型的列,不直接索引; 相反,您可以在生成的列上创建索引,该列从列中提取标量值
JSON
。有关详细示例,请参阅
索引生成的列以提供JSON列索引。
MySQL优化器还在与JSON表达式匹配的虚拟列上查找兼容索引。
MySQL NDB Cluster 8.0支持JSON
列和MySQL JSON函数,包括在列生成的JSON
列上创建索引,作为无法索引JSON
列的变通方法。JSON
每个NDB
表最多支持3 列
。
在MySQL 8.0中,优化器可以执行列的部分就地更新,JSON
而不是删除旧文档并将新文档完整地写入列。可以对满足以下条件的更新执行此优化:
正在更新的列被声明为
JSON
。
该UPDATE
语句使用任何的三个功能
JSON_SET()
,
JSON_REPLACE()
或
JSON_REMOVE()
更新列。UPDATE mytable SET jcol = '{"a": 10, "b":
25}'
无法直接指定列值(例如
)作为部分更新。
可以以这种方式优化JSON
单个UPDATE
语句
中多列的更新; MySQL只能对使用刚才列出的三个函数更新其值的列执行部分更新。
输入列和目标列必须是同一列; 诸如UPDATE mytable SET jcol1
= JSON_SET(jcol2, '$.a', 100)
不能作为部分更新执行的语句。
只要输入列和目标列相同,更新就可以以任意组合使用对前一项中列出的任何函数的嵌套调用。
所有更改都使用新值替换现有数组或对象值,并且不向父对象或数组添加任何新元素。
被替换的值必须至少与替换值一样大。换句话说,新值不能大于旧值。
当前一次部分更新为较大值留出足够空间时,会出现此要求的可能例外情况。您可以使用该函数
JSON_STORAGE_FREE()
查看JSON
列的任何部分更新释放了多少空间
。
可以使用节省空间的紧凑格式将这种部分更新写入二进制日志; 这可以通过将binlog_row_value_options
系统变量设置为来启用PARTIAL_JSON
。有关更多信息,请参阅此变量的说明。
接下来的几节提供了有关JSON值的创建和操作的基本信息。
JSON数组包含一个由逗号分隔的值列表,并包含在
字符[
和]
字符中:
[“abc”,10,null,true,false]
JSON对象包含一组由逗号分隔的键值对,并包含在字符{
和
}
字符中:
{“k1”:“value”,“k2”:10}
如示例所示,JSON数组和对象可以包含字符串或数字的标量值,JSON空文字或JSON布尔值true或false文字。JSON对象中的键必须是字符串。时间(日期,时间或日期时间)标量值也是允许的:
[“12:18:29.000000”,“2015-07-29”,“2015-07-29 12:18:29.000000”]
在JSON数组元素和JSON对象键值中允许嵌套:
[99,{“id”:“HK500”,“费用”:75.99},[“热”,“冷”]] {“k1”:“value”,“k2”:[10,20]}
您还可以为此目的从MySQL提供的许多函数中获取JSON值(请参见
第12.17.2节“创建JSON值的函数”)以及JSON
使用其他类型的值转换为类型
(请参阅
在JSON之间转换)和非JSON值)。接下来的几段描述了MySQL如何处理作为输入提供的JSON值。
CAST(
value
AS
JSON)
在MySQL中,JSON值被写为字符串。MySQL解析在需要JSON值的上下文中使用的任何字符串,如果它作为JSON无效则会产生错误。这些上下文包括将值插入到具有JSON
数据类型的列中,
并将参数传递给期望JSON值的函数(通常显示为
json_doc
或
json_val
在MySQL JSON函数的文档中),如以下示例所示:
JSON
如果值是有效的JSON值,则
尝试将值插入列成功,但如果不是,则尝试失败:
MySQL的>CREATE TABLE t1 (jdoc JSON);
查询OK,0行受影响(0.20秒) MySQL的>INSERT INTO t1 VALUES('{"key1": "value1", "key2": "value2"}');
查询OK,1行受影响(0.01秒) MySQL的>INSERT INTO t1 VALUES('[1, 2,');
ERROR 3140(22032)第2行:无效的JSON文本: “无效值。” 在位置6的值(或列)'[1,2,'。
为位置“ 在位置
N
”在这样的错误消息是基于0的,但应考虑其中一个值问题实际发生的粗指示。
该JSON_TYPE()
函数需要一个JSON参数并尝试将其解析为JSON值。如果值有效,则返回值的JSON类型,否则产生错误:
MySQL的>SELECT JSON_TYPE('["a", "b", 1]');
+ ---------------------------- + | JSON_TYPE('[“a”,“b”,1]')| + ---------------------------- + | ARRAY | + ---------------------------- + MySQL的>SELECT JSON_TYPE('"hello"');
+ ---------------------- + | JSON_TYPE('“hello”')| + ---------------------- + | STRING | + ---------------------- + MySQL的>SELECT JSON_TYPE('hello');
ERROR 3146(22032):参数1中JSON数据的数据类型无效 使用json_type; 需要JSON字符串或JSON类型。
MySQL使用utf8mb4
字符集和utf8mb4_bin
排序规则处理JSON上下文中使用的
字符串
。其他字符集中的字符串将utf8mb4
根据需要进行转换。(对于在串ascii
或
utf8
字符集,则不需要转换,因为ascii
和utf8
都是的子集utf8mb4
。)
作为使用文字字符串编写JSON值的替代方法,存在用于从组件元素组成JSON值的函数。JSON_ARRAY()
获取(可能为空)值列表并返回包含这些值的JSON数组:
MySQL的> SELECT JSON_ARRAY('a', 1, NOW());
+ ---------------------------------------- +
| JSON_ARRAY('a',1,NOW())|
+ ---------------------------------------- +
| [“a”,1,“2015-07-27 09:43:47.000000”] |
+ ---------------------------------------- +
JSON_OBJECT()
获取(可能为空)键值对列表并返回包含这些对的JSON对象:
MySQL的> SELECT JSON_OBJECT('key1', 1, 'key2', 'abc');
+ --------------------------------------- +
| JSON_OBJECT('key1',1,'key2','abc')|
+ --------------------------------------- +
| {“key1”:1,“key2”:“abc”} |
+ --------------------------------------- +
JSON_MERGE_PRESERVE()
获取两个或多个JSON文档并返回组合结果:
MySQL的> SELECT JSON_MERGE_PRESERVE('["a", 1]', '{"key": "value"}');
+ ------------------------------------------------- ---- +
| JSON_MERGE_PRESERVE('[“a”,1]','{“key”:“value”}')|
+ ------------------------------------------------- ---- +
| [“a”,1,{“key”:“value”}] |
+ ------------------------------------------------- ---- +
1排(0.00秒)
有关合并规则的信息,请参阅 JSON值的规范化,合并和自动包装。
(MySQL 8.0.3及更高版本也支持
JSON_MERGE_PATCH()
,它有一些不同的行为。有关这两个函数之间的差异的信息,请参阅与
JSON_MERGE_PRESERVE()相比较的JSON_MERGE_PATCH()。)
可以将JSON值分配给用户定义的变量:
mysql>SET @j = JSON_OBJECT('key', 'value');
mysql>SELECT @j;
+ ------------------ + | @j | + ------------------ + | {“key”:“value”} | + ------------------ +
但是,用户定义的变量不能是
JSON
数据类型,所以虽然
@j
在前面的例子中看起来像一个JSON值并且具有相同的字符集和归类为JSON值,但它不具有
JSON
数据类型。相反,结果from
JSON_OBJECT()
在分配给变量时会转换为字符串。
通过转换JSON值生成的字符串具有以下字符集utf8mb4
和排序规则
utf8mb4_bin
:
MySQL的> SELECT CHARSET(@j), COLLATION(@j);
+ ------------- + --------------- +
| CHARSET(@j)| COLLATION(@j)|
+ ------------- + --------------- +
| utf8mb4 | utf8mb4_bin |
+ ------------- + --------------- +
因为utf8mb4_bin
是二进制排序规则,所以JSON值的比较区分大小写。
MySQL的> SELECT JSON_ARRAY('x') = JSON_ARRAY('X');
+ ----------------------------------- +
| JSON_ARRAY('x')= JSON_ARRAY('X')|
+ ----------------------------------- +
| 0 |
+ ----------------------------------- +
区分大小写也适用于JSON
null
,true
和
false
文字,它们必须始终以小写形式编写:
MySQL的>SELECT JSON_VALID('null'), JSON_VALID('Null'), JSON_VALID('NULL');
+ -------------------- + -------------------- + ------- ------------- + | JSON_VALID('null')| JSON_VALID('Null')| JSON_VALID('NULL')| + -------------------- + -------------------- + ------- ------------- + | 1 | 0 | 0 | + -------------------- + -------------------- + ------- ------------- + MySQL的>SELECT CAST('null' AS JSON);
+ ---------------------- + | CAST('null'AS JSON)| + ---------------------- + | null | + ---------------------- + 1排(0.00秒) MySQL的>SELECT CAST('NULL' AS JSON);
ERROR 3141(22032):参数1中的函数cast_as_json中的JSON文本无效: “无效值。” 位于'NULL'的0位置。
的字面JSON的情况下,灵敏度不同于所述SQL的NULL
,TRUE
和
FALSE
文字,它可以在任何大小写被写成:
MySQL的> SELECT ISNULL(null), ISNULL(Null), ISNULL(NULL);
+ -------------- -------------- + + + --------------
| ISNULL(null)| ISNULL(Null)| ISNULL(NULL)|
+ -------------- -------------- + + + --------------
| 1 | 1 | 1 |
+ -------------- -------------- + + + --------------
有时可能需要或希望将引号字符("
或'
)插入JSON文档中。假设您希望在此示例中插入一些JSON对象,这些对象包含表示有关MySQL的一些事实的句子,每个句子都与适当的关键字配对,并使用此处显示的SQL语句创建的表:
MySQL的> CREATE TABLE facts (sentence JSON);
这些关键词 - 句子对中有这一个:
吉祥物:MySQL吉祥物是一只名为“Sakila”的海豚。
将此作为JSON对象插入facts
表中的一种方法
是使用MySQL
JSON_OBJECT()
函数。在这种情况下,您必须使用反斜杠转义每个引号字符,如下所示:
mysqlINSERT INTO facts VALUES
>>(JSON_OBJECT("mascot", "Our mascot is a dolphin named \"Sakila\"."));
如果将值作为JSON对象文字插入,则此方法不起作用,在这种情况下,必须使用双反斜杠转义序列,如下所示:
mysqlINSERT INTO facts VALUES
>>('{"mascot": "Our mascot is a dolphin named \\"Sakila\\"."}');
使用双反斜杠可防止MySQL执行转义序列处理,而是使其将字符串文字传递给存储引擎进行处理。以刚才显示的方式插入JSON对象后,您可以通过执行简单操作看到反斜杠存在于JSON列值中SELECT
,如下所示:
MySQL的> SELECT sentence FROM facts;
+ ------------------------------------------------- -------- +
| 句子|
+ ------------------------------------------------- -------- +
| {“吉祥物”:“我们的吉祥物是一只叫做”Sakila“的海豚。”} |
+ ------------------------------------------------- -------- +
要使用mascot
键作为查找此特定句子
,可以使用column-path运算符
->
,如下所示:
mysql> SELECT col - >“$。mascot”FROM qtest; + --------------------------------------------- + | col - >“$。吉祥物”| + --------------------------------------------- + | “我们的吉祥物是一只叫做”Sakila“的海豚。” | + --------------------------------------------- + 1排(0.00秒)
这样就可以保留反斜杠以及周围的引号。要使用mascot
键作为键显示所需的值
,但不包括周围的引号或任何转义,请使用内联路径运算符
->>
,如下所示:
MySQL的> SELECT sentence->>"$.mascot" FROM facts;
+ ----------------------------------------- +
| 句子 - >>“$。吉祥物”|
+ ----------------------------------------- +
| 我们的吉祥物是一只名为“Sakila”的海豚。|
+ ----------------------------------------- +
如果NO_BACKSLASH_ESCAPES
启用了服务器SQL模式,则前面的示例无法正常工作
。如果设置了此模式,则可以使用单个反斜杠而不是双反斜杠来插入JSON对象文字,并保留反斜杠。如果JSON_OBJECT()
在执行插入时使用该功能并且设置了此模式,则必须替换单引号和双引号,如下所示:
mysqlINSERT INTO facts VALUES
>>(JSON_OBJECT('mascot', 'Our mascot is a dolphin named "Sakila".'));
有关JSON_UNQUOTE()
此模式对JSON值中的转义字符的影响的详细信息,请参阅该函数的说明
。
解析字符串并发现它是有效的JSON文档时,它也会进行规范化。这意味着具有复制在文档中稍后找到的键的键的成员(从左到右读取)将被丢弃。以下JSON_OBJECT()
调用生成的对象值仅包含第二个key1
元素,因为该键名称在值的前面出现,如下所示:
MySQL的> SELECT JSON_OBJECT('key1', 1, 'key2', 'abc', 'key1', 'def');
+ ------------------------------------------------- ----- +
| JSON_OBJECT('key1',1,'key2','abc','key1','def')|
+ ------------------------------------------------- ----- +
| {“key1”:“def”,“key2”:“abc”} |
+ ------------------------------------------------- ----- +
将值插入JSON列时也会执行规范化,如下所示:
MySQL的>CREATE TABLE t1 (c1 JSON);
MySQL的>INSERT INTO t1 VALUES
>('{"x": 17, "x": "red"}'),
>('{"x": 17, "x": "red", "x": [3, 5, 7]}');
MySQL的>SELECT c1 FROM t1;
+ ------------------ + | c1 | + ------------------ + | {“x”:“red”} | | {“x”:[3,5,7]} | + ------------------ +
这种“ 最后重复密钥获胜 ”行为由RFC 7159建议, 并由大多数JavaScript解析器实现。(Bug#86866,Bug#26369555)
在8.0.3之前的MySQL版本中,具有复制在文档中较早发现的密钥的密钥的成员被丢弃。以下JSON_OBJECT()
调用生成的对象值
不包含第二个key1
元素,因为该键名称出现在值的前面:
MySQL的> SELECT JSON_OBJECT('key1', 1, 'key2', 'abc', 'key1', 'def');
+ ------------------------------------------------- ----- +
| JSON_OBJECT('key1',1,'key2','abc','key1','def')|
+ ------------------------------------------------- ----- +
| {“key1”:1,“key2”:“abc”} |
+ ------------------------------------------------- ----- +
在MySQL 8.0.3之前,当将值插入JSON列时,也执行了“ 第一次重复键获胜 ”规范化。
MySQL的>CREATE TABLE t1 (c1 JSON);
MySQL的>INSERT INTO t1 VALUES
>('{"x": 17, "x": "red"}'),
>('{"x": 17, "x": "red", "x": [3, 5, 7]}');
MySQL的>SELECT c1 FROM t1;
+ ----------- + | c1 | + ----------- + | {“x”:17} | | {“x”:17} | + ----------- +
MySQL还会丢弃原始JSON文档中的键,值或元素之间的额外空格。为了使查找更有效,它还对JSON对象的键进行排序。 您应该知道此排序的结果可能会发生变化,并且不保证在各个版本中保持一致。
生成JSON值的MySQL函数(请参见 第12.17.2节“创建JSON值的函数”)始终返回规范化值。
MySQL 8.0.3(及更高版本)支持两种合并算法,由函数
JSON_MERGE_PRESERVE()
和
JSON_MERGE_PATCH()
。它们处理重复键的方式不同:JSON_MERGE_PRESERVE()
保留重复键的
值,同时JSON_MERGE_PATCH()
丢弃除最后一个值之外的所有键
。接下来的几段将解释这两个函数中的每一个如何处理JSON文档的不同组合(即对象和数组)的合并。
JSON_MERGE_PRESERVE()
与JSON_MERGE()
以前版本的MySQL(在MySQL 8.0.3中重命名)中的函数相同。
JSON_MERGE()
仍然支持作为JSON_MERGE_PRESERVE()
MySQL 8.0中的别名,但不推荐使用,并在将来的版本中删除。
合并数组。
在组合多个数组的上下文中,数组合并为单个数组。
JSON_MERGE_PRESERVE()
通过将稍后命名的数组连接到第一个数组的末尾来实现此目的。JSON_MERGE_PATCH()
将每个参数视为由单个元素组成的数组(因此将0作为其索引),然后应用“ 最后重复键获胜 ”逻辑以仅选择最后一个参数。您可以比较此查询显示的结果:
mysql>SELECT
- >JSON_MERGE_PRESERVE('[1, 2]', '["a", "b", "c"]', '[true, false]') AS Preserve,
- >JSON_MERGE_PATCH('[1, 2]', '["a", "b", "c"]', '[true, false]') AS Patch\G
*************************** 1。排******************** ******* 保留:[1,2,“a”,“b”,“c”,true,false] 补丁:[true,false]
Multiple objects when merged produce a single object.
JSON_MERGE_PRESERVE()
handles multiple
objects having the same key by combining all unique values for
that key in an array; this array is then used as the value for
that key in the result. JSON_MERGE_PATCH()
discards values for which duplicate keys are found, working from
left to right, so that the result contains only the last value
for that key. The following query illustrates the difference in
the results for the duplicate key a
:
mysql>SELECT
->JSON_MERGE_PRESERVE('{"a": 1, "b": 2}', '{"c": 3, "a": 4}', '{"c": 5, "d": 3}') AS Preserve,
->JSON_MERGE_PATCH('{"a": 3, "b": 2}', '{"c": 3, "a": 4}', '{"c": 5, "d": 3}') AS Patch\G
*************************** 1. row *************************** Preserve: {"a": [1, 4], "b": 2, "c": [3, 5], "d": 3} Patch: {"a": 4, "b": 2, "c": 5, "d": 3}
Nonarray values used in a context that requires an array value
are autowrapped: The value is surrounded by [
and ]
characters to convert it to an array.
In the following statement, each argument is autowrapped as an
array ([1]
, [2]
). These
are then merged to produce a single result array; as in the
previous two cases, JSON_MERGE_PRESERVE()
combines values having the same key while
JSON_MERGE_PATCH()
discards values for all
duplicate keys except the last, as shown here:
mysql>SELECT
->JSON_MERGE_PRESERVE('1', '2') AS Preserve,
->JSON_MERGE_PATCH('1', '2') AS Patch\G
*************************** 1. row *************************** Preserve: [1, 2] Patch: 2
Array and object values are merged by autowrapping the object as
an array and merging the arrays by combining values or by
“last duplicate key wins” according to the choice
of merging function (JSON_MERGE_PRESERVE()
or
JSON_MERGE_PATCH()
, respectively), as can be
seen in this example:
mysql>SELECT
->JSON_MERGE_PRESERVE('[10, 20]', '{"a": "x", "b": "y"}') AS Preserve,
->JSON_MERGE_PATCH('[10, 20]', '{"a": "x", "b": "y"}') AS Patch\G
*************************** 1. row *************************** Preserve: [10, 20, {"a": "x", "b": "y"}] Patch: {"a": "x", "b": "y"}
A JSON path expression selects a value within a JSON document.
Path expressions are useful with functions that extract parts of
or modify a JSON document, to specify where within that document
to operate. For example, the following query extracts from a
JSON document the value of the member with the
name
key:
mysql> SELECT JSON_EXTRACT('{"id": 14, "name": "Aztalan"}', '$.name');
+---------------------------------------------------------+
| JSON_EXTRACT('{"id": 14, "name": "Aztalan"}', '$.name') |
+---------------------------------------------------------+
| "Aztalan" |
+---------------------------------------------------------+
Path syntax uses a leading $
character to
represent the JSON document under consideration, optionally
followed by selectors that indicate successively more specific
parts of the document:
A period followed by a key name names the member in an object with the given key. The key name must be specified within double quotation marks if the name without quotes is not legal within path expressions (for example, if it contains a space).
[
appended
to a N
]path
that selects an array
names the value at position N
within the array. Array positions are integers beginning
with zero. If path
does not
select an array value, path
[0]
evaluates to the same value as
path
:
mysql> SELECT JSON_SET('"x"', '$[0]', 'a');
+------------------------------+
| JSON_SET('"x"', '$[0]', 'a') |
+------------------------------+
| "a" |
+------------------------------+
1 row in set (0.00 sec)
[
specifies a subset
or range of array values starting with the value at position
M
to
N
]M
, and ending with the value at
position N
.
last
is supported as a synonym for the
index of the rightmost array element. Relative addressing of
array elements is also supported. If
path
does not select an array
value, path
[last] evaluates to
the same value as path
, as shown
later in this section (see
Rightmost array element).
Paths can contain *
or
**
wildcards:
.[*]
evaluates to the values of all
members in a JSON object.
[*]
evaluates to the values of all
elements in a JSON array.
evaluates to all paths that begin with the named prefix
and end with the named suffix.
prefix
**suffix
A path that does not exist in the document (evaluates to
nonexistent data) evaluates to NULL
.
Let $
refer to this JSON array with three
elements:
[3, {"a": [5, 6], "b": 10}, [99, 100]]
Then:
$[0]
evaluates to 3
.
$[1]
evaluates to {"a": [5, 6],
"b": 10}
.
$[2]
evaluates to [99,
100]
.
$[3]
evaluates to NULL
(it refers to the fourth array element, which does not
exist).
Because $[1]
and $[2]
evaluate to nonscalar values, they can be used as the basis for
more-specific path expressions that select nested values.
Examples:
$[1].a
evaluates to [5,
6]
.
$[1].a[1]
evaluates to
6
.
$[1].b
evaluates to
10
.
$[2][0]
evaluates to
99
.
As mentioned previously, path components that name keys must be
quoted if the unquoted key name is not legal in path
expressions. Let $
refer to this value:
{"a fish": "shark", "a bird": "sparrow"}
The keys both contain a space and must be quoted:
$."a fish"
evaluates to
shark
.
$."a bird"
evaluates to
sparrow
.
Paths that use wildcards evaluate to an array that can contain multiple values:
mysql>SELECT JSON_EXTRACT('{"a": 1, "b": 2, "c": [3, 4, 5]}', '$.*');
+---------------------------------------------------------+ | JSON_EXTRACT('{"a": 1, "b": 2, "c": [3, 4, 5]}', '$.*') | +---------------------------------------------------------+ | [1, 2, [3, 4, 5]] | +---------------------------------------------------------+ mysql>SELECT JSON_EXTRACT('{"a": 1, "b": 2, "c": [3, 4, 5]}', '$.c[*]');
+------------------------------------------------------------+ | JSON_EXTRACT('{"a": 1, "b": 2, "c": [3, 4, 5]}', '$.c[*]') | +------------------------------------------------------------+ | [3, 4, 5] | +------------------------------------------------------------+
In the following example, the path $**.b
evaluates to multiple paths ($.a.b
and
$.c.b
) and produces an array of the matching
path values:
mysql> SELECT JSON_EXTRACT('{"a": {"b": 1}, "c": {"b": 2}}', '$**.b');
+---------------------------------------------------------+
| JSON_EXTRACT('{"a": {"b": 1}, "c": {"b": 2}}', '$**.b') |
+---------------------------------------------------------+
| [1, 2] |
+---------------------------------------------------------+
Ranges from JSON arrays.
You can use ranges with the to
keyword to
specify subsets of JSON arrays. For example, $[1 to
3]
includes the second, third, and fourth elements
of an array, as shown here:
mysql> SELECT JSON_EXTRACT('[1, 2, 3, 4, 5]', '$[1 to 3]');
+----------------------------------------------+
| JSON_EXTRACT('[1, 2, 3, 4, 5]', '$[1 to 3]') |
+----------------------------------------------+
| [2, 3, 4] |
+----------------------------------------------+
1 row in set (0.00 sec)
The syntax is
, where
M
to
N
M
and N
are, respectively, the first and last indexes of a range of
elements from a JSON array. N
must be
greater than M
;
M
must be greater than or equal to 0.
Array elements are indexed beginning with 0.
You can use ranges in contexts where wildcards are supported.
Rightmost array element.
The last
keyword is supported as a synonym
for the index of the last element in an array. Expressions of
the form last -
can be used for
relative addressing, and within range definitions, like this:
N
mysql> SELECT JSON_EXTRACT('[1, 2, 3, 4, 5]', '$[last-3 to last-1]');
+--------------------------------------------------------+
| JSON_EXTRACT('[1, 2, 3, 4, 5]', '$[last-3 to last-1]') |
+--------------------------------------------------------+
| [2, 3, 4] |
+--------------------------------------------------------+
1 row in set (0.01 sec)
If the path is evaluated against a value that is not an array, the result of the evaluation is the same as if the value had been wrapped in a single-element array:
mysql> SELECT JSON_REPLACE('"Sakila"', '$[last]', 10); +-----------------------------------------+ | JSON_REPLACE('"Sakila"', '$[last]', 10) | +-----------------------------------------+ | 10 | +-----------------------------------------+ 1 row in set (0.00 sec)
You can use
with a JSON column identifier and JSON path expression as a
synonym for
column
->path
JSON_EXTRACT(
. See
Section 12.17.3, “Functions That Search JSON Values”, for more information.
See also Indexing a Generated Column to Provide a JSON Column Index.
column
,
path
)
Some functions take an existing JSON document, modify it in some
way, and return the resulting modified document. Path
expressions indicate where in the document to make changes. For
example, the JSON_SET()
,
JSON_INSERT()
, and
JSON_REPLACE()
functions each
take a JSON document, plus one or more path-value pairs that
describe where to modify the document and the values to use. The
functions differ in how they handle existing and nonexisting
values within the document.
Consider this document:
mysql> SET @j = '["a", {"b": [true, false]}, [10, 20]]';
JSON_SET()
replaces values for
paths that exist and adds values for paths that do not exist:.
mysql> SELECT JSON_SET(@j, '$[1].b[0]', 1, '$[2][2]', 2);
+--------------------------------------------+
| JSON_SET(@j, '$[1].b[0]', 1, '$[2][2]', 2) |
+--------------------------------------------+
| ["a", {"b": [1, false]}, [10, 20, 2]] |
+--------------------------------------------+
In this case, the path $[1].b[0]
selects an
existing value (true
), which is replaced with
the value following the path argument (1
).
The path $[2][2]
does not exist, so the
corresponding value (2
) is added to the value
selected by $[2]
.
JSON_INSERT()
adds new values but
does not replace existing values:
mysql> SELECT JSON_INSERT(@j, '$[1].b[0]', 1, '$[2][2]', 2);
+-----------------------------------------------+
| JSON_INSERT(@j, '$[1].b[0]', 1, '$[2][2]', 2) |
+-----------------------------------------------+
| ["a", {"b": [true, false]}, [10, 20, 2]] |
+-----------------------------------------------+
JSON_REPLACE()
replaces existing
values and ignores new values:
mysql> SELECT JSON_REPLACE(@j, '$[1].b[0]', 1, '$[2][2]', 2);
+------------------------------------------------+
| JSON_REPLACE(@j, '$[1].b[0]', 1, '$[2][2]', 2) |
+------------------------------------------------+
| ["a", {"b": [1, false]}, [10, 20]] |
+------------------------------------------------+
The path-value pairs are evaluated left to right. The document produced by evaluating one pair becomes the new value against which the next pair is evaluated.
JSON_REMOVE()
takes a JSON document and one
or more paths that specify values to be removed from the
document. The return value is the original document minus the
values selected by paths that exist within the document:
mysql> SELECT JSON_REMOVE(@j, '$[2]', '$[1].b[1]', '$[1].b[1]');
+---------------------------------------------------+
| JSON_REMOVE(@j, '$[2]', '$[1].b[1]', '$[1].b[1]') |
+---------------------------------------------------+
| ["a", {"b": [true]}] |
+---------------------------------------------------+
The paths have these effects:
$[2]
matches [10, 20]
and removes it.
The first instance of $[1].b[1]
matches
false
in the b
element
and removes it.
The second instance of $[1].b[1]
matches
nothing: That element has already been removed, the path no
longer exists, and has no effect.
Many of the JSON functions supported by MySQL and described
elsewhere in this Manual (see Section 12.17, “JSON Functions”)
require a path expression in order to identify a specific
element in a JSON document. A path consists of the path's
scope followed by one or more path legs. For paths used in MySQL
JSON functions, the scope is always the document being searched
or otherwise operated on, represented by a leading
$
character. Path legs are separated by
period characters (.
). Cells in arrays are
represented by
[
, where
N
]N
is a non-negative integer. Names of
keys must be double-quoted strings or valid ECMAScript
identifiers (see
http://www.ecma-international.org/ecma-262/5.1/#sec-7.6
)。路径表达式,如JSON文本,应该使用的编码
ascii
,utf8
或
utf8mb4
字符集。其他字符编码被隐式强制执行utf8mb4
。完整语法如下所示:
pathExpression
:scope
[(pathLeg
)*]pathLeg
:member
|arrayLocation
|doubleAsterisk
member
:period
(keyName
|asterisk
)arrayLocation
:leftBracket
(nonNegativeInteger
|asterisk
)rightBracket
keyName
:ESIdentifier
|doubleQuotedString
doubleAsterisk
: '**'period
: ''asterisk
: '*'leftBracket
: '['rightBracket
: ']'
如前所述,在MySQL中,路径的范围始终是正在操作的文档,表示为
$
。您可以'$'
在JSON路径表达式中用作文档的synonynm。
某些实现支持JSON路径范围的列引用; 目前,MySQL不支持这些。
通配符*
和**
令牌使用如下:
.*
表示对象中所有成员的值。
[*]
表示数组中所有单元格的值。
[
表示以...开头prefix
]**suffix
prefix
和结尾的
所有路径
suffix
。
prefix
是可选的,虽然
suffix
是必需的; 换句话说,路径可能不会结束**
。
另外,路径可能不包含序列
***
。
对于路径语法的例子,见该采取的路径作为参数,例如各种JSON功能的说明
JSON_CONTAINS_PATH()
,
JSON_SET()
和
JSON_REPLACE()
。有关使用*
和
**
通配符的示例,请参阅该JSON_SEARCH()
函数的说明
。
MySQL 8.0.2及更高版本还支持使用to
关键字(例如
$[2 to 10]
)的JSON数组子集的范围表示法,以及
last
关键字作为数组最右边元素的同义词。有关更多信息和示例,请参阅搜索和修改JSON值。
JSON值可以使用进行比较
=
,
<
,
<=
,
>
,
>=
,
<>
,
!=
,和
<=>
运营商。
JSON值尚不支持以下比较运算符和函数:
刚才列出的比较运算符和函数的解决方法是将JSON值转换为本机MySQL数值或字符串数据类型,以便它们具有一致的非JSON标量类型。
JSON值的比较发生在两个级别。第一级比较基于比较值的JSON类型。如果类型不同,则比较结果仅由哪种类型具有更高优先级来确定。如果这两个值具有相同的JSON类型,则使用特定于类型的规则进行第二级比较。
以下列表显示了JSON类型的优先级,从最高优先级到最低优先级。(类型名称是JSON_TYPE()
函数返回的类型名称。)一行显示的类型具有相同的优先级。列表中前面列出的任何具有JSON类型的值都比列表中稍后列出的具有JSON类型的任何值都要大。
BLOB BIT 不透明 约会时间 时间 日期 布尔 ARRAY 宾语 串 INTEGER,DOUBLE 空值
对于具有相同优先级的JSON值,比较规则是特定于类型的:
BLOB
N
比较两个值
的第一个字节,其中N
是较短值中的字节数。如果N
两个值的第一个
字节相同,则在较长值之前排序较短的值。
BIT
同样的规则BLOB
。
OPAQUE
同样的规则BLOB
。
OPAQUE
值是未归类为其他类型之一的值。
DATETIME
A value that represents an earlier point in time is ordered
before a value that represents a later point in time. If two
values originally come from the MySQL
DATETIME
and TIMESTAMP
types, respectively, they are equal if they represent the
same point in time.
TIME
The smaller of two time values is ordered before the larger one.
DATE
The earlier date is ordered before the more recent date.
ARRAY
Two JSON arrays are equal if they have the same length and values in corresponding positions in the arrays are equal.
If the arrays are not equal, their order is determined by the elements in the first position where there is a difference. The array with the smaller value in that position is ordered first. If all values of the shorter array are equal to the corresponding values in the longer array, the shorter array is ordered first.
Example:
[] < ["a"] < ["ab"] < ["ab", "cd", "ef"] < ["ab", "ef"]
BOOLEAN
The JSON false literal is less than the JSON true literal.
OBJECT
Two JSON objects are equal if they have the same set of keys, and each key has the same value in both objects.
Example:
{"a": 1, "b": 2} = {"b": 2, "a": 1}
The order of two objects that are not equal is unspecified but deterministic.
STRING
Strings are ordered lexically on the first
N
bytes of the
utf8mb4
representation of the two strings
being compared, where N
is the
length of the shorter string. If the first
N
bytes of the two strings are
identical, the shorter string is considered smaller than the
longer string.
Example:
"a" < "ab" < "b" < "bc"
This ordering is equivalent to the ordering of SQL strings
with collation utf8mb4_bin
. Because
utf8mb4_bin
is a binary collation,
comparison of JSON values is case-sensitive:
"A" < "a"
INTEGER
, DOUBLE
JSON values can contain exact-value numbers and approximate-value numbers. For a general discussion of these types of numbers, see Section 9.1.2, “Numeric Literals”.
The rules for comparing native MySQL numeric types are discussed in Section 12.2, “Type Conversion in Expression Evaluation”, but the rules for comparing numbers within JSON values differ somewhat:
In a comparison between two columns that use the native
MySQL INT
and
DOUBLE
numeric types,
respectively, it is known that all comparisons involve
an integer and a double, so the integer is converted to
double for all rows. That is, exact-value numbers are
converted to approximate-value numbers.
On the other hand, if the query compares two JSON columns containing numbers, it cannot be known in advance whether numbers will be integer or double. To provide the most consistent behavior across all rows, MySQL converts approximate-value numbers to exact-value numbers. The resulting ordering is consistent and does not lose precision for the exact-value numbers. For example, given the scalars 9223372036854775805, 9223372036854775806, 9223372036854775807 and 9.223372036854776e18, the order is such as this:
9223372036854775805 < 9223372036854775806 < 9223372036854775807 < 9.223372036854776e18 = 9223372036854776000 < 9223372036854776001
Were JSON comparisons to use the non-JSON numeric comparison rules, inconsistent ordering could occur. The usual MySQL comparison rules for numbers yield these orderings:
Integer comparison:
9223372036854775805 < 9223372036854775806 < 9223372036854775807
(not defined for 9.223372036854776e18)
Double comparison:
9223372036854775805 = 9223372036854775806 = 9223372036854775807 = 9.223372036854776e18
For comparison of any JSON value to SQL NULL
,
the result is UNKNOWN
.
For comparison of JSON and non-JSON values, the non-JSON value is converted to JSON according to the rules in the following table, then the values compared as described previously.
The following table provides a summary of the rules that MySQL follows when casting between JSON values and values of other types:
Table 11.3 JSON Conversion Rules
other type | CAST(other type AS JSON) | CAST(JSON AS other type) |
---|---|---|
JSON | No change | No change |
utf8 character type (utf8mb4 ,
utf8 , ascii ) |
The string is parsed into a JSON value. | The JSON value is serialized into a utf8mb4 string. |
Other character types | Other character encodings are implicitly converted to
utf8mb4 and treated as described for
utf8 character type. |
The JSON value is serialized into a utf8mb4 string,
then cast to the other character encoding. The result may
not be meaningful. |
NULL |
Results in a NULL value of type JSON. |
Not applicable. |
Geometry types | The geometry value is converted into a JSON document by calling
ST_AsGeoJSON() . |
Illegal operation. Workaround: Pass the result of
CAST( to
ST_GeomFromGeoJSON() . |
All other types | Results in a JSON document consisting of a single scalar value. | Succeeds if the JSON document consists of a single scalar value of the
target type and that scalar value can be cast to the
target type. Otherwise, returns NULL
and produces a warning. |
ORDER BY
and GROUP BY
for
JSON values works according to these principles:
Ordering of scalar JSON values uses the same rules as in the preceding discussion.
For ascending sorts, SQL NULL
orders
before all JSON values, including the JSON null literal; for
descending sorts, SQL NULL
orders after
all JSON values, including the JSON null literal.
Sort keys for JSON values are bound by the value of the
max_sort_length
system
variable, so keys that differ only after the first
max_sort_length
bytes
compare as equal.
Sorting of nonscalar values is not currently supported and a warning occurs.
For sorting, it can be beneficial to cast a JSON scalar to some
other native MySQL type. For example, if a column named
jdoc
contains JSON objects having a member
consisting of an id
key and a nonnegative
value, use this expression to sort by id
values:
ORDER BY CAST(JSON_EXTRACT(jdoc, '$.id') AS UNSIGNED)
If there happens to be a generated column defined to use the
same expression as in the ORDER BY
, the MySQL
optimizer recognizes that and considers using the index for the
query execution plan. See
Section 8.3.11, “Optimizer Use of Generated Column Indexes”.
For aggregation of JSON values, SQL NULL
values are ignored as for other data types.
Non-NULL
values are converted to a numeric
type and aggregated, except for
MIN()
,
MAX()
, and
GROUP_CONCAT()
. The conversion to
number should produce a meaningful result for JSON values that
are numeric scalars, although (depending on the values)
truncation and loss of precision may occur. Conversion to number
of other JSON values may not produce a meaningful result.
Data type specifications can have explicit or implicit default values.
A DEFAULT
clause in a data type specification explicitly indicates a default
value for a column. Examples:
value
CREATE TABLE t1 ( i INT DEFAULT -1, c VARCHAR(10) DEFAULT '', price DOUBLE(16,2) DEFAULT 0.00 );
SERIAL DEFAULT VALUE
is a special case. In the
definition of an integer column, it is an alias for NOT
NULL AUTO_INCREMENT UNIQUE
.
Some aspects of explicit DEFAULT
clause
handling are version dependent, as described following.
The default value specified in a DEFAULT
clause can be a literal constant or an expression. With one
exception, enclose expression default values within parentheses
to distinguish them from literal constant default values.
Examples:
CREATE TABLE t1 ( -- literal defaults i INT DEFAULT 0, c VARCHAR(10) DEFAULT '', -- expression defaults f FLOAT DEFAULT (RAND() * RAND()), b BINARY(16) DEFAULT (UUID_TO_BIN(UUID())), d DATE DEFAULT (CURRENT_DATE + INTERVAL 1 YEAR), p POINT DEFAULT (Point(0,0)), j JSON DEFAULT (JSON_ARRAY()) );
The exception is that, for
TIMESTAMP
and
DATETIME
columns, you can specify
the CURRENT_TIMESTAMP
function as
the default, without enclosing parentheses. See
Section 11.3.4, “Automatic Initialization and Updating for TIMESTAMP and DATETIME”.
The BLOB
,
TEXT
,
GEOMETRY
, and
JSON
data types can be assigned a
default value only if the value is written as an expression,
even if the expression value is a literal:
This is permitted (literal default specified as expression):
CREATE TABLE t2 (b BLOB DEFAULT ('abc'));
This produces an error (literal default not specified as expression):
CREATE TABLE t2 (b BLOB DEFAULT 'abc');
Expression default values must adhere to the following rules. An error occurs if an expression contains disallowed constructs.
Literals, built-in functions (both deterministic and nondeterministic), and operators are permitted.
Subqueries, parameters, variables, stored functions, and user-defined functions are not permitted.
An expression default value cannot depend on a column that
has the AUTO_INCREMENT
attribute.
An expression default value for one column can refer to other table columns, with the exception that references to generated columns or columns with expression default values must be to columns that occur earlier in the table definition. That is, expression default values cannot contain forward references to generated columns or columns with expression default values.
The ordering constraint also applies to the use of
ALTER TABLE
to reorder table
columns. If the resulting table would have an expression
default value that contains a forward reference to a
generated column or column with an expression default value,
the statement fails.
If any component of an expression default value depends on the SQL mode, different results may occur for different uses of the table unless the SQL mode is the same during all uses.
For CREATE
TABLE ... LIKE
and
CREATE
TABLE ... SELECT
, the destination table preserves
expression default values from the original table.
If an expression default value refers to a nondeterministic
function, any statement that causes the expression to be
evaluated is unsafe for statement-based replication. This
includes statements such as
INSERT
,
UPDATE
, and
ALTER TABLE
. In this situation,
if binary logging is disabled, the statement is executed as
normal. If binary logging is enabled and
binlog_format
is set to
STATEMENT
, the statement is logged and
executed but a warning message is written to the error log,
because replication slaves might diverge. When
binlog_format
is set to
MIXED
or ROW
, the
statement is not executed and an error message is written to the
error log.
When inserting a new row, the default value for a column with an
expression default can be inserted either by omitting the column
name or by specifying the column as DEFAULT
(just as for columns with literal defaults):
mysql>CREATE TABLE t4 (uid BINARY(16) DEFAULT (UUID_TO_BIN(UUID())));
mysql>INSERT INTO t4 () VALUES();
mysql>INSERT INTO t4 () VALUES(DEFAULT);
mysql>SELECT BIN_TO_UUID(uid) AS uid FROM t4;
+--------------------------------------+ | uid | +--------------------------------------+ | f1109174-94c9-11e8-971d-3bf1095aa633 | | f110cf9a-94c9-11e8-971d-3bf1095aa633 | +--------------------------------------+
However, the use of
DEFAULT(
to specify the default value for a named column is permitted
only for columns that have a literal default value, not for
columns that have an expression default value.
col_name
)
Not all storage engines permit expression default values. For
those that do not, an
ER_UNSUPPORTED_ACTION_ON_DEFAULT_VAL_GENERATED
error occurs.
If a default value evaluates to a data type that differs from the declared column type, implicit coercion to the declared type occurs according to the usual MySQL type-conversion rules. See Section 12.2, “Type Conversion in Expression Evaluation”.
With one exception, the default value specified in a
DEFAULT
clause must be a literal constant; it
cannot be a function or an expression. This means, for example,
that you cannot set the default for a date column to be the
value of a function such as NOW()
or CURRENT_DATE
. The exception is
that, for TIMESTAMP
and
DATETIME
columns, you can specify
CURRENT_TIMESTAMP
as the default.
See Section 11.3.4, “Automatic Initialization and Updating for TIMESTAMP and DATETIME”.
The BLOB
,
TEXT
,
GEOMETRY
, and
JSON
data types cannot be
assigned a default value.
If a default value evaluates to a data type that differs from the declared column type, implicit coercion to the declared type occurs according to the usual MySQL type-conversion rules. See Section 12.2, “Type Conversion in Expression Evaluation”.
If a data type specification includes no explicit
DEFAULT
value, MySQL determines the default
value as follows:
If the column can take NULL
as a value, the
column is defined with an explicit DEFAULT
NULL
clause.
If the column cannot take NULL
as a value,
MySQL defines the column with no explicit
DEFAULT
clause. Exception: If the column is
defined as part of a PRIMARY KEY
but not
explicitly as NOT NULL
, MySQL creates it as a
NOT NULL
column (because PRIMARY
KEY
columns must be NOT NULL
).
For data entry into a NOT NULL
column that
has no explicit DEFAULT
clause, if an
INSERT
or
REPLACE
statement includes no
value for the column, or an
UPDATE
statement sets the column
to NULL
, MySQL handles the column according
to the SQL mode in effect at the time:
If strict SQL mode is enabled, an error occurs for transactional tables and the statement is rolled back. For nontransactional tables, an error occurs, but if this happens for the second or subsequent row of a multiple-row statement, the preceding rows will have been inserted.
If strict mode is not enabled, MySQL sets the column to the implicit default value for the column data type.
Suppose that a table t
is defined as follows:
CREATE TABLE t (i INT NOT NULL);
In this case, i
has no explicit default, so
in strict mode each of the following statements produce an error
and no row is inserted. When not using strict mode, only the
third statement produces an error; the implicit default is
inserted for the first two statements, but the third fails
because DEFAULT(i)
cannot produce
a value:
INSERT INTO t VALUES(); INSERT INTO t VALUES(DEFAULT); INSERT INTO t VALUES(DEFAULT(i));
See Section 5.1.11, “Server SQL Modes”.
For a given table, the SHOW CREATE
TABLE
statement displays which columns have an
explicit DEFAULT
clause.
Implicit defaults are defined as follows:
For numeric types, the default is 0
, with
the exception that for integer or floating-point types
declared with the AUTO_INCREMENT
attribute, the default is the next value in the sequence.
For date and time types other than
TIMESTAMP
, the default is the
appropriate “zero” value for the type. This is
also true for TIMESTAMP
if
the
explicit_defaults_for_timestamp
system variable is enabled (see
Section 5.1.8, “Server System Variables”). Otherwise, for
the first TIMESTAMP
column in
a table, the default value is the current date and time. See
Section 11.3, “Date and Time Types”.
For string types other than
ENUM
, the default value is
the empty string. For ENUM
,
the default is the first enumeration value.
The storage requirements for table data on disk depend on several factors. Different storage engines represent data types and store raw data differently. Table data might be compressed, either for a column or an entire row, complicating the calculation of storage requirements for a table or column.
Despite differences in storage layout on disk, the internal MySQL APIs that communicate and exchange information about table rows use a consistent data structure that applies across all storage engines.
This section includes guidelines and information for the storage requirements for each data type supported by MySQL, including the internal format and size for storage engines that use a fixed-size representation for data types. Information is listed by category or storage engine.
The internal representation of a table has a maximum row size of
65,535 bytes, even if the storage engine is capable of supporting
larger rows. This figure excludes
BLOB
or
TEXT
columns, which contribute only
9 to 12 bytes toward this size. For
BLOB
and
TEXT
data, the information is
stored internally in a different area of memory than the row
buffer. Different storage engines handle the allocation and
storage of this data in different ways, according to the method
they use for handling the corresponding types. For more
information, see Chapter 16, Alternative Storage Engines, and
Section C.10.4, “Limits on Table Column Count and Row Size”.
See Section 15.10, “InnoDB Row Formats” for information about
storage requirements for InnoDB
tables.
NDB
tables use
4-byte alignment; all
NDB
data storage is done in
multiples of 4 bytes. Thus, a column value that would
typically take 15 bytes requires 16 bytes in an
NDB
table. For example, in
NDB
tables, the
TINYINT
,
SMALLINT
,
MEDIUMINT
, and
INTEGER
(INT
) column types each require
4 bytes storage per record due to the alignment factor.
Each BIT(
column takes M
)M
bits of storage
space. Although an individual
BIT
column is
not 4-byte aligned,
NDB
reserves 4 bytes (32 bits)
per row for the first 1-32 bits needed for
BIT
columns, then another 4 bytes for bits
33-64, and so on.
While a NULL
itself does not require any
storage space, NDB
reserves 4
bytes per row if the table definition contains any columns
allowing NULL
, up to 32
NULL
columns. (If an NDB Cluster table is
defined with more than 32 NULL
columns up
to 64 NULL
columns, then 8 bytes per row
are reserved.)
Every table using the NDB
storage
engine requires a primary key; if you do not define a primary
key, a “hidden” primary key is created by
NDB
. This hidden primary key
consumes 31-35 bytes per table record.
You can use the ndb_size.pl Perl script to
estimate NDB
storage requirements.
It connects to a current MySQL (not NDB Cluster) database and
creates a report on how much space that database would require
if it used the NDB
storage engine.
See Section 22.4.28, “ndb_size.pl — NDBCLUSTER Size Requirement Estimator” for
more information.
Data Type | Storage Required |
---|---|
TINYINT |
1 byte |
SMALLINT |
2 bytes |
MEDIUMINT |
3 bytes |
INT ,
INTEGER |
4 bytes |
BIGINT |
8 bytes |
FLOAT( |
4 bytes if 0 <= p <= 24, 8 bytes if 25
<= p <= 53 |
FLOAT |
4 bytes |
DOUBLE [PRECISION] ,
REAL |
8 bytes |
DECIMAL( ,
NUMERIC( |
Varies; see following discussion |
BIT( |
approximately (M +7)/8 bytes |
Values for DECIMAL
(and
NUMERIC
) columns are represented
using a binary format that packs nine decimal (base 10) digits
into four bytes. Storage for the integer and fractional parts of
each value are determined separately. Each multiple of nine
digits requires four bytes, and the “leftover”
digits require some fraction of four bytes. The storage required
for excess digits is given by the following table.
Leftover Digits | Number of Bytes |
---|---|
0 | 0 |
1 | 1 |
2 | 1 |
3 | 2 |
4 | 2 |
5 | 3 |
6 | 3 |
7 | 4 |
8 | 4 |
对于TIME
,
DATETIME
和
TIMESTAMP
列,在MySQL 5.6.4之前创建的表所需的存储与从5.6.4创建的表不同。这是由于5.6.4中的更改允许这些类型具有小数部分,这需要0到3个字节。
数据类型 | MySQL 5.6.4之前需要存储 | 从MySQL 5.6.4开始需要存储 |
---|---|---|
YEAR |
1个字节 | 1个字节 |
DATE |
3个字节 | 3个字节 |
TIME |
3个字节 | 3字节+小数秒存储 |
DATETIME |
8个字节 | 5字节+小数秒存储 |
TIMESTAMP |
4字节 | 4字节+小数秒存储 |
在MySQL 5.6.4,储存
YEAR
和
DATE
保持不变。然而
TIME
,
DATETIME
和
TIMESTAMP
以不同方式表示。DATETIME
更有效地打包,非分数部分需要5个而不是8个字节,并且所有三个部分都有一个小数部分,需要0到3个字节,具体取决于存储值的小数秒精度。
分数秒精度 | 需要存储 |
---|---|
0 | 0个字节 |
1,2 | 1个字节 |
3,4 | 2个字节 |
5,6 | 3个字节 |
例如,TIME(0)
,
TIME(2)
,
TIME(4)
,和
TIME(6)
分别用3,4,5,和6个字节。TIME
并且
TIME(0)
是等效的并且需要相同的存储空间。
有关时间值的内部表示的详细信息,请参阅MySQL内部:重要算法和结构。
在下表中,M
表示非二进制字符串类型的声明列长度(以字符为单位)和二进制字符串类型的字节。
L
表示给定字符串值的实际长度(以字节为单位)。
数据类型 | 需要存储 |
---|---|
CHAR( |
紧凑的InnoDB行格式系列优化了可变长度字符集的存储。请参阅
COMPACT行格式存储特性。否则,M ×
w bytes,255,其中
是字符集中最大长度字符所需的字节数。<=
w |
BINARY( |
M 字节,0 255<=
|
VARCHAR( ,
VARBINARY( |
L 如果列值需要0 - 255个字节,L 则为+ 1个字节;如果值可能需要超过255个字节,则为+ 2个字节 |
TINYBLOB ,
TINYTEXT |
L + 1个字节,其中
L <2 8 |
BLOB , TEXT |
L + 2个字节,其中
L <2 16 |
MEDIUMBLOB ,
MEDIUMTEXT |
L + 3个字节,其中
L <2 24 |
LONGBLOB ,
LONGTEXT |
L + 4个字节,其中
L <2 32 |
ENUM(' |
1或2个字节,具体取决于枚举值的数量(最多65,535个值) |
SET(' |
1个,2个,3个,4个或8个字节,具体取决于设置成员的数量(最多64个成员) |
使用长度前缀加数据存储可变长度字符串类型。长度前缀需要1到4个字节,具体取决于数据类型,前缀的值是
L
(字符串的字节长度)。例如,MEDIUMTEXT
值的存储
需要
L
字节来存储值加上三个字节来存储值的长度。
为了计算用于存储特定的字节数
CHAR
,
VARCHAR
或
TEXT
列的值时,必须考虑到用于该列是否值包含多字节字符的字符集。特别是,在使用utf8
Unicode字符集时,必须记住并非所有字符都使用相同的字节数。utf8mb3
和utf8mb4
字符集每个字符最多可分别需要三个和四个字节。有关用于不同类别utf8mb3
或
utf8mb4
字符的存储的细分,请参见
第10.9节“Unicode支持”。
VARCHAR
,
VARBINARY
以及
BLOB
和
TEXT
类型是可变长度类型的。对于每个,存储要求取决于以下因素:
列值的实际长度
列的最大可能长度
用于该列的字符集,因为某些字符集包含多字节字符
例如,VARCHAR(255)
列可以包含最大长度为255个字符的字符串。假设列使用latin1
字符集(每个字符一个字节),所需的实际存储量是字符串的长度(L
),加上一个字节来记录字符串的长度。对于字符串
'abcd'
,L
为4,存储要求为5个字节。如果声明同一列使用ucs2
双字节字符集,则存储要求为10个字节:长度'abcd'
为8个字节,并且该列需要两个字节来存储长度,因为最大长度大于255(最多510个)字节)。
可以存储在一个或一
列中
的有效最大字节数受最大行大小65,535字节的影响,该字节在所有列之间共享。对于存储多字节字符的列,有效最大字符数较少。例如,
字符每个字符最多需要四个字节,因此
使用该字符集的列可以声明为最多16,383个字符。请参见
第C.10.4节“表列数和行大小的限制”。
VARCHAR
VARBINARY
VARCHAR
utf8mb4
VARCHAR
utf8mb4
InnoDB
将长度大于或等于768字节的固定长度字段编码为可变长度字段,可以在页外存储。例如,
CHAR(255)
如果字符集的最大字节长度大于3,则列可以超过768字节utf8mb4
。
The NDB
storage engine supports
variable-width columns. This means that a
VARCHAR
column in an NDB Cluster
table requires the same amount of storage as would any other
storage engine, with the exception that such values are 4-byte
aligned. Thus, the string 'abcd'
stored in a
VARCHAR(50)
column using the
latin1
character set requires 8 bytes (rather
than 5 bytes for the same column value in a
MyISAM
table).
TEXT
and
BLOB
columns are implemented
differently in NDB
; each row in a
TEXT
column is made up of two separate parts.
One of these is of fixed size (256 bytes), and is actually
stored in the original table. The other consists of any data in
excess of 256 bytes, which is stored in a hidden table. The rows
in this second table are always 2000 bytes long. This means that
the size of a TEXT
column is 256 if
size
<= 256 (where
size
represents the size of the row);
otherwise, the size is 256 +
size
+ (2000 ×
(size
− 256) % 2000).
The size of an ENUM
object is
determined by the number of different enumeration values. One
byte is used for enumerations with up to 255 possible values.
Two bytes are used for enumerations having between 256 and
65,535 possible values. See Section 11.4.4, “The ENUM Type”.
The size of a SET
object is
determined by the number of different set members. If the set
size is N
, the object occupies
(
bytes,
rounded up to 1, 2, 3, 4, or 8 bytes. A
N
+7)/8SET
can have a maximum of 64
members. See Section 11.4.5, “The SET Type”.
MySQL stores geometry values using 4 bytes to indicate the SRID
followed by the WKB representation of the value. The
LENGTH()
function returns the
space in bytes required for value storage.
For descriptions of WKB and internal storage formats for spatial values, see Section 11.5.3, “Supported Spatial Data Formats”.
In general, the storage requirement for a
JSON
column is approximately the
same as for a LONGBLOB
or
LONGTEXT
column; that is, the space consumed
by a JSON document is roughly the same as it would be for the
document's string representation stored in a column of one
of these types. However, there is an overhead imposed by the
binary encoding, including metadata and dictionaries needed for
lookup, of the individual values stored in the JSON document.
For example, a string stored in a JSON document requires 4 to 10
bytes additional storage, depending on the length of the string
and the size of the object or array in which it is stored.
In addition, MySQL imposes a limit on the size of any JSON
document stored in a JSON
column such that it
cannot be any larger than the value of
max_allowed_packet
.
For optimum storage, you should try to use the most precise type
in all cases. For example, if an integer column is used for values
in the range from 1
to
99999
, MEDIUMINT UNSIGNED
is
the best type. Of the types that represent all the required
values, this type uses the least amount of storage.
所有基本计算(+
,
-
,*
,和
/
)与DECIMAL
列与65(十进制)(基体10)位的精度进行。请参见第11.1.1节“数字类型概述”。
如果准确性不是太重要或者速度是最高优先级,那么DOUBLE
类型可能足够好。对于高精度,您始终可以转换为存储在a中的定点类型
BIGINT
。这使您可以使用64位整数进行所有计算,然后根据需要将结果转换回浮点值。
为了便于使用为其他供应商的SQL实现编写的代码,MySQL映射数据类型,如下表所示。这些映射使得将表定义从其他数据库系统导入MySQL变得更加容易。
其他供应商类型 | MySQL类型 |
---|---|
BOOL |
TINYINT |
BOOLEAN |
TINYINT |
CHARACTER VARYING( |
VARCHAR( |
FIXED |
DECIMAL |
FLOAT4 |
FLOAT |
FLOAT8 |
DOUBLE |
INT1 |
TINYINT |
INT2 |
SMALLINT |
INT3 |
MEDIUMINT |
INT4 |
INT |
INT8 |
BIGINT |
LONG VARBINARY |
MEDIUMBLOB |
LONG VARCHAR |
MEDIUMTEXT |
LONG |
MEDIUMTEXT |
MIDDLEINT |
MEDIUMINT |
NUMERIC |
DECIMAL |
数据类型映射在表创建时发生,之后将丢弃原始类型规范。如果您创建一个包含其他供应商使用的类型的表,然后发出一个
语句,MySQL会使用等效的MySQL类型报告表结构。例如:
DESCRIBE
tbl_name
MySQL的>CREATE TABLE t (a BOOL, b FLOAT8, c LONG VARCHAR, d NUMERIC);
查询正常,0行受影响(0.00秒) MySQL的>DESCRIBE t;
+ ------- + -------- + ------ + ------ + --------- + - ----- + | 领域| 输入| 空| 钥匙| 默认| 额外的| + ------- + -------- + ------ + ------ + --------- + - ----- + | a | tinyint(1)| 是的| | NULL | | | b | 双| 是的| | NULL | | | c | 中文| 是的| | NULL | | | d | 十进制(10,0)| 是的| | NULL | | + ------- + -------- + ------ + ------ + --------- + - ----- + 4行(0.01秒)