モチベーション

MySQLでのGeometry型を扱う知見がないので調査する。また空間インデックスによる検索のパフォーマンスを確認する。

準備

MySQL Serverを準備する。

> docker version
Client:
 Version:      17.03.0-ce
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   60ccb22
 Built:        Thu Feb 23 10:40:59 2017
 OS/Arch:      darwin/amd64

Server:
 Version:      17.03.0-ce
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   3a232c8
 Built:        Tue Feb 28 07:52:04 2017
 OS/Arch:      linux/amd64
 Experimental: true

> docker pull mysql/mysql-server
> docker run --name db -e MYSQL_ROOT_PASSWORD=pass -d mysql/mysql-server:5.7
> docker container ls
CONTAINER ID        IMAGE                    COMMAND                  CREATED             STATUS              PORTS                 NAMES
d3bcb42ada91        mysql/mysql-server:5.7   "/entrypoint.sh my..."    hours ago         Up 8 hours          3306/tcp, 33060/tcp   db

> docker exec -it db  mysql -uroot -ppass
> show global variables like '%version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 5.7.17                       |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| tls_version             | TLSv1,TLSv1.1,TLSv1.2        |
| version                 | 5.7.17                       |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Linux                        |
+-------------------------+------------------------------+

以降、本記事でコマンドが記載されている場合はここで用意したMySQL Server内で実行したコマンドとする。bashで実行するコマンドはbash >のようになっている。

テーブルの作成とGeometry型データの格納

geo_testというデータベースを作る。

> create database geo_test;
> use geo_test;

エンジンをMyISAMとしたgeo_myisamテーブルを作る。geoカラムを今回検証したいGeometory型とする。またgeoカラムを使い空間インデックスを作成する。[1]

> CREATE TABLE geo_myisam (
 id INT NOT NULL,
 name VARCHAR(255) NOT NULL,
 geo GEOMETRY NOT NULL,
 PRIMARY KEY (id),
 SPATIAL KEY (geo)
)ENGINE=MyISAM;

GeometryオブジェクトのWKT表現を任意の型のGeometryデータに変換するにはGeomFromTextを使う。ここでは、Point型のデータをgeoカラムに格納する。[2]

> INSERT INTO geo_myisam (id, name,geo) VALUES \
(1, 'point0_0', GeomFromText('POINT(0 0)')), \
(2, 'point0_0.5', GeomFromText('POINT(0 0.5)')), \
(3, 'point0.5_0', GeomFromText('POINT(0.5 0)')), \
(4, 'point0_-0.5', GeomFromText('POINT(0 -0.5)')), \
(5, 'point-0.5_0', GeomFromText('POINT(-0.5 0)'));

geoカラムは普通にSELECTしても読めない。

mysql> select * from geo_myisam;
+----+-------------+---------------------------+
| id | name        | geo                       |
+----+-------------+---------------------------+
|  1 | point0_0    |                           |
|  2 | point0_0.5  |                           |
|  3 | point0.5_0  |                           |
|  4 | point0_-0.5 |                           |
|  5 | point-0.5_0 |                           |
+----+-------------+---------------------------+

マニュアルによると、

Internally, MySQL stores geometry values in a format that is not identical to either WKT or WKB format.

らしくWKTでもWKBでもないフォーマットのGeometryデータが格納されているとのこと。geoカラムの中身をWKTで確認する場合はST_AsTextを使う。[3]

mysql> select ST_AsText(geo)  from geo_myisam;
+---------------+
| ST_AsText(geo)|
+---------------+
| POINT(0 0)    |
| POINT(0 0.5)  |
| POINT(0.5 0)  |
| POINT(0 -0.5) |
| POINT(-0.5 0) |
+---------------+

Geometryデータの検索

MBRCONTAINSによりLineStringで指定した2点の最小外接矩形に含まれる全ての点を検索できる。

mysql> SELECT id, name, ST_AsText(geo) AS latlon from geo_myisam WHERE MBRCONTAINS(GeomFromText('LineString(-1 -1, 1 1)'), geo);
+----+-------------+---------------+
| id | name        | latlon        |
+----+-------------+---------------+
|  1 | point0_0    | POINT(0 0)    |
|  2 | point0_0.5  | POINT(0 0.5)  |
|  3 | point0.5_0  | POINT(0.5 0)  |
|  4 | point0_-0.5 | POINT(0 -0.5) |
|  5 | point-0.5_0 | POINT(-0.5 0) |
+----+-------------+---------------+

MBRsはMinimum Bounding Rectanglesの略である。関連する関数は“MySQL-Specific Spatial Relation Functions That Use Minimum Bounding Rectangles (MBRs)”を参照のこと。

LineStringで表現される2点から成る線上に存在する点を検索する場合はST_CONTAINSを使う。先に格納されたデータにて線上に含まれるのはPOINT(0 0)のみなので探索結果もそのようになる。

mysql> SELECT id, name, ST_AsText(geo) AS latlon from geo_myisam WHERE ST_CONTAINS(GeomFromText('LineString(-1 -1, 1 1)'), geo);
+----+----------+------------+
| id | name     | latlon     |
+----+----------+------------+
|  1 | point0_0 | POINT(0 0) |
+----+----------+------------+

試しにLineString(-1 -1, 1 1)で表現される線上に乗るGeometryデータを追加する。

mysql> INSERT INTO geo_myisam (id, name,geo) VALUES \
(6, 'point0.1_0.1', GeomFromText('POINT(0.1 0.1)')), \
(7, 'point-0.1_-0.1', GeomFromText('POINT(-0.1 -0.1)'));

再度検索してみると追加したデータが検索されることが確認できる。

mysql> SELECT id, name, ST_AsText(geo) AS latlon from geo_myisam WHERE ST_CONTAINS(GeomFromText('LineString(-1 -1, 1 1)'), geo);
+----+----------------+------------------+
| id | name           | latlon           |
+----+----------------+------------------+
|  1 | point0_0       | POINT(0 0)       |
|  6 | point0.1_0.1   | POINT(0.1 0.1)   |
|  7 | point-0.1_-0.1 | POINT(-0.1 -0.1) |
+----+----------------+------------------+

また、MBRCONTAINSによりLineStringで指定した2点の最小外接矩形に含まれる全ての点を検索すると追加したデータがヒットする。

mysql> SELECT id, name, ST_AsText(geo) AS latlon from geo_myisam WHERE MBRCONTAINS(GeomFromText('POLYGON((-1 -1,-1 1,1 1, -1 1,-1 -1))'), geo);
+----+----------------+------------------+
| id | name           | latlon           |
+----+----------------+------------------+
|  1 | point0_0       | POINT(0 0)       |
|  2 | point0_0.5     | POINT(0 0.5)     |
|  3 | point0.5_0     | POINT(0.5 0)     |
|  4 | point0_-0.5    | POINT(0 -0.5)    |
|  5 | point-0.5_0    | POINT(-0.5 0)    |
|  6 | point0.1_0.1   | POINT(0.1 0.1)   |
|  7 | point-0.1_-0.1 | POINT(-0.1 -0.1) |
+----+----------------+------------------+

空間インデックスの効果を確認する

EXPAINでインデックスが使われているか確認にする。

mysql> explain SELECT id, name, ST_AsText(geo) AS latlon from geo_myisam WHERE ST_CONTAINS(GeomFromText('LineString(-1 -1, 1 1)'), geo) \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: geo_myisam
   partitions: NULL
         type: ALL
possible_keys: geo
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 7
     filtered: 100.00
        Extra: Using where
1 row in set, 3 warnings (0.00 sec)

データが少なすぎたのか実行した検索クエリでインデックスが使われていない。インデックスが使われる程度にデータを追加する。

bash > for id in {10..10000}; do mysql -u root -ppass -e "INSERT INTO geo_test.geo_myisam (id, name, geo) VALUES (${id},CONCAT('point_',${id}), GeomFromText(CONCAT('POINT(',RAND() * 1 - 0.5,' ',RAND() * 1 - 0.5,')')))"; done;

データが追加されていることを確認する。

mysql> select id,name, ST_AsText(geo) AS latlon from geo_myisam LIMIT 20;
+----+----------------+--------------------------------------------------+
| id | name           | latlon                                           |
+----+----------------+--------------------------------------------------+
|  1 | point0_0       | POINT(0 0)                                       |
|  2 | point0_0.5     | POINT(0 0.5)                                     |
|  3 | point0.5_0     | POINT(0.5 0)                                     |
|  4 | point0_-0.5    | POINT(0 -0.5)                                    |
|  5 | point-0.5_0    | POINT(-0.5 0)                                    |
|  6 | point0.1_0.1   | POINT(0.1 0.1)                                   |
|  7 | point-0.1_-0.1 | POINT(-0.1 -0.1)                                 |
| 10 | point_10       | POINT(-0.18279335432014743 0.4261625166294747)   |
| 11 | point_11       | POINT(0.44168784929596616 -0.1697923659065667)   |
| 12 | point_12       | POINT(-0.4310579401730168 -0.37896196905426865)  |
| 13 | point_13       | POINT(0.3317695174662112 0.3630547280079264)     |
| 14 | point_14       | POINT(-0.23585583990053816 -0.29935303870528285) |
| 15 | point_15       | POINT(-0.36246499499535656 0.16255805423721492)  |
| 16 | point_16       | POINT(-0.2926347384160708 0.459336646794655)     |
| 17 | point_17       | POINT(0.4363440255972967 -0.44250360405305733)   |
| 18 | point_18       | POINT(-0.1282529082412393 0.1579594292286406)    |
| 19 | point_19       | POINT(-0.13817387971856993 -0.13420469745453883) |
| 20 | point_20       | POINT(-0.49398796446061505 -0.39641455551275473) |
| 21 | point_21       | POINT(-0.24329542437875218 0.16902874193063822)  |
| 22 | point_22       | POINT(0.07760948275924617 -0.21712540063739327)  |
+----+----------------+--------------------------------------------------+

空間インデックスの使用の有無での検索結果を再度比較する。

mysql> SELECT SQL_NO_CACHE id, ST_AsText(geo) AS latlon from geo_myisam  WHERE MBRCONTAINS(GeomFromText('LineString(-0.01 -0.01, 0.01 0.01)'),geo);
+-------+----------------------------------------------------+
| id    | latlon                                             |
+-------+----------------------------------------------------+
|     1 | POINT(0 0)                                         |
| 36313 | POINT(0.006687398074834983 -0.0001260358841401743) |
| 42705 | POINT(0.006265008827918195 -0.007921339951382333)  |
|    79 | POINT(0.009382635829395247 -0.007089380647139043)  |
| 12576 | POINT(-0.009696562783509999 0.006208053795814572)  |
+-------+----------------------------------------------------+
5 rows in set, 1 warning (0.00 sec)

mysql> EXPLAIN SELECT SQL_NO_CACHE id, ST_AsText(geo) AS latlon from geo_myisam  WHERE MBRCONTAINS(GeomFromText('LineString(-0.01 -0.01, 0.01 0.01)'),geo);
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | geo_myisam | NULL       | range | geo           | geo  | 34      | NULL |    5 |   100.00 | Using where |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+

mysql> SELECT SQL_NO_CACHE id, ST_AsText(geo) AS latlon from geo_myisam  IGNORE INDEX (geo) WHERE MBRCONTAINS(GeomFromText('LineString(-0.01 -0.01, 0.01 0.01)'),geo);
+-------+----------------------------------------------------+
| id    | latlon                                             |
+-------+----------------------------------------------------+
|     1 | POINT(0 0)                                         |
|    79 | POINT(0.009382635829395247 -0.007089380647139043)  |
| 12576 | POINT(-0.009696562783509999 0.006208053795814572)  |
| 36313 | POINT(0.006687398074834983 -0.0001260358841401743) |
| 42705 | POINT(0.006265008827918195 -0.007921339951382333)  |
+-------+----------------------------------------------------+
5 rows in set, 1 warning (0.08 sec)

mysql> EXPLAIN SELECT SQL_NO_CACHE id, ST_AsText(geo) AS latlon from geo_myisam  IGNORE INDEX (geo) WHERE MBRCONTAINS(GeomFromText('LineString(-0.01 -0.01, 0.01 0.01)'),geo);
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | geo_myisam | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9998 |   100.00 | Using where |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------------+

EXPLAINの結果よりインデックスの有無で検索する行数が大きく変わることがわかる。また、SELECTの実行時間もインデックスが無い場合はデータが少なかったときと比較して増加している。

さらにデータを増やしてみる

データ量によるパフォーマンスの変化を確認するためさらにデータを追加してSELECTの実行時間を確認する。

bash > docker exec -it db /bin/bash
bash > for id in {10001..5000000}; do mysql -u root -ppass -e "INSERT INTO geo_test.geo_myisam (id, name, geo) VALUES (${id},CONCAT('point_',${id}), GeomFromText(CONCAT('POINT(',RAND() * 4 - 1 * RAND(),' ',RAND() * 4 - 1 * RAND(),')')))"; done
mysql> SELECT id, ST_AsText(geo) from geo_myisam ORDER BY id DESC LIMIT 10;
+---------+------------------------------------------------+
| id      | ST_AsText(geo)                                 |
+---------+------------------------------------------------+
| 5000000 | POINT(0.47017629022726437 2.147326412747918)   |
| 4999999 | POINT(-0.5538703264238968 -0.5056929183245571) |
| 4999998 | POINT(0.2547422817486732 1.4077781423998794)   |
| 4999997 | POINT(-0.0562211629526887 2.4425885942229897)  |
| 4999996 | POINT(0.4893022221413461 0.5680967472196525)   |
| 4999995 | POINT(0.16882901188808397 1.3936923569009567)  |
| 4999994 | POINT(2.1195325303073345 0.4331634430523622)   |
| 4999993 | POINT(1.8108628008615812 1.5184356304988595)   |
| 4999992 | POINT(3.312845408276511 0.6860466745552111)    |
| 4999991 | POINT(-0.0809603334227208 1.8983265924251886)  |
+---------+------------------------------------------------+

mysql> SELECT SQL_NO_CACHE id, ST_AsText(geo) AS latlon from geo_myisam  WHERE MBRCONTAINS(GeomFromText('LineString(-0.01 -0.01, 0.01 0.01)'),geo);
+-------+----------------------------------------------------+
| id    | latlon                                             |
+-------+----------------------------------------------------+
|     1 | POINT(0 0)                                         |
| 36313 | POINT(0.006687398074834983 -0.0001260358841401743) |
| 42705 | POINT(0.006265008827918195 -0.007921339951382333)  |
|    79 | POINT(0.009382635829395247 -0.007089380647139043)  |
| 12576 | POINT(-0.009696562783509999 0.006208053795814572)  |
+-------+----------------------------------------------------+
5 rows in set, 1 warning (0.00 sec)

mysql> EXPLAIN SELECT SQL_NO_CACHE id, ST_AsText(geo) AS latlon from geo_myisam  WHERE MBRCONTAINS(GeomFromText('LineString(-0.01 -0.01, 0.01 0.01)'),geo);
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | geo_myisam | NULL       | range | geo           | geo  | 34      | NULL |   47 |   100.00 | Using where |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-------------+


mysql> SELECT SQL_NO_CACHE id, ST_AsText(geo) AS latlon from geo_myisam  IGNORE INDEX (geo) WHERE MBRCONTAINS(GeomFromText('LineString(-0.01 -0.01, 0.01 0.01)'),geo);
+-------+----------------------------------------------------+
| id    | latlon                                             |
+-------+----------------------------------------------------+
|     1 | POINT(0 0)                                         |
|    79 | POINT(0.009382635829395247 -0.007089380647139043)  |
| 12576 | POINT(-0.009696562783509999 0.006208053795814572)  |
| 36313 | POINT(0.006687398074834983 -0.0001260358841401743) |
| 42705 | POINT(0.006265008827918195 -0.007921339951382333)  |
+-------+----------------------------------------------------+
5 rows in set, 1 warning (0.29 sec)

mysql> EXPLAIN SELECT SQL_NO_CACHE id, ST_AsText(geo) AS latlon from geo_myisam  IGNORE INDEX (geo) WHERE MBRCONTAINS(GeomFromText('LineString(-0.01 -0.01, 0.01 0.01)'),geo);
+----+-------------+------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | geo_myisam | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 1332446 |   100.00 | Using where |
+----+-------------+------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

ORDER BYを入れてみる。

mysql> SELECT SQL_NO_CACHE id, name, ST_AsText(geo) AS latlon from geo_myisam WHERE MBRContains(GeomFromText('LineString(-0.1 -0.1,0.1 0.1)'), geo) ORDER BY geo DESC, id LIMIT 20;
+---------+---------------+---------------------------------------------------+
| id      | name          | latlon                                            |
+---------+---------------+---------------------------------------------------+
| 4172904 | point_4172904 | POINT(-0.032585136622735426 -0.07208580530442832) |
| 4856199 | point_4856199 | POINT(0.05438994062541979 -0.005999922757968235)  |
| 4304428 | point_4304428 | POINT(-0.032905573055991504 -0.07362254529597478) |
|   36488 | point_36488   | POINT(0.056770320103289855 -0.08607608553588028)  |
| 4960723 | point_4960723 | POINT(0.053667048973652584 -0.017522974887418497) |
| 4955535 | point_4955535 | POINT(0.052627551418382254 -0.04060934767183788)  |
| 4458984 | point_4458984 | POINT(-0.03358457892535681 -0.08208306979582003)  |
| 4977055 | point_4977055 | POINT(0.054517726464679214 0.0018763961287926456) |
| 4983445 | point_4983445 | POINT(0.05492399731178209 0.01108215657163611)    |
| 4938992 | point_4938992 | POINT(-0.0342750381997554 -0.07715336519959692)   |
| 4594336 | point_4594336 | POINT(0.05474851658078704 -0.009088192143559604)  |
| 4300896 | point_4300896 | POINT(-0.03313062343106663 -0.07872167795777485)  |
| 4597770 | point_4597770 | POINT(0.05350491968309965 -0.03630340661509279)   |
| 4858507 | point_4858507 | POINT(0.05527111799947091 0.013482706633846014)   |
| 4799538 | point_4799538 | POINT(-0.03375623378321178 -0.07158413163533828)  |
| 1226113 | point_1226113 | POINT(-0.03225133198523087 -0.0720902383998876)   |
| 1045225 | point_1045225 | POINT(-0.031585659861178755 -0.06502638018226847) |
| 4785297 | point_4785297 | POINT(0.05354496189723254 -0.027560937244036188)  |
| 4784713 | point_4784713 | POINT(0.05591007979205817 0.024447181284844112)   |
| 4992666 | point_4992666 | POINT(0.05489537218110205 0.01083885134275897)    |
+---------+---------------+---------------------------------------------------+
20 rows in set, 1 warning (0.01 sec)

mysql> EXPLAIN SELECT SQL_NO_CACHE id, name, ST_AsText(geo) AS latlon from geo_myisam WHERE MBRContains(GeomFromText('LineString(-0.1 -0.1,0.1 0.1)'), geo) ORDER BY geo DESC, id LIMIT 20;
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table      | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | geo_myisam | NULL       | range | geo           | geo  | 34      | NULL | 5131 |   100.00 | Using where; Using filesort |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------+

mysql> SELECT SQL_NO_CACHE id, name, ST_AsText(geo) AS latlon from geo_myisam IGNORE INDEX (geo) WHERE MBRContains(GeomFromText('LineString(-0.1 -0.1,0.1 0.1)'), geo) ORDER BY geo DESC, id LIMIT 20;
+---------+---------------+---------------------------------------------------+
| id      | name          | latlon                                            |
+---------+---------------+---------------------------------------------------+
| 4172904 | point_4172904 | POINT(-0.032585136622735426 -0.07208580530442832) |
| 4856199 | point_4856199 | POINT(0.05438994062541979 -0.005999922757968235)  |
| 4304428 | point_4304428 | POINT(-0.032905573055991504 -0.07362254529597478) |
|   36488 | point_36488   | POINT(0.056770320103289855 -0.08607608553588028)  |
| 4960723 | point_4960723 | POINT(0.053667048973652584 -0.017522974887418497) |
| 4955535 | point_4955535 | POINT(0.052627551418382254 -0.04060934767183788)  |
| 4458984 | point_4458984 | POINT(-0.03358457892535681 -0.08208306979582003)  |
| 4977055 | point_4977055 | POINT(0.054517726464679214 0.0018763961287926456) |
| 4983445 | point_4983445 | POINT(0.05492399731178209 0.01108215657163611)    |
| 4938992 | point_4938992 | POINT(-0.0342750381997554 -0.07715336519959692)   |
| 4594336 | point_4594336 | POINT(0.05474851658078704 -0.009088192143559604)  |
| 4300896 | point_4300896 | POINT(-0.03313062343106663 -0.07872167795777485)  |
| 4597770 | point_4597770 | POINT(0.05350491968309965 -0.03630340661509279)   |
| 4858507 | point_4858507 | POINT(0.05527111799947091 0.013482706633846014)   |
| 4799538 | point_4799538 | POINT(-0.03375623378321178 -0.07158413163533828)  |
| 1226113 | point_1226113 | POINT(-0.03225133198523087 -0.0720902383998876)   |
| 1045225 | point_1045225 | POINT(-0.031585659861178755 -0.06502638018226847) |
| 4785297 | point_4785297 | POINT(0.05354496189723254 -0.027560937244036188)  |
| 4784713 | point_4784713 | POINT(0.05591007979205817 0.024447181284844112)   |
| 4992666 | point_4992666 | POINT(0.05489537218110205 0.01083885134275897)    |
+---------+---------------+---------------------------------------------------+
20 rows in set, 1 warning (0.29 sec)

mysql> EXPLAIN SELECT SQL_NO_CACHE id, name, ST_AsText(geo) AS latlon from geo_myisam IGNORE INDEX (geo) WHERE MBRContains(GeomFromText('LineString(-0.1 -0.1,0.1 0.1)'), geo) ORDER BY geo DESC, id LIMIT 20;
+----+-------------+------------+------------+------+---------------+------+---------+------+---------+----------+-----------------------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra                       |
+----+-------------+------------+------------+------+---------------+------+---------+------+---------+----------+-----------------------------+
|  1 | SIMPLE      | geo_myisam | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 1332446 |   100.00 | Using where; Using filesort |
+----+-------------+------------+------------+------+---------------+------+---------+------+---------+----------+-----------------------------+

空間インデックスが使える前者の検索は探索行数が少なく、すぐに検索が終了することがわかる。(ただし、後者はほかのインデックスも効いていないので若干不利な条件だろうか。)

番外: ST_Containsを使う

ST_Containsの使い方が若干理解しづらかったのでメモとして残す。

mysql> SELECT SQL_NO_CACHE id, name, ST_AsText(geo) AS latlon from geo_myisam WHERE ST_Contains(ST_MakeEnvelope(POINT(-0.1,-0.1),POINT(0.1,0.1)), geo) ORDER BY geo DESC, id LIMIT 20;
+---------+---------------+---------------------------------------------------+
| id      | name          | latlon                                            |
+---------+---------------+---------------------------------------------------+
| 4172904 | point_4172904 | POINT(-0.032585136622735426 -0.07208580530442832) |
| 4856199 | point_4856199 | POINT(0.05438994062541979 -0.005999922757968235)  |
| 4304428 | point_4304428 | POINT(-0.032905573055991504 -0.07362254529597478) |
|   36488 | point_36488   | POINT(0.056770320103289855 -0.08607608553588028)  |
| 4960723 | point_4960723 | POINT(0.053667048973652584 -0.017522974887418497) |
| 4955535 | point_4955535 | POINT(0.052627551418382254 -0.04060934767183788)  |
| 4458984 | point_4458984 | POINT(-0.03358457892535681 -0.08208306979582003)  |
| 4977055 | point_4977055 | POINT(0.054517726464679214 0.0018763961287926456) |
| 4983445 | point_4983445 | POINT(0.05492399731178209 0.01108215657163611)    |
| 4938992 | point_4938992 | POINT(-0.0342750381997554 -0.07715336519959692)   |
| 4594336 | point_4594336 | POINT(0.05474851658078704 -0.009088192143559604)  |
| 4300896 | point_4300896 | POINT(-0.03313062343106663 -0.07872167795777485)  |
| 4597770 | point_4597770 | POINT(0.05350491968309965 -0.03630340661509279)   |
| 4858507 | point_4858507 | POINT(0.05527111799947091 0.013482706633846014)   |
| 4799538 | point_4799538 | POINT(-0.03375623378321178 -0.07158413163533828)  |
| 1226113 | point_1226113 | POINT(-0.03225133198523087 -0.0720902383998876)   |
| 1045225 | point_1045225 | POINT(-0.031585659861178755 -0.06502638018226847) |
| 4785297 | point_4785297 | POINT(0.05354496189723254 -0.027560937244036188)  |
| 4784713 | point_4784713 | POINT(0.05591007979205817 0.024447181284844112)   |
| 4992666 | point_4992666 | POINT(0.05489537218110205 0.01083885134275897)    |
+---------+---------------+---------------------------------------------------+
20 rows in set (0.01 sec)

mysql> SELECT SQL_NO_CACHE id, name, ST_AsText(geo) AS latlon from geo_myisam IGNORE INDEX (geo) WHERE ST_Contains(ST_MakeEnvelope(POINT(-0.1,-0.1),POINT(0.1,0.1)), geo) ORDER BY geo DESC, id LIMIT 20;
+---------+---------------+---------------------------------------------------+
| id      | name          | latlon                                            |
+---------+---------------+---------------------------------------------------+
| 4172904 | point_4172904 | POINT(-0.032585136622735426 -0.07208580530442832) |
| 4856199 | point_4856199 | POINT(0.05438994062541979 -0.005999922757968235)  |
| 4304428 | point_4304428 | POINT(-0.032905573055991504 -0.07362254529597478) |
|   36488 | point_36488   | POINT(0.056770320103289855 -0.08607608553588028)  |
| 4960723 | point_4960723 | POINT(0.053667048973652584 -0.017522974887418497) |
| 4955535 | point_4955535 | POINT(0.052627551418382254 -0.04060934767183788)  |
| 4458984 | point_4458984 | POINT(-0.03358457892535681 -0.08208306979582003)  |
| 4977055 | point_4977055 | POINT(0.054517726464679214 0.0018763961287926456) |
| 4983445 | point_4983445 | POINT(0.05492399731178209 0.01108215657163611)    |
| 4938992 | point_4938992 | POINT(-0.0342750381997554 -0.07715336519959692)   |
| 4594336 | point_4594336 | POINT(0.05474851658078704 -0.009088192143559604)  |
| 4300896 | point_4300896 | POINT(-0.03313062343106663 -0.07872167795777485)  |
| 4597770 | point_4597770 | POINT(0.05350491968309965 -0.03630340661509279)   |
| 4858507 | point_4858507 | POINT(0.05527111799947091 0.013482706633846014)   |
| 4799538 | point_4799538 | POINT(-0.03375623378321178 -0.07158413163533828)  |
| 1226113 | point_1226113 | POINT(-0.03225133198523087 -0.0720902383998876)   |
| 1045225 | point_1045225 | POINT(-0.031585659861178755 -0.06502638018226847) |
| 4785297 | point_4785297 | POINT(0.05354496189723254 -0.027560937244036188)  |
| 4784713 | point_4784713 | POINT(0.05591007979205817 0.024447181284844112)   |
| 4992666 | point_4992666 | POINT(0.05489537218110205 0.01083885134275897)    |
+---------+---------------+---------------------------------------------------+
20 rows in set (2.33 sec)

まとめ 

MySQL5.7にてGeometry型のデータを扱うテーブルと空間インデックスを作成した。また、テーブルに500万件のデータを格納した際の空間インデックスの有無による検索パフォーマンスを確認した。

今回はEngineをMyISAMにして検証したので次回はInnoDBの場合も検証しようと思う。

参考

[1]: https://dev.mysql.com/doc/refman/5.7/en/creating-spatial-indexes.html

[2] https://dev.mysql.com/doc/refman/5.6/ja/gis-wkt-functions.html#function_geomfromtext

[3]: https://dev.mysql.com/doc/refman/5.7/en/gis-format-conversion-functions.html#function_st-astext