本文主要要展示的是PostgreSQL在位置信息近邻(KNN)查询方面的性能。
测试类型point,索引类型GiST。(PostGIS同样支持KNN查询,性能和本文的测试差不多)
测试数据量大于100亿。
测试环境和优化请参考:
创建测试表
postgres=# create table tbl_point(id serial8, poi point);CREATE TABLEpostgres=# \d tbl_pointTable "benchmarksql.tbl_point"Column | Type | Modifiers--------+--------+--------------------------------------------------------id | bigint | not null default nextval('tbl_point_id_seq'::regclass)poi | point |postgres=# alter sequence tbl_point_id_seq cache 10000;ALTER SEQUENCE
使用以下脚本生成测试数据:
point的x和y的取值范围都是-50000到50000,所以一共可以生成100亿个不同的point。
和测试数据量相符。
vi test.sqlinsert into tbl_point(poi) select point(trunc(100000*(0.5-random())), trunc(100000*(0.5-random()))) from generate_series(1,10000);
使用pgbench每秒约插入233万位置信息。
pgbench -M prepared -n -r -f ./test.sql -P 1 -c 96 -j 96 -T 1100tps = 233.018365 (including connections establishing)tps = 233.150940 (excluding connections establishing)
数据量
postgres=# select count(*) from tbl_point;count------------2532820000(1 row)
当前表大小:
postgres=# \dt+benchmarksql | tbl_point | table | postgres | 123 GB |
在point类型上创建GiST索引
postgres=# create index idx_tbl_point on tbl_point using gist(poi) with (buffering=on);postgres=# \d+ tbl_pointTable "benchmarksql.tbl_point"Column | Type | Modifiers | Storage | Stats target | Description--------+--------+--------------------------------------------------------+---------+--------------+-------------id | bigint | not null default nextval('tbl_point_id_seq'::regclass) | plain | |poi | point | | plain | |Indexes:"idx_tbl_point" gist (poi) WITH (buffering='on')
索引大小:
\di+benchmarksql | idx_tbl_point | index | postgres | tbl_point | 170 GB |
新建完索引后,插入性能会下降,现在每秒约插入55万条位置信息。
pgbench -M prepared -n -r -f ./test.sql -P 1 -c 96 -j 96 -T 100transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 96number of threads: 96duration: 100 snumber of transactions actually processed: 5587latency average: 1726.947 mslatency stddev: 118.223 mstps = 55.390665 (including connections establishing)tps = 55.419003 (excluding connections establishing)statement latencies in milliseconds:1726.946947 insert into tbl_point(poi) select point(trunc(100000*(0.5-random())), trunc(100000*(0.5-random()))) from generate_series(1,10000);
根据这个速度,持续插入13600秒,超过100亿记录后停止。
pgbench -M prepared -n -r -f ./test.sql -P 1 -c 64 -j 64 -T 13600表500GB索引720GB
knn检索例子:
postgres=# select *,poi <-> point(1000,1000) dist from tbl_point where poi <-> point(1000,1000) < 100 order by poi <-> point(1000,1000) limit 10;id | poi | dist------------+-------------+------------------399588621 | (1000,999) | 11030719903 | (1001,999) | 1.41421356237312698052191 | (1001,1001) | 1.41421356237313291219762 | (999,999) | 1.41421356237312757190006 | (1002,1000) | 22862610530 | (998,1001) | 2.236067977499793450459141 | (998,1001) | 2.236067977499793124756442 | (1002,1001) | 2.236067977499793105439886 | (1001,998) | 2.23606797749979473144305 | (998,1002) | 2.82842712474619(10 rows)
执行计划,排序和检索都走了GiST索引。
例如,一个查询如下,扫描了16个数据块,在shared buffer命中8个,读8个块(可能在OS CACHE或直接读block dev)。
postgres=# explain (analyze,verbose,buffers,timing,costs) select *,poi <-> point(10090,10090) dist from tbl_point where poi <-> point(10090,10090) < 100 order by poi <-> point(10090,10090) limit 10;QUERY PLAN----------------------------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=0.56..13.15 rows=10 width=24) (actual time=0.469..1.309 rows=10 loops=1)Output: id, poi, ((poi <-> '(10090,10090)'::point))Buffers: shared hit=8 read=8 dirtied=1-> Index Scan using idx_tbl_point on benchmarksql.tbl_point (cost=0.56..1510464450.86 rows=1199422376 width=24) (actual time=0.468..1.306 rows=10 loops=1)Output: id, poi, (poi <-> '(10090,10090)'::point)Order By: (tbl_point.poi <-> '(10090,10090)'::point)Filter: ((tbl_point.poi <-> '(10090,10090)'::point) < '100'::double precision)Buffers: shared hit=8 read=8 dirtied=1Planning time: 0.084 msExecution time: 1.347 ms(10 rows)
下面是压力测试,测试在100亿地理位置数据中,postgresql 的knn查询性能。
测试脚本如下,随机生成一个point,然后查找这个point附近距离为100以内,按距离排序,取出1条。
vi test.sql\setrandom x -50000 50000\setrandom y -50000 50000select * from tbl_point where poi <-> point(:x,:y) <100 order by poi <-> point(:x,:y) limit 1;
测试结果:
单次请求的平均响应时间为0.858毫秒。
pgbench -M prepared -n -r -f ./test.sql -P 1 -c 64 -j 64 -T 100transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 64number of threads: 64duration: 100 snumber of transactions actually processed: 7418337latency average: 0.858 mslatency stddev: 0.564 mstps = 74151.604194 (including connections establishing)tps = 74184.255934 (excluding connections establishing)statement latencies in milliseconds:0.007518 \setrandom x -50000 500000.002193 \setrandom y -50000 500000.847847 select * from tbl_point where poi <-> point(:x,:y) <100 order by poi <-> point(:x,:y) limit 1;
因为数据量较大,内存加索引超过1TB,远远超越了内存大小,要做到0.858毫秒的响应,得益于AliFlash pci-E SSD卡的性能,单次请求平均0.01毫秒完成。队列等待0.19毫秒。
avg-cpu: %user %nice %system %iowait %steal %idle69.54 0.00 24.11 5.87 0.00 0.47Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utildfa 0.00 0.00 26100.00 2096.00 417600.00 33536.00 16.00 5.05 0.18 0.03 98.00dfb 0.00 0.00 26150.00 2038.00 418400.00 32600.00 16.00 5.01 0.18 0.03 98.40dfc 0.00 0.00 25931.00 2026.00 414896.00 32384.00 16.00 6.15 0.22 0.04 99.70dm-0 0.00 0.00 78178.00 6160.00 1250848.00 98520.00 16.00 16.73 0.19 0.01 101.00
[其他优化手段]
当请求的数据距离不在给予范围内时,被扫描的GiST索引PAGE会被放大,所以优化手段可以先order by limit,然后再过滤举例满足条件的。
例子
这个查询可能要跑很久很久才能出结果,并且结果可能是0记录。
explain (analyze,verbose,buffers,timing,costs) select *,poi <-> point(10090,10090000) dist from tbl_point where poi <-> point(10090,10090000) < 100 order by poi <-> point(10090,10090000) limit 10;
优化手段,先order by limit,然后再过滤举例满足条件的。
postgres=# explain (analyze,verbose,buffers,timing,costs) select * from (select *,poi <-> point(10090,10090000) dist from tbl_point order by poi <-> point(10090,10090000) limit 1000 ) t where poi <-> point(10090,10090000) < 100 limit 10;QUERY PLAN------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=0.56..13.51 rows=10 width=32) (actual time=3.769..3.769 rows=0 loops=1)Output: t.id, t.poi, t.distBuffers: shared hit=1174-> Subquery Scan on t (cost=0.56..432.00 rows=333 width=32) (actual time=3.769..3.769 rows=0 loops=1)Output: t.id, t.poi, t.distFilter: ((t.poi <-> '(10090,10090000)'::point) < '100'::double precision)Rows Removed by Filter: 1000Buffers: shared hit=1174-> Limit (cost=0.56..417.00 rows=1000 width=24) (actual time=0.106..3.596 rows=1000 loops=1)Output: tbl_point.id, tbl_point.poi, ((tbl_point.poi <-> '(10090,10090000)'::point))Buffers: shared hit=1174-> Index Scan using idx_tbl_point on benchmarksql.tbl_point (cost=0.56..1498470227.10 rows=3598267127 width=24) (actual time=0.105..3.505 rows=1000 loops=1)Output: tbl_point.id, tbl_point.poi, (tbl_point.poi <-> '(10090,10090000)'::point)Order By: (tbl_point.poi <-> '(10090,10090000)'::point)Buffers: shared hit=1174Planning time: 0.069 msExecution time: 3.793 ms(17 rows)
PostGIS也是这么玩的:
digoal=# select * from (select *,ST_Distance(jwd, ST_Transform(ST_GeomFromText('POINT(120.19 30.26)', 4326), 2163)) AS dist from cust_jw order by jwd <-> ST_Transform(ST_GeomFromText('POINT(120.19 30.26)', 4326), 2163) limit 1000) t where dist<15000;dz | jwd | dist------+----------------------------------------------------+------------------杭州 | 0101000020730800004C94087D5D4F54C173AA7759E8FB5D41 | 0余杭 | 0101000020730800000E6E5A20494854C121FC688DA9EF5D41 | 14483.9823187612(2 rows)Time: 0.634 ms
更小气的玩法如下,这是为节约资源优化到极致的方法,使用游标解决以上问题,最多多一个PAGE的扫描。
digoal=# do language plpgsql $$declarev_rec record;v_limit int := 1000;beginset enable_seqscan=off; -- 强制索引, 因为扫描行数够就退出.for v_rec in select *,ST_Distance(jwd, ST_Transform(ST_GeomFromText('POINT(120.19 30.26)', 4326), 2163)) AS dist from cust_jw order by jwd <-> ST_Transform(ST_GeomFromText('POINT(120.19 30.26)', 4326), 2163) loopif v_limit <=0 thenraise notice '已经取足数据';return;end if;if v_rec.dist > 20000 thenraise notice '满足条件的点已输出完毕';return;elseraise notice 'do someting, v_rec:%', v_rec;end if;v_limit := v_limit -1;end loop;end;$$;NOTICE: do someting, v_rec:(杭州,0101000020730800004C94087D5D4F54C173AA7759E8FB5D41,0)NOTICE: do someting, v_rec:(余杭,0101000020730800000E6E5A20494854C121FC688DA9EF5D41,14483.9823187612)NOTICE: 满足条件的点已输出完毕DO
[参考]