向量化快速上手
1 在开始之前
在开始使用向量化执行引擎之前,你需要率先准备好正确的软硬件环境:
软硬件环境 | 准备 |
---|---|
CPU | Intel CPU,Haswell 架构或以上 |
YMatrix | 部署好的集群环境 |
2 启用向量化
YMatrix 默认开启向量化。
3 向量化使用示例
首先,我们需要一个面向列存的数据表。下面创建一个 MARS2 表。
CREATE TABLE test(
tag int,
i4 int4,
i8 int8,
f4 float4,
f8 float8
) USING mars2;
CREATE INDEX ON test USING mars2_btree(tag);
3.1 表达式计算
向量化执行引擎支持常见表达式,例如:
EXPLAIN (costs off, verbose) SELECT tag + 10, f4 * -1, f8 / 10 + i8 * i4 FROM test;
得到如下输出:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Gather Motion 1:1 (slice1; segments: 1)
Output: ((tag + 10)), ((f4 * '-1'::double precision)), (((f8 / '10'::double precision) + ((i8 * i4))::double precision))
-> Custom Scan (MxVScan) on public.test
Output: (tag + 10), (f4 * '-1'::double precision), ((f8 / '10'::double precision) + ((i8 * i4))::double precision)
Optimizer: Postgres query optimizer
(5 rows)
其中 MxVScan 是包含了核心计算模块,利用 SIMD 指令和 CPU 缓存友好的算法,能够高效处理各种表达式。
3.2 WHERE 过滤优化
针对列存表的 WHERE 条件过滤查询,向量化执行引擎也有数量级的优化提升。 如下计划所示,除了计算外,MxVScan 也是具备向量化过滤的功能:
EXPLAIN (costs off, verbose) SELECT * FROM test where i4 > 10 and i8 < 8;
得到如下输出:
QUERY PLAN
----------------------------------------------------
Gather Motion 1:1 (slice1; segments: 1)
Output: tag, i4, i8, f4, f8
-> Custom Scan (MxVScan) on public.test
Output: tag, i4, i8, f4, f8
Filter: ((test.i4 > 10) AND (test.i8 < 8))
Optimizer: Postgres query optimizer
(6 rows)
3.3 实现排序查询优化
向量化执行引擎实现了面向列存的排序算法。同时,针对 Limit 算子也有特殊优化。Limit 算子用于限制数据输出的行数。启用向量化后,算子占用内存更少,速度更快。示例如下:
EXPLAIN (costs off, verbose) SELECT * FROM test ORDER BY i8;
得到如下输出:
QUERY PLAN
--------------------------------------------------
Gather Motion 1:1 (slice1; segments: 1)
Output: tag, i4, i8, f4, f8
Merge Key: i8
-> Custom Scan (MxVSort) on public.test
Output: tag, i4, i8, f4, f8
Sort Key: i8
-> Custom Scan (MxVScan) on public.test
Output: tag, i4, i8, f4, f8
Optimizer: Postgres query optimizer
(9 rows)
EXPLAIN (costs off, verbose) SELECT * FROM test ORDER BY i4, i8 DESC LIMIT 10;
得到如下输出:
QUERY PLAN
--------------------------------------------------------------
Limit
Output: tag, i4, i8, f4, f8
-> Gather Motion 1:1 (slice1; segments: 1)
Output: tag, i4, i8, f4, f8
Merge Key: i4, i8
-> Limit
Output: tag, i4, i8, f4, f8
-> Custom Scan (MxVSort) on public.test
Output: tag, i4, i8, f4, f8
Sort Key: test.i4, test.i8
-> Custom Scan (MxVScan) on public.test
Output: tag, i4, i8, f4, f8
Optimizer: Postgres query optimizer
(13 rows)
3.4 实现聚集查询优化
向量化执行引擎针对常见数据类型(int2, int4, int8, float4, float8 等)实现了各类聚集算子,如 sum, min, max, count 等。
EXPLAIN (costs off, verbose) SELECT min(tag + 2), max(i4 + i8), count(f8) FROM test;
得到如下输出:
QUERY PLAN
-----------------------------------------------------------------------------------------------
Finalize Aggregate
Output: min((tag + 2)), max((i4 + i8)), count(f8)
-> Gather Motion 1:1 (slice1; segments: 1)
Output: (PARTIAL min((tag + 2))), (PARTIAL max((i4 + i8))), (PARTIAL count(f8))
-> Custom Scan (MxVAgg)
Output: (PARTIAL min((tag + 2))), (PARTIAL max((i4 + i8))), (PARTIAL count(f8))
-> Custom Scan (MxVScan) on public.test
Output: tag, i4, i8, f8
Optimizer: Postgres query optimizer
(9 rows)
EXPLAIN (costs off, verbose) SELECT count(tag), sum(f4) FROM test GROUP BY i4;
得到如下输出:
QUERY PLAN
-------------------------------------------------------------------
Finalize GroupAggregate
Output: count(tag), sum(f4), i4
Group Key: test.i4
-> Gather Motion 1:1 (slice1; segments: 1)
Output: i4, (PARTIAL count(tag)), (PARTIAL sum(f4))
Merge Key: i4
-> Custom Scan (MxVAgg)
Output: i4, (PARTIAL count(tag)), (PARTIAL sum(f4))
-> Custom Scan (MxVSort) on public.test
Output: i4, tag, f4
Sort Key: i4
-> Custom Scan (MxVScan) on public.test
Output: i4, tag, f4
Optimizer: Postgres query optimizer
(14 rows)
3.5 其他复杂查询优化
针对更为复杂的查询,向量化执行引擎也有显著的提升作用。 例如:
select sum(lo_extendedprice * lo_discount) as revenue
from :tname
where lo_orderdate >= '1993-01-01'
and lo_orderdate < '1994-01-01'
and lo_discount between 1 and 3
and lo_quantity < 25
;