400-800-0824
info@ymatrix.cn
400-800-0824
info@ymatrix.cn
400-800-0824
info@ymatrix.cn
400-800-0824
info@ymatrix.cn
400-800-0824
info@ymatrix.cn
YMatrix 文档
关于 YMatrix
标准集群部署
数据写入
数据迁移
数据查询
运维监控
参考指南
工具指南
数据类型
存储引擎
执行引擎
系统配置参数
SQL 参考
常见问题(FAQ)
新架构 FAQ
集群部署 FAQ
SQL 查询 FAQ
MatrixGate FAQ
运维 FAQ
监控告警 FAQ
PXF FAQ
PLPython FAQ
性能 FAQ
在开始使用向量化执行引擎之前,你需要率先准备好正确的软硬件环境:
软硬件环境 | 准备 |
---|---|
CPU | Intel CPU,Haswell 架构或以上 |
YMatrix | 部署好的集群环境 |
YMatrix 默认开启向量化。
首先,我们需要一个面向列存的数据表。下面创建一个 MARS3 表。
=# CREATE TABLE test(
tag int,
i4 int4,
i8 int8,
f4 float4,
f8 float8
)
USING MARS3
ORDER BY (tag);
向量化执行引擎支持常见表达式,例如:
=# EXPLAIN (costs off, verbose) SELECT tag + 10, f4 * -1, f8 / 10 + i8 * i4 FROM test;
得到如下输出:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Custom Scan (MxVMotion) Gather Motion 4:1 (slice1; segments: 4)
Output: ((tag + 10)), ((f4 * '-1'::double precision)), (((f8 / '10'::double precision) + ((i8 * i4))::double precision))
-> Custom Scan (MxVScan) on public.test
Output: (tag + 10), (f4 * '-1'::double precision), ((f8 / '10'::double precision) + ((i8 * i4))::double precision)
Optimizer: Postgres query optimizer
(5 rows)
其中 MxVScan 是包含了核心计算模块,利用 SIMD 指令和 CPU 缓存友好的算法,能够高效处理各种表达式。
针对列存表的 WHERE 条件过滤查询,向量化执行引擎也有数量级的优化提升。 如下计划所示,除了计算外,MxVScan 也是具备向量化过滤的功能:
=# EXPLAIN (costs off, verbose) SELECT * FROM test where i4 > 10 and i8 < 8;
得到如下输出:
QUERY PLAN
------------------------------------------------------------------
Custom Scan (MxVMotion) Gather Motion 4:1 (slice1; segments: 4)
Output: tag, i4, i8, f4, f8
-> Custom Scan (MxVScan) on public.test
Output: tag, i4, i8, f4, f8
Filter: ((test.i4 > 10) AND (test.i8 < 8))
Optimizer: Postgres query optimizer
(6 rows)
向量化执行引擎实现了面向列存的排序算法。同时,针对 Limit 算子也有特殊优化。Limit 算子用于限制数据输出的行数。启用向量化后,算子占用内存更少,速度更快。示例如下:
=# EXPLAIN (costs off, verbose) SELECT * FROM test ORDER BY i8;
得到如下输出:
QUERY PLAN
------------------------------------------------------------------
Custom Scan (MxVMotion) Gather Motion 4:1 (slice1; segments: 4)
Output: tag, i4, i8, f4, f8
Merge Key: i8
-> Custom Scan (MxVSort)
Output: tag, i4, i8, f4, f8
Sort Key: test.i8
-> Custom Scan (MxVScan) on public.test
Output: tag, i4, i8, f4, f8
Optimizer: Postgres query optimizer
(9 rows)
=# EXPLAIN (costs off, verbose) SELECT * FROM test ORDER BY i4, i8 DESC LIMIT 10;
得到如下输出:
QUERY PLAN
------------------------------------------------------------------------
Custom Scan (MxVLimit)
Output: tag, i4, i8, f4, f8
-> Custom Scan (MxVMotion) Gather Motion 4:1 (slice1; segments: 4)
Output: tag, i4, i8, f4, f8
Merge Key: i4, i8
-> Custom Scan (MxVLimit)
Output: tag, i4, i8, f4, f8
-> Custom Scan (MxVSort)
Output: tag, i4, i8, f4, f8
Sort Key: test.i4, test.i8
-> Custom Scan (MxVScan) on public.test
Output: tag, i4, i8, f4, f8
Optimizer: Postgres query optimizer
(13 rows)
向量化执行引擎针对常见数据类型(int2, int4, int8, float4, float8 等)实现了各类聚集算子,如 sum, min, max, count 等。
=# EXPLAIN (costs off, verbose) SELECT min(tag + 2), max(i4 + i8), count(f8) FROM test;
得到如下输出:
QUERY PLAN
-----------------------------------------------------------------------------------------------
Custom Scan (MxVAgg) Finalize Aggregate
Output: (min((tag + 2))), (max((i4 + i8))), (count(f8))
-> Custom Scan (MxVMotion) Gather Motion 4:1 (slice1; segments: 4)
Output: (PARTIAL min((tag + 2))), (PARTIAL max((i4 + i8))), (PARTIAL count(f8))
-> Custom Scan (MxVAgg) Partial Aggregate
Output: (PARTIAL min((tag + 2))), (PARTIAL max((i4 + i8))), (PARTIAL count(f8))
-> Custom Scan (MxVScan) on public.test
Output: tag, i4, i8, f8
Optimizer: Postgres query optimizer
(9 rows)
=# EXPLAIN (costs off, verbose) SELECT count(tag), sum(f4) FROM test GROUP BY i4;
得到如下输出:
QUERY PLAN
------------------------------------------------------------------------------------
Custom Scan (MxVMotion) Gather Motion 4:1 (slice1; segments: 4)
Output: (count(tag)), (sum(f4)), i4
-> Custom Scan (MxVHashAgg) Finalize HashAggregate
Output: (count(tag)), (sum(f4)), i4
Group Key: test.i4
-> Custom Scan (MxVMotion) Redistribute Motion 4:4 (slice2; segments: 4)
Output: i4, (PARTIAL count(tag)), (PARTIAL sum(f4))
Hash Key: i4
-> Custom Scan (MxVHashAgg) Partial HashAggregate
Output: i4, (PARTIAL count(tag)), (PARTIAL sum(f4))
Group Key: test.i4
-> Custom Scan (MxVScan) on public.test
Output: i4, tag, f4
Optimizer: Postgres query optimizer
(14 rows)
针对更为复杂的查询,向量化执行引擎也有显著的提升作用。 例如:
=# SELECT SUM(lo_extendedprice * lo_discount) AS revenue
FROM tname
WHERE lo_orderdate >= '1993-01-01'
AND lo_orderdate < '1994-01-01'
AND lo_discount between 1 and 3
AND lo_quantity < 25
;