向量化快速上手

1 在开始之前

在开始使用向量化执行引擎之前,你需要率先准备好正确的软硬件环境:

软硬件环境 准备
CPU Intel CPU,Haswell 架构或以上
YMatrix 部署好的集群环境

2 启用向量化

YMatrix 默认开启向量化。

3 向量化使用示例

首先,我们需要一个面向列存的数据表。下面创建一个 MARS2 表。

CREATE TABLE test(
  tag int, 
  i4 int4, 
  i8 int8, 
  f4 float4, 
  f8 float8
  ) USING mars2;
CREATE INDEX ON test USING mars2_btree(tag);

3.1 表达式计算

向量化执行引擎支持常见表达式,例如:

EXPLAIN (costs off, verbose) SELECT tag + 10, f4 * -1, f8 / 10 + i8 * i4 FROM test;

得到如下输出:


                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Gather Motion 1:1  (slice1; segments: 1)
   Output: ((tag + 10)), ((f4 * '-1'::double precision)), (((f8 / '10'::double precision) + ((i8 * i4))::double precision))
   ->  Custom Scan (MxVScan) on public.test
         Output: (tag + 10), (f4 * '-1'::double precision), ((f8 / '10'::double precision) + ((i8 * i4))::double precision)
 Optimizer: Postgres query optimizer
(5 rows)

其中 MxVScan 是包含了核心计算模块,利用 SIMD 指令和 CPU 缓存友好的算法,能够高效处理各种表达式。

3.2 WHERE 过滤优化

针对列存表的 WHERE 条件过滤查询,向量化执行引擎也有数量级的优化提升。 如下计划所示,除了计算外,MxVScan 也是具备向量化过滤的功能:

EXPLAIN (costs off, verbose) SELECT * FROM test where i4 > 10 and i8 < 8;

得到如下输出:

                     QUERY PLAN                     
----------------------------------------------------
 Gather Motion 1:1  (slice1; segments: 1)
   Output: tag, i4, i8, f4, f8
   ->  Custom Scan (MxVScan) on public.test
         Output: tag, i4, i8, f4, f8
         Filter: ((test.i4 > 10) AND (test.i8 < 8))
 Optimizer: Postgres query optimizer
(6 rows)

3.3 实现排序查询优化

向量化执行引擎实现了面向列存的排序算法。同时,针对 Limit 算子也有特殊优化。Limit 算子用于限制数据输出的行数。启用向量化后,算子占用内存更少,速度更快。示例如下:

EXPLAIN (costs off, verbose) SELECT * FROM test ORDER BY i8;

得到如下输出:

                    QUERY PLAN                    
--------------------------------------------------
 Gather Motion 1:1  (slice1; segments: 1)
   Output: tag, i4, i8, f4, f8
   Merge Key: i8
   ->  Custom Scan (MxVSort) on public.test
         Output: tag, i4, i8, f4, f8
         Sort Key: i8
         ->  Custom Scan (MxVScan) on public.test
               Output: tag, i4, i8, f4, f8
 Optimizer: Postgres query optimizer
(9 rows)
EXPLAIN (costs off, verbose) SELECT * FROM test ORDER BY i4, i8 DESC LIMIT 10;

得到如下输出:

                          QUERY PLAN                          
--------------------------------------------------------------
 Limit
   Output: tag, i4, i8, f4, f8
   ->  Gather Motion 1:1  (slice1; segments: 1)
         Output: tag, i4, i8, f4, f8
         Merge Key: i4, i8
         ->  Limit
               Output: tag, i4, i8, f4, f8
               ->  Custom Scan (MxVSort) on public.test
                     Output: tag, i4, i8, f4, f8
                     Sort Key: test.i4, test.i8
                     ->  Custom Scan (MxVScan) on public.test
                           Output: tag, i4, i8, f4, f8
 Optimizer: Postgres query optimizer
(13 rows)

3.4 实现聚集查询优化

向量化执行引擎针对常见数据类型(int2, int4, int8, float4, float8 等)实现了各类聚集算子,如 sum, min, max, count 等。

EXPLAIN (costs off, verbose) SELECT min(tag + 2), max(i4 + i8), count(f8) FROM test;

得到如下输出:

                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Finalize Aggregate
   Output: min((tag + 2)), max((i4 + i8)), count(f8)
   ->  Gather Motion 1:1  (slice1; segments: 1)
         Output: (PARTIAL min((tag + 2))), (PARTIAL max((i4 + i8))), (PARTIAL count(f8))
         ->  Custom Scan (MxVAgg)
               Output: (PARTIAL min((tag + 2))), (PARTIAL max((i4 + i8))), (PARTIAL count(f8))
               ->  Custom Scan (MxVScan) on public.test
                     Output: tag, i4, i8, f8
 Optimizer: Postgres query optimizer
(9 rows)
EXPLAIN (costs off, verbose) SELECT count(tag), sum(f4) FROM test GROUP BY i4;

得到如下输出:

                            QUERY PLAN                             
-------------------------------------------------------------------
 Finalize GroupAggregate
   Output: count(tag), sum(f4), i4
   Group Key: test.i4
   ->  Gather Motion 1:1  (slice1; segments: 1)
         Output: i4, (PARTIAL count(tag)), (PARTIAL sum(f4))
         Merge Key: i4
         ->  Custom Scan (MxVAgg)
               Output: i4, (PARTIAL count(tag)), (PARTIAL sum(f4))
               ->  Custom Scan (MxVSort) on public.test
                     Output: i4, tag, f4
                     Sort Key: i4
                     ->  Custom Scan (MxVScan) on public.test
                           Output: i4, tag, f4
 Optimizer: Postgres query optimizer
(14 rows)

3.5 其他复杂查询优化

针对更为复杂的查询,向量化执行引擎也有显著的提升作用。 例如:

select sum(lo_extendedprice * lo_discount) as revenue
  from :tname
 where lo_orderdate >= '1993-01-01'
   and lo_orderdate <  '1994-01-01'
   and lo_discount between 1 and 3
   and lo_quantity < 25
;