mxshift
本文档介绍了数据迁移工具 mxshift。
1 描述
mxshift 工具目前支持以下功能:
- 全量迁移:以数据库为单位,实现从 Greenplum 4.3.X/5/6 / YMatrix 集群并行迁移数据表数据定义语言(DDL)及数据到新的 YMatrix 集群。
- 增量迁移:以表为单位,实现从 Greenplum 5/6 / YMatrix 集群并行迁移目标数据库中新增数据表的数据定义语言(DDL)及数据到新的 YMatrix 集群。
- 条件迁移:通过
WHERE
条件实现数据的条件迁移。 - 支持的数据分布策略:哈希分布、随机分布、复制表(REPLICATED)、Master-only。
2 快速上手
完整的模版文件命名为 config.toml
。此模版可通过执行 mxshift --config
获得。
我们在下文给出了针对不同使用场景的示例模版,请以此为参考,并根据具体情况使用 vim config.toml
命令修改文件完成配置。
2.1 全量迁移
[database]
[database.source]
## Name of database
db-database= "testdb"
## Hostname of database master
db-host="sdw3"
## password of database
db-password="xxxx"
## Port of database master
db-port=54322
## user name of database
db-user="gpadmin"
## Version of database(Please use the result of 'SELECT version();' as value). Required only when
## 1. Source database is un-reachable, and 'ddl.only-ddl' is enabled and 'ddl.mode' is 'input'
## 2. Target database is un-reachable, and 'ddl.mode' is 'output'
# db-version="PostgreSQL 12 (MatrixDB 5.2.0-enterprise) (Greenplum Database 7.0.0+dev.17410.gedbdb5ef84 build dev) on arm-apple-darwin21.5.0, compiled by Apple clang version 13.0.0 (clang-1300.0.27.3), 64-bit compiled on Jun 5 2023 15:45:24"
## The installation directory of database
install-dir="/usr/local/greenplum-db-6.7.1"
[[database.source.hostname-to-ip]]
## The content within <> should be replaced with actual information and <> should be removed
node-hostname="<mdw>"
node-ip="<127.0.0.1>"
[[database.source.hostname-to-ip]]
node-hostname="<sdw1>"
node-ip="<127.0.0.2>"
[[database.source.hostname-to-ip]]
node-hostname="<sdw2>"
node-ip="<127.0.0.3>"
[database.target]
## Name of database
db-database="destdb"
## Hostname of database master
db-host="172.16.100.32"
## password of database
db-password="yyyy"
## Port of database master
db-port=5432
## user name of database
db-user="mxadmin"
## Version of database(Please use the result of 'SELECT version();' as value). Required only when
## 1. Source database is un-reachable, and 'ddl.only-ddl' is enabled and 'ddl.mode' is 'input'
## 2. Target database is un-reachable, and 'ddl.mode' is 'output' */
# db-version="PostgreSQL 12 (MatrixDB 5.2.0-enterprise) (Greenplum Database 7.0.0+dev.17410.gedbdb5ef84 build dev) on arm-apple-darwin21.5.0, compiled by Apple clang version 13.0.0 (clang-1300.0.27.3), 64-bit compiled on Jun 5 2023 15:45:24"
[scope]
## Whether to disable data incremental migration, by default, it is true.
# disable-data-increment=true
disable-connector=false
## The compress method for transferring data, methods restricted to 0/gzip/lz4/zstd
compress-method="lz4"
## mode for transferring data from source to target database, and value restricted to normal/dryrun/fetch/motion.
##dryrun for only executing ddl, not transferring data
##fetch for fetching data from source and abandon
##motion for fetching data from source, redistributing and finally abandon
mode="normal"
## Sql for select segment information from source database
# select-source-segment-sql="SELECT dbid, content, port, hostname FROM gp_segment_configuration WHERE status = 'u' AND role = 'p' ORDER BY CONTENT;"
## Sql for select segment information from target database
# select-target-segment-sql="SELECT dbid, content, port, hostname FROM gp_segment_configuration WHERE status = 'u' AND role = 'p' ORDER BY CONTENT;"
[[scope.table-list]]
schema="test_schema_1"
name="table_001"
[[scope.table-list]]
schema="test_schema_2"
name="table_002"
[[scope.exclude-table-list]]
schema="test_schema_3"
name="table_003"
schema-list=["test_schema_1", "test_schema_2"]
exclude-schema-list=["test_schema_5", "test_schema_8"]
[log]
## The log level, value restricted to: debug/verbose/info.
log-level="info"
## Print log without color.
# no-color=false
[controller]
## By default, transfer work will start from the largest table. If set 'bothway' it will start from both the largest and the smallest table
both-way=true
## The number of table transferred at the same time
concurrency=3
[transfer]
## Verity the number of record of every table
verify=true
with-index=true
## Whether to disable transfer Data, by default, it is false.
# disabled-data-transfer=false
## Table with size under this threshold are categorized as "small",
## transfer between masters instead of segments to avoid creating external table.
# small-table-mb=32
[ddl]
enabled=true
# file-path="/tmp/mxshift.sql"
# mode="output"
only-ddl=false
## During the DDL transfer, whether to skip the transfer of resource queue or group, by default, it is true.
# skip-resource-queue-and-group=true
## During the DDL transfer, whether to skip the transfer of tablespace, by default, it is true.
# skip-table-space=true
[[ddl.replace]]
## Only applicable for the case of migration from Greenplum to YMatrix
category="role"
[[ddl.replace.pairs]]
old="gpadmin"
new="mxadmin"
## Whether to disable ddl incremental migration, by default, it is true.
# disable-ddl-increment=true
## DDL transfer will stop and exit when executing sql encounter errors, unless the type of DDL is set to 'ignore error'.Types of DDL are:
## "all","collation","default acl","extension","external table","function","index","language",
## "namespace","operator","resource","role","table","tablespace","type","view"
# ignore-error=[]
[verify]
## Whether to enable data verify, by default, it is false.
# enabled=false
## mode for verifying data consistency between source and target database,
## and value restricted to simple-count/column-count/length-sum/hash-bit-bloom.
## 'simple-count' for counting number of rows
## 'column-count' for counting not-null value of a certain column
## 'length-sum' for summing the length of some columns(e.g. column of type text, xml, json, varchar)
## 'hash-bit-bloom' for calculating the hash for one or more columns, and compare it's binary content
## (except for column of floating-point type, the result of witch can be affected by software/hardware)
# mode="simple-count"
## Columns used in verification, will randomly select columns if not set
# columns=[]
## Maximum number of randomly selected columns used in verification
# max-column-count=3
# [[verify.table-list]]
# schema="test_schema_1"
# name="table_001"
# [[verify.exclude-table-list]]
# schema="test_schema_3"
# name="table_003"
# verify.schema-list=["test_schema_1"]
# verify.exclude-schema-list=["test_schema_5"]
# [[verify.override]]
# mode="length-sum"
# max-column-count=3
# columns=[]
# [[verify.override.table-list]]
# schema="test_schema_3"
# name="table_003"
# verify.override.schema-list=[]
# [[verify.override]]
# mode="column-count"
# max-column-count=2
# columns=[]
# [[verify.override.table-list]]
# schema="test_schema_1"
# name="table_001"
# verify.override.schema-list=[]
准备好 TOML 文件后,执行如下命令以启动 mxshift 进行全量迁移。
$ mxshift -c config_path_full.toml
2.2 增量迁移
[database]
[database.source]
## Name of database
db-database= "testdb"
## Hostname of database master
db-host="sdw3"
## password of database
db-password="xxxx"
## Port of database master
db-port=54322
## user name of database
db-user="gpadmin"
## Version of database(Please use the result of 'SELECT version();' as value). Required only when
## 1. Source database is un-reachable, and 'ddl.only-ddl' is enabled and 'ddl.mode' is 'input'
## 2. Target database is un-reachable, and 'ddl.mode' is 'output' */
# db-version="PostgreSQL 12 (MatrixDB 5.2.0-enterprise) (Greenplum Database 7.0.0+dev.17410.gedbdb5ef84 build dev) on arm-apple-darwin21.5.0, compiled by Apple clang version 13.0.0 (clang-1300.0.27.3), 64-bit compiled on Jun 5 2023 15:45:24"
## The installation directory of database
install-dir="/usr/local/greenplum-db-6.7.1"
[[database.source.hostname-to-ip]]
## The content within <> should be replaced with actual information and <> should be removed
node-hostname="<mdw>"
node-ip="<127.0.0.1>"
[[database.source.hostname-to-ip]]
node-hostname="<sdw1>"
node-ip="<127.0.0.2>"
[[database.source.hostname-to-ip]]
node-hostname="<sdw2>"
node-ip="<127.0.0.3>"
[database.target]
## Name of database
db-database="destdb"
## Hostname of database master
db-host="172.16.100.32"
## password of database
db-password="yyyy"
## Port of database master
db-port=5432
## user name of database
db-user="mxadmin"
## Version of database(Please use the result of 'SELECT version();' as value). Required only when
## 1. Source database is un-reachable, and 'ddl.only-ddl' is enabled and 'ddl.mode' is 'input'
## 2. Target database is un-reachable, and 'ddl.mode' is 'output' */
# db-version="PostgreSQL 12 (MatrixDB 5.2.0-enterprise) (Greenplum Database 7.0.0+dev.17410.gedbdb5ef84 build dev) on arm-apple-darwin21.5.0, compiled by Apple clang version 13.0.0 (clang-1300.0.27.3), 64-bit compiled on Jun 5 2023 15:45:24"
[scope]
disable-connector=false
## The compress method for transferring data, methods restricted to 0/gzip/lz4/zstd
compress-method="lz4"
## mode for transferring data from source to target database, and value restricted to normal/dryrun/fetch/motion.
##dryrun for only executing ddl, not transferring data
##fetch for fetching data from source and abandon
##motion for fetching data from source, redistributing and finally abandon
mode="normal"
## Sql for select segment information from source database
# select-source-segment-sql="SELECT dbid, content, port, hostname FROM gp_segment_configuration WHERE status = 'u' AND role = 'p' ORDER BY CONTENT;"
## Sql for select segment information from target database
# select-target-segment-sql="SELECT dbid, content, port, hostname FROM gp_segment_configuration WHERE status = 'u' AND role = 'p' ORDER BY CONTENT;"
## Whether to disable data incremental migration, by default, it is true.
disable-data-increment=false
[[scope.table-list]]
schema="test_schema_1"
name="table_001"
[[scope.table-list]]
schema="test_schema_2"
name="table_002"
[[scope.exclude-table-list]]
schema="test_schema_3"
name="table_003"
schema-list=["test_schema_1", "test_schema_2"]
exclude-schema-list=["test_schema_5", "test_schema_8"]
[log]
## The log level, value restricted to: debug/verbose/info.
log-level="info"
## Print log without color.
# no-color=false
[controller]
## By default, transfer work will start from the largest table. If set 'bothway' it will start from both the largest and the smallest table
both-way=true
## The number of table transferred at the same time
concurrency=3
[transfer]
## Verity the number of record of every table
verify=true
with-index=true
## Whether to disable transfer Data, by default, it is false.
# disabled-data-transfer=false
## Table with size under this threshold are categorized as "small",
## transfer between masters instead of segments to avoid creating external table.
# small-table-mb=32
[ddl]
enabled=true
# file-path="/tmp/mxshift.sql"
# mode="output"
only-ddl=false
## During the DDL transfer, whether to skip the transfer of resource queue or group, by default, it is true.
# skip-resource-queue-and-group=true
## During the DDL transfer, whether to skip the transfer of tablespace, by default, it is true.
# skip-table-space=true
[[ddl.replace]]
## Only applicable for the case of migration from Greenplum to YMatrix
category="role"
[[ddl.replace.pairs]]
old="gpadmin"
new="mxadmin"
## Whether to disable ddl incremental migration, by default, it is true.
disable-ddl-increment=false
## DDL transfer will stop and exit when executing sql encounter errors, unless the type of DDL is set to 'ignore error'.Types of DDL are:
## "all","collation","default acl","extension","external table","function","index","language",
## "namespace","operator","resource","role","table","tablespace","type","view"
# ignore-error=[]
[verify]
## Whether to enable data verify, by default, it is false.
# enabled=false
## mode for verifying data consistency between source and target database,
## and value restricted to simple-count/column-count/length-sum/hash-bit-bloom.
## 'simple-count' for counting number of rows
## 'column-count' for counting not-null value of a certain column
## 'length-sum' for summing the length of some columns(e.g. column of type text, xml, json, varchar)
## 'hash-bit-bloom' for calculating the hash for one or more columns, and compare it's binary content
## (except for column of floating-point type, the result of witch can be affected by software/hardware)
# mode="simple-count"
## Columns used in verification, will randomly select columns if not set
# columns=[]
## Maximum number of randomly selected columns used in verification
# max-column-count=3
# [[verify.table-list]]
# schema="test_schema_1"
# name="table_001"
# [[verify.exclude-table-list]]
# schema="test_schema_3"
# name="table_003"
# verify.schema-list=["test_schema_1"]
# verify.exclude-schema-list=["test_schema_5"]
# [[verify.override]]
# mode="length-sum"
# max-column-count=3
# columns=[]
# [[verify.override.table-list]]
# schema="test_schema_3"
# name="table_003"
# verify.override.schema-list=[]
# [[verify.override]]
# mode="column-count"
# max-column-count=2
# columns=[]
# [[verify.override.table-list]]
# schema="test_schema_1"
# name="table_001"
# verify.override.schema-list=[]
准备好 TOML 文件后,执行如下命令以启动 mxshift 进行增量迁移。
$ mxshift -c config_path_incremental.toml
2.3 条件迁移
通过 WHERE
条件实现满足某些条件的数据迁移或实现过期数据不迁移:
[database]
[database.source]
## Name of database
db-database= "testdb"
## Hostname of database master
db-host="sdw3"
## password of database
db-password="xxxx"
## Port of database master
db-port=54322
## user name of database
db-user="gpadmin"
## The installation directory of database
install-dir="/usr/local/greenplum-db-6.7.1"
[database.target]
## Name of database
db-database="destdb"
## Hostname of database master
db-host="172.16.100.32"
## password of database
db-password="yyyy"
## Port of database master
db-port=5432
## user name of database
db-user="mxadmin"
[scope]
## The compress method for transferring data, methods restricted to 0/gzip/lz4/zstd
compress-method="lz4"
## mode for transferring data from source to target database, and value restricted to normal/dryrun/fetch/motion.
##dryrun for only executing ddl, not transferring data
##fetch for fetching data from source and abandon
##motion for fetching data from source, redistributing and finally abandon
mode="normal"
## Sql for select segment information from source database
# select-source-segment-sql="SELECT dbid, content, port, hostname FROM gp_segment_configuration WHERE status = 'u' AND role = 'p' ORDER BY CONTENT;"
## Sql for select segment information from target database
# select-target-segment-sql="SELECT dbid, content, port, hostname FROM gp_segment_configuration WHERE status = 'u' AND role = 'p' ORDER BY CONTENT;"
[[scope.table-list]]
schema=......
name=......
[[scope.table-list]]
......
[[scope.exclude-table-list]]
......
schema-list=["test_schema_1", "test_schema_2"]
exclude-schema-list=["test_schema_5", "test_schema_8"]
disable-connector=false
[log]
## The log level, value restricted to: debug/verbose/info.
log-level="info"
## Print log without color.
# no-color=false
[controller]
## By default, transfer work will start from the largest table. If set 'bothway' it will start from both the largest and the smallest table
both-way=true
## The number of table transferred at the same time
concurrency=3
[transfer]
## Verity the number of record of every table
verify=true
## Table with size under this threshold are categorized as "small",
## transfer between masters instead of segments to avoid creating external table.
# small-table-mb=32
[transfer.table-data-where-sql]
enabled=true
global="txdate >= '2022-10-01' AND batchnum >= 100000000"
[[transfer.table-data-where-sql.override]]
where="abc > 10"
[transfer.table-data-where-sql.override.table]
schema="test_schema_1"
name="table_001"
[[transfer.table-data-where-sql.override]]
where="tag != 'aabbcc' AND ts > '2022-01-01'"
[transfer.table-data-where-sql.override.table]
schema="test_schema_2"
name="another_table"
[verify]
## Whether to enable data verify, by default, it is false.
# enabled=false
## mode for verifying data consistency between source and target database,
## and value restricted to simple-count/column-count/length-sum/hash-bit-bloom.
## 'simple-count' for counting number of rows
## 'column-count' for counting not-null value of a certain column
## 'length-sum' for summing the length of some columns(e.g. column of type text, xml, json, varchar)
## 'hash-bit-bloom' for calculating the hash for one or more columns, and compare it's binary content
## (except for column of floating-point type, the result of witch can be affected by software/hardware)
# mode="simple-count"
## Columns used in verification, will randomly select columns if not set
# columns=[]
## Maximum number of randomly selected columns used in verification
# max-column-count=3
# [[verify.table-list]]
# schema="test_schema_1"
# name="table_001"
# [[verify.exclude-table-list]]
# schema="test_schema_3"
# name="table_003"
# verify.schema-list=["test_schema_1"]
# verify.exclude-schema-list=["test_schema_5"]
# [[verify.override]]
# mode="length-sum"
# max-column-count=3
# columns=[]
# [[verify.override.table-list]]
# schema="test_schema_3"
# name="table_003"
# verify.override.schema-list=[]
# [[verify.override]]
# mode="column-count"
# max-column-count=2
# columns=[]
# [[verify.override.table-list]]
# schema="test_schema_1"
# name="table_001"
# verify.override.schema-list=[]
准备好 TOML 文件后,执行如下命令以启动 mxshift 进行条件迁移。
$ mxshift -c config_path_conditional.toml
2.4 迁移复制表
如果是迁移某个数据库,或者某个 Schema 下的所有复制表,不需要在 config_path.toml
里面做额外的配置,mxshfit 会自动判断哪些表是复制表,执行相应的迁移策略,防止重复迁移数据的问题发生。你仅需要在 scope.table-list
中配置好 schema
和 name
即可。
[[scope.table-list]]
schema=“public1”
name=“table_replicated1”
[[scope.table-list]]
schema=“public2”
name=“table_replicated2”
准备好 TOML 文件后,执行如下命令以启动 mxshift 进行复制表迁移。
$ mxshift -c config_path_replicated.toml
注意!
数据迁移思路及完整步骤介绍请见 数据迁移
3 功能提示
3.1 导入/导出表元信息
对于数据迁移,mxshift 需要先查询 catalog 目录获得迁移表的目录信息,再执行数据迁移任务。
随着数据库使用的加重,catalog 膨胀,查询的耗时也随之增长。为快捷调试以及缩短迁移期间业务停机时间,mxshift 增加以下命令行参数实现表元信息导出,并支持在后续迁移过程中导入复用:
## 将迁移表的元信息导出到 mxshift_meta_file 文件,并退出
$ mxshift -c <迁移配置文件> --write-table-meta-file mxshift_meta_file
## 从 mxshift_meta_file 文件导入迁移表的元信息,并进行数据迁移
$ mxshift -c <迁移配置文件> --read-table-meta-file mxshift_meta_file
注意!
导出的表元信息并不建议长期使用。当迁移表范围改变或者表元信息发生更改时,表元信息需要重新生成。
3.2 导入/导出表 DDL
在 mshift 的配置文件中作如下配置:
[ddl]
enabled=true
file-path="/tmp/mxshift.sql"
mode="output"
- file-path:用于导入/导出 DDL sql 的文件路径,详见下文附录。禁止修改或者删除任何以
-- DO NOT EDIT.
开头的注释内容。 - mode:表示 DDL 的处理模式,分
exec
,output
以及input
三种,含义分别为:exec
:默认值,直接在目标库执行output
:输出到文件。此时 DDL 导出完毕后将直接退出,不会进行数据迁移input
:从文件读入
若在 output
下无法连接目标库,则可按照如下示例配置 database.target.db-version
,保障导出正常执行:
[database]
[database.target]
...
## Version of database(Please use the result of 'SELECT version();' as value). Required only when
## 1. Source database is un-reachable, and 'ddl.only-ddl' is enabled and 'ddl.mode' is 'input'
## 2. Target database is un-reachable, and 'ddl.mode' is 'output'
db-version="PostgreSQL 12 (MatrixDB 5.2.0-enterprise) (Greenplum Database 7.0.0+dev.17410.gedbdb5ef84 build dev) on arm-apple-darwin21.5.0, compiled by Apple clang version 13.0.0 (clang-1300.0.27.3), 64-bit compiled on Jul 5 2022 15:45:24"
若在input
下无法连接源库,则可按照如下示例配置 database.source.db-version
以及 ddl.only-ddl
,保障导入正常执行:
[database]
[database.target]
## Version of database(Please use the result of 'SELECT version();' as value). Required only when
## 1. Source database is un-reachable, and 'ddl.only-ddl' is enabled and 'ddl.mode' is 'input'
## 2. Target database is un-reachable, and 'ddl.mode' is 'output'
db-version="PostgreSQL 12 (MatrixDB 4.5.0-enterprise) (Greenplum Database 7.0.0+dev.17410.gedbdb5ef84 build dev) on arm-apple-darwin21.5.0, compiled by Apple clang version 13.0.0 (clang-1300.0.27.3), 64-bit compiled on Jul 5 2022 15:45:24"
[ddl]
only-ddl=true
3.3 动态调整并发数
mxshift 工具支持在迁移运行时调节并发数,步骤如下:
- 修改配置文件(启动 mxshift 时使用的配置文件)中的
controller.concurrency
[controller]
## The number of table transferred at the same time
concurrency=xxx
-
获取 mxshift 当前的进程号
命令行运行
ps aux | grep mxshift
,记下第二列的数字{{mxshift PID}}
-
向 mxshift 进程发
SIGUSR2
信号,令 mxshift 重新载入配置文件,更新并发数命令行运行
kill -SIGUSR2 {{mxshift PID}}
3.4 数据一致性检验
3.4.1 旧版数据检验与数据迁移解耦
在 v5.2.1 之前的版本中,[transfer]
下有配置项 with-index
和 verify
分别用以控制是否迁移索引和是否检验表行数,但是索引迁移、数据检验(旧版,仅检验表行数)均与数据迁移存在绑定关系,即索引迁移和数据检验均不能单独运行,必须在数据迁移的进程中进行。
为了增加灵活性,从 v5.2.1 版本开始支持索引迁移和旧版数据检验单独运行。在 [transfer]
类别下增加配置参数 disabled-data-transfer
用以跳过数据迁移。使用示例如下:
- 同时迁移数据和索引,可采用以下配置:
[transfer]
## Whether to disable transfer Data, by default, it is false.
disabled-data-transfer=false
## Transfer indexes of the tables, by default, it is false.
with-index=true
- 迁移数据并检验表行数,可采用以下配置:
[transfer]
## Whether to disable transfer Data, by default, it is false.
disabled-data-transfer=false
## Verify the number of record of every table, by default, it is disabled.
verify=true
- 同时迁移数据和索引,并检验表行数,可采用以下配置:
[transfer]
## Whether to disable transfer Data, by default, it is false.
disabled-data-transfer=false
## Transfer indexes of the tables, by default, it is false.
with-index=true
## Verify the number of record of every table, by default, it is disabled.
verify=true
- 仅迁移数据,不迁移索引,也不检验表行数,可采用以下配置:
[transfer]
## Whether to disable transfer Data, by default, it is false.
disabled-data-transfer=false
## Transfer indexes of the tables, by default, it is false.
with-index=false
## Verify the number of record of every table, by default, it is disabled.
verify=false
- 不迁移数据,仅迁移索引,可采用以下配置:
[transfer]
## Whether to disable transfer Data, by default, it is false.
disabled-data-transfer=true
## Transfer indexes of the tables, by default, it is false.
with-index=true
- 不迁移数据,仅检验表行数,可采用以下配置:
[transfer]
## Whether to disable transfer Data, by default, it is false.
disabled-data-transfer=true
## Verify the number of record of every table, by default, it is disabled.
verify=true
3.4.2 v5.2.1 版数据检验
YMatrix mxshift 从 v5.2.1 版本开始支持新的数据检验算法,可通过配置新增的 [verify]
类别运行。
详细的全局配置参数信息请见下文附录,完整的 [verify]
类别展示见上文全量迁移配置文件。这里特别介绍检验范围的配置。
[verify]
# [[verify.table-list]]
# schema="test_schema_1"
# name="table_001"
# [[verify.exclude-table-list]]
# schema="test_schema_3"
# name="table_003"
# verify.schema-list=["test_schema_1"]
# verify.exclude-schema-list=["test_schema_5"]
# [[verify.override]]
# mode="length-sum"
# max-column-count=3
# columns=[]
# [[verify.override.table-list]]
# schema="test_schema_3"
# name="table_003"
# verify.override.schema-list=[]
# [[verify.override]]
# mode="column-count"
# max-column-count=2
# columns=[]
# [[verify.override.table-list]]
# schema="test_schema_1"
# name="table_001"
# verify.override.schema-list=[]
指定检验范围相关配置存在于 [verify.table-list]
、[verify.exclude-table-list]
、[verify.schema-list]
以及 [verify.exclude-schema-list]
四个子类别。
检验优先级顺序如下:
-
如果
table-list
有配置具体的表,则只检验table-list
中的表。 -
如果没有配置具体表,则检查
schema-list
中是否配置了模式(Schema)。 -
如果配置了模式,则检验该模式下的所有表。
-
如果没有配置模式,则检验最近一次传输的所有表。
verify.table-list
:参与 v5.2.1 版数据检验的表名列表。verify.exclude-table-list
:不参与 v5.2.1 版数据检验的表名列表。verify.schema-list
:参与 v5.2.1 版数据检验的模式列表。verify.exclude-schema-list
:不参与 v5.2.1 版数据检验的模式列表。
[verify.override]
子类别用于覆盖指定的全局检验配置。对指定范围内的表使用此子类别下的配置。
附录:参数信息
此部分主要介绍 mxshift 工具相关参数信息。
配置文件
使用 mxshift 工具需要先准备一个 TOML 文件。以下是配置文件中的详细参数介绍:
如果你想要通过 WHERE 条件实现条件迁移,那么可能会使用到以下参数:
类别 | 子类别 | 子参数 | 子子类别 | 子子参数 | 描述 | 必选 |
[transfer]类别 | [transfer.table-data-where-sql]类别 | --enabled | 如果为 `true`,则利用 WHERE 条件过滤源数据库数据, 仅迁移满足 WHERE 条件的数据,实现增量迁移。本功能默认关闭,启用需手动指认 `true` | 否 | ||
--global | 此参数指 SQL 的 `WHERE` 表达式,mxshift 会对本次迁移的所有表附加此条件表达式,如果表中不包含 `WHERE` 关键字,则提示 `SKIPPED` 自动跳过该表 | 否 | ||||
[[transfer.table-data-where-sql.override]]类别 | --where | 表示 WHERE 子表达式,优先级高于 global 中的 WHERE 表达式,仅对某个特定表进行过滤的表级应用 | 否 | |||
[transfer.table-data-where-sql.override.table]类别 | --schema | 需迁移的模式名 | 否 | |||
--name | 需迁移的表名 | 否 |
命令行参数
mxshift 命令行参数如下:
参数名 | 描述 |
---|---|
--config | 打印配置文件默认内容 |
-c / --config-path | mxshift 工具全局配置文件路径 |
-a / --all | 无需确认,直接执行迁移。默认跳过已完成的表和空表 |
-v / --version | mxshift 工具版本 |
-l / --show-progress | 打印最近一次迁移的进度。若目前未在迁移,则展示上一次的迁移结果;若目前正在迁移,则展示当前迁移进度。其中,“STATUS”取值及其含义为:Not start:不在迁移范围;Skip:跳过(已迁移过或者为空表);Processing/Interrupted:正在迁移或者迁移过程中被打断;Complete:完成迁移;Failed:迁移失败 |
-R / --redo | 若表已迁移过,不跳过,重新迁移(该选项会清除目标库中表的所有数据,并重新迁移) |
--write-table-meta-file | 将迁移表的元信息导出到 mxshift_meta_file 文件,并退出 |
--read-table-meta-file | 从 mxshift_meta_file 文件导入迁移表的元信息,并进行数据迁移 |