Write Path

To accommodate diverse workload requirements, YMatrix supports three write modes. The active mode is determined by the table-level parameters prefer_load_mode and rowstore_size. The default mode is Normal.

Image

Single Mode

In Single mode, data accumulates in local memory up to a maximum of 1 MB before being inserted directly into the rowstore. This process resembles traditional PostgreSQL Heap insertion. Data resides directly in Shared Buffers, the size of which is controlled by the shared_buffers parameter. In Single mode, consider increasing this parameter to prevent memory swapping.

Image

Advantages:

  • Minimal insertion latency.
  • Minimal memory consumption.

Disadvantages:

  • For large data loads, data enters the rowstore first, requiring an additional merge operation later.
  • If data in the rowstore is not converted to columnstore promptly, the reported table size (\dt+) may appear inflated. Manually executing VACUUM or VACUUM FULL before checking can resolve this.
  • Analytical Processing (AP) queries and BRIN index queries perform less efficiently compared to columnstore.
  • Lack of compression results in data and XLOG sizes several times larger than columnstore. This mode is unsuitable for environments with poor I/O performance.

Bulk Mode

In Bulk mode, data is copied to local memory until it reaches the rowstore_size limit. Once this threshold is met, the data is immediately converted to columnstore format and flushed to disk. Upon completion of the insert operation, any remaining data in local memory (even if below rowstore_size) is also converted to columnstore and flushed to disk.

Advantages:

  • Data is compressed, reducing disk I/O pressure.
  • AP queries and BRIN index queries perform better than in rowstore.
  • Eliminates one instance of write amplification.

Disadvantages:

  • Higher memory consumption, often several times that of rowstore, particularly for tables with many partitions.
  • Insertion latency is higher compared to rowstore.

Normal Mode

Normal mode is an intelligent insertion strategy. Data is copied to local memory.

  • If the memory usage reaches rowstore_size, the data is immediately converted to columnstore and flushed to disk.
  • When the insert operation completes:
    • If the remaining data in local memory is less than rowstore_size / 2, it is written to the current rowstore.
    • Otherwise, it is converted to columnstore and flushed to disk.

This is the default mode.

Write Performance Comparison by Mode

  1. Single Row Insertion: pgbench -n -f single_insert.sql -c 1 -j 1 -t 100000 test
Test Type Transactions per Client Clients Threads Avg Latency (ms) TPS (incl. connection) TPS (excl. connection)
single_insert.sql 100,000 1 1 1.98 505.03 505.06
bulk_insert.sql 100,000 1 1 15.282 65.44 65.44
normal_insert.sql 100,000 1 1 1.968 508.02 508.05
  1. Small Batch Insertion: time for i in {1..2000}; do psql test -f batch_insert.sql; done
Mode Total Duration Total Seconds (s) Rows per Second (rows/s)
single 2m27s 147 ≈ 680
normal 2m28s 148 ≈ 676
bulk 4m55s 295 ≈ 339
  1. Large Batch Load: Generated a 50 million row CSV file and imported it using COPY into tables configured with different modes.
Mode Rows Duration (seconds)
bulk 50,000,000 69.287
single 50,000,000 50.868
normal 50,000,000 70.612
  1. MARS3 Single Mode vs. Heap Write Comparison: Compression ratios approach 10:1. TPS is approximately 25% lower compared to standard Heap tables.
Table Name Compression Ratio Storage Savings
fi_voucher 7:1 86%
fi_voucher_b 15:1 93%
aai_voucher 4.5:1 78%
aai_voucher_record 8.9:1 89%

WAL Generation Comparison by Mode

In a real-world scenario inserting identical data volumes, the WAL generation rate remains consistent across modes. However, the total WAL generated is approximately 1/3 of that produced by standard Heap tables. This significantly reduces storage space occupied by WAL files.

Image

Return to previous section: Storage Engine Principles