Data Types

This document describes the data types supported by YMatrix.

YMatrix supports the following categories of data types:

Basic data types
1. Numeric
2. String
3. Date and time
Extended data types
1. JSON / jsonb
2. MXKV
Spatial data types

1 Basic Data Types

1.1 Numeric

Type	Alias	Size	Range	Description
smallint	int2	2 bytes	-32768 ~ 32767	Small-range integer
int	int4	4 bytes	-2147483648 ~ 2147483647	Integer type
bigint	int8	8 bytes	-9223372036854775808 ~ 9223372036854775807	Large-range integer
serial	serial4	4 bytes	1 ~ 2147483647	Auto-incrementing integer
bigserial	serial8	8 bytes	1 ~ 9223372036854775807	Large-range auto-incrementing integer
real	float4	4 bytes	6-digit precision	Variable precision
double precision	float8	8 bytes	15-digit precision	Variable precision
decimal	numeric	variable	unlimited	Fixed precision

As shown in the table, YMatrix numeric types fall into four categories:

Integers: Different sizes provide different ranges; all are signed
1. smallint
2. int
3. bigint
Fixed-point numbers: Numbers with fixed decimal precision
1. decimal
Floating-point numbers: Numbers with variable decimal precision
1. real
2. double precision
Auto-increment types: Used for auto-incrementing ID columns
1. serial
2. bigserial

1.2 String

Type	Alias	Size	Range	Description
character [ (n) ]	char [ (n) ]	1 byte + n	Up to n characters	Fixed-length string, padded with spaces if not fully filled
character varying [ (n) ]	varchar [ (n) ]	1 byte + string size	Up to n characters	Variable-length string with length limit; maximum 10,485,760 bytes
text		1 byte + string length	Unlimited length	Variable-length string without length limit

As shown in the table, YMatrix string types fall into two categories:

Fixed-length: char
Variable-length
1. varchar: Length-limited
2. text: Unlimited length

1.3 Date and Time

Type	Alias	Size	Range	Description
date		4 bytes	4713 BC ~ 294,277 AD	Calendar date (year, month, day)
time [ (p) ] [ without time zone ]		8 bytes	00:00:00[.000000] ~ 24:00:00[.000000]	Time of day
time [ (p) ] with time zone	timetz	12 bytes	00:00:00+1359 ~ 24:00:00-1359	Time of day with time zone
timestamp [ (p) ] [ without time zone ]		8 bytes	4713 BC ~ 294,277 AD	Date and time
timestamp [ (p) ] with time zone	timestamptz	8 bytes	4713 BC ~ 294,277 AD	Date and time with time zone

As shown in the table, YMatrix date/time types fall into three categories:

Date: Represents only year-month-day, no time
1. date
Time: Represents only time, no date
1. time [without time zone] (without time zone)
2. time with time zone (with time zone)
Timestamp: Represents both date and time
1. timestamp [without time zone] (without time zone)
2. timestamp with time zone (with time zone)

Impact of Time Zone on Time Types

A notable feature in time types is the time zone setting. By default, time zone is not included (i.e., without time zone). What impact does time zone have on time types? The following example demonstrates this behavior.

CREATE TABLE test (
    c1 int,
    c2 timestamp,
    c3 timestamp with time zone
) DISTRIBUTED BY (c1);

The test table contains two timestamp columns: one without time zone and one with time zone. Insert test data and query:

INSERT INTO test VALUES(1, now(), now());

SELECT * FROM test;
 c1 |            c2            |             c3
----+--------------------------+-----------------------------
  1 | 2021-12-01 14:54:09.4783 | 2021-12-01 14:54:09.4783+08
(1 row)

SHOW timezone;
   TimeZone
---------------
 Asia/Shanghai
(1 row)

The query result shows that both timestamps appear identical. However, the c3 column (with time zone) includes +08, indicating it belongs to UTC+8. The SHOW timezone command confirms the current session time zone is 'Asia/Shanghai' (Beijing time).

Now change the time zone and requery:

SET timezone TO 'Japan';

SELECT * FROM test;
 c1 |            c2            |             c3
----+--------------------------+-----------------------------
  1 | 2021-12-01 14:54:09.4783 | 2021-12-01 15:54:09.4783+09
(1 row)

After changing the database time zone to Japan, the c3 column now displays a different time (15:54) and the time zone suffix changes to +09 (UTC+9).

Therefore, timestamp columns with time zone store time zone information and display local time accordingly in different time zones. In contrast, timestamp columns without time zone always display the stored time unchanged, regardless of session time zone.

When deploying systems across multiple geographic regions with different time zones, using timestamp with time zone is strongly recommended. Note that both timestamp variants use the same storage size, so including time zone information is advised.

Recommendations for Basic Data Types in Time-Series Scenarios

Time-series data typically includes:

Timestamps: Must include both date and time. Use timestamp with time zone.
Device identifiers:
1. For string-based identifiers, use varchar or text. Use char only for fixed-length identifiers such as serial numbers.
2. If maintaining a separate device table, use serial or bigserial as auto-incrementing primary keys to uniquely identify devices.
Metrics:
1. For integer values, choose smallint, int, or bigint based on expected range.
2. For decimal values with fixed precision and moderate range (e.g., temperature), use decimal.
3. For decimal values with variable precision and large range, choose real or double precision based on required accuracy.

2 Extended Data Types

In relational models, basic data types represent one attribute per column (e.g., name, height, weight). However, two challenges arise:

The set of metrics is unpredictable, requiring frequent schema changes.
The number of metrics exceeds YMatrix's column limit (1600 columns per table).

In such cases, extended data types are needed. These allow storing multiple attributes within a single column.

There are many ways to implement extended types (e.g., custom serialization in string columns). YMatrix provides two built-in extended types:

JSON / jsonb
MXKV

2.1 JSON / jsonb

The JSON / jsonb types are inherited from PostgreSQL. Unlike storing raw JSON strings in text columns, JSON / jsonb types support built-in functions for accessing keys, merging, and deletion.

JSON and jsonb are functionally similar and can be converted between each other. Key differences:

JSON is stored as text; jsonb is stored in binary format.
JSON writes are faster (no binary conversion).
jsonb reads are faster due to binary representation.

Common JSON / jsonb operations:

Extract key using ->

SELECT '{"a":1, "b":2, "c":3}'::json->'a';
 ?column?
----------
 1
(1 row)

For nested JSON, chain multiple -> operators:

SELECT '{"a":1, "b":2, "c":{"d":2}}'::json->'c'->'d';
 ?column?
----------
 2
(1 row)

Delete key

Use the - operator to delete a key. Write operations require jsonb:

SELECT '{"a":1, "b":2, "c":3}'::jsonb - 'a';
     ?column?
------------------
 {"b": 2, "c": 3}
(1 row)

Merge

Use the || operator to merge two jsonb objects:

SELECT '{"a":1, "b":2}'::jsonb || '{"c":3}'::jsonb;
         ?column?
--------------------------
 {"a": 1, "b": 2, "c": 3}
(1 row)

Update

When merging with duplicate keys, || updates the value:

SELECT '{"a":1, "b":2}'::jsonb || '{"b":3}'::jsonb;
     ?column?
------------------
 {"a": 1, "b": 3}
(1 row)

For more JSON functions, see the PostgreSQL documentation.

2.2 MXKV

MXKV is a high-performance, scalable storage type developed by YMatrix. It is used similarly to JSON but offers key advantages: binary storage with compression, resulting in smaller storage footprint and higher query performance.

For detailed usage, refer to the documentation and blog.

3 Spatial Data Types

YMatrix provides an enhanced version of the PostGIS extension, supporting storage and computation of spatial data types. For usage details, see Spatial Data.

← Previous

Tool Guide

Storage Engine