Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(sink): ClickHouse DateTime64 insertion #13645

Closed
xiangjinwu opened this issue Nov 24, 2023 · 1 comment
Closed

bug(sink): ClickHouse DateTime64 insertion #13645

xiangjinwu opened this issue Nov 24, 2023 · 1 comment

Comments

@xiangjinwu
Copy link
Contributor

xiangjinwu commented Nov 24, 2023

In ClickHouse, the data type DateTime can optionally be parameterized by precision and timezone. With different parameters, the proper insert can be different. However, our sink is issuing insert without checking these parameters.

https://clickhouse.com/docs/en/sql-reference/data-types/datetime64

  • When inserting datetime as an integer, it is treated as an appropriately scaled Unix Timestamp (UTC).
  • Inserting datetime as a decimal will treat it similarly as an integer, except the value before the decimal point is the Unix Timestamp up to and including the seconds, and after the decimal point will be treated as the precision.
  • When inserting string value as datetime, it is treated as being in column timezone.

In the example below, we created 3 columns using different parameters. For each row t0 = t1 = t2 but note how they require different insert:

  • Inserting 0 is simple
  • For integers, t0 requires 3 trailing zeros but t1 and t2 requires 6 trailing zeros.
  • For string, t0 and t1 requires UTC but t2 requires a local time in that zone.
  • For decimal, all 3 columns can use the same value in insert. But at least 1 zero is required to differentiate it from integer form.
create table t (rid Int32, t0 DateTime64(3, 'UTC'), t1 DateTime(6, 'UTC'), t2 DateTime(6, 'Asia/Singapore')) ENGINE = ReplacingMergeTree PRIMARY KEY (rid);

insert into t values (0, 0, 0, 0);
insert into t values (1, 1689130892000, 1689130892000000, 1689130892000000);
insert into t values (2, '2023-09-01 03:00:00', '2023-09-01 03:00:00', '2023-09-01 11:00:00');
insert into t values (3, 1689130892.0, 1689130892.0, 1689130892.0);

select *, t0 = t1, t1 = t2 from t order by rid;
┌─rid─┬──────────────────────t0─┬─────────────────────────t1─┬─────────────────────────t2─┬─equals(t0, t1)─┬─equals(t1, t2)─┐
│   0 │ 1970-01-01 00:00:00.000 │ 1970-01-01 00:00:00.000000 │ 1970-01-01 07:30:00.000000 │              1 │              1 │
└─────┴─────────────────────────┴────────────────────────────┴────────────────────────────┴────────────────┴────────────────┘
┌─rid─┬──────────────────────t0─┬─────────────────────────t1─┬─────────────────────────t2─┬─equals(t0, t1)─┬─equals(t1, t2)─┐
│   1 │ 2023-07-12 03:01:32.000 │ 2023-07-12 03:01:32.000000 │ 2023-07-12 11:01:32.000000 │              1 │              1 │
└─────┴─────────────────────────┴────────────────────────────┴────────────────────────────┴────────────────┴────────────────┘
┌─rid─┬──────────────────────t0─┬─────────────────────────t1─┬─────────────────────────t2─┬─equals(t0, t1)─┬─equals(t1, t2)─┐
│   2 │ 2023-09-01 03:00:00.000 │ 2023-09-01 03:00:00.000000 │ 2023-09-01 11:00:00.000000 │              1 │              1 │
└─────┴─────────────────────────┴────────────────────────────┴────────────────────────────┴────────────────┴────────────────┘
┌─rid─┬──────────────────────t0─┬─────────────────────────t1─┬─────────────────────────t2─┬─equals(t0, t1)─┬─equals(t1, t2)─┐
│   3 │ 2023-07-12 03:01:32.000 │ 2023-07-12 03:01:32.000000 │ 2023-07-12 11:01:32.000000 │              1 │              1 │
└─────┴─────────────────────────┴────────────────────────────┴────────────────────────────┴────────────────┴────────────────┘

4 rows in set. Elapsed: 0.010 sec. 
@github-actions github-actions bot added this to the release-1.5 milestone Nov 24, 2023
@xiangjinwu
Copy link
Contributor Author

let accuracy_time = if ck_column.r#type.contains("DateTime64(") {
ck_column
.r#type
.split("DateTime64(")
.last()
.ok_or_else(|| SinkError::ClickHouse("must have last".to_string()))?
.split(')')
.next()
.ok_or_else(|| SinkError::ClickHouse("must have next".to_string()))?
.parse::<u8>()
.map_err(|e| SinkError::ClickHouse(format!("clickhouse sink error {}", e)))?
} else {
0_u8
};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant