Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully support date/time legacy rebase for nested input [databricks] #9660

Merged
merged 78 commits into from
Nov 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
c578a64
Add check for nested types
ttnghia Aug 28, 2023
e368aa6
Add check for nested types
ttnghia Aug 28, 2023
7da416b
Recursively check for rebasing
ttnghia Nov 2, 2023
df8f861
Extract common code
ttnghia Nov 2, 2023
95d19ee
Allow nested type in rebase check
ttnghia Nov 2, 2023
b426610
Enable nested timestamp in roundtrip test
ttnghia Nov 2, 2023
7343b17
Fix another test
ttnghia Nov 2, 2023
0d48f57
Merge branch 'check_rebase_nested' into rebase_datatime
ttnghia Nov 2, 2023
024e6c9
Enable `LEGACY` rebase in read
ttnghia Nov 2, 2023
9a39628
Remove comment
ttnghia Nov 2, 2023
e686bb0
Change function/class signatures
ttnghia Nov 2, 2023
b49963e
Merge branch 'branch-23.12' into rebase_datatime
ttnghia Nov 3, 2023
2c232f8
Complete modification
ttnghia Nov 3, 2023
ac0f3e4
Misc
ttnghia Nov 3, 2023
c773794
Add explicit type
ttnghia Nov 3, 2023
29df7cd
Rename file and add some stuff in DateTimeRebaseHelpers.scala
ttnghia Nov 3, 2023
1b5112d
Move file and rename class
ttnghia Nov 4, 2023
63342a9
Adopt new enum type
ttnghia Nov 4, 2023
6b2d795
Add name for the enum classes
ttnghia Nov 4, 2023
37aa40b
Change exception messages
ttnghia Nov 4, 2023
d4cdc1b
Merge branch 'branch-23.12' into refactor_parquet_scan
ttnghia Nov 4, 2023
03f681e
Does not yet support legacy rebase in read
ttnghia Nov 5, 2023
14f230f
Change legacy to corrected mode
ttnghia Nov 5, 2023
1b464ec
Extract common code
ttnghia Nov 5, 2023
0d26d97
Rename functions
ttnghia Nov 5, 2023
c2504fd
Reformat
ttnghia Nov 5, 2023
edb6c81
Make classes serializable
ttnghia Nov 5, 2023
ea86e8f
Revert "Support rebase checking for nested dates and timestamps (#9617)"
ttnghia Nov 6, 2023
b14463f
Merge branch 'refactor_parquet_scan' into rebase_datatime
ttnghia Nov 6, 2023
adc8ae2
Implement date time rebase
ttnghia Nov 6, 2023
791573c
Optimize rebase op
ttnghia Nov 6, 2023
54e959f
Merge branch 'branch-23.12' into refactor_parquet_scan
ttnghia Nov 6, 2023
3f01690
Change comment
ttnghia Nov 6, 2023
6d9c20b
Merge branch 'refactor_parquet_scan' into rebase_datatime
ttnghia Nov 6, 2023
8c63273
Move tests
ttnghia Nov 6, 2023
1b1fdc3
Add test for datatime rebase
ttnghia Nov 6, 2023
e6559ce
Various changes
ttnghia Nov 6, 2023
74fe84a
Various changes
ttnghia Nov 6, 2023
a455a90
Fix compile errors
ttnghia Nov 6, 2023
b87493c
Fix comments
ttnghia Nov 6, 2023
321e516
Fix indentations
ttnghia Nov 6, 2023
4bc33be
Merge branch 'refactor_parquet_scan' into rebase_datatime
ttnghia Nov 6, 2023
4aab36b
Change comments and indentations
ttnghia Nov 6, 2023
1b4744a
Merge branch 'rebase_datatime' into rebase_nested_timestamp
ttnghia Nov 6, 2023
70310db
Allow nested check for rebase
ttnghia Nov 7, 2023
c615925
Merge branch 'branch-23.12' into rebase_datatime
ttnghia Nov 7, 2023
be92368
Write different timestamp types in test
ttnghia Nov 7, 2023
b09c61f
Fix conversion if timestamp is not micros
ttnghia Nov 7, 2023
00d96e4
Rename var
ttnghia Nov 7, 2023
7d81311
Dont have to down cast after up cast
ttnghia Nov 7, 2023
116bf3e
Change comment
ttnghia Nov 7, 2023
273b2c4
Still cast timestamp to the old type after rebasing
ttnghia Nov 7, 2023
996d9d4
Rename test
ttnghia Nov 7, 2023
5fd6ef5
Should not transform non-datetime types
ttnghia Nov 7, 2023
d53ecfa
Merge branch 'rebase_datatime' into rebase_nested_timestamp
ttnghia Nov 7, 2023
4144655
Fix test
ttnghia Nov 7, 2023
5a8b44c
Update tests
ttnghia Nov 7, 2023
a33bfd6
Merge branch 'rebase_datatime' into rebase_nested_timestamp
ttnghia Nov 7, 2023
e366e5a
Enable int96 rebase in write
ttnghia Nov 7, 2023
247f47f
Change tests
ttnghia Nov 7, 2023
8eba053
Complete tests
ttnghia Nov 7, 2023
bda59ef
Revert unrelated changes
ttnghia Nov 7, 2023
bbcd9d9
Merge branch 'branch-23.12' into int96_rebase_write
ttnghia Nov 7, 2023
fbe37d7
Merge branch 'branch-23.12' into rebase_datatime
ttnghia Nov 7, 2023
4a92d54
Change configs
ttnghia Nov 8, 2023
54c53d3
Merge branch 'rebase_datatime' into rebase_nested_timestamp
ttnghia Nov 8, 2023
2f30ce9
Merge branch 'int96_rebase_write' into rebase_nested_timestamp
ttnghia Nov 8, 2023
af817de
Merge tests
ttnghia Nov 8, 2023
13242f4
Simplify test data
ttnghia Nov 8, 2023
e1d9f74
Add a new write test
ttnghia Nov 8, 2023
82012b6
Add a mixed rebase test
ttnghia Nov 8, 2023
76694ad
Merge branch 'branch-23.12' into rebase_nested_timestamp
ttnghia Nov 15, 2023
cbef912
Change tests
ttnghia Nov 15, 2023
1474dda
Merge branch 'branch-23.12' into rebase_nested_timestamp
ttnghia Nov 15, 2023
14487bf
Fix `seed` in tests
ttnghia Nov 15, 2023
0fff5e6
Rename tests
ttnghia Nov 15, 2023
8bfca59
Merge branch 'branch-23.12' into rebase_nested_timestamp
ttnghia Nov 16, 2023
61d7d3d
Add default seed
ttnghia Nov 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 6 additions & 31 deletions integration_tests/src/main/python/parquet_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from asserts import assert_cpu_and_gpu_are_equal_collect_with_capture, assert_cpu_and_gpu_are_equal_sql_with_capture, assert_gpu_and_cpu_are_equal_collect, assert_gpu_and_cpu_row_counts_equal, \
assert_gpu_fallback_collect, assert_gpu_and_cpu_are_equal_sql, assert_gpu_and_cpu_error, assert_spark_exception
from data_gen import *
from parquet_write_test import parquet_nested_datetime_gen, parquet_ts_write_options
from marks import *
import pyarrow as pa
import pyarrow.parquet as pa_pq
Expand Down Expand Up @@ -310,42 +311,16 @@ def test_parquet_pred_push_round_trip(spark_tmp_path, parquet_gen, read_func, v1
lambda spark: rf(spark).select(f.col('a') >= s0),
conf=all_confs)


parquet_ts_write_options = ['INT96', 'TIMESTAMP_MICROS', 'TIMESTAMP_MILLIS']

# Once https://github.com/NVIDIA/spark-rapids/issues/1126 is fixed delete this test and merge it
# into test_parquet_read_roundtrip_datetime
Comment on lines -316 to -317
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deleted test is combined with the test_parquet_read_roundtrip_datetime.

@pytest.mark.parametrize('gen', [ArrayGen(TimestampGen(start=datetime(1900, 1, 1, tzinfo=timezone.utc))),
ArrayGen(ArrayGen(TimestampGen(start=datetime(1900, 1, 1, tzinfo=timezone.utc))))], ids=idfn)
@pytest.mark.parametrize('ts_write', parquet_ts_write_options)
@pytest.mark.parametrize('ts_rebase', ['CORRECTED', 'LEGACY'])
@pytest.mark.parametrize('reader_confs', reader_opt_confs)
@pytest.mark.parametrize('v1_enabled_list', ["", "parquet"])
@pytest.mark.xfail(reason='https://github.com/NVIDIA/spark-rapids/issues/1126')
def test_parquet_ts_read_round_trip_nested(gen, spark_tmp_path, ts_write, ts_rebase, v1_enabled_list, reader_confs):
data_path = spark_tmp_path + '/PARQUET_DATA'
with_cpu_session(
lambda spark : unary_op_df(spark, gen).write.parquet(data_path),
conf={'spark.sql.legacy.parquet.datetimeRebaseModeInWrite': ts_rebase,
'spark.sql.legacy.parquet.int96RebaseModeInWrite': ts_rebase,
'spark.sql.parquet.outputTimestampType': ts_write})
all_confs = copy_and_update(reader_confs, {'spark.sql.sources.useV1SourceList': v1_enabled_list})
assert_gpu_and_cpu_are_equal_collect(
lambda spark : spark.read.parquet(data_path),
conf=all_confs)

parquet_gens_legacy_list = [[byte_gen, short_gen, int_gen, long_gen, float_gen, double_gen,
string_gen, boolean_gen, date_gen, timestamp_gen]]

@pytest.mark.parametrize('parquet_gens', parquet_gens_legacy_list, ids=idfn)
@datagen_overrides(seed=0, reason='https://github.com/NVIDIA/spark-rapids/issues/9701')
@pytest.mark.parametrize('parquet_gens', [parquet_nested_datetime_gen], ids=idfn)
@pytest.mark.parametrize('ts_type', parquet_ts_write_options)
@pytest.mark.parametrize('ts_rebase_write', [('CORRECTED', 'LEGACY'), ('LEGACY', 'CORRECTED')])
@pytest.mark.parametrize('ts_rebase_read', [('CORRECTED', 'LEGACY'), ('LEGACY', 'CORRECTED')])
@pytest.mark.parametrize('reader_confs', reader_opt_confs)
@pytest.mark.parametrize('v1_enabled_list', ["", "parquet"])
def test_parquet_read_roundtrip_datetime(spark_tmp_path, parquet_gens, ts_type,
ts_rebase_write, ts_rebase_read,
reader_confs, v1_enabled_list):
def test_parquet_read_roundtrip_datetime_with_legacy_rebase(spark_tmp_path, parquet_gens, ts_type,
ts_rebase_write, ts_rebase_read,
reader_confs, v1_enabled_list):
gen_list = [('_c' + str(i), gen) for i, gen in enumerate(parquet_gens)]
data_path = spark_tmp_path + '/PARQUET_DATA'
write_confs = {'spark.sql.parquet.outputTimestampType': ts_type,
Expand Down
43 changes: 32 additions & 11 deletions integration_tests/src/main/python/parquet_write_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,11 +75,12 @@
TimestampGen(start=datetime(1, 1, 1, tzinfo=timezone.utc),
end=datetime(2000, 1, 1, tzinfo=timezone.utc))
.with_special_case(datetime(1000, 1, 1, tzinfo=timezone.utc), weight=10.0)]
parquet_datetime_in_struct_gen = [StructGen([['child' + str(ind), sub_gen] for ind, sub_gen in enumerate(parquet_datetime_gen_simple)]),
StructGen([['child0', StructGen([['child' + str(ind), sub_gen] for ind, sub_gen in enumerate(parquet_datetime_gen_simple)])]])]
parquet_datetime_in_array_gen = [ArrayGen(sub_gen, max_length=10) for sub_gen in parquet_datetime_gen_simple + parquet_datetime_in_struct_gen] + [
ArrayGen(ArrayGen(sub_gen, max_length=10), max_length=10) for sub_gen in parquet_datetime_gen_simple + parquet_datetime_in_struct_gen]
parquet_nested_datetime_gen = parquet_datetime_gen_simple + parquet_datetime_in_struct_gen + parquet_datetime_in_array_gen
parquet_datetime_in_struct_gen = [
StructGen([['child' + str(ind), sub_gen] for ind, sub_gen in enumerate(parquet_datetime_gen_simple)])]
parquet_datetime_in_array_gen = [ArrayGen(sub_gen, max_length=10) for sub_gen in
parquet_datetime_gen_simple + parquet_datetime_in_struct_gen]
parquet_nested_datetime_gen = parquet_datetime_gen_simple + parquet_datetime_in_struct_gen + \
parquet_datetime_in_array_gen
Comment on lines +78 to +83
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify the data generators a bit, since they are too heavy and the tests using them (especially in parquet read tests) become very very slow now.


parquet_map_gens = parquet_map_gens_sample + [
MapGen(StructGen([['child0', StringGen()], ['child1', StringGen()]], nullable=False), FloatGen()),
Expand Down Expand Up @@ -460,15 +461,35 @@ def generate_map_with_empty_validity(spark, path):
@datagen_overrides(seed=0, reason='https://github.com/NVIDIA/spark-rapids/issues/9701')
@pytest.mark.parametrize('data_gen', parquet_nested_datetime_gen, ids=idfn)
@pytest.mark.parametrize('ts_write', parquet_ts_write_options)
@pytest.mark.parametrize('ts_rebase_write', ['CORRECTED', 'LEGACY'])
@pytest.mark.parametrize('ts_rebase_read', ['CORRECTED', 'LEGACY'])
def test_datetime_roundtrip_with_legacy_rebase(spark_tmp_path, data_gen, ts_write, ts_rebase_write, ts_rebase_read):
@pytest.mark.parametrize('ts_rebase_write', ['EXCEPTION'])
def test_parquet_write_fails_legacy_datetime(spark_tmp_path, data_gen, ts_write, ts_rebase_write):
data_path = spark_tmp_path + '/PARQUET_DATA'
all_confs = {'spark.sql.parquet.outputTimestampType': ts_write,
'spark.sql.legacy.parquet.datetimeRebaseModeInWrite': ts_rebase_write,
'spark.sql.legacy.parquet.int96RebaseModeInWrite': ts_rebase_write,
'spark.sql.legacy.parquet.datetimeRebaseModeInRead': ts_rebase_read,
'spark.sql.legacy.parquet.int96RebaseModeInRead': ts_rebase_read}
'spark.sql.legacy.parquet.int96RebaseModeInWrite': ts_rebase_write}
def writeParquetCatchException(spark, data_gen, data_path):
with pytest.raises(Exception) as e_info:
unary_op_df(spark, data_gen).coalesce(1).write.parquet(data_path)
assert e_info.match(r".*SparkUpgradeException.*")
with_gpu_session(
lambda spark: writeParquetCatchException(spark, data_gen, data_path),
conf=all_confs)

@datagen_overrides(seed=0, reason='https://github.com/NVIDIA/spark-rapids/issues/9701')
@pytest.mark.parametrize('data_gen', parquet_nested_datetime_gen, ids=idfn)
@pytest.mark.parametrize('ts_write', parquet_ts_write_options)
@pytest.mark.parametrize('ts_rebase_write', [('CORRECTED', 'LEGACY'), ('LEGACY', 'CORRECTED')])
@pytest.mark.parametrize('ts_rebase_read', [('CORRECTED', 'LEGACY'), ('LEGACY', 'CORRECTED')])
def test_parquet_write_roundtrip_datetime_with_legacy_rebase(spark_tmp_path, data_gen, ts_write,
ts_rebase_write, ts_rebase_read):
data_path = spark_tmp_path + '/PARQUET_DATA'
all_confs = {'spark.sql.parquet.outputTimestampType': ts_write,
'spark.sql.legacy.parquet.datetimeRebaseModeInWrite': ts_rebase_write[0],
'spark.sql.legacy.parquet.int96RebaseModeInWrite': ts_rebase_write[1],
# The rebase modes in read configs should be ignored and overridden by the same
# modes in write configs, which are retrieved from the written files.
'spark.sql.legacy.parquet.datetimeRebaseModeInRead': ts_rebase_read[0],
'spark.sql.legacy.parquet.int96RebaseModeInRead': ts_rebase_read[1]}
assert_gpu_and_cpu_writes_are_equal_collect(
lambda spark, path: unary_op_df(spark, data_gen).coalesce(1).write.parquet(path),
lambda spark, path: spark.read.parquet(path),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -158,8 +158,6 @@ object GpuParquetScan {

def tagSupport(sparkSession: SparkSession, readSchema: StructType,
meta: RapidsMeta[_, _, _]): Unit = {
val sqlConf = sparkSession.conf

if (ParquetLegacyNanoAsLongShims.legacyParquetNanosAsLong) {
meta.willNotWorkOnGpu("GPU does not support spark.sql.legacy.parquet.nanosAsLong")
}
Expand All @@ -176,25 +174,6 @@ object GpuParquetScan {

FileFormatChecks.tag(meta, readSchema, ParquetFormatType, ReadFileOp)

val schemaHasTimestamps = readSchema.exists { field =>
TrampolineUtil.dataTypeExistsRecursively(field.dataType, _.isInstanceOf[TimestampType])
}

def isTsOrDate(dt: DataType): Boolean = dt match {
case TimestampType | DateType => true
// Timestamp without timezone (TimestampNTZType, since Spark 3.4) is not yet supported
// See https://github.com/NVIDIA/spark-rapids/issues/9707.
case _ => false
}

val schemaMightNeedNestedRebase = readSchema.exists { field =>
if (DataTypeUtils.isNestedType(field.dataType)) {
TrampolineUtil.dataTypeExistsRecursively(field.dataType, isTsOrDate)
} else {
false
}
}

// Currently timestamp conversion is not supported.
// If support needs to be added then we need to follow the logic in Spark's
// ParquetPartitionReaderFactory and VectorizedColumnReader which essentially
Expand All @@ -204,43 +183,12 @@ object GpuParquetScan {
// were written in that timezone and convert them to UTC timestamps.
// Essentially this should boil down to a vector subtract of the scalar delta
// between the configured timezone's delta from UTC on the timestamp data.
val schemaHasTimestamps = readSchema.exists { field =>
TrampolineUtil.dataTypeExistsRecursively(field.dataType, _.isInstanceOf[TimestampType])
}
if (schemaHasTimestamps && sparkSession.sessionState.conf.isParquetINT96TimestampConversion) {
meta.willNotWorkOnGpu("GpuParquetScan does not support int96 timestamp conversion")
}

DateTimeRebaseMode.fromName(sqlConf.get(SparkShimImpl.int96ParquetRebaseReadKey)) match {
case DateTimeRebaseException => if (schemaMightNeedNestedRebase) {
meta.willNotWorkOnGpu("Nested timestamp and date values are not supported when " +
s"${SparkShimImpl.int96ParquetRebaseReadKey} is EXCEPTION")
}
case DateTimeRebaseCorrected => // Good
case DateTimeRebaseLegacy =>
if (schemaMightNeedNestedRebase) {
meta.willNotWorkOnGpu("Nested timestamp and date values are not supported when " +
s"${SparkShimImpl.int96ParquetRebaseReadKey} is LEGACY")
}
// This should never be reached out, since invalid mode is handled in
// `DateTimeRebaseMode.fromName`.
case other => meta.willNotWorkOnGpu(
DateTimeRebaseUtils.invalidRebaseModeMessage(other.getClass.getName))
}

DateTimeRebaseMode.fromName(sqlConf.get(SparkShimImpl.parquetRebaseReadKey)) match {
case DateTimeRebaseException => if (schemaMightNeedNestedRebase) {
meta.willNotWorkOnGpu("Nested timestamp and date values are not supported when " +
s"${SparkShimImpl.parquetRebaseReadKey} is EXCEPTION")
}
case DateTimeRebaseCorrected => // Good
case DateTimeRebaseLegacy =>
if (schemaMightNeedNestedRebase) {
meta.willNotWorkOnGpu("Nested timestamp and date values are not supported when " +
s"${SparkShimImpl.parquetRebaseReadKey} is LEGACY")
}
// This should never be reached out, since invalid mode is handled in
// `DateTimeRebaseMode.fromName`.
case other => meta.willNotWorkOnGpu(
DateTimeRebaseUtils.invalidRebaseModeMessage(other.getClass.getName))
}
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ package com.nvidia.spark.rapids

import java.util.TimeZone

import ai.rapids.cudf.{ColumnVector, DType, Scalar}
import ai.rapids.cudf.{ColumnView, DType, Scalar}
import com.nvidia.spark.rapids.Arm.withResource
import com.nvidia.spark.rapids.shims.SparkShimImpl

Expand Down Expand Up @@ -117,54 +117,50 @@ object DateTimeRebaseUtils {
SPARK_LEGACY_INT96_METADATA_KEY)
}

private[this] def isDateRebaseNeeded(column: ColumnVector,
startDay: Int): Boolean = {
// TODO update this for nested column checks
// https://github.com/NVIDIA/spark-rapids/issues/1126
private[this] def isRebaseNeeded(column: ColumnView, checkType: DType,
minGood: Scalar): Boolean = {
val dtype = column.getType
if (dtype == DType.TIMESTAMP_DAYS) {
val hasBad = withResource(Scalar.timestampDaysFromInt(startDay)) {
column.lessThan
}
val anyBad = withResource(hasBad) {
_.any()
}
withResource(anyBad) { _ =>
anyBad.isValid && anyBad.getBoolean
}
} else {
false
}
}
require(!dtype.hasTimeResolution || dtype == DType.TIMESTAMP_MICROSECONDS)

private[this] def isTimeRebaseNeeded(column: ColumnVector,
startTs: Long): Boolean = {
val dtype = column.getType
if (dtype.hasTimeResolution) {
require(dtype == DType.TIMESTAMP_MICROSECONDS)
withResource(
Scalar.timestampFromLong(DType.TIMESTAMP_MICROSECONDS, startTs)) { minGood =>
dtype match {
case `checkType` =>
withResource(column.lessThan(minGood)) { hasBad =>
withResource(hasBad.any()) { a =>
a.isValid && a.getBoolean
withResource(hasBad.any()) { anyBad =>
anyBad.isValid && anyBad.getBoolean
}
}
}
} else {
false

case DType.LIST | DType.STRUCT => (0 until column.getNumChildren).exists(i =>
withResource(column.getChildColumnView(i)) { child =>
isRebaseNeeded(child, checkType, minGood)
})

case _ => false
}
}

private[this] def isDateRebaseNeeded(column: ColumnView, startDay: Int): Boolean = {
withResource(Scalar.timestampDaysFromInt(startDay)) { minGood =>
isRebaseNeeded(column, DType.TIMESTAMP_DAYS, minGood)
}
}

private[this] def isTimeRebaseNeeded(column: ColumnView, startTs: Long): Boolean = {
withResource(Scalar.timestampFromLong(DType.TIMESTAMP_MICROSECONDS, startTs)) { minGood =>
isRebaseNeeded(column, DType.TIMESTAMP_MICROSECONDS, minGood)
}
}

def isDateRebaseNeededInRead(column: ColumnVector): Boolean =
def isDateRebaseNeededInRead(column: ColumnView): Boolean =
isDateRebaseNeeded(column, RebaseDateTime.lastSwitchJulianDay)

def isTimeRebaseNeededInRead(column: ColumnVector): Boolean =
def isTimeRebaseNeededInRead(column: ColumnView): Boolean =
isTimeRebaseNeeded(column, RebaseDateTime.lastSwitchJulianTs)

def isDateRebaseNeededInWrite(column: ColumnVector): Boolean =
def isDateRebaseNeededInWrite(column: ColumnView): Boolean =
isDateRebaseNeeded(column, RebaseDateTime.lastSwitchGregorianDay)

def isTimeRebaseNeededInWrite(column: ColumnVector): Boolean =
def isTimeRebaseNeededInWrite(column: ColumnView): Boolean =
isTimeRebaseNeeded(column, RebaseDateTime.lastSwitchGregorianTs)

def newRebaseExceptionInRead(format: String): Exception = {
Expand Down