-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29383: Iceberg: [V3] Add support for timestamp with nanosecond precession #6242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
5740926 to
76ac1b7
Compare
| return DateTimeUtil.dateFromDays((Integer) value); | ||
| } | ||
| break; | ||
| case TIMESTAMP: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to add TIMESTAMP_NANO in convertToWriteType ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't help as of now, because of bug in Iceberg:
apache/iceberg#13160
Unless we upgrade, so I thought I will handle that case independently once we upgrade as a follow-up. I didn't dig in deep once I hit the above error & it had a fix to find a hack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following completely, can you please help me understand. The iceberg issue you mentioned is fixed as part of iceberg 1.10.0 (PR 13746 and 13747) and hive is on iceberg 1.10.0. Will the following won't work?
case TIMESTAMP_NANO:
if (value instanceof Long) {
Types.TimestampNanoType timestampType = (Types.TimestampNanoType) type;
return timestampType.shouldAdjustToUTC()
? DateTimeUtil.timestamptzFromNanos((Long) value)
: DateTimeUtil.timestampFromNanos((Long) value);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue that I was hitting was this: apache/iceberg#14359
my code wasn't reaching there, so, I didn't try logically it should work
...r/src/main/java/org/apache/iceberg/mr/hive/serde/objectinspector/IcebergObjectInspector.java
Outdated
Show resolved
Hide resolved
| return IcebergTimeObjectInspector.get(); | ||
| case TIMESTAMP_NANO: | ||
| boolean adjust = ((Types.TimestampNanoType) primitiveType).shouldAdjustToUTC(); | ||
| return adjust ? TIMESTAMP_INSPECTOR_WITH_TZ : TIMESTAMP_INSPECTOR; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case TIMESTAMP_NANO:
boolean adjust = ((Types.TimestampNanoType) primitiveType).shouldAdjustToUTC();
return adjust ? TIMESTAMP_INSPECTOR_WITH_TZ : TIMESTAMP_INSPECTOR;
I don't see any "nanos" usage in IcebergTimestampWithZoneObjectInspectorHive3 i.e. TIMESTAMPTZ_INSPECTOR_CLASS , should we create a new class extending AbstractPrimitiveJavaObjectInspector for handling nano case? Also TIMESTAMP_INSPECTOR will not return "nanosecond timestamp"
Lines 41 to 43 in eaa8189
| private IcebergTimestampObjectInspectorHive3() { | |
| super(TypeInfoFactory.timestampTypeInfo); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK
Nanosecond precision is preserved because all conversions are done using:
- java.time.ZonedDateTime
- java.time.OffsetDateTime
- org.apache.hadoop.hive.common.type.TimestampTZ
All of these internally store time with nanosecond resolution
If you check the q file output, I have a case for all 4 types, and for the nanosecond timezones, it would be more digits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Append the following in timestamp_ns.q and run once.
CREATE TABLE t_nano_source (
id int,
ts_ns nanosecond timestamp
) STORED BY ICEBERG TBLPROPERTIES ('format-version'='3');
INSERT INTO t_nano_source VALUES (1, '2025-12-24 10:00:00.123456789');
select * from t_nano_source;
CREATE TABLE t_nano_target STORED BY ICEBERG AS SELECT * FROM t_nano_source;
select * from t_nano_target;
DESCRIBE t_nano_target;
The CTAS table (t_nano_target) is having column as timestamp:
POSTHOOK: query: select * from t_nano_target
POSTHOOK: type: QUERY
POSTHOOK: Input: default@t_nano_target
POSTHOOK: Output: hdfs://### HDFS PATH ###
1 2025-12-24 10:00:00.123456
.....
POSTHOOK: Input: default@t_nano_target
id int
ts_ns timestamp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on this I guess, new class for timestamp nano needs to be created. Maybe IcebergTimestampNanoWithZoneObjectInspector and IcebergTimestampNanoObjectInspector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Aggarwal-Raghav thanx for sharing the test, I was able to repro it, updated the code and have added a test case for it.
Lemme know how things are now
|
If i take your #6243 and added TIMESTAMP_NANO in switch and added But correct/expected Output should be as per my understanding: |
I changed that PR, Add the case like this: and it should give the intended result |
|



What changes were proposed in this pull request?
Add support for timestamp with nanosecond precession
Why are the changes needed?
To support the new timestamp types introduced as part of iceberg v3
Does this PR introduce any user-facing change?
Yes, users can leverage timestamp_ns & timestamptz_ns data types for iceberg v3 tables
How was this patch tested?
UT