Skip to content

[fix](parquet)fix parquet write timestamp int96 type. (1/2)#61760

Merged
morningman merged 4 commits intoapache:masterfrom
hubgeter:arrow_int96_fix
Mar 28, 2026
Merged

[fix](parquet)fix parquet write timestamp int96 type. (1/2)#61760
morningman merged 4 commits intoapache:masterfrom
hubgeter:arrow_int96_fix

Conversation

@hubgeter
Copy link
Copy Markdown
Contributor

@hubgeter hubgeter commented Mar 26, 2026

What problem does this PR solve?

PR #60946
Problem Summary:
This pull request fixes a patch introduced in #60946 that caused Doris exports to fail to write Parquet int96 data types. This issue is resolved by adding a new patch to arrow that introduces a parameter that forces writing to int96.

This pr only update thirdparty, next pr update be code.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Mar 26, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hubgeter
Copy link
Copy Markdown
Contributor Author

run buildall

@hubgeter
Copy link
Copy Markdown
Contributor Author

run buildall

@morningman morningman marked this pull request as ready for review March 26, 2026 15:18
morningman
morningman previously approved these changes Mar 26, 2026
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 26, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Mar 26, 2026
@hello-stephen
Copy link
Copy Markdown
Contributor

skip buildlall

@morningman
Copy link
Copy Markdown
Contributor

run buildall

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 27044 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1c7230b935be95fd147570e4d7478276c4c56570, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17626	4563	4298	4298
q2	q3	10651	823	564	564
q4	4675	354	254	254
q5	7565	1231	1015	1015
q6	178	172	147	147
q7	799	840	678	678
q8	9735	1512	1385	1385
q9	5259	4755	4783	4755
q10	6330	1909	1680	1680
q11	483	243	257	243
q12	764	591	467	467
q13	18068	2700	1953	1953
q14	230	241	212	212
q15	q16	761	761	681	681
q17	740	854	432	432
q18	6284	5459	5190	5190
q19	1334	972	616	616
q20	548	481	376	376
q21	4560	1837	1807	1807
q22	435	337	291	291
Total cold run time: 97025 ms
Total hot run time: 27044 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4690	4623	4584	4584
q2	q3	3894	4352	3869	3869
q4	936	1264	808	808
q5	4178	4396	4335	4335
q6	191	184	143	143
q7	1765	1680	1595	1595
q8	2609	2685	2641	2641
q9	7595	7273	7417	7273
q10	3749	4009	3626	3626
q11	564	474	437	437
q12	521	590	467	467
q13	2595	2946	2057	2057
q14	296	313	297	297
q15	q16	753	761	706	706
q17	1160	1383	1336	1336
q18	7272	6878	6601	6601
q19	970	917	946	917
q20	2057	2144	1989	1989
q21	3998	3492	3496	3492
q22	452	441	380	380
Total cold run time: 50245 ms
Total hot run time: 47553 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 169606 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1c7230b935be95fd147570e4d7478276c4c56570, data reload: false

query5	4328	625	519	519
query6	328	228	204	204
query7	4225	459	264	264
query8	338	237	237	237
query9	8736	2712	2712	2712
query10	540	393	368	368
query11	6989	5072	4878	4878
query12	197	138	132	132
query13	1278	468	346	346
query14	5628	3730	3507	3507
query14_1	2842	2876	2887	2876
query15	208	198	185	185
query16	983	478	455	455
query17	926	731	641	641
query18	2457	487	347	347
query19	203	196	181	181
query20	140	132	124	124
query21	214	136	110	110
query22	13294	14033	15119	14033
query23	16706	16426	16255	16255
query23_1	16186	16058	15614	15614
query24	7174	1627	1200	1200
query24_1	1234	1221	1214	1214
query25	538	469	415	415
query26	1236	263	150	150
query27	2772	477	294	294
query28	4467	1826	1803	1803
query29	876	613	476	476
query30	303	230	191	191
query31	1013	965	862	862
query32	82	68	71	68
query33	528	344	286	286
query34	873	869	524	524
query35	632	684	601	601
query36	1055	1112	975	975
query37	136	93	82	82
query38	2913	2937	2855	2855
query39	852	848	821	821
query39_1	791	794	796	794
query40	230	150	136	136
query41	62	61	60	60
query42	258	256	255	255
query43	255	250	219	219
query44	
query45	208	191	187	187
query46	866	992	605	605
query47	2133	2504	2085	2085
query48	332	317	227	227
query49	641	444	377	377
query50	680	279	221	221
query51	4119	4041	4037	4037
query52	264	266	256	256
query53	287	330	297	297
query54	307	267	271	267
query55	94	86	82	82
query56	325	324	308	308
query57	1902	1822	1686	1686
query58	322	280	272	272
query59	2827	2949	2754	2754
query60	357	341	332	332
query61	162	157	165	157
query62	628	592	544	544
query63	305	278	274	274
query64	5193	1299	1023	1023
query65	
query66	1467	452	356	356
query67	24316	24400	24276	24276
query68	
query69	400	316	286	286
query70	979	991	955	955
query71	354	315	309	309
query72	2876	2864	2736	2736
query73	546	551	321	321
query74	9639	9616	9397	9397
query75	2888	2751	2502	2502
query76	2287	1068	675	675
query77	380	399	321	321
query78	10938	11049	10488	10488
query79	1856	775	572	572
query80	1427	659	578	578
query81	549	266	230	230
query82	982	152	123	123
query83	342	281	253	253
query84	306	129	106	106
query85	1018	519	471	471
query86	421	324	313	313
query87	3119	3143	2998	2998
query88	3515	2627	2631	2627
query89	427	372	345	345
query90	2018	177	176	176
query91	171	158	138	138
query92	76	71	74	71
query93	1000	836	503	503
query94	666	321	280	280
query95	590	345	318	318
query96	648	511	225	225
query97	2488	2482	2388	2388
query98	231	221	217	217
query99	993	987	932	932
Total cold run time: 251264 ms
Total hot run time: 169606 ms

@hubgeter
Copy link
Copy Markdown
Contributor Author

run buildall

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 26281 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ecb25520238e5e88e2d000622eeea7856849d391, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17577	4459	4288	4288
q2	q3	10644	775	536	536
q4	4674	355	250	250
q5	7558	1200	1011	1011
q6	176	173	144	144
q7	765	867	670	670
q8	9303	1443	1346	1346
q9	4908	4705	4453	4453
q10	6247	1911	1637	1637
q11	476	265	237	237
q12	710	574	467	467
q13	18022	2698	1949	1949
q14	231	241	220	220
q15	q16	714	738	663	663
q17	723	850	457	457
q18	6046	5418	5283	5283
q19	1253	1125	597	597
q20	540	502	379	379
q21	4631	1816	1404	1404
q22	332	290	361	290
Total cold run time: 95530 ms
Total hot run time: 26281 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4816	4671	4550	4550
q2	q3	3893	4442	3821	3821
q4	865	1188	800	800
q5	4098	4390	4353	4353
q6	183	174	146	146
q7	1783	1654	1537	1537
q8	2470	2696	2591	2591
q9	7701	7438	7454	7438
q10	3846	3958	3624	3624
q11	504	457	419	419
q12	498	588	443	443
q13	2429	2912	2058	2058
q14	294	292	276	276
q15	q16	715	780	757	757
q17	1237	1399	1440	1399
q18	7057	7003	6598	6598
q19	872	868	871	868
q20	2058	2161	2007	2007
q21	4114	3547	3438	3438
q22	480	436	371	371
Total cold run time: 49913 ms
Total hot run time: 47494 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 168743 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ecb25520238e5e88e2d000622eeea7856849d391, data reload: false

query5	4339	626	506	506
query6	354	227	209	209
query7	4218	462	260	260
query8	344	238	223	223
query9	8730	2732	2718	2718
query10	524	381	347	347
query11	6932	5082	4883	4883
query12	186	133	123	123
query13	1278	473	351	351
query14	5668	3663	3396	3396
query14_1	2838	2844	2820	2820
query15	202	193	176	176
query16	1011	470	460	460
query17	1122	748	641	641
query18	2479	452	357	357
query19	213	210	187	187
query20	136	126	127	126
query21	218	140	116	116
query22	13147	14060	14700	14060
query23	16735	16254	16141	16141
query23_1	15891	15734	15781	15734
query24	7158	1624	1219	1219
query24_1	1235	1272	1222	1222
query25	537	458	407	407
query26	1245	256	148	148
query27	2789	475	299	299
query28	4461	1871	1821	1821
query29	830	570	471	471
query30	299	222	188	188
query31	1020	943	869	869
query32	82	72	68	68
query33	511	315	290	290
query34	891	867	532	532
query35	618	706	595	595
query36	1104	1136	1019	1019
query37	130	97	81	81
query38	2936	2899	2904	2899
query39	856	836	805	805
query39_1	788	802	774	774
query40	229	148	135	135
query41	62	100	57	57
query42	261	257	259	257
query43	241	245	212	212
query44	
query45	189	188	180	180
query46	881	977	613	613
query47	2144	2200	2056	2056
query48	305	331	233	233
query49	637	465	378	378
query50	693	273	223	223
query51	4097	4065	3979	3979
query52	262	269	251	251
query53	288	357	283	283
query54	297	260	258	258
query55	88	83	82	82
query56	306	319	305	305
query57	1944	1892	1733	1733
query58	281	270	269	269
query59	2806	2926	2753	2753
query60	345	334	321	321
query61	157	149	147	147
query62	638	576	549	549
query63	307	276	270	270
query64	5076	1285	1002	1002
query65	
query66	1457	450	363	363
query67	24239	24330	24073	24073
query68	
query69	408	317	291	291
query70	985	956	881	881
query71	340	300	309	300
query72	2802	2759	2402	2402
query73	535	543	319	319
query74	9620	9545	9403	9403
query75	2890	2765	2454	2454
query76	2282	1039	673	673
query77	350	369	299	299
query78	11014	11262	10533	10533
query79	1109	767	571	571
query80	852	622	561	561
query81	529	266	227	227
query82	1345	148	122	122
query83	363	278	240	240
query84	258	125	93	93
query85	882	489	444	444
query86	400	310	323	310
query87	3169	3106	3004	3004
query88	3546	2646	2633	2633
query89	425	383	339	339
query90	1837	186	177	177
query91	164	160	140	140
query92	84	72	72	72
query93	920	841	489	489
query94	521	322	304	304
query95	601	341	331	331
query96	643	517	229	229
query97	2480	2487	2420	2420
query98	244	223	214	214
query99	1053	1000	908	908
Total cold run time: 249197 ms
Total hot run time: 168743 ms

@hubgeter hubgeter changed the title [fix](parquet)fix parquet write timestamp int96 type. [fix](parquet)fix parquet write timestamp int96 type. (1) Mar 27, 2026
@hubgeter hubgeter changed the title [fix](parquet)fix parquet write timestamp int96 type. (1) [fix](parquet)fix parquet write timestamp int96 type. (1/2) Mar 27, 2026
@doris-robot
Copy link
Copy Markdown

BE UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.90% (19936/37689)
Line Coverage 36.42% (186817/512906)
Region Coverage 32.72% (145031/443244)
Branch Coverage 33.89% (63533/187485)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100% (0/0) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.51% (27134/36911)
Line Coverage 57.04% (291701/511358)
Region Coverage 54.46% (243629/447366)
Branch Coverage 56.11% (105511/188051)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100% (0/0) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.51% (27134/36911)
Line Coverage 57.04% (291701/511358)
Region Coverage 54.46% (243629/447366)
Branch Coverage 56.11% (105511/188051)

@morningman morningman merged commit f1e1020 into apache:master Mar 28, 2026
32 of 33 checks passed
github-actions bot pushed a commit that referenced this pull request Mar 28, 2026
### What problem does this PR solve?
PR #60946 
Problem Summary:
This pull request fixes a patch introduced in #60946 that caused Doris
exports to fail to write Parquet int96 data types. This issue is
resolved by adding a new patch to arrow that introduces a parameter that
forces writing to int96.

This pr only update thirdparty, next pr update be code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.x dev/4.0.x-conflict dev/4.1.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants