Skip to content

[fix](fe) Improve MaxCompute catalog validation#64119

Open
hubgeter wants to merge 1 commit into
apache:masterfrom
hubgeter:fix_mc_epic
Open

[fix](fe) Improve MaxCompute catalog validation#64119
hubgeter wants to merge 1 commit into
apache:masterfrom
hubgeter:fix_mc_epic

Conversation

@hubgeter
Copy link
Copy Markdown
Contributor

@hubgeter hubgeter commented Jun 4, 2026

What problem does this PR solve?

Problem Summary:
Add the mc.validate_connection parameter, which defaults to false, to perform ak-sk、 project 、schema validation when creating a catalog.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hubgeter
Copy link
Copy Markdown
Contributor Author

hubgeter commented Jun 4, 2026

run buildall

@hubgeter hubgeter marked this pull request as draft June 4, 2026 10:16
@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 28829 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f89dad9bfaf645f55869eb3c9b83501c7ac133db, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17713	3999	3920	3920
q2	q3	10744	1470	824	824
q4	4682	472	350	350
q5	7547	874	581	581
q6	183	169	134	134
q7	775	849	654	654
q8	9566	1674	1644	1644
q9	6722	4503	4535	4503
q10	6652	1838	1523	1523
q11	443	270	253	253
q12	637	437	294	294
q13	18113	3400	2791	2791
q14	268	259	246	246
q15	q16	822	781	714	714
q17	1175	890	948	890
q18	6844	5743	5528	5528
q19	1564	1256	1076	1076
q20	512	409	265	265
q21	5840	2618	2337	2337
q22	441	361	302	302
Total cold run time: 101243 ms
Total hot run time: 28829 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4316	4244	4247	4244
q2	q3	4540	4980	4336	4336
q4	2082	2230	1385	1385
q5	4454	4296	4320	4296
q6	228	174	128	128
q7	1749	1722	1850	1722
q8	2559	2196	2133	2133
q9	8002	7979	7968	7968
q10	4787	4772	4253	4253
q11	608	434	397	397
q12	760	765	544	544
q13	3465	3648	2995	2995
q14	309	304	294	294
q15	q16	724	762	665	665
q17	1369	1341	1323	1323
q18	7941	7256	7130	7130
q19	1150	1087	1105	1087
q20	2216	2232	1949	1949
q21	5286	4547	4445	4445
q22	536	479	423	423
Total cold run time: 57081 ms
Total hot run time: 51717 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169519 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f89dad9bfaf645f55869eb3c9b83501c7ac133db, data reload: false

query5	4318	643	475	475
query6	439	205	171	171
query7	4851	598	321	321
query8	365	217	204	204
query9	8773	4002	4023	4002
query10	446	297	260	260
query11	5950	2367	2192	2192
query12	153	103	99	99
query13	1275	611	446	446
query14	6463	5447	5096	5096
query14_1	4427	4427	4406	4406
query15	209	199	178	178
query16	996	511	480	480
query17	1142	716	617	617
query18	2538	481	354	354
query19	207	188	152	152
query20	118	111	106	106
query21	224	141	121	121
query22	13660	13614	13388	13388
query23	17345	16493	16234	16234
query23_1	16409	16292	16325	16292
query24	7695	1776	1333	1333
query24_1	1339	1323	1344	1323
query25	576	470	414	414
query26	1305	348	166	166
query27	2644	547	347	347
query28	4436	2023	2028	2023
query29	1109	652	502	502
query30	316	243	204	204
query31	1143	1066	960	960
query32	108	63	61	61
query33	546	330	264	264
query34	1214	1143	655	655
query35	769	821	675	675
query36	1396	1391	1252	1252
query37	157	101	89	89
query38	3210	3134	3047	3047
query39	955	932	910	910
query39_1	897	892	896	892
query40	217	121	100	100
query41	64	62	60	60
query42	95	95	90	90
query43	332	330	278	278
query44	
query45	196	183	180	180
query46	1127	1259	753	753
query47	2392	2364	2276	2276
query48	413	417	281	281
query49	635	481	348	348
query50	986	354	253	253
query51	4360	4276	4347	4276
query52	90	90	78	78
query53	244	269	195	195
query54	266	215	215	215
query55	78	74	70	70
query56	232	229	219	219
query57	1448	1396	1300	1300
query58	250	213	209	209
query59	1555	1666	1514	1514
query60	289	266	236	236
query61	151	155	165	155
query62	691	665	589	589
query63	237	187	193	187
query64	2599	784	641	641
query65	
query66	1764	498	340	340
query67	29758	29109	29584	29109
query68	
query69	417	304	268	268
query70	989	972	975	972
query71	322	236	212	212
query72	3058	2672	2458	2458
query73	851	762	447	447
query74	5218	4970	4770	4770
query75	2664	2591	2244	2244
query76	2326	1189	792	792
query77	362	373	279	279
query78	12353	12384	11986	11986
query79	1441	1042	774	774
query80	1274	462	394	394
query81	529	281	251	251
query82	600	157	119	119
query83	327	273	248	248
query84	258	147	113	113
query85	901	538	449	449
query86	443	294	284	284
query87	3422	3314	3183	3183
query88	3682	2779	2782	2779
query89	435	379	326	326
query90	1899	184	187	184
query91	179	168	136	136
query92	65	63	58	58
query93	1645	1437	857	857
query94	730	354	308	308
query95	681	478	375	375
query96	1023	758	333	333
query97	2717	2687	2567	2567
query98	214	208	204	204
query99	1160	1190	1027	1027
Total cold run time: 253239 ms
Total hot run time: 169519 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/86) 🎉
Increment coverage report
Complete coverage report

@hubgeter
Copy link
Copy Markdown
Contributor Author

hubgeter commented Jun 5, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/71) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29310 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0de2e53254b46d4957da799836067099dded8685, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17803	4025	4011	4011
q2	q3	10797	1396	839	839
q4	4683	500	356	356
q5	7536	903	587	587
q6	183	173	138	138
q7	774	852	658	658
q8	9342	1628	1708	1628
q9	5976	4578	4505	4505
q10	6740	1827	1524	1524
q11	437	267	244	244
q12	632	424	292	292
q13	18167	3355	2687	2687
q14	265	259	242	242
q15	q16	816	783	712	712
q17	990	962	941	941
q18	6973	5927	5555	5555
q19	1300	1253	1125	1125
q20	538	401	265	265
q21	6106	2812	2684	2684
q22	457	380	317	317
Total cold run time: 100515 ms
Total hot run time: 29310 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5031	4703	4826	4703
q2	q3	4763	5255	4781	4781
q4	2167	2196	1448	1448
q5	4922	4861	4709	4709
q6	234	175	133	133
q7	1819	1735	1616	1616
q8	2438	2085	2072	2072
q9	8010	7852	7413	7413
q10	4755	4656	4199	4199
q11	531	392	352	352
q12	735	735	531	531
q13	2952	3422	2828	2828
q14	275	277	263	263
q15	q16	692	700	609	609
q17	1278	1247	1250	1247
q18	7271	6850	6672	6672
q19	1118	1081	1158	1081
q20	2239	2212	1957	1957
q21	5265	4617	4483	4483
q22	503	466	396	396
Total cold run time: 56998 ms
Total hot run time: 51493 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169725 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0de2e53254b46d4957da799836067099dded8685, data reload: false

query5	4314	645	481	481
query6	449	216	190	190
query7	4824	585	317	317
query8	368	227	218	218
query9	8760	4040	4033	4033
query10	465	322	269	269
query11	5884	2362	2189	2189
query12	161	111	100	100
query13	1285	589	459	459
query14	6419	5435	5082	5082
query14_1	4405	4416	4401	4401
query15	213	198	178	178
query16	1016	457	440	440
query17	1128	749	598	598
query18	2728	478	352	352
query19	210	191	146	146
query20	113	108	119	108
query21	218	142	134	134
query22	13696	13677	13490	13490
query23	17481	16593	16084	16084
query23_1	16274	16361	16385	16361
query24	7473	1765	1316	1316
query24_1	1307	1323	1308	1308
query25	573	426	372	372
query26	1313	323	156	156
query27	2652	551	341	341
query28	4427	2075	2008	2008
query29	1052	614	470	470
query30	310	239	199	199
query31	1117	1076	950	950
query32	109	63	59	59
query33	513	311	242	242
query34	1155	1120	625	625
query35	769	773	701	701
query36	1398	1405	1229	1229
query37	150	105	93	93
query38	3211	3119	3049	3049
query39	933	906	903	903
query39_1	893	874	866	866
query40	216	127	104	104
query41	68	66	66	66
query42	97	96	96	96
query43	326	323	277	277
query44	
query45	193	188	183	183
query46	1083	1183	737	737
query47	2369	2412	2243	2243
query48	408	428	267	267
query49	620	469	352	352
query50	935	360	255	255
query51	4415	4337	4272	4272
query52	88	92	76	76
query53	253	301	196	196
query54	278	224	200	200
query55	83	76	75	75
query56	222	230	226	226
query57	1449	1404	1339	1339
query58	237	219	218	218
query59	1590	1632	1425	1425
query60	281	240	231	231
query61	192	158	158	158
query62	707	643	573	573
query63	240	187	186	186
query64	2508	770	620	620
query65	
query66	1746	454	338	338
query67	29789	29665	29577	29577
query68	
query69	430	311	257	257
query70	1005	942	912	912
query71	322	223	206	206
query72	3043	2677	2374	2374
query73	857	779	421	421
query74	5173	4985	4778	4778
query75	2650	2607	2230	2230
query76	2323	1140	781	781
query77	347	377	284	284
query78	12434	12397	11864	11864
query79	1295	1007	751	751
query80	628	466	385	385
query81	467	284	245	245
query82	569	159	124	124
query83	360	284	259	259
query84	261	148	110	110
query85	944	530	434	434
query86	365	307	281	281
query87	3387	3353	3194	3194
query88	3693	2751	2766	2751
query89	443	381	328	328
query90	1942	182	179	179
query91	174	167	135	135
query92	67	64	55	55
query93	1400	1449	841	841
query94	538	358	318	318
query95	673	480	348	348
query96	1013	789	327	327
query97	2690	2700	2573	2573
query98	215	211	209	209
query99	1173	1176	1031	1031
Total cold run time: 251330 ms
Total hot run time: 169725 ms

@hubgeter
Copy link
Copy Markdown
Contributor Author

hubgeter commented Jun 8, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one blocking issue in the new MaxCompute namespace-schema validation path. The PR otherwise stays focused on MaxCompute catalog validation and unsupported table/view rejection, with targeted unit tests added, but this validation path can reject valid existing deployments that use private/intranet MaxCompute endpoints.

Critical checkpoint conclusions:

  • Goal/test: The goal appears to be earlier MaxCompute validation and explicit rejection of unsupported external tables/logical views. The new tests cover local branching and unsupported table/view checks, but not private/intranet endpoint behavior for the new OpenAPI validation.
  • Scope/focus: The code change is mostly focused, though it adds a new SDK dependency and a second MaxCompute client path.
  • Concurrency/lifecycle: No new shared mutable concurrency path found. Catalog initialization remains synchronized by ExternalCatalog, but the new remote validation runs during lazy initialization.
  • Configuration/compatibility: The new OpenAPI validation does not honor the existing mc.endpoint/private endpoint configuration, which is a compatibility regression for namespace-schema catalogs.
  • Parallel paths: Read and write unsupported-table checks were both updated.
  • Error handling/observability: Errors are surfaced during catalog initialization with useful context, but the incorrect endpoint selection makes the error misleading for private endpoint users.
  • Data correctness/transactions/persistence: No transaction visibility or persisted metadata format issue found in the reviewed diff.
  • Performance: No hot-path performance issue found; validation runs at initialization.
  • Security model/focus: I read SECURITY.md and threat-model.md because this PR touches external catalog connection/auth material. The finding is an operational correctness/compatibility issue, not a security vulnerability under the model. No additional user-provided review focus was present.

@hubgeter
Copy link
Copy Markdown
Contributor Author

hubgeter commented Jun 8, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 43.24% (32/74) 🎉
Increment coverage report
Complete coverage report

@hubgeter hubgeter force-pushed the fix_mc_epic branch 2 times, most recently from afb3991 to d1e407c Compare June 8, 2026 10:34
@hubgeter
Copy link
Copy Markdown
Contributor Author

hubgeter commented Jun 8, 2026

run buildall

@hubgeter hubgeter marked this pull request as ready for review June 8, 2026 10:36
@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/142) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants