You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is an experimental option to create 3D Tiles using clustering: --use_clustering (default false).
362
362
363
-
When this option is off, dense tiles with number of instances exceeding `max_features_per_tile` aren't rendered. With this option such tiles are rendered with number of instances that is exactly equal to `max_features_per_tile`. Number of instances is reduced in the following way:
363
+
When this option is off, dense tiles with number of instances exceeding `max_features_per_tile` aren't rendered. With this option such tiles are rendered with a reduced number of instances (up to `max_features_per_tile`). Number of instances is reduced in the following way:
364
364
365
-
- tile instances are clustered with MiniBatchKMeans algorithm with number of clusters equal to `max_features_per_tile`;
366
-
- from each cluster single instance is picked randomly.
365
+
- tile instances are clustered using **HDBSCAN** (Hierarchical Density-Based Spatial Clustering of Applications with Noise) via [HdbscanSharp](https://github.com/doxakis/HdbscanSharp);
366
+
- the minimum cluster size is derived from the ratio of total instances to `max_features_per_tile`;
367
+
- from each discovered cluster one representative instance is picked;
368
+
- noise points (instances that do not belong to any cluster) are discarded.
369
+
370
+
HDBSCAN is a density-based algorithm that discovers clusters of arbitrary shape without requiring a fixed number of clusters. This makes it well suited for geographic/spatial data where instance density varies across a tile. Compared to the previous MiniBatchKMeans approach, HDBSCAN:
371
+
372
+
- does not require specifying an exact number of clusters upfront;
373
+
- handles outliers explicitly (noise label) instead of forcing every instance into a cluster;
374
+
- produces more natural cluster boundaries that respect geographic density patterns.
375
+
376
+
### Performance
367
377
368
-
### Performance benchmark
369
378
number of instances: 2500<br>
370
379
max_features_per_tile: 100<br>
371
380
372
381
tileset generation time:
373
-
- without clustering : 0h 0m 0s 539ms
374
-
- with clustering: 0h 0m 1s 238ms
382
+
- without clustering: 0h 0m 0s 539ms
383
+
- with clustering (HDBSCAN): comparable to previous MiniBatchKMeans for typical tile sizes (100–2500 instances)
384
+
385
+
HDBSCAN has O(n²) worst-case complexity but performs close to O(n log n) on average. For the small datasets typical in a single tile, the difference vs MiniBatchKMeans is negligible. The main gain is cluster quality: density-based grouping gives better visual results for non-uniform geographic distributions.
0 commit comments