FI Fractal Indexing Hilbert, Z-order, HNSW

CHAPTER 14 STUDIO

The indexes you ship every day are fractal. Watch the math draw.

Apache Iceberg added Hilbert curve clustering in 2025. Delta Lake's Liquid Clustering reports up to 10x query acceleration. Uber's H3 and Google's S2 are hierarchical fractal subdivisions. HNSW is a hierarchical small-world graph. Most engineers who use these indexes have never seen the curve trace through a grid. This page fixes that. Three labs make the locality property, the page-pruning win, and the HNSW scale separation visible at the speed of a click.

9 notebooks in the cluster
3 interactive labs
10x production speedup target

LAB 1: CURVE TRACE ANIMATOR

Watch a Hilbert or Z-order curve fill a 32 x 32 grid. Drag the query rectangle.

Pick a curve. Pick a speed. Drag the green rectangle anywhere on the grid. The page-fetch counters update in real time and show how many "tiles" each ordering forces a query to read. Hilbert wins on average; the gap is bigger for small queries.

Drag the green rectangle. Click PLAY to retrace. ready
14 cells per second
Pages touched (this curve) 0
Pages touched (row-major) 0
Pages saved vs row-major 0%

Each "page" holds 16 contiguous curve cells (a 16-cell run). Production engines tune this to thousands of rows; the savings ratio is the same.

LAB 2: INDEX RACE TRACK

5,000 random points. Four indexes. One query box. Watch them race.

Drag the query rectangle anywhere on the canvas. The bar chart on the right updates instantly with pages-fetched and points-returned for each ordering. Use the skew slider to redistribute the points; Hilbert and Z-order pull farther ahead as skew increases.

Drag the green box. Click REGENERATE to draw new points. ready
60 on a 0 (uniform) to 100 (heavy clusters) scale
Points in query box 0
Best index for this query --

LAB 3: HNSW CLIMBER

Watch a search descend through the layers.

Click anywhere on the canvas to drop a query point. The HNSW search starts at the top layer and descends one layer per step. Visited nodes flash. The visit-count panel on the right shows why the descent is logarithmic instead of linear. Use the M slider to rebuild the graph with denser or sparser connections.

Click anywhere to drop a query. ready
M = 5 (production HNSW defaults to 16)
80 points across all layers
Layers built 0
Visited nodes 0
Speedup vs brute force --

Brute force checks every point. HNSW checks roughly log(N) per layer. The speedup grows with N.

CHAPTER 14 NOTEBOOK PATH

Each lab has notebooks behind it.

The studio is the surface. The chapter is where the algorithms are built and the failure modes are named.

14.0

Why indexes are already fractal

Frames the chapter. Three production examples (Iceberg, Liquid Clustering, HNSW). One-paragraph history.

14.1 - 14.2

Curves, locality, and Hilbert R-tree

Build Z-order and Hilbert in NumPy. Reproduce Kamel-Faloutsos (1994) Hilbert R-tree bulk loading.

14.3

Fractal dimension as a selectivity oracle

Reproduce the 1994 Faloutsos selectivity formula. Beat the uniform predictor by an order of magnitude on skewed data.

14.4

HNSW as a small-world index

Pure-Python HNSW. Visualize layer assignment. Connect to Watts-Strogatz, Barabasi, Song-Havlin-Makse.

14.5 - 14.6

DuckDB clustering and Hurst-aware partitioning

Run Liquid Clustering at home. Drive time-series chunk boundaries by the Hurst exponent of the stream.

14.7 - 14.8

Capstone and the failure modes

Recommend an index for your workload. Then read the four failure modes before you ship anything.