Research Summary

Problem: Maintain non-overlapping 2D rectangles with last-writer-wins semantics
Solution: Rectangle decomposition (A \ B → ≤4 fragments)
Target: Pure JavaScript environments (browsers, Node.js, Deno, constrained runtimes like Google Apps Script)

Production Recommendations

n (size)	Workload	Use	Why
Any	Multi-attribute	LazyPartitioned wrapper	Independent attribute updates
< 100	All	Morton spatial locality	O(n) ≈ O(1), faster via locality
100-200	Write-heavy	Context-dependent	See transition zone analysis
100-600	High overlap	Morton spatial locality	Decomposition cost dominates
> 200	Read-heavy	R-tree (R* split)	O(log n) query pruning wins
> 600	All	R-tree (R* split)	O(log n) hierarchical indexing

See PRODUCTION-GUIDE for implementation details.

Key Findings

This section summarizes research outcomes. For detailed methodology and data, see individual analysis documents linked below.

1. Morton Curve Optimization (5-53% speedup, 25% average)

Finding: Morton curve (Z-order) via bit interleaving provides 25% average speedup over Hilbert curve at small n (range: 5-53% depending on scenario), while being simpler to implement.

Impact: Production algorithm for n<100. Constant-time encoding vs iterative Hilbert.

See morton-vs-hilbert-analysis.md for detailed comparison.

2. Linear Scan Wins for Sparse Data (n < 100)

Finding: O(n) linear scan with spatial locality is 5-10x faster than O(log n) R-trees for n < 100.

Why: Tree construction overhead (node allocation, bbox updates) exceeds traversal benefits at small scales.

Impact: Use Morton spatial locality for n<100, R-tree for n≥100. Crossover varies by workload (see transition zone analysis).

See sparse-data-analysis.md for performance data.

3. R* Split Algorithm (Construction + Quality)

Finding: R* split (Beckmann 1990) is fastest for tree construction and provides best query performance on overlapping/large datasets.

Performance: 20-25x faster construction than Quadratic split, 30-35% faster queries on overlapping data vs Midpoint.

Impact: Production R-tree algorithm. Optimal balance of construction speed and tree quality.

See r-star-analysis.md for split algorithm comparison.

4. Transition Zone Mapped (100 < n < 600)

Finding: Crossover point between linear scan and R-tree varies by workload:

Read-heavy: R-tree wins at n > 100
Write + low overlap: R-tree wins at n > 200
Write + high overlap: Linear scan wins until n > 600

Impact: Concrete thresholds replace "workload-dependent" guidance.

See transition-zone-analysis.md for 23-scenario benchmark matrix.

5. Failed Experiments

FastRTree (R* axis + midpoint split): 1.29x slower, rejected.
Bulk Insert API: 1.01-1.39x slower due to LWW sequential dependency, rejected.
Learned Indexes: Requires ML runtimes (TensorFlow/PyTorch), impractical for constrained environments.

See archive/docs/experiments/ for full analyses.

6. Implementation Constraints

TypedArrays: Only performance optimization compatible with constrained runtimes (Google Apps Script). WASM/WebGPU/SharedArrayBuffer unavailable.

Implementation style: Imperative + TypedArrays required for production (~14x faster than functional style with .flatMap/.filter).

Bundle sizes: Morton ~2.3KB, R-tree ~5.9KB, LazyPartitioned ~2.1KB (minified). Already optimal.

7. Comprehensive Testing (35 Benchmark Scenarios)

Coverage:

Algorithmic patterns: Sequential, grid, overlapping, large datasets (n=500-5000)
User patterns: Single cells, columns, rows, diagonal, striping, merge-like blocks
Workloads: Write-heavy (80/20), query-only (10k queries), mixed

Adversarial validation: Pathological patterns (concentric, diagonal, checkerboard) validate O(n) fragmentation bound. Empirical k ≈ 2.3 overlaps per insert.

See adversarial-patterns.md and benchmark-statistics.md.

8. Production Readiness (October 2025)

Optimization study: No further micro-optimizations viable (<10% impact threshold). Current performance is algorithmically determined.

Test coverage: Axiom-based conformance, ASCII snapshots, cross-implementation consistency, adversarial worst-case.

Conclusion: Implementations production-ready. Future gains require new algorithms, not micro-optimizations.

Algorithms

Linear Scan (O(n)): Flat array with Morton spatial locality. Best for n<100 (2x faster via cache locality). Bundle: ~2.3KB.

R-tree (O(log n)): Hierarchical index with R* split (Beckmann 1990). Best for n≥100. Bundle: ~5.9KB.

See packages/@jim/spandex/src/index/ for implementations.

Methodology

Benchmarks: 35 scenarios (algorithmic patterns + user patterns + workloads), 5 runs, CV% <5%.

Testing: Axiom-based correctness (LWW semantics, disjointness, fragment bounds), adversarial worst-case validation (k ≈ 2.3 overlaps/insert), cross-implementation consistency.

Reproduce: deno task bench:update • deno task test • deno task test:adversarial

Documentation Map

Document	Purpose
GETTING-STARTED	Tutorial for new users
PRODUCTION-GUIDE	Algorithm selection guide
TROUBLESHOOTING	Common issues and solutions
theoretical-foundation	Proofs, complexity analysis
morton-vs-hilbert-analysis	Space-filling curve comparison
sparse-data-analysis	Why O(n) wins for n<100
transition-zone-analysis	Crossover thresholds by workload
r-star-analysis	Split algorithm comparison
adversarial-patterns	Worst-case fragmentation validation
benchmark-statistics	Statistical methodology
alternatives-analysis	Why not quadtrees/grids?
`archive/docs/experiments/`	Failed experiments (preserved learning)

References

Academic Literature

Beckmann, N., Kriegel, H.-P., Schneider, R., & Seeger, B. (1990). "The R*-tree: An Efficient and Robust Access Method for Points and Rectangles." SIGMOD '90: Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, pp. 322-331. DOI: 10.1145/93597.98741
de Berg, M., Cheong, O., van Kreveld, M., & Overmars, M. (2008). Computational Geometry: Algorithms and Applications (3rd ed.). Springer-Verlag. ISBN: 978-3-540-77973-5 (Rectangle decomposition, geometric algorithms)
Guttman, A. (1984). "R-trees: A Dynamic Index Structure for Spatial Searching." SIGMOD '84: Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, pp. 47-57. DOI: 10.1145/602259.602266
Hilbert, D. (1891). "Über die stetige Abbildung einer Linie auf ein Flächenstück." Mathematische Annalen, 38(3), pp. 459-460. DOI: 10.1007/BF01199431 (Space-filling curves)
Samet, H. (1990). The Design and Analysis of Spatial Data Structures. Addison-Wesley. ISBN: 978-0-201-50255-9 (Comprehensive survey of spatial indexing)
Shapiro, M., Preguiça, N., Baquero, C., & Zawirski, M. (2011). "Conflict-Free Replicated Data Types." In Stabilization, Safety, and Security of Distributed Systems, LNCS vol 6976, pp. 386-400. Springer. DOI: 10.1007/978-3-642-24550-3_29 (Last-Writer-Wins conflict resolution)

API Documentation

GridRange type - Custom interface matching Google Sheets GridRange (minus sheetId) defined in src/adapters/gridrange.ts
Deno Standard Library - TypeScript runtime and testing framework

Result: Algorithm choice depends on n and workload. Morton linear scan for n<100, R-tree for n≥100. See PRODUCTION-GUIDE for decision tree.