Rectangle Decomposition with Spatial Join
What you'll learn: How to use multiple simple indexes (one per property) and combine them only when you need the full picture.
Series: Part 3 of 3
- Part 1: Last-Writer-Wins - The simplest case
- Part 2: Shallow Merge - Combining properties
- Part 3: Spatial Join (this document) - Using multiple indexes
The Problem
In Part 2, we saw that Shallow Merge creates complexity - you have to store explicit overlap regions with combined properties. That's 4 rectangles instead of 2!
Question: What if we want cells to have multiple properties (background AND font color), but we don't want the merge complexity during inserts?
With Spatial Join: Keep background colors and font colors in SEPARATE simple indexes. Only combine them when you need to!
Let's see how this works...
Step-by-Step Walkthrough
Starting Point: Two Empty Indexes
Instead of one unified index, we create TWO separate indexes:
- backgrounds index: Stores only background colors (uses simple LWW)
- fontColors index: Stores only font colors (uses simple LWW)
Each index is independent and simple!
Step 1: Set Background Colors (First Index)
What we do: backgrounds.insert(A1:C2, 'RED')
Then: backgrounds.insert(B0:D2, 'GREEN')
What it means: Same as Part 1 - just normal LWW in the backgrounds index!
The result:
A B C D
+---+---+---+---+
0 | | G | G | G |
+---+---+---+---+
1 | R | G | G | G | ← B1:C1 now GREEN (LWW)
+---+---+---+---+
2 | R | G | G | G | ← B2:C2 now GREEN (LWW)
+---+---+---+---+
3 | | | | |
+---+---+---+---+
Storage in backgrounds index: [
{ bounds: A1:A2, value: 'RED' }, // Left fragment
{ bounds: B0:D2, value: 'GREEN' } // New range (LWW)
]
R = RED background
G = GREEN background
Notice: Just 2 ranges! Simple LWW decomposition, exactly like Part 1.
No merge complexity - the backgrounds index doesn't know or care about font colors!
Step 2: Set Font Colors (Second Index)
What we do: fontColors.insert(B1:D2, 'BLUE')
What it means: Give cells B1, C1, D1, B2, C2, D2 blue font color - but store it in a DIFFERENT index!
The result:
A B C D
+---+---+---+---+
0 | | | | |
+---+---+---+---+
1 | | B | B | B |
+---+---+---+---+
2 | | B | B | B |
+---+---+---+---+
3 | | | | |
+---+---+---+---+
Storage in fontColors index: [
{ bounds: B1:D2, value: 'BLUE' }
]
B = BLUE font color
Again, simple! Just 1 range in this index.
Key insight: We now have TWO simple indexes instead of ONE complex index!
- backgrounds index: 2 ranges (simple LWW)
- fontColors index: 1 range (simple LWW)
- Total storage: 3 ranges across 2 indexes
Compare to Shallow Merge (Part 2): 4 ranges in 1 unified index.
The Magic: Spatial Join at Query Time
Now here's where it gets interesting! When you want to see what properties a cell has, you combine the indexes on the fly.
The question: What are the properties for cells in viewport A0:D3?
Step 1: Query each index separately
const bgResults = backgrounds.query(A0:D3);
// Returns: [
// { bounds: A1:A2, value: 'RED' },
// { bounds: B0:D2, value: 'GREEN' }
// ]
const fontResults = fontColors.query(A0:D3);
// Returns: [
// { bounds: B1:D2, value: 'BLUE' }
// ]
Easy! Each index just returns its own ranges.
Step 2: Spatial join - combine the results
Now we find where the ranges overlap and combine the properties:
function spatialJoin(bgResults, fontResults) {
// For each distinct region in the viewport, find which properties apply
const joined = [];
// Process all unique regions (areas where coverage changes)
for each distinct region in union(bgResults, fontResults) {
const bg = findCoveringRange(region, bgResults);
const font = findCoveringRange(region, fontResults);
joined.push({
bounds: region,
background: bg?.value, // undefined if no background
fontColor: font?.value // undefined if no font color
});
}
return joined;
}
Think of it like putting two transparency sheets on top of each other - you see both!
Step 3: The combined result
A B C D
+---+---+---+---+
0 | | G | G | G | ← { background: GREEN }
+---+---+---+---+
1 | R |R+B|R+B|G+B| ← A1={ bg: RED }, B1:C1={ bg: GREEN, font: BLUE }, D1={ bg: GREEN, font: BLUE }
+---+---+---+---+
2 | R |R+B|R+B|G+B| ← A2={ bg: RED }, B2:C2={ bg: GREEN, font: BLUE }, D2={ bg: GREEN, font: BLUE }
+---+---+---+---+
3 | | | | |
+---+---+---+---+
Joined result: [
{ bounds: A1:A2, background: 'RED', fontColor: undefined },
{ bounds: B0:D0, background: 'GREEN', fontColor: undefined },
{ bounds: B1:C2, background: 'GREEN', fontColor: 'BLUE' }, ← Joined!
{ bounds: D1:D2, background: 'GREEN', fontColor: 'BLUE' } ← Joined!
]
R = { background: RED }
G = { background: GREEN }
B = { fontColor: BLUE }
R+B = { background: RED, fontColor: BLUE } (from join, not storage!)
G+B = { background: GREEN, fontColor: BLUE } (from join, not storage!)
Key insight: Combined properties only exist in the query result, NOT in storage!
What Just Happened?
Think of spatial join like organizing books:
Shallow Merge approach (Part 2):
- ONE bookshelf with books labeled "Red+Blue", "Red", "Blue"
- Every time you add a book, you might need to relabel existing books
- Complex to maintain, but easy to read
Spatial Join approach (Part 3):
- TWO bookshelves: "Background Colors" and "Font Colors"
- Adding books is simple - just put them on the right shelf
- When you need to find "what does B1 have?", you check BOTH shelves
The tradeoff:
- ✅ Simple inserts (just normal LWW in each index)
- ✅ Less total storage (3 ranges vs 4)
- ❌ Slightly more work at query time (check multiple indexes)
But here's the thing: In spreadsheets, you insert operations OFTEN (user edits) but query RARELY (only when rendering). So this tradeoff makes sense!
Key Concepts
Separate Indexes = Simple Inserts
Each property gets its own index:
- Background colors → backgrounds index
- Font colors → fontColors index
- Borders → borders index
- etc.
Each index uses simple LWW (exactly like Part 1!). No merge complexity!
Spatial Join = Combining at Query Time
When you need to see cell properties:
- Query each index independently
- Find where the results overlap
- Combine properties from all indexes for each region
The combination happens during the query, not during insert.
The Tradeoff
Compared to LWW (Part 1):
- More indexes to manage
- Slightly more complex queries
- But you GET multiple properties per cell!
Compared to Shallow Merge (Part 2):
- Simpler inserts (just normal LWW)
- Less total storage (3 ranges vs 4)
- Slightly more work at query time (join operation)
Why "Spatial" Join?
It's called spatial because we're joining based on where things are in space (which cells they cover), not on some ID or key field.
Academic term from spatial databases (like PostGIS for geographic data).
Summary
What we learned:
- Spatial join uses multiple simple indexes instead of one complex index
- Each index stores one property type using simple LWW
- Properties are combined at query time by finding overlaps
- Simpler inserts, slightly more complex queries
The tradeoff:
- ✅ Pro: Simple inserts (no merge logic), less storage per index
- ✅ Pro: Properties updated independently (changing background doesn't touch font colors)
- ❌ Con: Query requires joining multiple indexes
When to use this:
- ✅ Multiple independent properties (background, font, borders)
- ✅ Properties updated separately (user changes background, later changes font)
- ✅ Inserts more frequent than queries (batch edits, render once)
- ✅ Google Sheets batch API (separate request types per property)
Comparison table:
| Approach | Insert | Query | Storage (example) | Best for |
|---|---|---|---|---|
| LWW (Part 1) | Simple | Simple | 2 ranges | Single property only |
| Shallow Merge (Part 2) | Complex | Simple | 4 ranges | Properties always together |
| Spatial Join (Part 3) | Simple | Join | 3 ranges (across 2 indexes) | Independent properties |
What's Next?
You've now seen all three approaches! Each has different tradeoffs:
- Part 1 (LWW): Use when you only have one property (like cell values)
- Part 2 (Shallow Merge): Use when properties are always updated together
- Part 3 (Spatial Join): Use when properties are updated independently
For real-world spreadsheet systems (like Google Sheets), Spatial Join is often the best choice because:
- Users edit different properties at different times
- API has separate request types for each property
- Insert performance matters more than query performance
This is a standard technique in spatial databases and GIS systems!