Rectangle Decomposition with Spatial Join

What you'll learn: How to use multiple simple indexes (one per property) and combine them only when you need the full picture.

Series: Part 3 of 3


The Problem

In Part 2, we saw that Shallow Merge creates complexity - you have to store explicit overlap regions with combined properties. That's 4 rectangles instead of 2!

Question: What if we want cells to have multiple properties (background AND font color), but we don't want the merge complexity during inserts?

With Spatial Join: Keep background colors and font colors in SEPARATE simple indexes. Only combine them when you need to!

Let's see how this works...


Step-by-Step Walkthrough

Starting Point: Two Empty Indexes

Instead of one unified index, we create TWO separate indexes:

Each index is independent and simple!


Step 1: Set Background Colors (First Index)

What we do: backgrounds.insert(A1:C2, 'RED')

Then: backgrounds.insert(B0:D2, 'GREEN')

What it means: Same as Part 1 - just normal LWW in the backgrounds index!

The result:

     A   B   C   D
   +---+---+---+---+
 0 |   | G | G | G |
   +---+---+---+---+
 1 | R | G | G | G |  ← B1:C1 now GREEN (LWW)
   +---+---+---+---+
 2 | R | G | G | G |  ← B2:C2 now GREEN (LWW)
   +---+---+---+---+
 3 |   |   |   |   |
   +---+---+---+---+

Storage in backgrounds index: [
  { bounds: A1:A2, value: 'RED' },    // Left fragment
  { bounds: B0:D2, value: 'GREEN' }   // New range (LWW)
]

R = RED background
G = GREEN background

Notice: Just 2 ranges! Simple LWW decomposition, exactly like Part 1.

No merge complexity - the backgrounds index doesn't know or care about font colors!


Step 2: Set Font Colors (Second Index)

What we do: fontColors.insert(B1:D2, 'BLUE')

What it means: Give cells B1, C1, D1, B2, C2, D2 blue font color - but store it in a DIFFERENT index!

The result:

     A   B   C   D
   +---+---+---+---+
 0 |   |   |   |   |
   +---+---+---+---+
 1 |   | B | B | B |
   +---+---+---+---+
 2 |   | B | B | B |
   +---+---+---+---+
 3 |   |   |   |   |
   +---+---+---+---+

Storage in fontColors index: [
  { bounds: B1:D2, value: 'BLUE' }
]

B = BLUE font color

Again, simple! Just 1 range in this index.

Key insight: We now have TWO simple indexes instead of ONE complex index!

Compare to Shallow Merge (Part 2): 4 ranges in 1 unified index.


The Magic: Spatial Join at Query Time

Now here's where it gets interesting! When you want to see what properties a cell has, you combine the indexes on the fly.

The question: What are the properties for cells in viewport A0:D3?

Step 1: Query each index separately

const bgResults = backgrounds.query(A0:D3);
// Returns: [
//   { bounds: A1:A2, value: 'RED' },
//   { bounds: B0:D2, value: 'GREEN' }
// ]

const fontResults = fontColors.query(A0:D3);
// Returns: [
//   { bounds: B1:D2, value: 'BLUE' }
// ]

Easy! Each index just returns its own ranges.

Step 2: Spatial join - combine the results

Now we find where the ranges overlap and combine the properties:

function spatialJoin(bgResults, fontResults) {
  // For each distinct region in the viewport, find which properties apply
  const joined = [];

  // Process all unique regions (areas where coverage changes)
  for each distinct region in union(bgResults, fontResults) {
    const bg = findCoveringRange(region, bgResults);
    const font = findCoveringRange(region, fontResults);

    joined.push({
      bounds: region,
      background: bg?.value,    // undefined if no background
      fontColor: font?.value    // undefined if no font color
    });
  }

  return joined;
}

Think of it like putting two transparency sheets on top of each other - you see both!

Step 3: The combined result

     A   B   C   D
   +---+---+---+---+
 0 |   | G | G | G |  ← { background: GREEN }
   +---+---+---+---+
 1 | R |R+B|R+B|G+B|  ← A1={ bg: RED }, B1:C1={ bg: GREEN, font: BLUE }, D1={ bg: GREEN, font: BLUE }
   +---+---+---+---+
 2 | R |R+B|R+B|G+B|  ← A2={ bg: RED }, B2:C2={ bg: GREEN, font: BLUE }, D2={ bg: GREEN, font: BLUE }
   +---+---+---+---+
 3 |   |   |   |   |
   +---+---+---+---+

Joined result: [
  { bounds: A1:A2, background: 'RED', fontColor: undefined },
  { bounds: B0:D0, background: 'GREEN', fontColor: undefined },
  { bounds: B1:C2, background: 'GREEN', fontColor: 'BLUE' },  ← Joined!
  { bounds: D1:D2, background: 'GREEN', fontColor: 'BLUE' }   ← Joined!
]

R = { background: RED }
G = { background: GREEN }
B = { fontColor: BLUE }
R+B = { background: RED, fontColor: BLUE } (from join, not storage!)
G+B = { background: GREEN, fontColor: BLUE } (from join, not storage!)

Key insight: Combined properties only exist in the query result, NOT in storage!


What Just Happened?

Think of spatial join like organizing books:

Shallow Merge approach (Part 2):

Spatial Join approach (Part 3):

The tradeoff:

But here's the thing: In spreadsheets, you insert operations OFTEN (user edits) but query RARELY (only when rendering). So this tradeoff makes sense!


Key Concepts

Separate Indexes = Simple Inserts

Each property gets its own index:

Each index uses simple LWW (exactly like Part 1!). No merge complexity!

Spatial Join = Combining at Query Time

When you need to see cell properties:

  1. Query each index independently
  2. Find where the results overlap
  3. Combine properties from all indexes for each region

The combination happens during the query, not during insert.

The Tradeoff

Compared to LWW (Part 1):

Compared to Shallow Merge (Part 2):

Why "Spatial" Join?

It's called spatial because we're joining based on where things are in space (which cells they cover), not on some ID or key field.

Academic term from spatial databases (like PostGIS for geographic data).


Summary

What we learned:

  1. Spatial join uses multiple simple indexes instead of one complex index
  2. Each index stores one property type using simple LWW
  3. Properties are combined at query time by finding overlaps
  4. Simpler inserts, slightly more complex queries

The tradeoff:

When to use this:

Comparison table:

Approach Insert Query Storage (example) Best for
LWW (Part 1) Simple Simple 2 ranges Single property only
Shallow Merge (Part 2) Complex Simple 4 ranges Properties always together
Spatial Join (Part 3) Simple Join 3 ranges (across 2 indexes) Independent properties

What's Next?

You've now seen all three approaches! Each has different tradeoffs:

For real-world spreadsheet systems (like Google Sheets), Spatial Join is often the best choice because:

This is a standard technique in spatial databases and GIS systems!