RoomIQ — AI Interior Design Platform
At a glance
- Industry: Retail / AI-powered interior design
- Focus: LLM orchestration, spatial geometry, 3D scene generation
- Goal: Generate physically valid room layouts with real catalog furniture from natural language input
- Duration: ~6 months from concept to production
Context
The client brief was deceptively simple: a user describes their room and preferred style — the system returns a complete interior with real furniture from a catalog. We built a full orchestration engine combining LLMs, deterministic algorithms, and evolutionary optimization into a single pipeline. Here’s what was interesting along the way.
Insight 1: GPT can reason about geometry. It cannot compute it.
The first prototype was naive: pass room parameters in a prompt, ask for furniture placement, parse coordinates from JSON output.
It worked about 60% of the time. The other 40%: a sofa clipping through the wall, a bed blocking the doorway, a chair with no room to pull out.
The problem is fundamental. An LLM is a statistical machine over text. It knows that “a bed usually goes against the wall” — but it cannot compute that a specific 180×200cm bed in a 3.2×4.1m room with a door 80cm from the corner only fits if you rotate it and shift it 15cm left.
The fix: the LLM no longer places furniture. It extracts user intent — style, priorities, budget, atmosphere. Everything downstream is deterministic algorithms and physics.
Insight 2: Evolution beats hand-written rules
When the deterministic planner places furniture, conflicts arise — especially in small rooms or non-standard layouts. The first instinct was to write correction rules: “if the sofa overlaps the table, shift the sofa by X.”
The rules turned into spaghetti fast. Every fix broke something else.
We switched to evolutionary algorithms:
- Simulated Annealing for straightforward cases: runs in ~600ms, resolves most conflicts.
- Genetic Algorithm for complex scenes (8+ items): ~15 seconds, but finds solutions where SA gets stuck in a local minimum.
The system selects automatically. If SA reaches a sufficient fitness score, GA never runs.
The fitness function deserves its own mention. It doesn’t just penalize collisions (−5000 for intersection) — it actively rewards ergonomics: +9 if the bed is accessible from both sides, +7 if the desk faces a window, +4 for balanced furniture distribution across the room. The algorithm literally optimizes for livability.
Insight 3: Three request paths instead of one
Early versions routed every request through the full LLM pipeline. Expensive, slow, and mostly unnecessary.
We introduced request classification into three paths:
| Path | Request type | LLM calls |
|---|---|---|
| PATH 1 | Structured — all parameters explicit | 0 |
| PATH 2 | Free-form text, straightforward requirements | ≤ 1 |
| PATH 3 | Complex context, many constraints | ≤ 2 |
Most production traffic lands on PATH 1 and PATH 2. The token savings are real. But the bigger benefit is predictability — both latency and cost per request become deterministic.
Insight 4: The async render was the non-obvious call
Scene generation takes a few seconds. Photorealistic rendering takes 100+ seconds. Waiting synchronously means the user stares at a spinner for nearly two minutes.
We decoupled the two processes: the API returns a schematic floor-plan preview in ~23 seconds, while the render runs in the background. By the time the user is reviewing the layout and browsing furniture options, the render is nearly done. The photorealistic image appears exactly when the user needs it — not two minutes later.
Insight 5: Soft validation
The classic dilemma: show the user an error, or show them something?
We defined two error classes:
Blockers — furniture outside the walls, a door obstructed, serious item overlap. These are physically impossible to reproduce in a real room. Export is disabled.
Warnings — a nightstand slightly tight against the wall, a rug edging past the zone boundary. Logged and surfaced to the user, but the process continues.
The psychological effect matters. A result with a note — “this area could be improved” — is far better than a blank screen with an error code. Users iterate. They don’t abandon.
Insight 6: The LLM router as production insurance
Depending on a single provider in production is a liability. OpenAI goes down. Rate limits hit. Latency spikes without warning.
We built a custom router across OpenAI, Anthropic, and OpenRouter (100+ models behind one API). It balances across four strategies: priority, round-robin, minimum cost, minimum latency. On provider failure — automatic failover to the next. Users see nothing. We get an alert and investigate.
Results
| Metric | Result |
|---|---|
| Placement validity | ~95%+ (up from 60% baseline) |
| API response with preview | ~23 seconds |
| Conflict resolution — SA | ~600ms |
| Conflict resolution — GA | ~15 seconds |
| Max LLM calls per request | ≤ 2 |
| Pipeline uptime | 99%+ |
Key takeaways
Don’t trust LLMs with what they do poorly. LLMs understand people well — they reason poorly about spatial constraints. A hybrid architecture where each component does what it’s actually good at is more reliable, faster, and cheaper than trying to make a neural network do everything.
YAML as the single source of rules. All placement rules — clearances, zone priorities, minimum walkway widths — live in one configuration file. The validator, the evolutionary optimizer, and the router all read the same rules. No desync. No “we forgot to update that part.” Changes deploy without a code push.