Case Notes: RoomIQ — Teaching AI to Place Furniture
This is harder than it sounds.
A user wants to redesign their living room. They have a 4×5m space, a door, two windows, a “Scandinavian” aesthetic, and a budget. They want to see the result now — real furniture from a catalog, in 3D, placed correctly.
What does the market offer?
- A human designer — expensive, slow, waitlist.
- An online configurator — drag every piece manually, know every dimension, calculate every clearance yourself. That’s work, not visualization.
- AI image generation — beautiful output, but the sofa has no legs, and the door is blocked by a cabinet. You can’t actually put that in a real room.
This is where the RoomIQ Interior Engine begins.
What we built
The Interior Engine takes a room description and user preferences in plain language, and returns a ready-to-export 3D scene — real furniture from a live catalog, with every item positioned, collision-checked, and physically validated.
Not an image hallucination. Not a rough sketch. A precise plan with coordinates for every object, verified for collisions, walkway clearances, and room geometry constraints.
Why it’s technically hard
When we first tried “just ask GPT to arrange furniture,” it worked about 60% of the time. The other 40%: sofas inside walls, beds blocking doorways, tables with no room for a chair.
LLMs can reason about geometry. They cannot compute it.
So we built a hybrid architecture.
The LLM owns intent — it understands “a cozy bedroom for a family with a toddler,” extracts style preferences, priorities, and budget, and forms a structured request profile.
Deterministic algorithms own physics — spatial zoning, furniture placement, walkway validation (minimum 600mm clearances), door swing verification, and bilateral access to beds.
Evolutionary algorithms resolve conflicts — if anything still overlaps, Simulated Annealing (~600ms for most cases) or a Genetic Algorithm (~15s for complex scenes) iteratively adjusts positions until everything fits. They literally “nudge” furniture until the scene is valid.
Six features worth noting
1. Three request paths — minimum LLM calls
Every request is classified into one of three paths:
- PATH 1 — structured input, 0 LLM calls, pure deterministic.
- PATH 2 — free-form text, ≤1 call for profile extraction.
- PATH 3 — complex or ambiguous scenarios, ≤2 calls.
This isn’t just token economics. It means predictable cost and response time per request type — critical for production pricing models.
2. Real catalog, real SKUs
No invented items. The engine matches furniture from a live database, filtered by style, dimensions, and budget. If the selected sofa doesn’t fit the room, it falls back to a smaller piece from the same collection — it doesn’t just disappear.
3. Soft validation
There are two error classes: blockers (furniture outside walls, door obstructed) that prevent export, and warnings (slightly tight clearance near a nightstand) that are logged but don’t halt the process. Users get a result with notes — not a blank screen with an error.
4. Async render pipeline
Scene generation is fast. Photorealistic rendering is slow (100+ seconds). These are decoupled: the API responds in ~23 seconds with a preview, while the full render runs in the background. By the time a user finishes reviewing the floor plan, the image is nearly ready.
5. Multi-provider LLM router
OpenAI, Anthropic, and OpenRouter (100+ models) sit behind a single interface. Routing is based on priority, cost, and latency. Automatic failover on provider errors. From the user’s perspective, provider outages are invisible.
6. Deterministic reproducibility
A dedicated deterministic mode ensures the same input always produces the same output. Essential for A/B testing, debugging, and client demos.
The pipeline
Every stage has configurable rules in YAML, metrics in Prometheus, and errors in Sentry. Infrastructure runs on AWS (ECS Fargate, RDS, S3), provisioned with Terraform.
Metrics
| Metric | Value |
|---|---|
| API response with preview | ~23 seconds |
| Simulated Annealing (most scenes) | ~600ms |
| Genetic Algorithm (complex scenes) | ~15 seconds |
| Photorealistic render | 100+ seconds (async) |
| Max LLM calls per request | ≤2 |
| Pipeline uptime target | 99%+ |
- LLMs can reason about geometry. They cannot compute it. Separate these concerns explicitly.
- Classify requests early. Knowing which path a request takes lets you control cost, speed, and predictability.
- Soft validation over hard errors. A result with warnings is always better than a blank screen.
- Decouple fast from slow. Separate the ~23s interactive response from the 100s+ render — users don’t wait for what runs in the background.
- Reproducibility is infrastructure. Deterministic mode makes A/B tests, debugging, and demos reliable by design.
The bigger point
We didn’t build “another interior design chatbot.” We solved a specific engineering problem: how to make an LLM do what it’s bad at — geometry and physics — by using it only where it excels: intent understanding, style interpretation, and context reasoning.
The result is a system where neural networks and classical algorithms work as partners, not substitutes. That’s what production hybrid AI looks like.