Case Notes: RoomIQ — Teaching AI to Place Furniture

This is harder than it sounds.

A user wants to redesign their living room. They have a 4×5m space, a door, two windows, a “Scandinavian” aesthetic, and a budget. They want to see the result now — real furniture from a catalog, in 3D, placed correctly.

What does the market offer?

A human designer — expensive, slow, waitlist.
An online configurator — drag every piece manually, know every dimension, calculate every clearance yourself. That’s work, not visualization.
AI image generation — beautiful output, but the sofa has no legs, and the door is blocked by a cabinet. You can’t actually put that in a real room.

This is where the RoomIQ Interior Engine begins.

Three existing options — each with a fundamental gap.

What we built

The Interior Engine takes a room description and user preferences in plain language, and returns a ready-to-export 3D scene — real furniture from a live catalog, with every item positioned, collision-checked, and physically validated.

Not an image hallucination. Not a rough sketch. A precise plan with coordinates for every object, verified for collisions, walkway clearances, and room geometry constraints.

Why it’s technically hard

When we first tried “just ask GPT to arrange furniture,” it worked about 60% of the time. The other 40%: sofas inside walls, beds blocking doorways, tables with no room for a chair.

LLMs can reason about geometry. They cannot compute it.

So we built a hybrid architecture.

Three layers — each responsible for what it does best.

The LLM owns intent — it understands “a cozy bedroom for a family with a toddler,” extracts style preferences, priorities, and budget, and forms a structured request profile.

Deterministic algorithms own physics — spatial zoning, furniture placement, walkway validation (minimum 600mm clearances), door swing verification, and bilateral access to beds.

Evolutionary algorithms resolve conflicts — if anything still overlaps, Simulated Annealing (~600ms for most cases) or a Genetic Algorithm (~15s for complex scenes) iteratively adjusts positions until everything fits. They literally “nudge” furniture until the scene is valid.

Six features worth noting

1. Three request paths — minimum LLM calls

Every request is classified into one of three paths:

PATH 1 — structured input, 0 LLM calls, pure deterministic.
PATH 2 — free-form text, ≤1 call for profile extraction.
PATH 3 — complex or ambiguous scenarios, ≤2 calls.

This isn’t just token economics. It means predictable cost and response time per request type — critical for production pricing models.

2. Real catalog, real SKUs

No invented items. The engine matches furniture from a live database, filtered by style, dimensions, and budget. If the selected sofa doesn’t fit the room, it falls back to a smaller piece from the same collection — it doesn’t just disappear.

3. Soft validation

There are two error classes: blockers (furniture outside walls, door obstructed) that prevent export, and warnings (slightly tight clearance near a nightstand) that are logged but don’t halt the process. Users get a result with notes — not a blank screen with an error.

4. Async render pipeline

Scene generation is fast. Photorealistic rendering is slow (100+ seconds). These are decoupled: the API responds in ~23 seconds with a preview, while the full render runs in the background. By the time a user finishes reviewing the floor plan, the image is nearly ready.

5. Multi-provider LLM router

OpenAI, Anthropic, and OpenRouter (100+ models) sit behind a single interface. Routing is based on priority, cost, and latency. Automatic failover on provider errors. From the user’s perspective, provider outages are invisible.

6. Deterministic reproducibility

A dedicated deterministic mode ensures the same input always produces the same output. Essential for A/B testing, debugging, and client demos.

The pipeline

End-to-end pipeline — each stage is independently configurable and observable.

Every stage has configurable rules in YAML, metrics in Prometheus, and errors in Sentry. Infrastructure runs on AWS (ECS Fargate, RDS, S3), provisioned with Terraform.

Metrics

API response (preview)

~23s

Full scene with floor plan and catalog matches.

Conflict resolution

600ms – 15s

SA for simple scenes, GA for complex ones.

Placement accuracy

~95%+

Valid scenes without manual correction, up from 60% baseline.

Metric	Value
API response with preview	~23 seconds
Simulated Annealing (most scenes)	~600ms
Genetic Algorithm (complex scenes)	~15 seconds
Photorealistic render	100+ seconds (async)
Max LLM calls per request	≤2
Pipeline uptime target	99%+

Key takeaways

LLMs can reason about geometry. They cannot compute it. Separate these concerns explicitly.
Classify requests early. Knowing which path a request takes lets you control cost, speed, and predictability.
Soft validation over hard errors. A result with warnings is always better than a blank screen.
Decouple fast from slow. Separate the ~23s interactive response from the 100s+ render — users don’t wait for what runs in the background.
Reproducibility is infrastructure. Deterministic mode makes A/B tests, debugging, and demos reliable by design.

The bigger point

We didn’t build “another interior design chatbot.” We solved a specific engineering problem: how to make an LLM do what it’s bad at — geometry and physics — by using it only where it excels: intent understanding, style interpretation, and context reasoning.

The result is a system where neural networks and classical algorithms work as partners, not substitutes. That’s what production hybrid AI looks like.

Case Notes: RoomIQ — Teaching AI to Place Furniture

Case Notes: RoomIQ — Teaching AI to Place Furniture

What we built

Why it’s technically hard

Six features worth noting

1. Three request paths — minimum LLM calls

2. Real catalog, real SKUs

3. Soft validation

4. Async render pipeline

5. Multi-provider LLM router

6. Deterministic reproducibility

The pipeline

Metrics

The bigger point

Ready to build production AI systems?

Related posts

Related reading