Case Notes: RoomIQ — Teaching AI to Place Furniture

Case Notes: RoomIQ — Teaching AI to Place Furniture

How we built RoomIQ's Interior Engine — a hybrid system combining LLMs, deterministic geometry algorithms, and evolutionary optimization to generate physics-valid 3D room layouts from natural language.

Case Notes: RoomIQ — Teaching AI to Place Furniture

This is harder than it sounds.

A user wants to redesign their living room. They have a 4×5m space, a door, two windows, a “Scandinavian” aesthetic, and a budget. They want to see the result now — real furniture from a catalog, in 3D, placed correctly.

What does the market offer?

This is where the RoomIQ Interior Engine begins.

Human DesignerSlow + expensiveConfiguratorManual, no intentAI Image GenPretty hallucinations
Three existing options — each with a fundamental gap.

What we built

The Interior Engine takes a room description and user preferences in plain language, and returns a ready-to-export 3D scene — real furniture from a live catalog, with every item positioned, collision-checked, and physically validated.

Not an image hallucination. Not a rough sketch. A precise plan with coordinates for every object, verified for collisions, walkway clearances, and room geometry constraints.

Why it’s technically hard

When we first tried “just ask GPT to arrange furniture,” it worked about 60% of the time. The other 40%: sofas inside walls, beds blocking doorways, tables with no room for a chair.

LLMs can reason about geometry. They cannot compute it.

So we built a hybrid architecture.

LLM LayerIntent, style, budgetGeometry EnginePhysics, zones, clearancesEvolutionary OptimizerSA ~600ms / GA ~15s
Three layers — each responsible for what it does best.

The LLM owns intent — it understands “a cozy bedroom for a family with a toddler,” extracts style preferences, priorities, and budget, and forms a structured request profile.

Deterministic algorithms own physics — spatial zoning, furniture placement, walkway validation (minimum 600mm clearances), door swing verification, and bilateral access to beds.

Evolutionary algorithms resolve conflicts — if anything still overlaps, Simulated Annealing (~600ms for most cases) or a Genetic Algorithm (~15s for complex scenes) iteratively adjusts positions until everything fits. They literally “nudge” furniture until the scene is valid.

Six features worth noting

1. Three request paths — minimum LLM calls

Every request is classified into one of three paths:

This isn’t just token economics. It means predictable cost and response time per request type — critical for production pricing models.

2. Real catalog, real SKUs

No invented items. The engine matches furniture from a live database, filtered by style, dimensions, and budget. If the selected sofa doesn’t fit the room, it falls back to a smaller piece from the same collection — it doesn’t just disappear.

3. Soft validation

There are two error classes: blockers (furniture outside walls, door obstructed) that prevent export, and warnings (slightly tight clearance near a nightstand) that are logged but don’t halt the process. Users get a result with notes — not a blank screen with an error.

4. Async render pipeline

Scene generation is fast. Photorealistic rendering is slow (100+ seconds). These are decoupled: the API responds in ~23 seconds with a preview, while the full render runs in the background. By the time a user finishes reviewing the floor plan, the image is nearly ready.

5. Multi-provider LLM router

OpenAI, Anthropic, and OpenRouter (100+ models) sit behind a single interface. Routing is based on priority, cost, and latency. Automatic failover on provider errors. From the user’s perspective, provider outages are invisible.

6. Deterministic reproducibility

A dedicated deterministic mode ensures the same input always produces the same output. Essential for A/B testing, debugging, and client demos.

The pipeline

TextProfileZonesFurniturePhysicsOptimizer3D ExportRenderPhoto
End-to-end pipeline — each stage is independently configurable and observable.

Every stage has configurable rules in YAML, metrics in Prometheus, and errors in Sentry. Infrastructure runs on AWS (ECS Fargate, RDS, S3), provisioned with Terraform.

Metrics

API response (preview)
~23s
Full scene with floor plan and catalog matches.
Conflict resolution
600ms – 15s
SA for simple scenes, GA for complex ones.
Placement accuracy
~95%+
Valid scenes without manual correction, up from 60% baseline.
MetricValue
API response with preview~23 seconds
Simulated Annealing (most scenes)~600ms
Genetic Algorithm (complex scenes)~15 seconds
Photorealistic render100+ seconds (async)
Max LLM calls per request≤2
Pipeline uptime target99%+
Key takeaways
  • LLMs can reason about geometry. They cannot compute it. Separate these concerns explicitly.
  • Classify requests early. Knowing which path a request takes lets you control cost, speed, and predictability.
  • Soft validation over hard errors. A result with warnings is always better than a blank screen.
  • Decouple fast from slow. Separate the ~23s interactive response from the 100s+ render — users don’t wait for what runs in the background.
  • Reproducibility is infrastructure. Deterministic mode makes A/B tests, debugging, and demos reliable by design.

The bigger point

We didn’t build “another interior design chatbot.” We solved a specific engineering problem: how to make an LLM do what it’s bad at — geometry and physics — by using it only where it excels: intent understanding, style interpretation, and context reasoning.

The result is a system where neural networks and classical algorithms work as partners, not substitutes. That’s what production hybrid AI looks like.

Ready to build production AI systems?

We help teams ship AI that works in the real world. Let's discuss your project.

Related posts

Related reading