AgrigateVision
At a glance
- Industry: Dairy agriculture
- Focus: Computer vision, edge inference, monitoring
- Goal: Early lameness detection with production‑grade reliability
Context
We built an early‑lameness detection system for dairy farms. On paper, this looks like a standard computer vision problem. In practice, it’s a physical system with messy inputs and delayed feedback.
What the environment really looked like:
- Cameras get bumped during cleaning.
- Lighting shifts by season and time of day.
- Lenses collect dust, steam, and condensation.
- Labels arrive weeks later from veterinary checks.
The model worked in the lab. The pipeline failed in the barn.
Challenge
Deploy CV where data drifts daily, ground truth is delayed, and connectivity is unreliable. The core risk was not a weak model; it was a brittle system that could not survive real‑world variability — a classic Applied AI systems problem.
Why “model‑first” broke
We learned quickly that accuracy in notebooks did not translate to consistent results in the field. Small errors early in the pipeline (missed detections, ID switches, occlusions) compounded downstream into unstable risk scores — a common production ML failure mode. The system needed to survive missing frames, partial visibility, and incomplete labels rather than assume perfect inputs.
What we built
A system‑first CV pipeline where every stage was designed to handle uncertainty:
- Multi‑stage pipeline for detection → tracking → gait feature extraction → risk scoring.
- Negotiation layer for low‑confidence frames (interpolation across recent history instead of hard drops).
- Temporal aggregation to stabilize decisions over time, not per‑frame.
- Input health monitoring (brightness, sharpness, and occlusion signals) to detect “data decay” like dust or lens shifts.
- Asynchronous supervision with delayed ground truth, merging veterinary confirmations to historical inference logs.
- Active learning loop to prioritize human review where uncertainty is highest.
- Offline‑first edge behavior with local buffering and delayed consistency when connectivity drops.
Pipeline flow (system view)
- Detect animals with conservative thresholds.
- Track identities across frames to avoid ID switches.
- Extract gait features and normalize for partial visibility.
- Aggregate over time to avoid noisy, frame‑level alerts.
- Score risk and feed the review queue.
Data & feedback loop
- Asynchronous supervision: delayed labels merged back into historical inference logs.
- Input health monitoring: brightness, sharpness, occlusion, and drift signals.
- Active learning: prioritize human review where uncertainty is highest.
Failure modes and mitigations
| Failure mode | Mitigation |
|---|---|
| Camera moved or dirty | Input health alerts + maintenance workflow |
| Occlusions & ID switches | Tracker confidence + temporal smoothing |
| Label latency (weeks) | Delayed‑label ingestion + backfilled training |
| Connectivity drop | Offline buffer + delayed consistency |
Trade‑offs we made
- Robustness over peak accuracy. A slightly less accurate model that fails predictably beats a brittle SOTA model in production.
- System observability before optimization. We invested early in input health metrics and pipeline logging to avoid silent drift.
- Latency vs reliability. We optimized for stable, consistent signals rather than chasing single‑frame real‑time precision.
Results
- 20–40% reduction in manual review workload (typical range)
- 100–300 ms edge inference latency for real‑time decisions
- 95–99% uptime for edge pipeline stability
Stack
- Edge inference
- Tracking + gait feature extraction
- Active learning workflow
- Monitoring dashboards for pipeline health
Takeaways
- Applied CV in the wild is a systems problem, not a model problem.
- Models degrade; pipelines survive when uncertainty is designed in.
- The winning architecture accepted messy inputs and made failures observable.