Architecture · decision brief · RFC #403

BenzAuto — Scheduler & bay-management engine: the call

The written decision behind the two diagrams. Read this next to the current architecture (how booking works today and where it fails) and the four candidate engines (A–D, the shared policy config, the axes). Issue #403. Pilot floor-ops source of truth: KL Chan.

1 · Evidence from staging (org 0001) — grounds the whole decision

Four facts pulled from the live database. They rule out "just add more capacity" and point straight at engine logic as the failure.

2 · Current engine gaps — what #403 must fix

The whole "do we have a slot?" decision lives in one function (packages/db/src/services/slot-proposal.ts). Here is everything it gets wrong.

3 · The four candidate engines

Same problem (bays × time × job length, constrained by hours and policy), four shapes of answer. Full flow sketches live in the options diagram.

A · Patch the 30-min slot grid

Keep today's forward-walking loop but collect all fitting slots (not the first 3), add a time-of-day filter, apply reserve/caps as post-filters, match bay_type.
Effort S Pros smallest change; closes the live "afternoon" / "next week" failures on today's engine.  Cons reserve/caps are bolt-ons; still can't model variable duration cleanly. Ceiling still a rigid grid.

B · Interval capacity planner (rule-driven)  recommended

Model each bay as a timeline of free intervals. Place a job of duration[min,max] + buffer into an equipment-matched bay, then run a per-tenant policy layer (reserve-bays, reserve-hours, caps). Emit slots or windows, staff-confirm.
Effort M Pros real per-bay scheduling; variable duration + buffer + equipment; clean policy layer; explainable; stays in Bun/TypeScript.  Cons more code than A; we own the interval-fitting logic. The recommended middle ground.

C · Constraint solver (OR-Tools CP-SAT)

Jobs = time intervals, bays = resources, policy = constraints; a solver finds a feasible/optimal assignment. Generalises to technicians and skills. Likely a separate Python service.
Effort L Pros most powerful, future-proof; provably optimal.  Cons separate runtime; heavy ops; added latency; hard to explain "why no slot" to a customer. Biggest jump — hold in reserve.

D · Drop-off / day-bucket capacity

No exact start times. Sell capacity per day or per AM/PM by job class (light / standard / heavy), capped, with headroom held back for walk-ins. The Tekmetric / Shopmonkey model.
Effort S–M Pros dead-simple "leave the car" UX; robust to estimate error; easy caps + walk-in headroom.  Cons no exact start time; wrong for customers who want "2pm sharp". Best fit for drop-off shops.

OptionModelBooking UXEffortBest for
APatched fixed 30-min gridExact slots, spread + time-of-day filterSShipping a fix now on today's engine
BInterval capacity planner (rules)Offer windows / slots, staff-confirmMRealistic per-bay scheduling, in our stack
CCP-SAT constraint solverOptimal slots / windowsLLong term: technicians, skills, optimisation
DDay / AM-PM capacity buckets"Leave the car", no exact timeS–MDrop-off shops

4 · Per-tenant policy schema — the real "bay management"

Every option reads the same per-org configuration. A treats reserve/caps as post-filters; B/C make them first-class; D leans on caps + reserve. One shape keeps workshops portable across whichever engine wins.

bayManagement: {
  bays:     [ { name, bayType/equipment[], active } ]          # bay_type EXISTS today
  services: # per service → { class: from service_type, duration:{min,max}, buffer, multiDay? }
  reserve:  { keepBaysOpen: 1, reserveHoursPct: 0, releaseAfter: "15:00" }  # walk-in protection
  caps:     { heavyPerDay, majorPerDay }                       # max_per_day exists, unused
  bookingMode: slots | windows | dropoff
  requireBayEquipment: true                                       # match job → bay_type
}

5 · The decision, on three axes

The four options aren't a ranking — they're points on three independent axes. Pick a position on each and the option falls out.

Exact start time A and C anchor on exact times · B offers windows · D drops start times entirely Day / AM-PM bucket
Rule-based A, B, D are rule-driven · C is the only constraint solver Constraint solver
Bays only A, B, D schedule bays × time · only C adds technicians & skills Schedule people too

6 · Recommendation going in (to be grilled, not assumed)

7 · Open questions to grill (some answered by the data)