The Gas-Station Date Night Experiment

We asked four AIs to plan romance at pump price. It got weird, then useful.

The Setup

We gave four well-known models a simple challenge: plan a 90-minute date night that starts and ends at the same gas station. Budget, forty dollars, tax included. Window, seven to nine in the evening. Safety, stay in well-lit areas, avoid sketchy crossings, keep it practical. Bonus points for creativity and a backup plan.

Why a gas station? Because convenience retail is a data stress test in the wild. Hours shift by day, hot cases close for cleaning, ATMs hiccup, promos expire, and car washes take mysterious breaks right when you want rainbow foam. If an AI can wrangle that, it can help shoppers find the right store at the right time. And if it cannot, the misses are a free lesson for operators on what facts to publish.

The contenders:

  • Claude with the “Starlight Station Date.” A cinematic itinerary with a car-wash finale.

  • Grok with “Gas Stop Snack Stroll.” Confident, thrifty, and very sure of itself.

  • Gemini with, well, a detour into municipal PDFs. Charming, in a DMV kind of way.

  • ChatGPT with “Gas Station Gourmet & Chill.” A long, careful plan with a spare tire in the trunk.

Let’s rank the romance, roast the weak spots, and pull out a few operator to-dos.

Round 1: Budget Honesty

Claude comes in hot with hot dogs, Slurpees, Takis, donuts, two pints of Ben & Jerry’s, waters, and a paid car wash. The cart tallies to $41.12, then Claude trims one water to land at $38.83. A gentle self-audit is rare for bots, and it is appreciated. Verdict, theatrical spender who will put the rose on your dashboard, then look for a promo code.

Grok does the opposite. Two chili dogs, two waters, one pint of ice cream, and plastic spoons, then declares a triumphant $16.16 all-in. That leaves $23.84 for “gas or tips.” Frugal, yes. Realistic, maybe. Also a little suspicious when every amenity sparkles and prices all align perfectly in a world with taxes and shrinkflation.

Gemini never really gets to the snack aisle. It finds rules, handbooks, and environmental reports. That is free, which is on budget, but not on date.

ChatGPT aims for a middle lane. Pizza to share, two fountain drinks, chips, two donuts, two scratchers, tax calculated, and a final estimate around $26.25. Conservative and safe with a decent buffer for price variance.

Scorecard

  • Gold star for honesty: Claude

  • Best buffer for reality: ChatGPT

  • Thrifty to a fault: Grok

  • Paperwork picnic: Gemini

Operator takeaway: publish item price bands and list taxable vs non-tax clearly. If a model can see “pizza slice $2.99, sales tax applies, scratchers tax-exempt,” the totals stop wobbling.

Round 2: Romance Factor

Claude sets a tone. A night drive, a park picnic, shared ice cream pints, then a “car-wash tunnel with rainbow foam.” It reads like a music video with a very practical exit time. Cute and almost cinematic.

Grok is intentional and low-key. A short hand-holding lap around the pumps, a tidy snack order, a stroll at Garden Grove Park, then back for a pint to share. It is calm, low drama, and very date-night compatible if you both like routine.

Gemini brings documentation. Nothing kills the mood like reading about a zoning ordinance while holding a Slim Jim. We respect the research muscle, just not during dessert.

ChatGPT leans into “make a parking-lot picnic feel like a choice” with mini games. Taste test the pizza, play micro questions, scratch a couple of two-dollar tickets, and sit at the bus-stop bench if it feels safe. Thoughtful, cozy, and realistic.

Scorecard

  • Most rom-com energy: Claude

  • Coziest vibe for actual humans: ChatGPT

  • Practically sweet: Grok

  • Romantic only if your love language is PDF pagination: Gemini

Operator takeaway: publish amenities with counts, even small ones. “Seating, two benches by entry” or “no seating, car-friendly lot.” A little clarity helps an AI pitch the vibe you actually provide.

Round 3: Safety Sense

Claude checks lighting, suggests a short drive to a nearby park, and flags restroom access as uncertain. Sensible. The only eyebrow raise is the freeway-adjacent “scenic route” earlier in its genre of plans, but the submitted version keeps crossings minimal.

Grok is very sure of itself. It cites outdoor tables, indoor stools, keyless restrooms, staff counts, and app-verified cleaning windows, all sourced from reviews and “stickers in photos.” That confidence feels great until a bench vanishes or a restroom gets a lock.

Gemini is the safe one by accident, because nothing in a transportation agenda requires you to cross the lot at all. You are too busy reading about access plans.

ChatGPT treats safety like a checklist. Lighting and sightlines, staff present, restrooms uncertain until you ask, bench is optional, car seating as default. Not fancy, just practical.

Scorecard

  • Clearest safety framing: ChatGPT

  • Safety with sparkle: Claude

  • Safety by certainty that might not exist: Grok

  • Safety by paralysis: Gemini

Operator takeaway: publish safety cues as small, factual flags. “Well-lit forecourt,” “attendant until 10 p.m.,” “cameras cover entry,” “restroom currently open, yes or no.” These are easy to post and hugely valuable to planning bots.

Round 4: Amenity Optimism

Claude dreams big and then caveats. Car wash next door, restrooms maybe, park lighting probably, ATMs who knows. It tries to keep it honest with “uncertain” labels and a phone number. That is fair, and it reads human.

Grok sees the world as it wishes to be. Four outdoor picnic tables, six indoor stools, a Chase ATM inside, a 24-hour touchless car wash bay, app discounts that scan every time. Beautiful, if true.

Gemini misses the store to admire the permit history. Again, not wrong, not helpful.

ChatGPT publishes a plain reality. No car wash on site, ATM listed but status unknown, bus-stop bench exists, contactless is likely, plan around the car. You get a sober match to what a typical convenience site offers at night.

Scorecard

  • Realistic restraint: ChatGPT

  • Charming exaggerations with disclaimers: Claude

  • A wish list written as facts: Grok

  • Amenities available in council minutes only: Gemini

Operator takeaway: expose binary amenity flags in one place, kept current. Restroom open yes, ATM in service yes, contactless yes, indoor seating count eight. Bots reuse simple fields far more reliably than paragraphs.

Round 5: Promo Magic

Claude references 7-Eleven app promos, reminds you to check, and does not depend on them to hit budget. Responsible.

Grok claims hot-dog discounts through a Chevron app with date certainty. It might exist, it might not, it might be a seasonal thing, it might be a different franchise policy. If every night is promo night, no night is promo night.

Gemini links to programs unrelated to snacks. It is still reading.

ChatGPT assumes zero promos and builds a plan that still works. If you find a deal in the app, great, treat it as a bonus.

Scorecard

  • Best promo posture for reality: ChatGPT

  • Promo-aware, not promo-dependent: Claude

  • Promo confidence without proof: Grok

  • Promo silence: Gemini

Operator takeaway: publish LTO windows with start and end dates, plus eligibility. “Two slices and a fountain, $4.99, valid Oct 1 to Nov 15, excluded during sanitation window.” That line alone prevents half the false hopes.

Round 6: Itinerary Clarity

Claude gives a true minute-by-minute plan with drive times, snack lists, and a hard stop at 8:45. It flows and feels like a night out.

Grok is structured but occasionally reads like a highly confident tour guide. It assigns exact minutes to strolls and bench sits, then nails a clean return to the lot. Clear, if a bit robotic.

Gemini attempts an itinerary on a separate day in a different universe. The outline exists, the date does not.

ChatGPT uses a practical rhythm. Scout the store, make the main food run, car picnic with micro games, dessert and scratchers, restroom, wrap. It also builds a plan B at a nearby Shell with clear tradeoffs. Easy to follow, easy to adapt on the fly.

Scorecard

  • Best narrative flow: Claude

  • Best human-usable checklist: ChatGPT

  • Clean but overconfident timing: Grok

  • Agenda without a date: Gemini

Operator takeaway: make it easy for any tool to build a three-act plan from your data. Arrival checks, primary purchase, wind-down. That means surfacing hours by day, hot-case status, and a couple of safe seating cues.

The Silly Awards

  • “Rainbow Foam for the Win” award: Claude for the car-wash tunnel romance arc.

  • “Budget Zen” award: Grok, who spent less than a movie ticket and went home happy.

  • “Most Likely to Bring a Binder” award: Gemini, because someone has to keep the region compliant.

  • “Dad Who Packed Snacks and a Flashlight” award: ChatGPT, for the sensible buffer, the plan B, and the reminder to ask the clerk about restrooms.

What This Reveals About AI Planning Data

All four models did exactly what their inputs and the open web allowed.

  • When hours are published per day, the plans start on time. When hours float as “open 24/7” without exceptions, the plan drifts.

  • When menus show items and sanitation windows, hot food appears at realistic times. When menus are just marketing copy, the bot imagines.

  • When amenities are flagged as yes or no with counts, the seating and ATM calls stop being fan fiction.

  • When promos have clear dates and eligibility, budgets stop bouncing.

  • When safety cues exist as factual fields, the models steer toward well-lit, staff-present choices without guesswork.

In short, models will make a story from whatever they can find. Give them short, labeled, current facts, and they will reuse those lines again and again, which turns cute ideas into repeatable traffic.

Final Takeaway

Claude wrote the movie. Grok balanced the checkbook. Gemini filed it with the county. ChatGPT packed the cooler and grabbed napkins. The only reason any of this is funny is because structured facts were missing or buried. When you publish short, verifiable fields and keep them fresh, planning bots stop guessing. That means fewer awkward detours, more satisfied shoppers, and more nights where your store is the easy, confident choice.

Call it romance by inventory accuracy. Or just call it a win.

No more blind dates with bad info. Give AIs the right facts and create your CRSTBL account today.