Lisa Zulu Posted on May 30 Why the Treasure Hunt Demo Broke Every Query Tool We Fed It # webdev # programming # ai # machinelearning The Problem We Were Actually Solving We were not building a demo. We needed to let Veltrix operators run A/B experiments on synthetic user journeys without melting the underlying SQL warehouse. The real question was: how close could we push the warehouse to the AI inference layer before the planner started dropping predicates and the warehouse returned rows that made no sense for the user journey. The warehouse in question was a Snowflake XL on AWS, billed by the second. Our synthetic user model generated 250 k journeys per minute during peak. The AI layer had to annotate each journey with intent tags (shopping, support, fraud) within 200 ms to stay ahead of the next batch. That was the operating envelope, not the sales slide. What We Tried First (And Why It Failed) First cut: put the intent model in a sidecar container next to the Spark cluster that generated the journeys. We picked ONNX Runtime v1.14 with a DistilBERT fine-tuned on our own corpus because the latency slide said 30 ms. Reality: ONNX packaged the tokenizer as a separate DLL. Tokenization alone took 85–110 ms on c6i.large instances, pushing the total inference time to 190 ms when the warehouse was cold and 280 ms when Snowflake decided to spike the warehouse cluster. The operator dashboards immediately showed orange pings; the business called it a red fire drill. Worse, the tokenizer DLL leaked memory. After two hours on a 64-core cluster, each pods RSS climbed to 2.4 GB, and the Kubernetes scheduler evicted five pods in a row. The warehouse downstream received duplicate rows with NULL intents, so every metric we exported was off by 7–12 %. The Architecture Decision We ripped out the sidecar entirely. Instead, the Spark jobs write raw event JSON to an S3 bucket every 60 seconds. A Lambda function (Python 3.12 runtime) picks up the bucket, tokenizes offline, and stores th
Back to Home
Why the Treasure Hunt Demo Broke Every Query Tool We Fed It
B
Blizine Admin
·2 min read·0 views
📰Dev.to — dev.to
B
Blizine Admin
View Profile Staff Writer