loader from loading.io

The Apparent Meaninglessness of AI Benchmarks, plus How to Explain AI Opportunities to Others

Raw Data with Rob Collie

Release Date: 12/16/2025

The Dangers of Letting AI Speak for You, and Why Selling AI Might be Easier than Selling Dashboards show art The Dangers of Letting AI Speak for You, and Why Selling AI Might be Easier than Selling Dashboards

Raw Data with Rob Collie

There’s an easy button for hard conversations now, and it’s dangerously good. You’ve got something complicated to say. It needs nuance. It needs empathy. It probably needs a little courage. The AI will draft the whole thing in seconds. It sounds smart. It sounds reasonable. You skim it. You send it. And most of the time, nothing bad happens. The problem is that the time it does go bad is the exact situation where you thought you were being thoughtful. This week’s Raw Data walks straight through one of those moments, from both sides of the exchange, and it’s a reminder that...

info_outline
Jensen Huang's Reindeer Games, Agent Frameworks vs. Fully Custom, and Rapid Impact vs. Technical Debt show art Jensen Huang's Reindeer Games, Agent Frameworks vs. Fully Custom, and Rapid Impact vs. Technical Debt

Raw Data with Rob Collie

In this week's episode, Rob and Justin dig into the weird paralysis happening at enterprise scale. Fortune 500 companies are spending six months in high-level negotiations to build AI workflows that could be done in a week. IT departments, trained for decades to fear custom code, are watching their companies get lapped by competitors who just decided to turn the thing on. Everyone's releasing agent frameworks, every AI company's got one, some have more than one, and instead of clarifying things, it's freezing people up.. There's a massive gap between what AI can do right now and what most...

info_outline
How to Acclimate Your Family to AI show art How to Acclimate Your Family to AI

Raw Data with Rob Collie

This week’s episode steps away from dashboards and delivery stories and into real life. Rob and Justin both spent the same week realizing how naturally AI is already showing up at home. Not as a plan. Not as a lesson. Just as part of how the next generation creates, explores, and even plans a date. One household includes an about to graduate computer science student navigating a shrinking entry level job market, Discord as the default communication layer, and a Claude Code powered date night that feels entirely normal to everyone involved. The other involves younger kids, a TV, a terminal...

info_outline
Tales from the Five Percent: Tangible AI Success, w/ Tuio's Juan Garcia show art Tales from the Five Percent: Tangible AI Success, w/ Tuio's Juan Garcia

Raw Data with Rob Collie

This week’s episode is a case study in what AI looks like when it’s doing real work. runs an insurance company in Spain. Industry average profit margin is 5%. He's at 15%, headed for 18%. The difference? Five AI agents in production doing real work. Not pilot projects. Not demos for the board. Actual agents handling claims, customer questions, marketing decisions, fraud detection, and underwriting. His claims adjusters went from 10 cases a day to 50 because the AI does everything except the stuff that actually needs a human. Here's the thing. Juan started this in mid-2023 with GPT-3.5....

info_outline
Rob's New Book on AI (and Why He's Writing It) show art Rob's New Book on AI (and Why He's Writing It)

Raw Data with Rob Collie

This week’s episode breaks the usual format, and that’s the point. Instead of a guest or a debate, Rob does something he hasn’t done publicly in a long time. He reads the foreword to a book he’s actively writing. The first one since 2015. Back then, his books helped define how people learned Power BI. For a few years, he was literally the guy who wrote the book. Then he stopped. No updates. No sequels. An entire generation of practitioners came up without ever encountering his work. So why return now? Because the same pattern is repeating itself, just louder. This time with AI. The...

info_outline
Is AI Is AI "Vibe Coding" the Next VBA?

Raw Data with Rob Collie

Those Excel macros running your business were never meant to be permanent. Someone in accounting built them because the company needed custom software and didn’t have the budget or patience for a two-year IT project. IT hates them. You know they’re fragile. But they work. And compared to expensive software that never quite fits, working counts for a lot. In this episode, Rob and Justin dig into what might finally replace that world. Not in theory, but in practice. Over the next four years, is the real shift AI helping people build traditional software faster and cheaper? Or is it software...

info_outline
Democratized Data Science, Custom Software is the Future, and the Data Gene Rides Again show art Democratized Data Science, Custom Software is the Future, and the Data Gene Rides Again

Raw Data with Rob Collie

Every week brings a new AI model, a new benchmark, and a new reason to believe everything just changed. But for most companies, none of that matters if the people closest to the work can’t use these tools to build something real. In this episode, Rob and Justin walk through what democratized data science really looks like. Not dashboards. Not prompts. Actual analysis and custom software built around a specific problem, driven by someone who knows the data well enough to challenge the answers. The difference isn’t the technology. It’s the person driving it. Someone who understands the...

info_outline
The Apparent Meaninglessness of AI Benchmarks, plus How to Explain AI Opportunities to Others show art The Apparent Meaninglessness of AI Benchmarks, plus How to Explain AI Opportunities to Others

Raw Data with Rob Collie

Every week brings a new AI benchmark. Higher scores. Bigger claims. Louder voices insisting this changes everything. And yet, when you put AI in front of a real business problem, none of that noise seems to help. In this episode, Rob and Justin dig into why AI benchmarks often feel strangely meaningless in practice and why that disconnect is the point. Benchmarks aren’t useless. They’re just answering a different question than the one most businesses are asking. This isn’t just random conjecture either. Rob walks through what he’s learned building actual AI workflows and why a twenty...

info_outline
The Power BI Fundamentals Behind Expert Development *and* AI Simplicity, w/ Microsoft's Rui Romano show art The Power BI Fundamentals Behind Expert Development *and* AI Simplicity, w/ Microsoft's Rui Romano

Raw Data with Rob Collie

Everyone keeps asking whether AI kills Power BI or makes it stronger. Rui Romano flips that entire question on its head. As the Microsoft PM behind PBIP, TMDL, and all the file format work that rebuilt Power BI's foundation, he explains how the platform accidentally became one of the most AI-ready systems in analytics - and it wasn't by accident, not really. His team was solving problems for real developers who were tired of unsupported workarounds and offshore relay races. They weren't training agents. But the work they did means AI now feels native instead of duct-taped on. What we learned...

info_outline
Why We Should Stop Paying Attention to the % of AI Projects which Fail (and Instead Learn Why the Others Succeed) show art Why We Should Stop Paying Attention to the % of AI Projects which Fail (and Instead Learn Why the Others Succeed)

Raw Data with Rob Collie

This episode starts with a familiar scene. A role opens, the applications pour in, and suddenly you’re staring at a mountain of resumes that deserve real attention but arrive faster than anyone can process. The mix had everything… experienced candidates, newcomers trying to break in, and a growing stack of AI-generated submissions that looked sharp until you asked a second question. That’s where Haystack came in. Instead of using AI as a blunt filter, Rob and the team treated it like a collaborator. Teach it what matters. Teach it what P3 looks for in a teammate. Teach it how to separate...

info_outline
 
More Episodes

Every week brings a new AI benchmark. Higher scores. Bigger claims. Louder voices insisting this changes everything. And yet, when you put AI in front of a real business problem, none of that noise seems to help. In this episode, Rob and Justin dig into why AI benchmarks often feel strangely meaningless in practice and why that disconnect is the point. Benchmarks aren’t useless. They’re just answering a different question than the one most businesses are asking.

This isn’t just random conjecture either. Rob walks through what he’s learned building actual AI workflows and why a twenty percent improvement on a leaderboard rarely translates into anything you can feel on the job. They talk about why model choice usually isn’t the bottleneck, why swapping models should be easy if you’ve built things the right way, and why the most successful AI work rarely shows up as a flashy demo. Most of the value is happening quietly, off-screen, inside systems that look a lot more like normal software than artificial intelligence.

Rob and Justin also talk about why explaining AI is often harder than building it. The first demo people see tends to stick, even when it’s the wrong one. Consumer AI feels magical. Business AI face plants unless it’s built with intent, structure, and real context. This episode gives leaders better language for that gap, without hype or panic. If you’re done chasing benchmarks and just want a way to think about AI that survives contact with reality, this episode’s for you.