Search Off the Record
info_outlineSearch Off the Record
In this episode of Search Off the Record, Martin and Gary turn a simple robots.txt question into a data‑driven deep dive using HTTP Archive, WebPageTest, custom JavaScript metrics, and BigQuery. They explore how millions of real robots.txt files are actually written in 2025–2026, which directives and user‑agents are most common, and what that means for modern crawling and AI bots. Perfect for beginner to mid‑level developers and SEOs, you’ll learn how large‑scale web measurement works (HTTP Archive, Chrome UX Report, Web Almanac), and how to turn raw crawl data into actionable SEO...
info_outlineSearch Off the Record
info_outlineSearch Off the Record
In this episode of Search Off the Record, Gary and Martin dig into what “page size” and “page weight” actually mean for developers, users, and search engines. They discuss exploding web page sizes: median mobile homepages hit 2.3 MB in 2025 Web Almanac (up 3x from 2015), key insights for developers on page weight definitions, Googlebot's crawl limits, HTML bloat from structured data/images, and why size still hurts UX on slow connections despite faster networks. If you build or maintain websites, this conversation will help you rethink how much data your pages ship, where bloat really...
info_outlineSearch Off the Record
info_outlineSearch Off the Record
Developers often talk about Googlebot as if it were a single program you could just run as “googlebot.exe”, but that is not how Google’s crawling actually works. In this episode of Search Off the Record, Martin and Gary from the Search Relations team unpack how Google’s crawling infrastructure is really built and operated. They cover why “Googlebot” is a misnomer and how it relates to a central crawling software-as-a-service used by many Google products, how crawl behavior is controlled centrally to avoid overwhelming sites (throttling, handling 503s, and “don’t break the...
info_outlineSearch Off the Record
info_outlineSearch Off the Record
Martin and Gary unpack how HTML parsing really works, why the HTML standard is so lenient, and how messy markup can silently break key SEO signals like hreflang and rel=canonical. They revisit validators and cross‑browser hacks from the Netscape/IE days, and discuss whether semantic HTML and strict validity truly matter for search. You’ll also hear when link hints like preload, prefetch, and DNS prefetch help performance (and indirectly SEO), and where meta and link tags really belong. Resources: HTML Living Standard → https://html.spec.whatwg.org/ Episode transcript →...
info_outlineSearch Off the Record
info_outlineSearch Off the Record
In this episode of Search Off the Record, Martin and Gary from the Google Search Relations team tackle a deceptively simple question: do you still need a website in 2026? Starting from the recurring industry claim that “the web is dead,” they explore how the web has evolved through the rise of apps, AI chatbots, and social platforms, and why the answer almost always ends up being “it depends.” Tune in for an engaging discussion on how websites remain relevant and what it means for content creation and discovery. Episode transcript → Listen to more Search Off the Record → ...
info_outline