nanochat by Karpathy - How to build your own ChatGPT for $100
Artificial Intelligence : Papers & Concepts
Release Date: 10/21/2025
Artificial Intelligence : Papers & Concepts
In this episode, we explore DINOv3, a new self-supervised learning (SSL) vision foundation model from Meta AI Research, emphasizing its ability to scale effortlessly to massive datasets and large architectures without relying on manual data annotation. The core innovations are scaling model and dataset size, introducing Gram anchoring to prevent the degradation of dense feature maps during long training, and employing post-hoc strategies for enhanced flexibility in resolution and text alignment. The authors present DINOv3 as a versatile visual encoder that achieves...
info_outlineArtificial Intelligence : Papers & Concepts
dots.ocr is a powerful, multilingual document parsing model from rednote-hilab that achieves state-of-the-art performance by unifying layout detection and content recognition within a single, efficient vision-language model (VLM). Built upon a compact 1.7B parameter Large Language Model (LLM), it offers a streamlined alternative to complex, multi-model pipelines, enabling faster inference speeds. The model demonstrates superior capabilities across multiple industry benchmarks, including OmniDocBench, where it leads in text, table, and reading order tasks, and olmOCR-bench, where...
info_outlineArtificial Intelligence : Papers & Concepts
In this episode, we dive deep into DeepSeek-OCR, a cutting-edge open-source Optical Character Recognition (OCR) / Text Recognition model that’s redefining accuracy and efficiency in document understanding. DeepSeek-OCR flips long-context processing on its head by rendering text as images and then decoding it back—shrinking context length by 7–20× while preserving high fidelity. We break down how the two-stage stack works—DeepEncoder (optical/vision encoding of pages) + MoE decoder (text reconstruction and reasoning)—and why this “context optical compression” matters for...
info_outlineArtificial Intelligence : Papers & Concepts
“The best ChatGPT that $100 can buy.” That’s Andrej Karpathy’s positioning for nanochat—a compact, end‑to‑end stack that goes from tokenizer training to a ChatGPT‑style web UI in a few thousand lines of Python (plus a tiny Rust tokenizer). It’s meant to be read, hacked, and run so students, researchers, and tech enthusiats can understand the entire pipeline needed to train a baby version of ChatGPT. In this episode, we walk you through the nanochat repository. Resources nanochat github repo: AI Consulting & Product Development Services: ...
info_outlineArtificial Intelligence : Papers & Concepts
In this episode of Artificial Intelligence: Papers and Concepts, we explore SmolVLM, a family of compact yet powerful vision language models (VLMs) designed for efficiency. Unlike large VLMs that require significant computational resources, SmolVLM is engineered to run on everyday devices like smartphones and laptops. We dive into the research paper SmolVLM: Redefining Small and Efficient Multimodal Models and a related HuggingFace blog post, discussing key design choices such as optimized vision-language balance, pixel shuffle for token reduction, and learned positional tokens to improve...
info_outlineArtificial Intelligence : Papers & Concepts
In this episode, we dig deep into the unglamorous side of AI and computer vision projects — the mistakes, misfires, and blind spots that too often derail even the most promising teams. Based on BigVision.ai’s playbook “Common Pitfalls in Computer Vision & AI Projects”, we walk through a field-tested catalog of pitfalls drawn from real failures and successes. We cover: Why ambiguous problem statements and fuzzy success criteria lead to early project drift The dangers of unrepresentative training data and how missing edge cases sabotage models Labeling mistakes, data leakage,...
info_outline“The best ChatGPT that $100 can buy.” That’s Andrej Karpathy’s positioning for nanochat—a compact, end‑to‑end stack that goes from tokenizer training to a ChatGPT‑style web UI in a few thousand lines of Python (plus a tiny Rust tokenizer).
It’s meant to be read, hacked, and run so students, researchers, and tech enthusiats can understand the entire pipeline needed to train a baby version of ChatGPT.
In this episode, we walk you through the nanochat repository.
Resources
- nanochat github repo: https://github.com/karpathy/nanochat/
- AI Consulting & Product Development Services: https://bigvision.ai
- Start a career in computer vision & AI : https://opencv.org/university