
🧠 How Do LLMs Like ChatGPT Actually Work?
This interactive guide, based on Andrej Karpathy’s technical deep dive, walks you through the entire process: from downloading the internet to building a conversational assistant.
🌐 Data Collection — Billions of web pages are crawled, filtered, and cleaned down to ~44 TB of high-quality text (FineWeb).
🔤 Tokenization — Text is split into sub-word chunks (tokens). GPT-4 uses a vocabulary of 100K tokens built with the BPE algorithm.
⚙️ Pre-training — A Transformer neural network learns to predict the next token, tuning billions of parameters over months of compute.
🤖 Base Model — The result is a text simulator: it completes sequences in a sophisticated way, but it is not yet an assistant.
🎓 Post-training (SFT + RLHF) — Humans create ideal conversations and rank responses. The model learns to imitate the best possible human labeler.
🌀 LLM Psychology — It hallucinates because it was always trained with confident answers. It has no persistent memory. Every conversation starts fresh.
📚 RAG — For up-to-date data, relevant documents are retrieved and injected into context before generating the answer.
💡 Explanation in a nutshell#
An LLM is essentially a system that learned to predict the next word by reading trillions of internet texts. It was then “fine-tuned” by humans to behave like a helpful assistant. Every response it generates is like flipping a very informed coin: it picks the most likely word, one by one, until the answer is complete.
More information at the link 👇
