Day 4 - Jan 24 2026

Understand the "Brain" of the operation before we start performing surgery on it.

An LLM (Large Language Model) is not a "knowledge base" or a "search engine." It is a Next-Token Prediction Engine.

The Mental Model: Imagine the world's best auto-complete. If you type "The capital of France is", the model doesn't "know" geography. It calculates that "Paris" is statistically the most likely next word.

Tokens: The currency of LLMs. The model processes text in chunks (tokens), not words. Rough math: 1,000 tokens ≈ 750 words.
Context Window: The "Short-Term Memory." It’s how much text the model can look at right now to answer you. If conversation exceeds this limit, the model "forgets" the beginning.
Temperature: The "Creativity Knob."
- Temp = 0.0: Precise, deterministic, factual (Use for coding/data extraction).
- Temp = 1.0: Creative, random, diverse (Use for brainstorming/poetry).

Cloud (OpenAI/Gemini): Smarter, but data leaves your device. Cost per token.
Local (Ollama): Runs on your laptop. Private. Free. Works offline. Great for testing agents without burning money.

Last updated 6 days ago