Day 4 - Jan 24 2026

Understand the "Brain" of the operation before we start performing surgery on it.

1. What is an LLM? (No Math)

An LLM (Large Language Model) is not a "knowledge base" or a "search engine." It is a Next-Token Prediction Engine.

  • The Mental Model: Imagine the world's best auto-complete. If you type "The capital of France is", the model doesn't "know" geography. It calculates that "Paris" is statistically the most likely next word.

2. The 3 Pillars of Control

  • Tokens: The currency of LLMs. The model processes text in chunks (tokens), not words. Rough math: 1,000 tokens ≈ 750 words.

  • Context Window: The "Short-Term Memory." It’s how much text the model can look at right now to answer you. If conversation exceeds this limit, the model "forgets" the beginning.

  • Temperature: The "Creativity Knob."

    • Temp = 0.0: Precise, deterministic, factual (Use for coding/data extraction).

    • Temp = 1.0: Creative, random, diverse (Use for brainstorming/poetry).

3. Cloud vs. Local (Ollama)

  • Cloud (OpenAI/Gemini): Smarter, but data leaves your device. Cost per token.

  • Local (Ollama): Runs on your laptop. Private. Free. Works offline. Great for testing agents without burning money.

Status:

Last updated