Week 8 - BALT 4364 - Language Models
Chapter 8 takes a deep dive into Large Language Models (LLMs), one of the biggest breakthroughs in artificial intelligence. These models—like ChatGPT and DALL·E 2—are changing the way machines understand and generate human-like text and images. In this chapter, I explore how LLMs work, how they’re trained, and why they matter, along with a hands-on exercise to help me actually apply what I’m learning. LLMs are built on transformer architecture, which allows them to recognize patterns and relationships in language. That’s what makes them so powerful for tasks like answering questions, summarizing long passages, writing emails, or even creating poetry. The chapter also breaks down the two major stages of training: pre-training, where the model learns general language patterns from massive amounts of text, and fine-tuning, where it’s shaped for a specific task. Understanding this process helps me see why LLMs are so capable—and where their limitations come from. ...