Week 8 - BALT 4364 - Language Models

 






    Chapter 8 takes a deep dive into Large Language Models (LLMs), one of the biggest breakthroughs in artificial intelligence. These models—like ChatGPT and DALL·E 2—are changing the way machines understand and generate human-like text and images. In this chapter, I explore how LLMs work, how they’re trained, and why they matter, along with a hands-on exercise to help me actually apply what I’m learning.

LLMs are built on transformer architecture, which allows them to recognize patterns and relationships in language. That’s what makes them so powerful for tasks like answering questions, summarizing long passages, writing emails, or even creating poetry.

The chapter also breaks down the two major stages of training: pre-training, where the model learns general language patterns from massive amounts of text, and fine-tuning, where it’s shaped for a specific task. Understanding this process helps me see why LLMs are so capable—and where their limitations come from.

Of course, LLMs come with challenges too, like high computational costs, data bias, and ethical concerns.

To bring everything together, the chapter ends with a practical Google Colab exercise using the T5 model for text summarization. This hands-on part is especially valuable, giving me a chance to build real skills I can use in my own projects.

Comments

Popular posts from this blog

Week 5 - BALT 4364 - Understanding Natural Language Processing

Week 2 - BALT 4364 - Using Google Colab for Hands on Python

Week 6 - Balt 4364 - PyTorch