Foundations of Language Modeling
Weeks 1–3- Understand the ML development pipeline and strengths/limitations of language models
- Compare n-gram and transformer models through hands-on coding
- Define a community problem to tackle with machine learning
Tobias Lorenz
Tobias Lorenz & Deborah Dormah Kanubala
- N-grams
- Lab: Experiment with N-grams
- Knowledge Check 2
- The limitations of N-grams
- Lab: Compare N-gram and Transformer language models
- AlphaFold: the power of machine learning
- Weighing values: culture and ethics in the trolley problem
- Applying a local ethical lens to the trolley problem
- Core aspects of Ubuntu
- Develop a local value framework
- Anatomy of a language model
- What does it mean to train a model?
- Knowledge Check 3
- Machine learning development pipeline
- Lab: Prepare the dataset for training an SLM
- Lab: Train your own Small Language Model (SLM)
- Evaluating a model
- Knowledge Check 4
- Anticipating benefits
- Challenge: Develop your problem statement
- Knowledge Check 5
- Summary
- Looking forward
- Additional resources and further reading
- Glossary
- Feedback
Text Data: Tokenization & Embeddings
Weeks 4–6- Learn to prepare, structure, and represent text data for language models
- Investigate tokenization strategies (character, word, subword/BPE) and embeddings
- Design a dataset ethically using data cards
- Teaching a machine the soul of your language
- A world of text: types and sources
- Exploring raw data
- Lab: Preprocess data
- Harnessing the potential of low-resource languages
- Data resources
- Who owns the data?
- Knowledge Check 1
- Class activity: Why document data?
- Challenge: Build a dataset ethically with a Data card
- Learning objectives
- How to get the most out of this course
- What is tokenization?
- Lab: Tokenize texts into characters and words
- Lab: Tokenize texts into subword tokens
- Subword tokenization
- Lab: Implement a BPE tokenizer
- Whose voice is missing?
- Knowledge Check 2
Mid-Cohort Break
Week 7Rest & Catch-up Week
No scheduled sessions. Use this time to review materials, work on your team project, or get ahead on challenge submissions.
Neural Networks & Training
Weeks 8–10- Implement and evaluate the multilayer perceptron (MLP)
- Understand backpropagation, gradients, and stochastic gradient descent
- Spot and mitigate overfitting/underfitting in practice
- The deep learning revolution
- Signal and noise
- Lab: Distinguish between signal and noise
- Generalization and the bias-variance trade-off
- Training and test splits
- Predicting cyclones with AI weather models
- Anticipating risks
- Knowledge Check 1
- The multilayer perceptron (MLP)
- Lab: Make predictions with a single-layer neural network
- Learning objectives
- How to get the most out of this course
- Gradients
- Lab: Gradients
- Knowledge Check 4
- Backpropagation
- Stochastic gradient descent (SGD)
- Lab: Train your model with Keras
- Knowledge Check 5
- Anticipating social impacts
- Challenge: Create an impact statement card
- Knowledge Check 6
- Summary
- Additional resources and further reading
- Looking forward
- Glossary
- Feedback
Transformer Architecture
Weeks 11–14- Explore the attention mechanism, masked attention, and multi-head attention
- Understand positional embeddings, layer normalization, and transformer blocks
- Build neural networks suited to language modeling
Josiah Isong
Olumide Okubadejo
- Positional embeddings
- Lab: Positional embeddings
- Sinusoidal and rotary positional embeddings
- Knowledge Check 4
- Community values and meaning in an automated world
- Why engagement matters: Gendered chatbots in Nigerian banks
- Mapping stakeholders and social values
- Challenge: Design a mini-engagement plan
Josiah Isong
Olumide Okubadejo
- Transformer blocks
- Multi-layer perceptron (in transformers)
- Layer normalization
- Lab: Trainable parameters in the transformer model
- Knowledge Check 3
- Pros and cons of transformers
- Decoding and generation
- Knowledge Check 5
- Summary
- Looking forward
- Additional resources and further reading
- Glossary
- Feedback
Project Week
Sep 6–12, 2026Final project sprint — no lectures
Teams finalise their Jupyter Notebook, presentation slides, and demo. Mentors available for check-ins. Challenge Lab: Train A Small Language Model closes this week.
🎉 Demo Day
September 19, 2026Online Demo Day — September 19, 2026
Each team presents their problem & motivation, dataset & model, ethical considerations, and results. Top teams receive prizes. Format: Remote (Online).
Full capstone project guidelines — including team structure, deliverables, project requirements, and mentor details — will be published here once finalised. Check back soon!
Four challenges run alongside the weekly curriculum, each tied to a course. A final challenge lab caps the course technical work before Demo Day.
All 4 Cohort Challenges
- Challenge 1 — Develop your problem statement
- Challenge 2 — Build a dataset with a Data Card
- Challenge 3 — Create an impact statement card
- Challenge 4 — Design a mini-engagement plan
Final Challenge Lab
- Train A Small Language Model
Submission Deadlines
- Challenge 1 closes → Week 5
- Challenge 2 closes → Week 7
- Challenge 3 closes → Week 9
- Challenge 4 closes → Week 11
- Project Leaderboard opens → Week 8
- Project Leaderboard closes → Week 14
- Challenge Lab closes → Week 14
- At least 60% lecture attendance
- Complete the "Train A Small Language Model" lab
- Submit all four cohort challenges (as a team)
- Participate in a final capstone project (as a team)
- Attend tutorial sessions regularly
- Engage in discussions & community activities
- Contribute actively to your team project
- Complete weekly materials on Google Cloud Skills Boost