top of page
Applications of Transformers has helped countless students
understand how multimodal models really work.
We'll use PyTorch to build an Image Captioning Model.
_edited.jpg)
Amit R
I love the attention to detail that Dev provides, with minimal theory to get started with implementation. This is exactly how I wanted to learn.

Shekhar C
The courses have been a LIFESAVER for me. They're comprehensive and easy to understand, almost like having a personal expert by your side.

Chang L
The structure has significantly boosted my confidence in AI/ML. Highly recommend for anyone serious about advancing in this field.
Course Content
01.
Basics of multimodal AI
02.
Encoder Decoder Models
03.
Review of Transformers
04.
Data Processing Techniques
05.
Generating text based on images
Simon Wang, Singapore
"Dev has a knack for simplifying complex concepts."
Jane Grant, New York
"This course contains a wealth of detailed concepts that you won't find collected together anywhere else. Highly recommend."
Rafael Ruiz, Spain
"I love how all the concepts are tied together from module to module."
What's Inside
Applications of Transformers is comprised of 3 main chapters.

Chapter 1: Multimodal AI
We start off the course by discussing the Neural Networks behind multimodal models.

Chapter 2: Building The Model
Next, we'll build the model using Convolutional Neural Networks & Transformers.

Chapter 3: Image Captioning
Finally, we'll train our model, pass in images as input, and observe the generated captions.

Search video...

Intro to Multimodal AI & Encoder-Decoders
11:33
Play Video

The Image Captioning Architecture
26:13
Play Video

Setting Up Dependencies
06:28
Play Video

Implementing Neural Network Layers
14:43
Play Video
Note for Members: The Video Info tab has a description with important clarifications and information for each module.
bottom of page