DDPM are generative models that learn to transform Gaussian noise into data samples by iteratively denoising through a Markovian diffusion process. The model is trained on CIFAR10.
Generative Adversarial Networks (GANs) consist of a generator and a discriminator trained in a competitive framework. This implementation is trained on the CelebA dataset and conditioned on attributes such as 'Male' and 'Blond Hair' to generate realistic face images.
A multimodal model by OpenAI that learns visual concepts from natural language supervision, enabling it to understand and relate images and text efficiently.
Implementation the AlexNet architecture, the landmark model that won the 2012 ImageNet challenge. It uses the CIFAR-10 dataset for training and integrates TensorBoard for real-time visualization of metrics.
Spatial Transformer Networks allow a neural network to learn how to perform spatial transformations on the input image in order to enhance the geometric invariance of the model.