Image Inpainting for Content Localization

In our last post, we trained StyleGAN2 over a corpus of hundreds of thousands theatrical posters we scraped from sites like IMDb. Then we explored image retrieval applications of StyleGAN2 after extracting embeddings by projecting our image corpus onto the learned latent factor space. Image retrieval techniques can form the basis of personalized image recommendations as we use content similarity to generate new recommendations. Netflix engineers posted about testing the impact on user engagement from artwork produced by their content creation team....

 · 3 min · Terry Rodriguez & Salma Mayorquin

Applying GAN Latent Factors for Image Retrieval

GANs consistently achieve state of the art performance in image generation by learning the distribution of an image corpus. The newest models often use explicit mechanisms to learn factored representations for images which can be help provide faceted image retrieval, capable of conditioning output on key attributes. In this post, we explore applying StyleGAN2 embeddings in image retrieval tasks. StyleGAN2 To begin, we train a StyleGAN2 model to generate theatrical posters from our image corpus....

 · 3 min · Terry Rodriguez & Salma Mayorquin

Deepfake Detection With NVIDIA TLT 3.0 and DeepStream SDK

Last year, over 2 thousand teams participated in Kaggle’s Deepfake detection video classification challenge. For this task, contestants were provided 470 GB of high resolution video and required to submit a notebook which predicts whether each sample video file has been deepfaked with a 9 hour run-time limit. Since most deepfake technology performs a faceswap, contestants concentrated around face detection and analysis. Beginning with face detection, contestants could develop an image classifier using the provided labels....

 · 5 min · Terry Rodriguez & Salma Mayorquin

Deepfake Detection: Challenge Accepted

Advances in methods to generate photorealistic but synthetic images have prompted concerns about abusing the technology to spread misinformation. In response, major tech companies like Facebook, Amazon, and Microsoft partnered to sponsor a contest hosted by Kaggle to mobilize machine learning talent to tackle the challenge. With $1 million in prizes and nearly half a terabyte of samples to train on, this contest requires the development of models that can be deployed to combat deepfakes....

 · 2 min · Terry Rodriguez & Salma Mayorquin

Everybody Dance Faster

Check out the repo and the video! “Everybody Dance Now” offers a sensational demonstration in combining image-to-image translation with pose estimation to produce photo-realistic ‘do-as-i-do’ motion transfer. Researchers used roughly 20 mins of video shot at 120 fps of a subject moving through a normal range of body motion. It is also important for source and target videos to be taken from similar perspectives. Generally, this is a fixed camera angle at a third person perspective with the subject’s body occupying most of the image....

 · 7 min · Terry Rodriguez & Salma Mayorquin

Editing Images With Cyclegans

GANs represent the state of the art in image-to-image translation. However, it can be difficult to acquire aligned image pairs to learn the mapping between image domains. CycleGANs introduced the “cycle consistency” constraint to learn to transfigure images, transfer style, and enhance photos from unaligned source and target domain samples. This technique has been used to render historic black & white images in full color or to represent an image in greater resolution but here, we explore applications in agriculture....

 · 2 min · Terry Rodriguez & Salma Mayorquin