Hey, we're Terry & Salma 馃憢

The technical tagteam behind this blog. We aim to showcase the latest research, tools, and hardware for developing AI applications.

Bitrate Optimization using Spark and FFmpeg

Check out this part 1 notebook and this part 2 notebook that accompany this post! Streaming video is a major part of how users consume information across a variety of applications. As more users turn to mobile devices, the screen sizes are also increasing. At the same time, consumers expect high quality video without lag or distortion. This frames an engineering challenge to optimize the way video is streamed for consumers using a wide variety of hardware....

 路 3 min 路 Terry Rodriguez & Salma Mayorquin

Scalable Image Deduplication With Spark

Make sure to check out the databricks notebook for this post! Modern internet companies maintain many image/video assets rendered at various resolutions to optimize content delivery. This demand gives rise to very interesting optimization problems. Groups like Netflix have even taken steps to personalize the images presented to each user, but as they describe, this involves subproblems in organizing the collection of images. In particular, Netflix researchers described extracting image metadata to help cluster near duplicate images so they could more efficiently apply techniques like contextual bandits for image personalization....

 路 2 min 路 Terry Rodriguez & Salma Mayorquin

Image Inpainting for Content Localization

In a prior post, we explored training StyleGAN2 on a corpus of many theatrical posters we scraped from places like IMDb. Then we considered applications of StyleGAN2 for image retrieval after extracting embeddings by projecting our corpus to the latent factor space. These image retrieval techniques can form the basis of personalized image recommendations. Netflix engineering posted on how they partner with their content creation team to test artwork for driving user engagement....

 路 3 min 路 Terry Rodriguez & Salma Mayorquin

Applying GAN Latent Factors for Image Retrieval

GANs represent the state of the art in learning an image distribution for an image corpus. These models often use explicit mechanisms for learning factored representations for images over continuous embedding space. These are desirable features for image embeddings in the context of image retrieval. In this post, we explore applications of StyleGAN2 variants to the image retrieval task. StyleGAN2 By unsupervised approach, we train a StyleGAN2 model to generate theatrical posters from our image corpus....

 路 3 min 路 Terry Rodriguez & Salma Mayorquin

Deepfake Detection With NVIDIA TLT 3.0 and DeepStream SDK

Last year, over 2 thousand teams participated in Kaggle鈥檚 Deepfake detection video classification challenge. For this task, contestants were provided 470 GB of high resolution video and required to submit a notebook which predicts whether each sample video file has been deepfaked with a 9 hour run-time limit. Since most deepfake technology performs a faceswap, contestants concentrated around face detection and analysis. Beginning with face detection, contestants could develop an image classifier using the provided labels....

 路 5 min 路 Terry Rodriguez & Salma Mayorquin