Bitrate Optimization using Spark and FFmpeg

Check out this part 1 notebook and this part 2 notebook and part 3 notebook which accompany this post! Streaming video is quickly occupying the lion’s share of digital content consumed by users of many applications. At the same time more users are streaming from mobile devices, screen sizes are also increasing while consumers expect high-quality video without lag or distortion artifacts. This frames an engineering challenge to optimize the way video is streamed for consumers across a multitude of hardware platforms....

 · 5 min · Terry Rodriguez & Salma Mayorquin

Scalable Image Deduplication With Spark

Make sure to check out the databricks notebook which complements this post! Modern internet companies maintain many image/video assets rendered at various resolutions to optimize content delivery. This demand gives rise to very interesting optimization problems. Groups like Netflix have even taken steps to personalize the images presented to each user, but as they describe, this involves subproblems in organizing the collection of images. In particular, researchers described extracting image metadata to help cluster near duplicate images so they could more efficiently apply techniques like contextual bandits for image personalization....

 · 2 min · Terry Rodriguez & Salma Mayorquin