Smells Like ML

Movie Poster Similarity for Recommendation

The use of streaming services has sharply increased over this past year. Many video streaming platforms prominently feature theatrical posters in content representation. As movie posters are designed to signal theme, genre and era, this representation strongly influences a user’s propensity to watch the title. Domain experts have remarked on how poster elements can convey an emotion or capture attention. Exploring this thesis, Netflix conducted a UX study, using eye tracking to find that 91% of titles are rejected after roughly 1 second of view time....

TF-Recommenders & Kubernetes for flexible RecSys Model Development & Deployment

Introducing TF-Recommenders Recently, Google open sourced a Keras API for building recommender systems called TF-Recommenders. TF-Recommenders is flexible, making it easy to integrate heterogeneous signals like implicit ratings from user interactions, content embeddings, or real-time context info. This module also introduces losses specialized for ranking and retrieval which can be combined to benefit from multi-task learning. The developers emphasize the ease-of-use in research, as well as the robustness for deployment in web-scale applications....

TF-Ranking and BERT for Movie Recommendations

Check out our repo for all the code referenced in this blog! Recommender systems are used by many groups to maximize the presentation of products to users. There is a variety of implementations for building recommender systems, but at their core, these systems are designed to sort a universe of items by their relevance to a user based on user information, item information, or both. One well known algorithm for solving the sorting problem is the Learn-to-Rank model, where the objective is to rank a list of examples by each item’s relevance to a particular user....

IVA Pipelines with NVIDIA TLT and Deepstream SDK 5.0

We have seen applications in industries like retail, telemedicine, and robotics enabled by video analytics with machine learning. ML practitioners often leverage transfer learning with pretrained models to expedite development. Computer vision applications can benefit from using video analytics frameworks to facilitate faster iteration and experimentation. NVIDIA’s TLT toolkit and the Deepstream SDK 5.0 have made it easy to experiment with various network architectures and quickly deploy them on a NVIDIA powered device for optimized inference....

Population Health Modeling

In a matter of months, the COVID-19 pandemic has besieged humanity and now the world wrestles to manage the population health challenges of a novel coronavirus with remarkable infectivity. Organizing an effective response to blunt the impact of such a large, complex challenge demands a principled and scientific approach. Better Planning by Forecasting Infections Reliable forecasting is crucial for planning and allocating limited resources efficiently and minimizing casualties....

Deepfake Detection: Challenge Accepted

Advances in methods to generate photorealistic but synthetic images have prompted concerns about abusing the technology to spread misinformation. In response, major tech companies like Facebook, Amazon, and Microsoft partnered to sponsor a contest hosted by Kaggle to mobilize machine learning talent to tackle the challenge. With $1 million in prizes and nearly half a terabyte of samples to train on, this contest requires the development of models that can be deployed to combat deepfakes....

Protecting Privacy With Computer Vision

Check out and contribute to our collection of data privacy resources! AI researchers developed models to identify image pixels featuring people. We apply this to promote privacy by helping you redact personally identifiable info in images. This demo is powered by Tensorflow.js! Drop an image and retrieve the redacted output without ever sending data over the internet. Click on your redacted image when it’s done to save. Consider another use case of delivery robots roaming the streets....

Everybody Dance Faster

Check out the repo and the video! “Everybody Dance Now” offers a sensational demonstration in combining image-to-image translation with pose estimation to produce photo-realistic ‘do-as-i-do’ motion transfer. Researchers used roughly 20 mins of video shot at 120 fps of a subject moving through a normal range of body motion. It is also important for source and target videos to be taken from similar perspectives. Generally, this is a fixed camera angle at a third person perspective with the subject’s body occupying most of the image....

Human Activity Recognition with Pose Estimation

Check out the repo and enjoy the video on YogAI and ActionAI Wanting a personal trainer to help track our fitness goals, we figured we could build our own. The goal was to build an application that could track how we were exercising and began with Yoga as a simple context. We dubbed our first iteration of this application as YogAI. We thought about the YogAI concept for some time....

Editing Images With Cyclegans

GANs represent the state of the art in image-to-image translation. However, it can be difficult to acquire aligned image pairs to learn the mapping between image domains. CycleGANs introduced the “cycle consistency” constraint to learn to transfigure images, transfer style, and enhance photos from unaligned source and target domain samples. This technique has been used to render historic black & white images in full color or to represent an image in greater resolution but here, we explore applications in agriculture....