FilmGeeks 3

Check out the FilmGeeks3 Collection In last year’s post on generative models, we showcased theatrical posters synthesized with GANs and diffusion models. Since that time, Latent Diffusion has gained popularity for faster training while allowing the output to be conditioned on text and images. This variant performs diffusion in the embedding space after encoding text or image inputs with CLIP. By allowing the generated output to be conditioned on text and/or image inputs, the user has much more influence on the results....

 · 3 min · Terry Rodriguez & Salma Mayorquin

Go Nerf Yourself

While prototyping YogAI, our smart mirror fitness application, we dreamed of using generative models like GANs to render realistic avatars. For the TFWorld 2.0 Challenge, we came a bit closer to that vision by demonstrating a pipeline which quickly creates motion transfer videos. More recently, we have been learning about reconstruction techniques and have been excited about the work around Neural Radiance Fields (Nerf). By this method, one learns an implicit representation of a scene from posed monocular videos....

 · 2 min · Terry Rodriguez & Salma Mayorquin

Next Steps for ActionAI

Some of our earliest work applying ML to video was done in the context of prototyping IoT products like YogAI. A couple years ago, we described a more generalized pipeline called ActionAI. ActionAI was designed to streamline prototyping IoT products using lightweight activity recognition pipelines on devices like NVIDIA’s Jetson Nano or the Coral Dev Board. Since then, NVIDIA has introduced action recognition modules into their Deepstream SDK. They model a classifier using 3D convolutional kernels over the space-time volume of normalized regions of interest, batched over a k-window in time....

 · 4 min · Terry Rodriguez & Salma Mayorquin

Next Steps for ActionAI

Some of our earliest work applying ML to video was done in the context of prototyping IoT products like YogAI. A couple years ago, we described a more generalized pipeline called ActionAI. ActionAI was designed to streamline prototyping IoT products using lightweight activity recognition pipelines on devices like NVIDIA’s Jetson Nano or the Coral Dev Board. Since then, NVIDIA has introduced action recognition modules into their Deepstream SDK. They model a classifier using 3D convolutional kernels over the space-time volume of normalized regions of interest, batched over a k-window in time....

 · 154 min · Terry Rodriguez & Salma Mayorquin

Everybody Dance Faster

Check out the repo and the video! “Everybody Dance Now” offers a sensational demonstration in combining image-to-image translation with pose estimation to produce photo-realistic ‘do-as-i-do’ motion transfer. Researchers used roughly 20 mins of video shot at 120 fps of a subject moving through a normal range of body motion. It is also important for source and target videos to be taken from similar perspectives. Generally, this is a fixed camera angle at a third person perspective with the subject’s body occupying most of the image....

 · 7 min · Terry Rodriguez & Salma Mayorquin

Human Activity Recognition with Pose Estimation

Check out the repo and enjoy the video on YogAI and ActionAI Wanting a personal trainer to help track our fitness goals, we figured we could build our own. The goal was to build an application that could track how we were exercising and began with Yoga as a simple context. We dubbed our first iteration of this application as YogAI. We thought about the YogAI concept for some time....

 · 7 min · Terry Rodriguez & Salma Mayorquin