Hey, we're Terry & Salma 👋

The technical tagteam behind this blog. We aim to showcase the latest research, tools, and hardware for developing AI applications.

Efficient Transformers

Convolutional Neural Networks have been a workhorse in our projects as we apply deep learning to image and video with an eye toward fast inference over big data or small hardware. As is common practice, we have turned to RNNs, particularly LSTMs, to model sequential dependence in our data. However, Transformer architectures are increasingly being used to analyze image and video, both applications traditionally dominated by CNNs. Vaswani et al’s pioneering work in machine translation introduced Transformers, which relies on attention mechanisms rather than recurrent or convolutional layers along with positional embeddings to encode the sequential relation between tokens....

 · 5 min · Terry Rodriguez & Salma Mayorquin

TF Microcontroller Challenge: Droop, There It Is

Repo for this project here! A seasoned gardener can diagnose plant stress by visual inspection. For our entry to the Tensorflow Microcontroller Challenge, we chose to highlight the issue of water conservation while pushing the limits of computer vision applications. Our submission, dubbed “Droop, There It Is” builds on previous work to identify droopy, wilted plants. Drought stress in plants typically manifests as visually discernible drooping and wilting, also known as plasmolysis, indicating low turgidity or water pressure....

 · 5 min · Terry Rodriguez & Salma Mayorquin

Make Some Noise for Score Based Models

Blob Pitt's next big blockbuster Generative models have reached remarkable capabilities in applying machine learning to synthesize original instances after learning a data distribution. In the arena of image generation, the recent SOTA tracks alongside advances in a family of models called generative adversarial networks or GANs. This technique introduces to the generator network a corrective signal when its samples are not photorealistic, accomplished with the use of a jointly-trained classifier, dubbed the discriminator....

 · 4 min · Terry Rodriguez & Salma Mayorquin

Machine Learning on Video

Many groups have found recent success in productizing computer vision applications with some formulation of transfer learning pre-trained convolutional neural networks. Factors such as cheaper bandwidth and storage, expanded remote work, streaming entertainment, social media, robotics and autonomous vehicles, all contribute to the rapidly increasing volume of video data. Despite the increased data and research attention, benchmark ML video tasks in perception, activity recognition, and video understanding have thus far eluded simple recipes or the broad adoption enjoyed by image applications....

 · 5 min · Terry Rodriguez & Salma Mayorquin

Jacked About Jax

As others have discussed, we’ve noticed a recent uptick in research implemented using Jax. You could chalk it up as yet another ML platform, but Jax is emerging as the tool of choice for faster research iteration at DeepMind. After exploring for ourselves, we’re excited to find Jax is principally designed for fast differentiation. Excited because differentiation is foundational to gradient-based learning strategies supported in many ML algorithms. Moreover, the derivative is also ubiquitous in scientific computing, making Jax one powerful hammer!...

 · 4 min · Terry Rodriguez