Hey, we're Terry & Salma 👋

The technical tagteam behind this blog. We aim to showcase experiments and innovations in Artificial Intelligence and applied Machine Learning.

Go Nerf Yourself

While prototyping YogAI, our smart mirror fitness application, we dreamed of using generative models like GANs to render realistic avatars. For the TFWorld 2.0 Challenge, we came a bit closer to that vision by demonstrating a pipeline which quickly creates motion transfer videos. More recently, we have been learning about reconstruction techniques and have been excited about the work around Neural Radiance Fields (Nerf). By this method, one learns an implicit representation of a scene from posed monocular videos....

 · 2 min · Terry Rodriguez & Salma Mayorquin

Cheaper to Fly

In recent experiments, we’ve generated high quality reconstructions of our apartment from video. Learning the failure modes of these methods, you will move the camera smoothly, avoid bright lights, and focus on textured regions of the FOV. If it all works out, you might spend more time recording video than processing it! Automating the data collection can really reduce the cost of mapping and reconstruction. Compared to recording from a phone/tablet, drones move more smoothly and swiftly....

 · 3 min · Terry Rodriguez & Salma Mayorquin

Meet the Flockers

In this post, we share one of our favorite “pet projects”. We first heard about the “parrots of telegraph hill” looking for things to see in the city. But after a couple years, we never managed to run into one of these accidentally re-wilded parrots. Eventually, we moved to a new apartment where we could hear their distant squawks and occassionally see a small flock of the cherry-headed conures....

 · 3 min · Terry Rodriguez & Salma Mayorquin

Toward Real-Time Mapping & Reconstruction

Modern archeologists have been able to survey larger structures more precisely using remote sensing and photogrammetry. More recently, researchers demonstrate applications of multi view stereo with networked embedded cameras to track geological disturbances. In scenarios where visibility comes with high cost or saftey risk, the ability to quickly render high-fidelity reconstructions for offline analysis & review can be a powerful tool. Advances in techniques like multi-view stereo and structure from motion have reduced the cost by alleviating dependence on more expensive sensors like lidar....

 · 4 min · Terry Rodriguez & Salma Mayorquin

Detect-Track-Localize

In our latest experiment with Depthai’s cameras, we consider visual localization. This relates to the simultaneous localization and mapping (SLAM) problem that robots use to consistently localize in a known environment. However, instead of feature matching with algorithms like ORB, we can try to directly regress the pose of a known object. This approach uses object detection, which is more robust to changes in illumination and perspective than classical techniques. And without the need to generate a textured map of a new environment, this technique can be quickly adapted to new scenes....

 · 3 min · Terry Rodriguez & Salma Mayorquin

Pointcloud Video

Lately, we’ve come to enjoy using the DepthAI OAK-D, which features an RGB camera with stereo depth, IMU, and Intel’s MyriadX VPU. Along with this powerful hardware combination, DepthAI provides a rich SDK to build your own embedded vision pipelines. Many projects are included to get you started. These specs could help bring spatial AI to the SpecMirror where we can test representing human activities with pointcloud video. The Data First, we will generate training samples for activity recognition models like P4Transformer....

 · 2 min · Terry Rodriguez & Salma Mayorquin

Adding Vision to the Turtlebot3

Turtlebot open-sourced designs for many low-cost, personal robot kits and the Burger is a great platform to learn ROS and mobile robotics. The documentation progresses from setup to more advanced robot behavior including SLAM, navigation, autonomous driving and more! Our Build Since the turtlebot3 burger kit doesn’t include a camera, we added the OAK-FFC-3P camera made by Luxonis with a 100 degree Arducam to our build. This device includes deep learning accelerators on the camera so we can reduce the compute needs on the raspberry pi host....

 · 2 min · Terry Rodriguez & Salma Mayorquin

Framing SSL

Many recent successes in computer vision have been powered by the extension of BERTology beyond the mode of text-based data to image & video. Without a doubt, efficient Transformers which patchify input images a la ViT have initiated much of this progress. But in this post, we are interested in pretraining with self-supervised learning to develop compact representations we might use in various downstream tasks. Datasets of real world interest often exhibit structures which are not exploited in research on benchmark datasets....

 · 3 min · Terry Rodriguez & Salma Mayorquin

Learning on Synthetic Data

Sometimes, we succeed applying transfer learning with relatively few labeled samples to develop custom models. However, there are times when the cost of acquisition is so great that even having a few examples to learn from is difficult. Scientists curate databases like FathomNet to share expertise about the ocean’s wildlife. Applying machine learning to classify marine species is quite challenging in practice due in part to rarity of encounters and challenging photographic environments....

 · 3 min · Terry Rodriguez & Salma Mayorquin

Next Steps for ActionAI

Some of our earliest work applying ML to video was done in the context of prototyping IoT products like YogAI. A couple years ago, we described a more generalized pipeline called ActionAI. ActionAI was designed to streamline prototyping IoT products using lightweight activity recognition pipelines on devices like NVIDIA’s Jetson Nano or the Coral Dev Board. Since then, NVIDIA has introduced action recognition modules into their Deepstream SDK. They model a classifier using 3D convolutional kernels over the space-time volume of normalized regions of interest, batched over a k-window in time....

 · 4 min · Terry Rodriguez & Salma Mayorquin