Pointcloud Video

Lately, we’ve come to enjoy using the DepthAI OAK-D, which features an RGB camera with stereo depth, IMU, and Intel’s MyriadX VPU. Along with this powerful hardware combination, DepthAI provides a rich SDK to build your own embedded vision pipelines. Many projects are included to get you started. These specs could help bring spatial AI to the SpecMirror where we can test representing human activities with pointcloud video. The Data First, we will generate training samples for activity recognition models like P4Transformer....

 · 2 min · Terry Rodriguez & Salma Mayorquin

Efficient Transformers

Convolutional Neural Networks have been a boon to the computer vision community. Deep learning from high-bandwidth image/video datasets can be computationally and statistically much more efficient using the inductive bias of strong locality. This streamlines inference over big datasets or on resource-limited hardware. To model sequential dependence in short sequences of low-dimensional data, we have often used LSTMs. However, researchers have recently found success adapting Transformer architectures to learn from image and video, both applications traditionally dominated by CNNs....

 · 7 min · Terry Rodriguez & Salma Mayorquin