TensorFlow is one of the most popular Machine Learning frameworks out there – probably THE most popular one. One of the great things about TensorFlow is that many libraries are actively maintained and updated. One of my favorites is the TensorFlow Object Detection API. The Tensorflow Object Detection API classifies and provides the location of multiple objects in an image. It comes pre-trained on nearly 1000 object classes with a wide variety of pre-trained models that let you trade off speed vs. accuracy.
Sending real time communications from point A to point B? That functionality is relatively easy with WebRTC. Processing the media in real time to do something cool with it? That is an area I find a lot more interesting, but it is a lot tougher to do. When I was building my Motion Detecting Baby Monitor project, I wished I had some kind of media server to handle the motion detection processing. That would give me some flexibility to take the processor intensive algorithm off of my phone and stick that in the cloud if I wanted to save on battery. That also got me thinking – if you can do motion detection why not apply other more advanced image processing algorithms to the WebRTC stream? How about facial recognition, object detection, gesture tracking or many of the other cool features that are popping up all the time in the popular Open Source Computer Vision (OpenCV) project? I wrote this dream off as science fiction for another year or two.