Maybe I have been working with WebRTC for too long, but I constantly see use cases for it in my daily life. One of the more recent use cases has to do with my dog, Levy. Levy is an Old English Bulldog. Many years ago, when he was a cute little puppy, we would let him up on the couch. Over the years he has turned into a massive, gassy, dandruffy, shedding beast so we gradually weaned him off this habit in favor of a oversized, ridiculously fluffy doggy bed. He had been hooked on this new amenity for a while, but in the past several weeks he has been sneakily returning to his old habit when we were not home.
One of my first WebRTC experiments was a motion detecting baby monitor. I wondered if I could do something similar to automatically alert me when he got up on the couch. Then I could use a WebRTC session so that I could vocally urge him to get off. I have also been playing with this JavaScript-based microcontroller from Tessel – perhaps I could make use of that to nudge him when he inevitably ignores me. Odds are I was not going to be in front of a computer when this dog-on-the-couch alert goes off, so I need a good way of getting alerted on my mobile. Lastly, if I was going to go through the hassle of doing this, I wanted to make sure I could get a good recording of it all.
Continue reading to see how I put this together, including how I did a quick prototype, code samples of various modules, and the overall architecture.
Early prototype
As a first step I did some basic tests to see if this concept would even work and to explore what features I might need. I had played with the Tessel and a relay module before. Tessel uses node.js, making it easy to for any JavaScript programmer to muck around with hardware. My son has this Snap Circuits Deluxe Rover kit from Elenco, so it was fairly trivial to wire the Tessel up to this to get it to move forward.
See here for my Gist showing how simple it was to write some JavaScript on the Tessel to make a REST interface. The REST interface activates relay module that is wired to the rover and makes it move forward (see the video below for more details).
I have a home theater PC setup with a webcam already facing the couch. Once I had the Tessel-controller Rover working I simply setup a WebRTC video session with talky.io to remotely monitor the couch. The Tessel was just on my local LAN, so I used Chrome Remote Desktop to run my REST command on that LAN while I was in my office. I used some desktop recording software to record the whole thing.
You can check-out the prototype and get a glimpse of the wiring here.
Time to get serious: requirements
The early prototype worked well, but I really needed something more automated. Here is what I came up with for features:
- Motion detection to set off an alert – like the baby monitor, this should be an easy module to reuse
- Alert to my mobile – alerting my phone is best since that is the device I always have near
- WebRTC video call – at a minimum I should be able to see the couch/dog and talk to him
- Recording – to capture the experience and play it back or save it to a file
- Tessel controlled rover – something more sophisticated than my prototype that could move in different directions if needed
- Monitor app in my house to set everything up and act as a local proxy for the Tessel
- Remote viewing app for remove viewing and control
I also realized I could use the built in software zoom functions on webcam to zoom in on the couch to help prevent false alarms.
I also played around with a few JavaScript face tracking libraries (see here and here) to see if could use that to turn off alerting when there were people in the room. Unfortunately I could not get these to work reliably out-of-the box in my larger space.
From here I went off and played with each of these features.
Motion detection
For the TADHack last June, I played around with a similar motion detecting alerting app. For that I decided to use a slightly simpler implementation of the motion detecting algorithm than the one used in the baby monitor. This new one is inspired by by Rod Appledorn’s EasySec project. You can see that module here.
Mobile alerts with Twilio MMS
I had made a few failed attempts at mobile WebRTC app development and I knew I did not have the skills to go down that path. Another option for alerting on mobile would be to use something like the IFTTT app and figure out how to hook into one of their channels. Then I got the notice that Twilio was now supporting MMS, I knew that this would be a good fit for my project.
MMS has the advantage that it works out-of-the-box on pretty much any phone these days. Instead of just sending a notification, I could send a picture or short video to help verify that my dog was actually on the couch and not sitting in front of it.
Making this work was super simple. Full instructions for this are on Twilio’s site here. All I needed to do to send a MMS was this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
//require the Twilio module and create a REST client var client = require('twilio')(process.env.twilioSid, process.env.twilioToken); //credentials stored in env vars var twilioPhone = process.env.twilioPhone; exports.mms = function(msg, to, url){ client.messages.create({ to: to, from: twilioPhone, body: msg, MediaUrl: url }, function (err, message) { if (err) debug(err); debug("MMS ID: " + message); //was message.sid }); debug("sent MMS"); }; |
I attempted to send a video recording with this but found it was temperamental. AT&T only allows messages smaller than 5MB, so it would take some effort to make sure my recordings stayed under that size. I found a simple photo worked just as well as an indicator.
The downside is that it usually takes at least several seconds – sometimes minutes – for the alert to come through. This rules out the ability to reliably catch my dog in the act, but it was good enough for my use case since most of the time I am not in a position to react within seconds anyway.
To grab a photo, I used this snippet from HTML5 Rocks:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
//Take a still photo and send it up //Inspiration: https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Taking_still_photos function takePicture(sourceImg) { var canvas = document.createElement("canvas"); //set picture to native size of video canvas.width = sourceImg.videoWidth; canvas.height = sourceImg.videoHeight; canvas.getContext('2d').drawImage(sourceImg, 0, 0, sourceImg.videoWidth, sourceImg.videoHeight); var data = canvas.toDataURL('image/png'); var img = { type: 'image/png', dataURL: data }; return (img); } |
Eventually I needed to figure out how to send this via a websocket and save it on my server, but I’ll get to that later.
WebRTC video calling with SimpleWebRTC
Editing this blog for nearly 18 months has taught me enough to setup my own getUserMedia and createPeerConnection calls. Still, I rather not deal with all this code and with setting up a signaling server if I didn’t have to. My calling needs are very basic AppRTC like functionality. In the past I have used EasyRTC, but to try something else out I used SimpleWebRTC this time.
They host a signaling server for demo usage, so I just hooked into that instead of adding theirs into my server. That meant I only had to worry about a small amount of client-side JavaScript:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
/** * SimpleWebRTC setup */ $(document).ready( function() { var video = $('#localVideo')[0]; //use var so we only call jQuery selectors once var w = 1280, //max video width h = 720; //video width var constraints = { audio: true, video: { //ask for HD but settle for 640x360 mandatory: { minWidth: w *.5, minHeight: h *.5 }, optional: [ {width: w}, {height: h}] } }; webrtc = new SimpleWebRTC({ localVideoEl: 'localVideo', // the id/element dom element that will hold "our" video remoteVideosEl: 'remoteVideos', // the id/element dom element that will hold remote videos (not used) autoRequestMedia: true, // immediately ask for camera access media: constraints }); video.onloadeddata = function(){ mediastream = webrtc.webrtc.localStreams[0]; $('#startButton').show(); //let the user start when self-video is loaded }; video.onloadedmetadata = function(){ console.log("requested video size was: " + w + " x " + h ); console.log("returned video was is: " + video.videoWidth + " x " + video.videoHeight); }; }); |
Then to start a call, all you need is a room name:
1 |
webrtc.joinRoom(room); |
And then the receiving end can connect to the room with:
1 2 3 4 5 6 7 8 9 10 |
var webrtc = new SimpleWebRTC({ localVideoEl: 'localVideo', remoteVideosEl: 'remoteVideos', media: {audio: true, video: false}, autoRequestMedia: true}); webrtc.on('readyToCall', function () { webrtc.joinRoom(room); console.log('Joined room ' + room); }); |
Recording
Recording was one of the trickier aspects of the project. Muaz Khan has a bunch of recording options available under RecordRTC on webrtc-experiment.com, so I started there. It is rare that Firefox is ahead of Chrome, but I discovered FireFox does a better job of recording than Chrome. Firefox will put both the audio and video in a single WebM file. Chrome handles these separately. Muaz built a way to merge these with ffmeg, but I was fine with using Firefox to act as the recorder for my project.
I ended up modifying RecordRTC-over-SocketIO for my purposes. I started by making a simple self-recording app from RecordRTC that you can see here.
Here is what I ended up using to make the recording on the client slide:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
socketio.on('broadcast', function(message){ console.log("broadcast message: " + JSON.stringify(message)); if (message.command=="record"){ window.recordRTC = RecordRTC(mediastream); recordRTC.startRecording(); } if (message.command=="stop"){ recordRTC.stopRecording(function () { recordRTC.getDataURL(function (audioVideoWebMURL) { var data = { audio: { //audio&video on Firefox, just audio on Chrome type: recordRTC.getBlob().type || 'audio/wav', dataURL: audioVideoWebMURL }, room: room }; socketio.emit('video', data); console.log("file type is " + recordRTC.getBlob().type); }); }); } }); |
Then I used this modified from Muaz’s examples on the node.js server:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
app.io.route('video', function(req){ var vidId = shortId.generate(); //get a short, unique ID for the filename //To DO: write the filename based on teh data.audio.type var filetype = 'webm'; fileName = vidId + '.' + filetype; writeToDisk(req.data.audio.dataURL, fileName); // if it is chrome - if (req.data.video) { writeToDisk(req.data.video.dataURL, fileName); //merge(socket, fileName); //replace this if want to use audio+video on Chrome using ffmpeg } }); function writeToDisk(dataURL, fileName) { debug("About to write " + fileName); var fileExtension = fileName.split('.').pop(), fileRootNameWithBase = './uploads/' + fileName, filePath = fileRootNameWithBase, fileID = 2, fileBuffer; //increment file name if it already exists while (fs.existsSync(filePath)) { filePath = fileRootNameWithBase.split('.')[0] + '(' + fileID + ').' + fileExtension; fileID += 1; } dataURL = dataURL.split(',').pop(); fileBuffer = new Buffer(dataURL, 'base64'); fs.writeFileSync(filePath, fileBuffer); debug('filePath:' + filePath); } |
In retrospect it would have probably been easier to use a media server (disclosure: I work for a media server company, so I kind of already knew that). It is possible to record client side and use tools like ffmpeg to modify the media, but sending the video up in a timely manner takes some effort and hogs a lot of bandwidth, hurting the quality of the video you are sending. Running ffmpeg also requires a lot more access to the server platform. ffmpeg also does not work in most node.js PaaS offerings (like nodejitsu that I used) since it requires its own installation beyond just node.
Lastly, it would have been nice to setup a some kind of rolling buffer to start a recording of Levy a few seconds before he jumped up on the couch, but this was more logic than I really needed. Instead I opted for an on-demand recording ability without any automation.
I could have video-only support for Chrome easily with the RecordRTC library, but this was not critical so I kept this out for now to avoid adding other unintended complications.
Tessel Controlled Elenco Rover
As I mentioned earlier, my Tessel relay setup only allowed the rover to go forward. This could be limiting if Levy knocked it off to the side, so I needed a way to get more control. Here is how I setup the Rover with the Tessel.
Hardware setup
The Rover has 6 inputs on the back:
- orange (battery +):
- gray (battery -)
- white (right forward)
- yellow (right backward)
- green (left forward)
- blue (left backward)
So how to get forward, backward, and turning controls? My son’s model only came with a RF receiver module that did this, but there was nothing on that module to hook the Tessel’s into. To remedy this I could have build a complex circuit. Luckily it turns out Elenco has a simpler U8 module that works perfectly for hooking into the Tessel’s GPIO as shown below:
All the Elenco parts are color coded, so the only part I needed to pay close attention to was the pin assignments coming from the U8 to the Tessel GPIO. Here is a quick reference:
JS Object | Label | Pin | U8 wire |
gpio.digital[0] | G1 | 15 | green |
gpio.digital[1] | G2 | 17 | red |
gpio.digital[2] | G3 | 19 | white |
gpio.digital[3] | G4 | 20 | yellow |
GND | 1 | black |
It is possible to power the Tessel from the Rover’s9V power supply, but that involves soldering so I used a cheap USB charger I needed anyway. Also, don’t forgot to provide a common ground by hooking the back GND wire to the negative terminal on U8.
Software setup
Here is a what the Rover JavaScript code looks like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 |
/** * REST control of the Tessel GPIO * For use with an Elenco Rover * Created by chad on 8/30/2014. */ var router = require('tiny-router'), tessel = require('tessel'), gpio = tessel.port['GPIO']; var lights = { green: tessel.led[0], blue: tessel.led[1], red: tessel.led[2], amber: tessel.led[3] }; //setup the gpio pins var gpo = { rf: gpio.digital[0], //RF : G1 pin 15, green wire rb: gpio.digital[1], //RB : G2 pin 17, red wire lf: gpio.digital[2], //LF : G3 pin 19, white wire lb: gpio.digital[3] //LB : G4 pin 20, yellow wire }; router .get('/', function(req, res) { res.send( '<h1>Simple tessel REST API for Elenco Rover</h1>' + '<p><h3>Commands</h3></p>' + '<ul>' + '<li><a href="/forward">/forward</a> - move forward for 1 second</li>' + '<li><a href="/backward">/backward</a> - move backward for 1 second</li>' + '<li><a href="spinright">/spinright</a> - spin to the right for 1 second</li>' + '<li><a href="/spinleft">/spinleft</a> - spin to the left for 1 second</li>' + '<li>/rf/t - right forward for t seconds</li>' + '<li>/rb/t - right backward for t seconds</li>' + '<li>/lf/t - left forward for t seconds</li>' + '<li>/lb/t - left backward for t seconds</li>' + '</ul>'); console.log("home"); }) .get('/forward/{t}', function(req, res){ var t = parseInt(req.body.t); if (isNaN(t)) t = 1; res.send("Moving forward"); console.log("Toggling rf & lf for " + t + " seconds"); gpo.rf.output(1); gpo.lf.output(1); lights.blue.write(1); setTimeout(function(){ gpo.rf.output(0); gpo.lf.output(0); lights.blue.write(0); } , t * 1000); }) .get('/backward/{t}', function(req, res){ var t = parseInt(req.body.t); if (isNaN(t)) t = 1; res.send("Moving backward"); console.log("Toggling rb & lb for " + t + " seconds"); gpo.rb.output(1); gpo.lb.output(1); lights.blue.write(1); setTimeout(function(){ gpo.rb.output(0); gpo.lb.output(0); lights.blue.write(0); } , 1000); }) .get('/spinright/{t}', function(req, res){ var t = parseInt(req.body.t); if (isNaN(t)) t = 1; res.send("Spinning right"); console.log("Toggling rf & lb for " + t + " seconds"); gpo.lf.output(1); gpo.rb.output(1); lights.blue.write(1); setTimeout(function(){ gpo.lf.output(0); gpo.rb.output(0); lights.blue.write(0); } , 1000); }) .get('/spinleft/{t}', function(req, res){ var t = parseInt(req.body.t); if (isNaN(t)) t = 1; res.send("Spinning left"); console.log("Toggling rf & lb for " + t + " seconds"); gpo.rf.output(1); gpo.lb.output(1); lights.blue.write(1); setTimeout(function(){ gpo.rf.output(0); gpo.lb.output(0); lights.blue.write(0); } , 1000); }) .get('/rf/{t}', function(req, res){ var t = parseInt(req.body.t); if (isNaN(t)) t = 1; res.send("Toggling G1 for " + t); console.log("Toggling G1 on for " + t + " seconds"); gpo.rf.output(1); lights.blue.write(1); setTimeout(function(){ gpo.rf.output(0); lights.blue.write(0); } , t * 1000); }) .get('/rf/{t}', function(req, res){ var t = parseInt(req.body.t); if (isNaN(t)) t=1; res.send("Toggling G1 for " + t); console.log("Toggling G1 on for " + t + " seconds"); gpo.rf.output(1); lights.blue.write(1); setTimeout(function(){ gpo.rf.output(0); lights.blue.write(0); } , t * 1000); }) .get('/rb/{t}', function(req, res){ var t = parseInt(req.body.t); if (isNaN(t)) t = 1; res.send("Toggling G2 for " + t); console.log("Toggling G2 on for " + t + " seconds"); gpo.rb.output(1); lights.blue.write(1); setTimeout(function(){ gpo.rb.output(0); lights.blue.write(0); } , t * 1000); }) .get('/lf/{t}', function(req, res){ var t = parseInt(req.body.t); if (isNaN(t)) t = 1; res.send("Toggling G3 for " + t); console.log("Toggling G3 on for " + t + " seconds"); gpo.lf.output(1); lights.blue.write(1); setTimeout(function(){ gpo.lf.output(0); lights.blue.write(0); } , t * 1000); }) .get('/lb/{t}', function(req, res){ var t = parseInt(req.body.t); if (isNaN(t)) t = 1; res.send("Toggling G4 for " + t); console.log("Toggling G4 on for " + t + " seconds"); gpo.lb.output(1); lights.blue.write(1); setTimeout(function(){ gpo.lb.output(0); lights.blue.write(0); } , t * 1000); }) ; function start(){ //turn the GPIO off gpo.rf.output(0); gpo.rb.output(0); gpo.lf.output(0); gpo.lb.output(0); //Delay 10 seconds to give WiFi to start //To do: Look for wifi acquired event setTimeout(function(){ router.listen(8080); console.log("listening on port 8080"); lights.green.write(1); },10000); } start(); |
You can also find the code in github here.
Basically I setup a bunch of tiny-router routes to fire off the various combinations of the GPIO pins needed for a specified amount of time needed using setTimeout. Each route is just a different combination corresponding to forward, reverse, spin right, or spin left. It could easily be more eloquently written in about half the lines, but it got the job done in no time.
I want to clean this up and make a more responsive, WebSocket-based setup eventually, but that will be another project.
Putting it all together
Now that I got all the above components working independently, it was time to put this all together. There is way too much code to include snippets on everything here, so please refer to the webrtcDogTrainer repository on github if you want to see everything.
Choosing a host
To give access to the video feed remotely I needed to host this on a public server somewhere, so I decided to use nodejitsu as my host where I already had an account setup from some earlier experiments. Hosting on nodejitsu is cheap – $9/month for a micro package and deploying was super easy, just run “jitsu deploy” from a shell in your root node.js directory and it does the rest for you.
Setting up node.js
I started out using a simple node.js, express, socket.io template to get started. Socket.io is required by RecordRTC and if I ever wanted to run any of the server-side SimpleWebRTC pieces myself, I would also need it there.
I soon realized this was not remotely close to running the latest version of Express or socket.io, so I updated them. That was a huge mistake. I spent countless hours trying to get Express 4.x using the express generator and Socket.io 1.0 to work together with sessions so I could keep track of returning users with no luck. This part of the project was not very fun and left me unimpressed with the node.js experience.
Eventually I gave up on using the latest versions of express and socket.io. I ended up using express.io which is basically a package that lets you handle socket.io messages the same way you do express routes (using express 3.4 and socket.io 0.9). As I will show soon, these helped to keep my routing clean.
I also dumped session support after I realized I did not really need it at all for my personal use.
Architecture
Now I was back on track and had to do some thinking on my application logic and how to make all these pieces tie together. It took me many, many iterations, but this is what I ultimately came up with for a flow:
I render the monitor and remote pages using jade and provided corresponding JavaScript files for each from my public/javascripts directory.
Socket.io’s room broadcast capabilities on node.js send messages back and forth between the monitor and remote. These messages in turn trigger various actions locally.
Twilio requires a URL to grab the image/multimedia for the MMS, so I needed to create a route that would allow access to that. I also needed the same mechanism to serve the recorded video when it was re
The Tessel is set to a fixed IP address on my local network so I needed the monitor to call the Tessel REST commands since that was the only other local element. I kept the functionality very simple and just allowed the remote user to enter the REST command as a string. It is then broadcasted through the node.js socket.io server and to the monitor which then GETs the URL. After I first set everything up I realized this would allow anyone to call any URL they wanted from the monitor. I did not show it in the diagram above, but to improve security I only allow this functionality if a predefined string is entered after the host URL. I setup alternative jade files to request additional JavaScript files for the Tessel REST proxy. This is still not all that secure, but was quick and better than nothing.
Does it work?
The other day I set it the monitor prior to leaving on a family trip for the afternoon. Before I got down the street my phone buzzed, so I pulled over to see this:
So I clicked on the link to see him just starting to get cozy. From there I press “record”, entered my Tessel REST command and here is what happened:
Worked like a charm!
We’re hoping his pavlovian response eventually kicks in at the site of the rover so we get fewer alerts.
Note for the PETA members out there: no bulldogs were hurt during the making of this experiment. We actually felt a little bad after experimenting on him, so we bought him a doggy bed heater for his dog bed too. Despite these additional comforts, he still tries to jump on the couch occasionally.
What’s next
While I proved you can use JavaScript to train your dog, but this is still very limited prototype. There is still a lot more work to do to make this a beta-ready app. I’m probably not going to do most (or any) of this, but here is my wish list:
- Add Chrome support
- Make the monitor and remote pages look respectable with a front-end framework (I started using bootstrap for this)
- Make the remote app responsive so it works better on mobile
- Host the SimpleWebRTC server on node.js instead of remotely
- Automatically record the initial motion event and give an option play that back when you load the remote page
- Better support for non-WebRTC browsers – stream the video with a media server and/or use a normal phone to call in for audio
- Use sessions to allow persistence across refreshes
- Use a media server for higher quality recordings
- Add HTTPS to simplify getUserMedia access permissions
- Store the files somewhere for longer-term access
- Provide some usage controls/limits on the server
- Use security tokens to or other authentication mechanism to allow access to the Tessel
- Reconcile all the socket.io messaging into a common framework – perhaps just use SimpleWebRTC to do this
- Allow peer-to-peer communication of messages using the DataChannel
- Tsahi also recommended adding in some report logs to see if the training is working over time
Have other ideas? Please feel free to fork the webrtcDogTrainer and have a go at it and show me what I can do better!
Robert Welbourn says
I wonder whether I could use a similar approach to get my son to do his homework?
Anirban Dutta says
That’s a good one ! 😀 May be instead of using a rolling stick, you can help him with homework using RTC. 😛