An Interview With Sidecar

Machine Learning that predicts San Francisco brunch.

Posted by Vaibhav Mallya on March 5, 2014
Sidecar was the first ride-sharing app - drivers with spare cars, passengers with transit needs, and a prediction model to match them. This week, I chatted with the engineers about their stack and their challenges in growing to a dozen cities (and beyond). Also: brunch.
Jahan Khanna- Founder, CTO. University of Michigan grad, guitar-player. Loves Outside Lands.

Jahan Khanna

How did Sidecar start?
I was at a party in the Marina. It was late and I was stuck - no taxis around. I paid a pizza guy 20$ to take me home. That's how. Got hacking on it soon after.
Where did you stack start off? How has it evolved with time?
We've been on AWS this whole time. EC2, S3. The core is a node.js state machine that manages the state of the drivers, passengers, and their respective payments. RDS and RedShift later. But overall, it's close to what it was at the start. Whatever keeps things running is key - we all use Sidecar all the time too, as drivers and passengers.
Speaking of dogfooding - it's a huge problem for location-dependent apps in general. How can you keep the customer experience consistent in cities you're not living in?
Good question. We've gotten a pretty solid process down. We can spin up an instance of our stack, or allocate resource for it, for each city pretty quickly. The dogfooding - well, we have dedicated city managers, but we also all make an effort to travel. Basically - we invest a ton of time and energy in it. And we need to, since we want to be in every city in the world.
Rob Moran - Algorithms and ML engineer, University of Michigan grad. Likes frisbee. Does a good Zoolander impression.

Jahan Khanna

So how do you model geographic supply-
and demand? We experimented with a few different algorithms. We settled on a Kalman filter, along with some other algorithms for finer levels of granularity. Reasonable set of tradeoffs for our use-case.
A universal Kalman Filter?
No, we tune it a bit for each city. We examine the graphs to make sure things make sense and iterate, and see patterns. Los Angeles, for example has a much more active average nightlife. But you can't beat brunch in San Francisco.
So it models brunch?
Yeah. Our ML literally model San Francisco brunch. Demand graphs spike and recede and spike again as people head to brunch in Sidecars, wait in line, then return in Sidecars. Same pattern for bars. Really cool.
How do you measure your accuracy?
We track driver utilization, and passengers who request a ride but don't get one. What's interesting is that for forecasting, it helps a lot to understand each city's culture. Like, we all love music festivals here - Outside Lands, Treasure Island - and we sort of have to, so we know why demand is spiking in a location, and if we anticipated it properly or not. Lots to do though. We need to cut false-negatives. We might even use another set of algorithms down the road.
Which one? Deep learning is pretty hot.
Keeping that under wraps for now. [Rob grins]
Tommy Gellatly - Tech Lead for mobile and infrastructure teams. Likes a good Cuban Mojito.

Tommy Gellatly

Obligatory question: Why AWS?
Obligatory answer: Simplest, fastest, most scalable. EC2, S3, SQS, RDS, EBS. Standard. But honestly, the stack is just one part of a larger picture. We're relatively low-volume for the dollar amount of transactions we do. In theory we could run chunks of the backend off our Macbooks.
So where are the engineering challenges in keeping this up? Your overall setup seems pretty straightforward.
Redundancy, durability of data is big. Everything is transactional - we're dealing with money. People rely on us. Not to mention that latency is important. Take too long to on a request and more people will drop off. So we have to optimize our Android and iOS apps to be efficient with respect to their network requests.
AWS had a few issues last year. US-EAST took hits. How did you handle that?
The short answer it was painful at first... We were totally reliant on US-EAST. After it happened again and harmed the customer experience, we ended up doing some amount of cross region-replication and backup to help stop bleeding when it happened. and let us recover more quickly.
But it sounds like things weren't bad enough to move off? Even though RDS didn't support cross-region replication until recently?
Everything else seems to be worse. So no, it didn't, and still doesn't, make sense. The switching cost wasn't even the issue - the other solutions we've investigated just aren't much better. Node on EC2 can take you a surprisingly long way.