uber

Use Case

driver report location periodically
rider request uber, match a set of neiboring drivers
driver accept a request
driver / rider cancel a matched request
driver pick up / take off a driver

Assumption

Drive report every 4s
Acitive Driver = 200K
QPS = 200K / 4 = 50K
Peak = 5 50K = 250K
Storage: record current location in memory = 200K
100 bytes = 20M

High level

Data Schema

Location: Redis, Hierachy
    geohash
    Set<driver_id>

Driver: SQL
    driver_id
    status
    lng
    lat
    update_time

Trip: SQL
    trip_id
    driver_id
    user_id
    status

Business Logic

Driver report location

  1. read driverDB to get Old geohash
  2. update driverDB
  3. update locationDB: move user_id reside from the old slot to new slot for locationDB of each level

Match a user’s request

  1. insert into tripDB
  2. compute geoHash and find neiboring drivers
  3. read status of drivers from driverDB, send request to available drivers
  4. available drivers accept / cancel the trip (deal with data race)
  5. user perioridally query the tripDB. If no driver matched, send a new request

Scale

Sharding by city, add the field city in the tripDB and DriverDB. If sharding by user_id or driver_id, RT would be longer
Add Push Service Cluster to allow drivers report location faster
Uber Pool: add seat field in the tripDB, and change the field user_id be a list of user_id.