Table of Contents |
---|
...
Reference_description_with_linked_URLs_______________________ | Notes______________________________________________________________ |
---|---|
gq>> Uber system design - incentives, logistics | |
gq>> uber technical whitepaper queries | |
Key Concepts
Uber elevate - Uber flight concepts white paper
...
Uber-paper_ Real-time Data Infrastructure at Uber – Distributed Computing Musings.pdf. file
Uber might seem simple in the first look but does a great job of hiding complexity in order to provide a great user experience. Achieving this requires processing huge chunks of data in real-time and making decisions based on this data. Also time is of essence while making these decisions as they impact the customers who is using the application at that very moment. Use cases such as fraud detection, calculating surge pricing etc require processing petabytes of data in a scalable format. In addition to doing the processing, the system needs to be extensible to accommodate use cases in future
This data comprises of both client side events and system logs from microservices operating within Uber application. The real time data generation also comes from the change-log of production databases where live transactions are getting processed. Processing is performed on this data in order to cover large set of use which can be covered on a high level in these three categories.
- Messaging platform
- Stream processing
- Online analytical processing
Data challenges
- FACTUR3DT.IO for big data, multiple sources, formats,
- Multiple use cases - detail data models for ML/ AI, event streams, smart state change management, version management of schema, function changes, smart queries, extensible models
- Logical models and flows can simplify complex analytics to simple queries
- Need to version manage schemas, functions with data changes
...
Kafka key for integrating data processing and events as transaction messages
At Uber, Kafka is responsible for transferring streaming data to both batch and realtime processing systems. The use cases can range from sending events from driver/rider apps to the underlying analytics platform to streaming database change-logs to subscribers performing computation based on these events.
Potential Value Opportunities
...