Google Mesa: A Geo-Replicated, Near Real-Time Data Warehouse
Manage episode 487366627 series 3670304
**Mesa** is a highly scalable, geo-replicated data warehousing system developed at Google to handle petabytes of data related to its advertising business. **Designed for near real-time data ingestion and querying**, it processes millions of updates per second and serves billions of queries daily. **Key features include strong consistency, high availability, and fault tolerance**, achieved through techniques like multi-version concurrency control and Paxos-based distributed synchronization. The paper details Mesa's architecture, including its storage subsystem using versioned data management with delta compaction, and its multi-datacenter deployment. Finally, it explores operational challenges and lessons learned in building and maintaining such a large-scale system.
https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=bb1af5424e972c0c15f21e3847708e4d393abfae
43 episodes