Learn how to build and scale a D2C e-commerce platform to handle a 20x volume increase without code rewrites while maintaining 100ms latency goals. This talk covers the architectural decisions, open-source infrastructure, cloud deployment strategies, and observability practices that enabled a high-performance, cost-efficient, and scalable system.
Agenda:
1.High-Level Architecture Overview
• Micro services for Customer, Catalog, Order Management, Payment and Inventory, fronted by API Gateway
• Use of protobuf for lightweight messaging
• Sync messages turned to async and back with deferred result
2.Core Infrastructure
Advanced Open Source db like Postgres and using its features extensively
• Partitioning
• FTS
• Advisory Locks
• Use of bytea type to store read models, and entire protobuf messages – usage similar to NoSql db
• Favor inserting new records over updating existing rows.
Kafka Beyond Messaging
• Load Balancer and lockless Synchronisation
• Service Messages vs Service Event topic organization
Distributed Caching
• Caffeine Cache and JGroups
Cloud Deployment: Kubernetes
• Low cost scale and cloud agnostic deployment support
• Running on credits as part of various startup programs of cloud providers
3.Observability and Debugging
• Ensuring correctness with idempotent request handling.
• Distributed tracing for debugging cross-service interactions.
• Proactive monitoring
• All exceptions logged to Slack and actively monitored by team.
• Reports and Business Metrics also posted to Slack
• Slf4j Slack Appender/Prometheus/Griffin/Sentry ensure you are on top of all the things happening in your system
4.Conclusion and Q&A