Monolith: From Zero to Millions

Nic Lasdoce

14 May 20244 minutes read

Facing performance issues as your user base grows? Learn how to scale your monolithic architecture to handle millions of users without migrating to microservices. Discover practical strategies for optimizing your database, managing high request volumes, handling long workloads, and offloading large media files. Transform your monolith into a robust, scalable system in manageable phases.

The Problem

Monolithic architectures, particularly the three-layer monolith (UI, API, database), are popular among startups due to their simplicity and familiarity. However, as user bases grow, these monoliths often encounter performance bottlenecks. Some startups have the time and resources to migrate to microservices to address these issues fully. However, many fear that the complexity of such a transition may lead to a long-term commitment to refactoring the architecture, which they cannot handle at the moment.

The Solution

Fortunately, there are ways to scale monoliths to handle millions of users in a reasonable amount of time, and these can be done in phases.

Phase 1: The Database

A common issue is database bottlenecks due to long or complex queries, as well as analytics and reporting-related queries. Two common patterns to address this are sharding and replicas.

Sharding: This is an option when you have a good data clustering design, which you can use as a shard.
Read Replicas: These are more common because they can easily be integrated into the system. If your API framework uses ORM, chances are it has a database routing configuration allowing you to specify that all reads go to a different database, while writes go to a master.

If multiple read replicas are needed, implement load balancing on the database side so your application only needs to connect to a single load balancer, which will handle which database serves the query.

AWS Quicktip: For PostgreSQL and MySQL databases, you can use AWS Aurora to auto-scale the number of replicas and also obtain a single read endpoint acting as a load balancer for read queries. This significantly reduces code changes, setup time, and even provides a great multi-region disaster recovery option.

Phase 2: Too Many Requests

Scaling the database is not enough because the API is the first to receive requests. Phase 1 and Phase 2 can be reversed depending on which step is most impactful. If your API times out due to database issues, go with Phase 1. If the database can handle the load but the API server is slow, start with this phase.

This phase involves horizontal scaling—adding more servers instead of more CPUs and memory to the existing server. This approach allows handling increasing traffic without downtime for upgrades and enables reducing server numbers when traffic decreases. A load balancer is essential to distribute traffic among servers.

AWS Quicktip: For EC2 users, create an auto-scaling group and Elastic Load Balancer for easy setup. For ECS users, set up auto-scaling rules within the ECS Service configuration and attach an Elastic Load Balancer.

Phase 3: Long Workloads

Monoliths often have large functions or processes that take a long time to execute, blocking the entire server and causing some users to drop out. Addressing this requires planning and effort. First, assess your system using tools to identify bottleneck functions, then determine which processes are not needed in the response and can be moved to the background.

Move these processes to a worker, preferably on a different server than your API. Use a message broker or queue system for communication between the API and the workers. For Python applications, python-rq and django-rq are good options, while Bull-MQ is suitable for NodeJS applications. These libraries are based on Redis, which is simpler to implement than more robust systems like RabbitMQ.

AWS Quicktip: Use Elasticache for Redis, which offers scalability through replicas and sharding, simplifying infrastructure management.

Phase 4: Large Media Files

Many applications use their API to upload and serve static media files like images and videos, consuming significant memory. A better alternative is using storage services like AWS S3 or Google Cloud Storage. This step removes the load from your API, allowing it to handle more requests by simply providing upload/download links that utilize the storage providers' infrastructure. Implementing a Content Delivery Network (CDN) can further enhance performance by caching static files.

AWS Quicktip: Use AWS Cloudfront for content distribution. It integrates well with S3 and allows setting up security to restrict direct S3 bucket access, providing signed URLs to ensure authorized access with controlled expiry.

Phase 5: The UI

I placed this phase last because modern web applications running on JavaScript frameworks are typically efficient and performant out of the box. However, as user traffic grows, you may still need to optimize and scale your UI to ensure a smooth experience for all users. Here are some strategies for scaling your JavaScript-based UI:

Code Splitting and Lazy Loading - Divide code into smaller chunks and load components only when needed to reduce initial load times.
Content Delivery Network (CDN) - Use a CDN like AWS CloudFront to distribute static assets globally, reducing load times and server load.
Server-Side Rendering (SSR) and Static Site Generation (SSG) - Use frameworks like Next.js for SSR and SSG to improve initial load times and SEO.
Load Balancing and Auto Scaling - Same as the API, you can distribute traffic across multiple servers with load balancers and use auto-scaling to manage server capacity.

AWS Quicktip: For AWS users, consider using AWS Amplify to host and manage your front-end application. Amplify provides built-in support for CDNs, automatic builds, and deployments, making it easier to scale your UI.

Conclusion

Scaling a monolithic architecture to handle millions of users may seem daunting, but with a phased approach, it is entirely achievable. By addressing key bottlenecks in manageable steps—optimizing your database, managing high request volumes, handling long-running workloads, and offloading large media files—you can significantly enhance your system's performance and scalability. This approach allows you to leverage the simplicity and familiarity of your existing monolith while incrementally improving its capacity to support a growing user base. By implementing these strategies, your startup can continue to grow and thrive without the immediate need for a complex and resource-intensive migration to microservices. Take these steps to transform your monolith into a robust, scalable system that can meet the demands of millions of users.