A scalable future for our little app

6 min readMay 11, 2020

Growing your app is not as simple as putting more power to the servers. This post will discuss the things we could do to our software architecture to improve the scalability of an application with our application as the concrete example.

This article is written as a part of individual review criterion of PPL CS UI 2020

A brief to our application

For PPL, we are assigned a project of a Tuberculosis Case Monitoring and Tracking System. So far, it is only meant to be used in a certain region with approximately less than a thousand transactions per day.

Illustration by Dave Nathanael (teammate from PPL C5)

Our app consists of 3 services: a backend service, a mobile app, and a dashboard for administrators. It has run quite well so far, but what if we are meant to build something bigger?

The case for our application future

Let us say, our app is meant to be used in a nation-wide area. It now serves for some team in a province alone, but tuberculosis is tuberculosis, there is no regional variation for tuberculosis. So, scrap off the region, make it a variable, now it can run nationwide right?

Quite the opposite, the whole structure of the software has to respond to this change as well. And it has to change on layers of the architecture.

Scalable software architecture tunings

Application scalability is the potential for an application to grow over time — being able to efficiently handle more and more requests per minute (RPM). (https://www.netguru.com/blog/how-to-effectively-scale-your-web-application)

Well, now if we analyze the factors of scalability, we will find something that is somewhat described by this graph :

It says that different levels of tuning impact the scalability in a different magnitude as well. Design Tuning provides the greatest impact, followed by Code Tuning, Product / Software Tuning, and last Hardware Tuning.

We will discuss each of them in this great opportunity.

Design Tuning

One of the most impactful layers to tune is on the design layer. The design layer refers to the underlying structure of your application.

Microservices

Microservice is defined as a service architecture composed of several loosely-coupled services. The reason why it is such a great addition is the way it is so scalable by being able to be integrated and improved individually without disturbing other services. The numbers play part in the way it is spreadable as well.

For example, our backend service that handles all of the accounts and cases can be divided so that the account service and case service are separated. This way, they all can be maintained separately, the weights are spread to a number of services too instead of just one.

By this alone, we already increase the scalability potential by a lot.

Database Safety

With the numbers growing, one of the things we should pay attention to (especially with vital data like the one we are handling) is the consistency and safety of data.

Consistency will be easily achieved if you are using just 1 proper database. But if you have more than 1 database that needs to be synchronized, you might need to have a rollback on error/violation feature.

But, what if you need some safety measures and efficiency? You may want to take a look at a replica database. A replica is a copy of a database in which you can only read from it. This is suitable for processes like the one we have in our app: statistic generation. Further, you can use it to handle some readings too. And you can have more than 1 replica as well.

Asynchronous Programming

Do not wait. This is an important rule when you are doing scalable applications, especially with numerous services. Imagine if you have to wait for one service, then another one, then another, and the chain goes on.

If you can, just make a request, then leave it. If it has some errors somewhere along the line, throw it back to the requester. This way each of your services does not wait for each other.

Code Tuning

Mostly, code tunings revolve around how efficient your algorithm is and how efficient is the query to the databases.

Use efficient algorithm/flow

Of course, aim for the most efficient algorithm as possible. Little optimization when often used, will result in great improvement. But then, you must consider the trade-off with readability.

Minimize queries to DB, always update in bulk

Minimize the number of queries to the database. If you can, always update in bulk (multiple updates in one transaction) instead of updating one object one by one.

If possible, if you need to use the result of the same query over and over, you can save the result of the query first then reuse it. And for lazy querying, you can prefetch the joining tables first.

Use UUID

Using regular ID across multiple DB will need you to access one DB to see the current counter. Instead, use UUID that can be used across multiple DB.

Use libraries that are efficient

If you use some libraries, of course, use the ones that are credible and known to be stable and efficient.

Product / Software Tuning

Product tuning refers to the changes you make on the product level.

Deployment

Deployment may not be your first attention when doing scalability improvement. But, containerization and continuous delivery provide great help when you are making your app available to the masses. For example, by using Docker you can easily add and update instances.

And when you are planning to scale your app, you might need to scale the servers that handle the requests as well. Which brings us to the next point.

Multi-instance + stateless application (Design Tuning as well)

You may want to have multiple instances running and they are all can receive requests from all the users. Of course, you may need a load balancer for this. But, there is a catch! What if a request is handled by instance 1, but the next request is handled by instance 2? Would they be out of sync?

For example, the handler for your request for adding a new case is instance 1. But then, when you wish to edit it, your request is handled by instance 2.

Of course, you want them to be in sync. This means you cannot store temporary data on each instance. Instead, you have to make them stateless. This means there is no temporary data on each instance. You can make use of shared memory storage such as Redis to do the work.

Proper Load balancing (Software + Hardware)

Say you have multiple instances to handle your requests, like the ones discussed in the part above. Of course, it would be a waste if all of the traffic is handled by just one instance. You will need to spread it evenly with proper load balancing, be it a Round Robin, or anything suits you.

Mind you, it is crucial to pick the load balancing in regards to the deployment methods you might have. You have to take into considerations how deployment can render your app unavailable for awhile. For instance, you can redirect your traffics to one server during the deployment, then deploy on the other one, etc.

Hardware Tuning

Of course, more hardware power can be a help as well.

More Computing Power and Memory

The more power and memory your servers have, the quicker they can process requests, and the more load they can handle. Pretty straight forward. Mind you, this is a more costly upgrade…

Last…

I hope with those changes, the app will be more scalable even up to the point it can handle nationwide traffic and requests.