Back-of-the-envelope Estimation

Ishan Aggarwal
3 min readAug 13, 2023

--

In a system design interview, when you jump to the design/architecture for your application — you start using various components like — Load Balancer, CDNs, cache servers and multiple application servers and DB servers etc. At that point of time — an interviewer can ask you questions on what is the reason behind using LBs, multiple application/ DB servers etc.

Although we know that the idea behind such system design interviews is that you design large scale applications that are scalable, highly available in nature. But it is always better to clarify with the interviewer about the estimates of the application you are asked to design.

Also, it is needed to understand and be sure that we are never over utilizing our resources (servers) as well as underutilizing the resources. Back-of-the-envelope derives our decision for System Design Architecture.

Few Points to Consider

  • Rough Estimates (T-shirt size numbers)
  • Do not spend much time on this. (Should be covered between 5–8 minutes)
  • Keep the assumption values simple. For example, use numbers that are in multiple 10s, 100s so that it helps in the calculation.

Cheat sheet

Computation/ Storage Size

Availability Numbers

Data Type Size Assumption

Some Important Facts

  • Memory is fast but disk is slow.
  • Avoid disk seeks if possible.
  • Simple compression algorithms are fast.
  • Compress data before sending it over the internet if possible.
  • Data centers are usually in different regions, and its takes time to send the data between them.

What To Calculate?

  • Number Of Servers
  • RAM
  • Storage Capacity
  • Trade Offs (CAP Theorem)

Calculation Formula

x million users * y MBs = xy * TBs (Because both Million and MBs have 6 zeros each)

For example,
5 Million Users *2 KBs=10 GB (Because Million has 6 zeros and KB has 3 zeros)

Example: Facebook Query Per Second and Storage Requirements

Traffic Estimation

Make some assumptions

Storage Estimation

Make some assumptions

RAM Estimation

Number Of Servers Required

Trade Offs (CAP Theorem)

For this use-case — I would like to keep

Availability and Partition Tolerance.

Thank you so much for reading this article. If you found this helpful, please do share it with others and on social media.

Stay tuned and follow me for more content on System Design (LLD and HLD)

https://www.linkedin.com/in/ishan-aggarwal/

--

--

Ishan Aggarwal

Consulting Principal MTS @ Oracle Cloud Infrastructures | Works on designing highly Scalable and Distributed Systems