Design Interview Process

In this article I will be laying out the process for system design interview and also covering some of the examples for each one to go over the process framework.

The whole system design interview is based on aspects of breaking the given problem into smaller explainable components during the interview setting. I think the whole 35-40 min interview (5-10 min gone in formalities, and questions in interview end), is too short time to cover any full fledged system design. It is important to understand from the interviewer perspective what exactly he/she are looking for and direct the interview accordingly. In my experience the focus should be defined quality rather than quantity here, I mean you can go on and on for next 40 mins or so, if you didn’t touch on things which interviewer was looking then it gives a very generic feedback.

Overall I think the whole design interview, can be divided into three phases.

Requirement gathering & Analysis (10 mins)

This is understanding what problem needs to be solved during the interview. Also scoping down the problem, such that it is MVP design which works on the core part of the problem. This requires active discussion with the interviewer to help to narrow down the scope.

The requirements are usually categorized in two parts.

Functional requirements:

These are the requirements related to user functionality of the system. In the sense what kind of features need to be supported from the user perspective in the MVP of the system. Ex: In case of Taxi service, this could be booking a cab. or in case of instagram could be positing a picture on instagram.

Non Functional requirements:

There needs to be a discussion on read write patterns for the application. Is the application read heavy or write heavy? These are the requirements, which the system should have to support a good customer experience when the number of users are increased. An instagram can be also created as to support 10 users and running on single server as monolith and all 10 users seem to use the system fine. But as the number of users increase, that is where the system needs to be distributed, as well as available and Consistent which brings the challenge in design such system. So the requirements here, are tend to understand how many users are expected to use the system and based on that what level of choices we need to make the system usable for so many users.

This part also requires a good discussion with the interviewer to understand what makes sense. Common Topics to discuss are:

Scalability
Availability vs Consistency
Reliability
Latency
Durability

Example:

Usually user facing online systems are highly available, but can that be assumed for a highly consistent system where a downtime can taken to make the data consistent?

What level of consistency is expected. Will eventual consistency good enough?

What level of latency is expected?

What about safety & security considerations in designing the system?

High level design (15 mins)

This is the next part of the interview, here there are multiple aspects which can be covered. But usually the time is less, and it is important to be cognizant about the time in the part of the interview. My though here is to come up first with a high level block design. This design is basically, dividing the whole application into smaller component level pieces or micro-services with the interactions between these components.

Basically this is part where you take the functional requirements and list the components which can capture those requirements. Doing so firstly helps to understand that all the requirements have been captured, and drawing a picture brings both interviewer and interviewee on the same page. Also once the basic design picture is on the board, it is easier to fill more details in the design. Also gives an idea what kind of problem we are dealing with, is it a breadth problem (many components) or depth problem (few components) with in-depth focus on one of the components. It is better to draw the picture of the high level system during of the system. Most of the components are consistent in multiple system design questions. Important topics are the ones which are specific to the question being asked.

High Level Design Components

Once the high level design or a 10k feet view of the system is captured, what all details can be filled in this part: It is important to be cognizant of the time here and ask the interviewer which part of the system needs to be discussed further in detail. Some of the topics which can be covered or discussed further:

Algorithm or Application server

In some of the design problems, algorithm running in the application server is a critical piece, which is very important to cover as a part of the interview. For example in url shortener problem, how to generate a short key for the url, or in geo location service how to find places in the radius of a user location, or how to compose a news feed from friends post or how to find a cab for taxi service. I think this is part of the design interview which changes across different problems, and is often the core of the problem which is being solved. In such problems, it is important to cover this part of the problem and allocate time appropriately.

Data Flow What data needs to be stored in the system, and what type of storage should be used? A discussion about SQL & noSQL can happen here based on the data model. Also a discussion about the entity model diagram can also be done here, i.e. how the entity objects will look like, relations among entities etc.
API design What kind of API’s the system will be need in the components. Will these be restful API’s or RPC kind of API’s. There could be discussion about Restful OpenAPI spec or RPC like gRPC and Thrift?
Back of the envelope calculations

So this part is basically about understanding which part of the system, will have scaling issues. i.e. Is the bottle neck of the system is regarding too much data, or throughput of the system i.e. lot of concurrent users. Is the system read heavy or write heavy system? This can be done by assuming certain amount of traffic in the system and then calculate data store needs or throughput need. This will help in the next part to improve the scalability of the system. Which is the next part of the system. For more details.

What are some of the easy ways to Back of the envelope calculations.

There is no specific order to cover these points, but based on the problem should be fluid to prioritize these topics. For example if there is any algorithmic challenge in the problem, it is better to tackle that first. If the data model or API is quite straight forward, it is better to skip those after discussion with interviewer and move directly to back of the envelop calculations.

Detailed Design, In-depth service scalability (10 - 15 mins)

So this is the part of the problem, where the problem needs be discussed in more depth in terms of different components/micro-services used and how to scale them for the non-functional requirements. Here it is important to check-in with the interviewer to understand which component to cover first, or which seems like to most critical component which the interviewee knows the best. Now again there are only 10-15 mins available for this section, a lot can be covered by going in depth of a service to scale it or a lot of breadth can be covered by by going in shallow details of each component. This depends on interviewer and needs clarification again about it.

Now scaling can be done for multiple aspects in the systems. It is important to understand, what is the part we are scaling for, and depends upon the problem and back of the envelop calculations gave us a good insight about that.

Do we need to scale for the data, i.e. the problem is generating a lot of data or need to store lot of data?
Do we need to scale for throughput (CPU/IO) i.e. the system has CPU or IO bottlenecks, in terms of data transfer or in terms of computation.
Need to scale API concurrent requests, need to scale the number of users concurrent requests.
Remove hotspots in the system.
Increase availability or GEO-distribution of service.

So the concept of scaling is simple idea, there are two ways to scale:

Vertical scaling: Make the machine more beefy. Works in short term, cost increases exponentially as the machine gets more and more beefy. Probably not the best to discuss in interview.
Horizontal scaling: Add more number of commodity machines, to make the system distributed. Once the system is distributed and working, you can add more machines as more scale is needed. The horizontal scaling comes with its own challenges, but is able to handle the scale for some of the systems which are expected to be built. So that is the path to take for the interview.

Now let’s look at ways to scale different part of the system using horizontal scaling.

Data scaling:

Sharding / Paritioning

Data Types API Database Scaling/Back of the envelope calculations

High level design