foreign ninjas and we are here on our lecture number three which is scalability scalability of a system right so a lot of times in our past two lectures the first one Introduction to System designs as well as in components of system designs we have listened about scalability a lot like how do we scale system and the overall Crux of the system design is that you want to create a system which is highly scalable that means it scales to large extent now when I talk about this word scaling what does this scaling means here is uh

I I've told you in the past lectures with this scaling right when you talk about scaling we are actually trying to create such an environment which actually enables a lot of users and when I say this lot word here I'm just telling you millions of users at a particular time right so how can we create such applications such a web app such as service which can be used by millions of users concurrently this is the problem statement that you we are going to cater now and towards the end of the series we are going to

do a lot of examples how can we create how can we build a service like Instagram how can we build a service like uh Uber Google Maps and all this kind of stuff all right so uh yeah so that's it for the like more of the introduction and the like uh the thin layer out of it but now let's Deep dive on it uh yeah so let's start with what is scalability so uh let's assume we have a computer which has some kind of code running on it and when I talk about this particular code

this code is some kind of a business Logic for me right so let's say I am creating a uh application or a code of uh let's say it is some kind of a map service correct all right math service in some kind of explication right and there are a lot of users in this x location and they want to use this map service to go and roam around for example some new tourists come uh to this this x location and they want to visit what are they going to do they're going to open this map

to get uh the sense of like where places are what are the favorite what are the like top restaurants what are top places to visit in this next location right we are going to use map service a lot so as you can see when I talk about this problem statement there are going to be multiple users was going to use this map service so we are actually having some kind of functional requirement here and we are actually working on our business logic and this code is a part where our whole business logic the Saints all

right and this is uh our business logic and this is actually what people pay us money for so what exactly you want to use we have multiple users let's say we have a a person a we have a person B over here and we have a person C over here they're actually paying us so that they can use this particular code correct so what are they actually going to do is they're going to request they're going to require a request to this application to this machine here that they want to use this particular code which

is what they are paying for and they won't actually uh use this so in return this application is going to send some kind of response correct this is going to send some response as well as this is going to send some response so on a hole you can just see a system has a request response kind of architecture correct architecture and this then in this I can directly say this request is pardoned by client client request the server and server gives back response to the client correct so there are multiple users a b and c

who are actually requesting this code now obviously a code will be having multiple service attachments for example let's say we can just talk about as database foreign your code is actually used by three users and the system is working fine but someday let's say a CR a group of thousand peoples comes inside the city comes inside the sex location they want to actually um use this map service how can they use so here we have all right so we have thousand new enters already there were thousand people living in the city but there are new

thousand people who wants to access this map now at this point my application starts slowing down like my application is not able to perform well correct so this is the point where we can say that my system is not scalable now if you've got the sense I can just say scalability is proportional to foreign that means my system if it is scalable that means it is going to be highly performed it is going to perform in a very good manner with respect to how much resources we add to the system for example let's say this

is my application my application is working from person 1 200 but even if I increase the traffic from thousand to like 2000 my still my application is highly performant when I say this kind of thing that means my system is actually scalable so scalable is directly emotional to perform performance foreign as you all know application cannot be accessed directly so when you are on a network let's say we are on a network here we are on internet over here and this is our machine correct we actually expose this code with the help of apis apis

help us in exposing this code to multiple users so these are the apis all right uh I have it's you know full form is application programmable interface so these are some specific protocols through which uh our code is exposed over internet we call them as apis these are some kind of protocols which uh application actually communicates to each other on web and this is how multiple users are able to actually communicate to through code with the help of these apis correct I think uh you're getting this point so when I talk about this particular thing

there are a lot of concerns uh related to this correct so when it when he talked about system designs right when you talk about system design there are three things that we actually want we want a system to be reliable that means fault tolerant and we want a system to be effective should meet business requirement and you wanted to maintain able right that means can be scale or maintain correct so let's suppose we have our business logic inside the system typical things associated with what are the harmful things that are associated what if someday this

machine guns goes down what is some days this machine goes down that means every data inside this code is going to be empty it is going to vanish and we are going to have a bad impression on the customers because customers are actually paying us for this thing we have our customers even let's say they're actually paying us some money for providing this use case this code right so we never want our machine to go down got it we also want a it to be fault tolerant as well as we also want it to be

consistent correct so in such cases a customer will only be happy lonely be happy if my machine is totally fault tolerant totally consistent and our machine I mean it is actually reliable how can you actually do that we can host this particular code in two ways the first one is I can either hosted remote or on my desktop now what is the difference between hosting it on a desktop or a remote fundamentally there's nothing uh which is different right there's nothing which is different but when you talk about remote it is provided by some kind

of cloud providers now Cloud providers like gcp Google Cloud platform or you can just talk about Azure or you can talk about Amazon web services what they do is they provide us uh uh the the brightest machines they provide us compute engines which are actually located somewhere else and we able we actually access them use uh remotely right so these Amazon web services we actually pay them so that they can give us resources which are actually uh some desktop which are running from some other part of the world and we actually uh run our algorithm

or our code in these machines this is what gcp and Azure and AWS helps us so these are cloud provider which provides Us remote machines so and we actually pay them money to access these resources whereas desktop is something which you own are on sales so the code is something that I can actually run on a remote machine or on a desktop all right but the problem here is there are many users so to support many users whether it be a cloud machine or whether it be a desktop there's one thing that we actually want

to ensure that even when a machine goes down or something bad happens all right what should I do how can I actually maintain the system how can I give reliable solutions to the customers how can I give reliable information to the customers this is one of the bigger concern and how am I able to compute large amount of data as you all know the current industry is more data intensive rather than compute intensive correct how can we actually work on it so the thing is to work with much data uh intensive application what we can

do is either the first thing is we can buy bigger machines foreign which can handle more data loads or second thing is what we can do is why more machines the example that I can give here is for example let's say you have to go on a trip and you have some clothes with you all right you have just one back and you you've actually you are actually trying to fill all those clothes inside the bag but at a point comes when your back back gets full now how to add in more clothes there are

two solutions either you can buy a bigger bag and put all the clothes inside that bigger bag this is the first option or you can just buy identical second bag right so you can have the same kind of bag which you already had and you with the same size with the same um weight acceptance and you can just add the remaining clothes in that bag so there are two ways either you can buy a bigger bag or you can buy more bags so essentially we have the same thing over here what you can do is

to run more application to run more data intensive application or whether it be compute intensive application the first thing that we can do is buy bigger machine that has better configuration when you talk about a system it can have better configuration for example let's say it has a good processor it has good caches it has a good Ram maybe so the configuration is bigger this is what you call as this is what you called as vertical scaling vertical scaling when you talk about buying more machines with the same uh configuration we actually call this horizontal

scaling all right this is also called the sharding if someone asks you what is sharding you can just tell them that this is exactly horizontal scaling all right so these are some two kind of uh scaling so what do you actually do is to cater load what we try to do we are actually either make our machines bigger to handle that road or we actually divide that road into multiple Machines of similar configuration so yeah these are some two these are two types of uh scaling like the new type of scaling that I want you

to know is functional scaling which a lot of people don't know functional scale but I'm going to tell it using an example so let's stay let's take an example of a e-commerce website e-commerce website and let's divide this into 14 so for example let's say this one is your uh authentication team let's say this one is your transaction team all right this one is your uh catalog and let's say this is your uh feedback all right so we know that the whole team the whole uh e-commerce system is divided into for this team but someday

and you can see let's say we have a number of users so let's say transaction is done by like a lot of users authentication is done by let's say two years two users catalog is seen by hundreds of users all right 100 of users are actually seeing the catalog and feedback is just given by one users so as you can see there are different sections there is much separation over here in the system there's no separation over here in the system but I can see this particular catalog part is seen by many users whereas this

transaction as well as the auth part is seen by very less number of users our feedback is just seen by a very few number of pieces so I can here see my catalog part isn't working so what I can do is I can independently this particular uh part of the application so I can just scale it up so let's say I can create a second catalog as well a different node for catalog as well so that we can divide some users here as well so that they can actually see the catalog through this node correct

so what I've done is I've actually scaled according to the functionality I've actually started scaling according to the functionality let's say hey I want to scale the catalog part but not the transaction part here I want to scale auth part but not the feedback part so what I can do is I can essentially do a scaling at a function level as well but here you you're seeing you're actually scaling depending upon the function but when I talk about horizontal scaling it is actually a homogeneous separation it is actually your homogeneous scaling now when I talk

about this homogeneous scaling let's say what I'm trying to say here is for example let's say this is your big database all right big database which is showing which is storing e-commerce data So and I've made partition I made this three partition over here partition a partition B and partition C so in Partition a also I'm going to have transaction data in Partition we also a also I'm going to have uh my feedback it is also going to contain my catalog similarly for B it is also going to contain my transaction it's going to contain

my feedback it's going to contain my catalog and same goes for c as well so here you can see is there is no particular schema on how we are separating but we are evenly separating the database according to how we have to separate so this is called a sharding we are actually partitioning our database into some parts but when I talk about functional scaling we are actually separating things on the basis of what functionality that particular node is handling so this is one important type of functionality and a lot of people not know about this

so make sure that you understand this and I hope with that we are more clear on on a like higher level what scalability is now let's come out the differences between uh as a horizontal and vertical scaling so here we are going to see horizontal scaling here we are able to see vertical scale all right so okay so let's start with the horizontal scaling part so when I talk about horizontal scaling it has multiple machine as I said horizontal is buying one machine so I I will have multiple computers over here now all right but

here I'm going to have a big computer here now when I talk about multiple computers over here I will have multiple requests so request one request two of the request r how can I actually pass them to these servers I'll have to manage load for this I will have to require a load balance here a load balancer here so these requests are going to go through load balance and this load balancer is going to separate out among the server so what I'm going to do here is the first difference is load balancer is required in

vertical no lb required all right the second Point here is let's say Someday My this machine goes down but you can see I still have these running now the request routed to This Server let's say this is server number three this two one so the requested to Service uh server 3 can be due to server for buy because server 3s will but let's say here we have a request and this actually fails we don't have any other point to go all right so this require request can't be handled so when I talk about this particular

part horizontal this is resilient all right this is single point of failure so if this bigger machine fails everything fails whereas horizontal is more resilient all right third we are going to have more Network calls obviously there are going to be a lot of network calls uh in terms of apis but here because everything is inside a machine because we just have machine so we are going to actually talk on application Level and when we talk about uh application Level this is going to be RPC which are actually faster so it is going to be

faster communication uh Network calls a bit slow it depends on the bandwidth that you have for the system so it is going to have a bit latency so but rpcs have lower latency as they are inside the system fault because it is horizontal we have to maintain data over all the particular nodes so there can be issue which relates to which leads to data inconsistency whereas vertical machine has the data is consistent because this is a single point correct similarly fifth it's K as well does not scale right for example let's say I have 500

users more so for 500 uses at any point of time I can add more machines but here this is vertical scaling there is some kind of Hardware limit and Hardware limit depends on the type of chips that you use so it depends on the semiconductors it depends on a lot of things uh and they they have they actually have some kind of Hardware limit so that's why it is not easily easy to scale vertically but horizontal scaling can be done any point of time we can add add in more machine we can remove more machines

so horizontal scaling can be well scaled but the question here is you must be thinking what do you what do we exactly use uh in a real system so the answer to this is we use a measure that means you use hybrid all right so we take these properties from horizontal resilient and scales well whereas we actually take fast communication as well as data consistent from this thing so how can you do it we actually horizontally scale bigger machines correct so with bigger machines I have more compute compute power and faster processing as well as

when I scale them horizontally bigger machine so it can be scaled well as well as it is more resilient because I don't have a single point of failure now so we actually use a hybrid ticket in the model so as you said again there is a trade-off between uh performance and scalability let's see that too performance and scalability so just extra point for you guys a lot of a lot of times you actually uh there's there's a misconception about these terms so when I talk about performance right so performance uh when you talk about performance

is so f it is a performance issue it is if it is a performance issue the system is slow for a single user for single user but when we talk about a problem related to scalability that means it is slower for single user okay I mean slower for many user but faster for single use so when you talk about a scalability issue that means the application is slower for many many people like when I have many requests my system is not able to handle those requests but for low number of requests my system is working

very fine so that is a scalability issue that my system is not highly scalable but when there is a performance issue it is not at all working first for a single use as well that means for lower number of requests also my system is actually working very slow so this is a trade-off between uh performance and scalability how is performance different from scalability when it is a performance issue when it is a scalability issue I think this has answered it well when uh the system is slow for a single user that is a performance issue

when the system is slower for many users but first of a single user that is called as scalability issue so with that I think uh we have learned about scalability we have learned horizontal scaling we have learned about vertical scaling as well as I've given you a bit information on functional scaling and the differences between of horizontal and vertical scaling so that's all for the scalability lecture we are going to see distributed cache in the next lecture and make sure that you like share and subscribe Channel and if there's anything that you missed out or

you want to know you guys can just comment in the comment section and I will be very happy to help you with any of the information that I can give to you guys uh with that note please like share subscribe and let's meet in the next video where we are going to study about distributed caches that is going to be very awesome so yeah bye

L3: Scalability Of System Design: Horizontal vs Vertical Scaling | Lesson 3 | System Design