data:image/s3,"s3://crabby-images/cafdd/cafdd8d103e5986bf1472e10943dcbfc0c0358ca" alt="Tinyurl system design"
data:image/s3,"s3://crabby-images/b76b2/b76b20d0a65024f866eddce33563279a35b7e62a" alt="tinyurl system design tinyurl system design"
(Even if we store the data for 5 years, we will merely store around 20B URLs) Using MD5 hashing:Īfter using the MD5 hashing algorithm which generates 128 bits hash value, we will have 22 characters as the output, since Base64 encoding will use 6 bits for representing a character. URL Length = 8, => 64^8 = ~280 T possible URLs Length of URL:įor our system memory requirements, the length of 6 characters is more than sufficient. URL Length = 7, => 64^7 = ~5 T possible URLs URL Length = 6, => 64^6 = ~70 B possible URLs Let us consider Base64 encoding as it contains all characters which can be included in a URL, then what length should be the appropriate length of the shortened URL. The hash which is to be encoded, could be in Base36 (), Base62 () or Base64 (,’+’,’/’). Algorithm for URL shorteningįor generating a short URL that is unique from an existing URL, we could use hashing techniques for the same. Examples of NoSQL databases include Amazon's DynamoDB, Facebook's Cassandra, and many more. Scaling Relational or SQL databases would increase the complexity of the system considering the number of queries to be dealt (with Around a billion records to be stored).Ĭonsidering the above points, a NoSQL database seems like a wise choice.SQL databases are always preferred when complex query retrievals are involved in a system and the system has a complex database schema.The system is expected to have minimum latency in the redirection.As stated earlier, the system would handle more redirection requests as compared to shortening requests.We only need to store which user has requested the shortening of the URL.Some points before considering whether to use SQL or NoSQL for the database: Database to be usedįirst of all, we need to figure out what type of data we need to store in the database, and then we will further discuss the scalability of the database. Bandwidth Required:įor URL shortening, there would be about 50 requests in a second, so the incoming data would be around 50 * 500 bytes = 25KB/s.įor URL redirection services, in a similar way the outgoing data would be around = 5K * 500 bytes = 2.5 MB/s. Considering, there could be multiple requests for the same URLs, so 43.2 GB is the upper limit for the memory requirement. The memory required to cache around 20% of these would be 0.2 * 432 M * 500 bytes = 43.2 GB. We have around 5K redirection requests per second, which would be around 5k requests * 60 secs * 60 mins * 24 hrs = 432 M requests per day. The size of each object is taken around 500 bytes, which will give an estimate of storage needed which would be around 7.2 B * 500 bytes = 3.6 TB Estimating Memory Requirements:Īccording to the 80-20 rule, 80% of the traffic will be generated by 20% of the URLs, so it would be better to cache these 20% URLs.
data:image/s3,"s3://crabby-images/3e2f7/3e2f7e6c7a8b875b0a7b33a46331b2e3a1c758ce" alt="tinyurl system design tinyurl system design"
So, the number of objects to be stored would be around: 300 M * 2 yrs * 12 months = 7.2 B. Let's assume, we store URL requests for a period of 2 years. So, there would be around 50 X 300M = 15B redirections per month which is around 300 M / (30 days * 24 hrs * 3600 sec ) * 50 = 5K URL redirections in a second. Let's take the request ratio to be 50:1 between redirection and shortening. It is safe to assume that our service would have more requests for redirection as compared to shortening. Let us assume 300M fresh URL shortening requests coming up each month.
data:image/s3,"s3://crabby-images/b2fae/b2fae6ce3f72d2ff1380392e85e07a8e36a88afb" alt="tinyurl system design tinyurl system design"
data:image/s3,"s3://crabby-images/cafdd/cafdd8d103e5986bf1472e10943dcbfc0c0358ca" alt="Tinyurl system design"