CAP Theorem Super Simplified

CAP Theorem Super Simplified

Help me understand CAP theorem with an example.

Chapter 1: “Yatra Inc” Your new venture :

Last night when your friends appreciated you for planning their group travel itinerary to Kashmir, a great Idea struck you. While many struggle to plan their dream vacations, you possess an innate talent for crafting unforgettable journeys. So why not start a venture that will put your talent to use?

Yatra Inc! - Your remarkable journey partner!

Introducing “Yatra Inc.” - Your premier destination for personalized travel experiences. The idea is simple: Travelers can simply dial our hotline, 12KA4—42KA1-YATRA, and request tailor-made itineraries. And the best part? Our services come at an unbeatable price of just $10 per itinerary!

So, your typical phone conversation will look like this:

  • Customer: Hi, I'm looking to plan a once-in-a-lifetime trip. Can you assist?

  • You: Absolutely! What destinations are you considering, and what experiences are you craving?

  • Customer: I've always dreamt of exploring the Mediterranean coast and immersing myself in its rich history and culture.

  • You: (scribbling notes in your travel planner) Wonderful choice! Let's delve deeper into your preferences and create your trip plan. Call us back anytime for anything you need.

  • Customer: Thank you so much for your help!

  • You: It's our pleasure, and we've debited your account with $10 for the planning session.

Chapter 2: Expansion and Overload:

Your venture gets funded by Sequoia. Your Idea spreads like wildfire and you experience an exponential surge in demand.

And there starts the problem. You see that more and more of your customers have to wait in the queue to speak to you. Most of them even hang up tired of the long waiting time. Besides when you were sick the other day and could not come to work you lost a whole day of business.

You decide it’s time for you to scale up and bring in your friend to help you.

You start with a simple plan:

  1. You and your friend both get an extension phone.

  2. Customers still dial 12KA4—42KA1-YATRA and need to remember only one number.

  3. A PBX(private branch exchange) will route the customer call to whoever is free and equally

Chapter 3: The First Setback :

Two days after you implemented the new system, you get a call from your trusted customer Mohan. This is how it goes:

  • Mohan: Hey

  • You: Glad you called “Yatra Inc!”. What can I do for you?

  • Mohan: Can you tell me the hotel location in Gulmarg as per my plan?

  • You: Sure.. 1-sec sir (You look up your notebook) (wow! There is no entry for Gulmarg in Mohan’s page)!!

  • You: Sir, I think there is a mistake. You never planned with us your trip to Gulmarg.

  • Mohan: What! I just planned my trip with you guys 5 days back! (Hangs up the call!)

How did that happen? Could Mohan be lying? You think about it for a second and the reason hits you! Could Mohan’s call yesterday reach your friend? You go to your friend’s desk and check her notebook. Sure enough, it’s there. You tell this to your friend and she realizes the problem too.

What a terrible flaw in your distributed design! Your distributed system is not consistent! There could always be a chance that a customer updates something which goes to either you or your friend and when the next call from the customer is routed to another person there will not be a consistent reply from Yatra Inc!

Chapter 4: Striving for Consistency:

Well, your competitors may ignore a bad service, but not you. You think all night in bed when your friend is sleeping and come up with a beautiful plan in the morning. You wake up your friend and tell her:

”Bro, this is what we are going to do from now”

  • Whenever any one of us gets a call for an update (when the customer wants us to plan or update the travel itinerary) before completing the call we tell the other person.

  • This way both of us note down any updates.

  • When there is a call for search (When the customer wants information about the plan he has already finalized) we don’t need to talk with the other person. Since both of us have the latest updated information in both of our notebooks we can just refer to it..

There is only one problem though, you say, and that is an “update” request has to involve both of us and we cannot work in parallel during that time. For eg. when you get an update request and telling me to update too, I cannot take other calls. But that’s okay because most calls we get anyway are “search” (a customer updates once and asks many times). Besides, we cannot give wrong information at any cost.

“Neat,” your friend says, “but there is one more flaw in this system that you haven’t thought of. What if one of us doesn’t report to work on a particular day? On that day, then, we won’t be able to take “any” Update calls, because the other person cannot be updated! We will have an Availability problem, i.e, for eg., if an update request comes to me I will never be able to complete that call because even though I have written the update in my notebook, I can never update you. So I can never complete the call!”

Chapter 5: Balancing Consistency and Availability:

You begin to realize a little bit about why a distributed system might not be as easy as you thought at first. Is it that difficult to come up with a solution that could be both “Consistent and Available”? It could be difficult for others, but not for you!! The next morning you come up with a solution that your competitors cannot think of in their dreams! You wake your friend up eagerly again.

”look”, you tell her.. “This is what we can do to be consistent and available”. The plan is mostly similar to what I told you yesterday:

  • Whenever any one of us gets a call for an update(when the customer wants us to plan or modify the itinerary) before completing the call, if the other person is available we tell the other person. This way both of us note down any updates

  • But if the other person is not available(doesn’t report to work) we send the other person an email about the update.

  • The next day when the other person comes to work after taking a day off, He first goes through all the emails and updates his notebook accordingly.. before taking his first call.

Genius! Your friend says! I can’t find any flaws in this system. Let’s put it to use. Yatra Inc! is now both Consistent and available!

Chapter 6: The Fallout of Discord :

Everything goes well for a while. Your system is consistent. Your system works well even when one of you doesn’t report to work. Your venture is making a lot of money and your friend is demanding more equity in the company.

Now, what if Both of you report to work and one of you doesn’t update the other person? Remember all those days you’ve been waking your friend up early with your Greatest-idea-ever-bullshit? What if your friend decides to take calls but is too angry with you and decides not to update you for a day? Your idea breaks! Your idea so far is good for consistency and availability but is not Partition Tolerant!

You can decide to be partition tolerant by deciding not to take any calls until you patch up with your friend. Then your system will not be “available” during that time…

Chapter 7: Deciphering the CAP Theorem :

So Let’s look at the CAP Theorem now. It states that, when you are designing a distributed system you cannot achieve all three of Consistency, Availability, and Partition tolerance. You can pick only two of the following:

  • Consistency: Your customers, once they have updated information with you, will always get the most updated information when they call subsequently. No matter how quickly they call back.

  • Availability: Yatra Inc. will always be available for calls until any one of you(you or your friend) reports to work.

  • Partition Tolerance: Yatra Inc. will work even if there is a communication loss between you and your friend!

Eventual Consistency and the Expedition Relay:

Here is another food for thought. You can have a run-around clerk, who will update other’s notebooks when one of your’s or your friend’s notebooks is updated. The greatest benefit of this is that he can work in the background and one of your or your friend’s “update” doesn’t have to block, waiting for the other one to update. This is how many NoSql systems work, one node updates itself locally and a background process synchronizes all other nodes accordingly. The only problem is that you will lose consistency for some time. For eg., a customer’s call reaches your friend first and before the clerk has a chance to update your notebook, the customer calls back and it reaches you. Then he won’t get a consistent reply. But that said, this is not at all a bad idea if such cases are limited. For eg., assuming a customer won’t change their plans so quickly that he calls back in 5 minutes.