Wednesday, December 26, 2018

Consistency levels in Azure Cosmos DB.

There are 5 consistency models supported by Cosmos DB:



Strong Consistency: In this model, the read operations will see only the recent write operation. A client will never see an uncommitted data. This is the strongest consistency level of all the 5. To use this consistency level the Azure CosmosDB account should not associate with more than one region. 


Bounded Staleness Consistency: In this model the client can see uncommitted data as well, that is inconsistent reads are allowed. There is a threshold option which can be used to control the inconsistent reads. We can set a threshold to the inconsistent reads using either time interval or number of versions. Let’s say a time interval threshold is set for one hour, reads will be inconsistent within that 1-hour threshold and beyond the threshold, only consistent data is shown.


Let’s take the example of counting exam marks of a student. In this example Teacher is correcting the exam paper online and giving marks, student and parent are watching the marks online. Let’s assume the teacher, student, and the parent are in different regions. Server A is in the region nearby to teacher’s location, Server B is in the region nearby to Student location and Server C is in the region nearby to Parent location. 

As the correction not yet started servers of all regions show ‘0’ marks.





When bounded-staleness consistency is used and let’s say the threshold is set to 3 hours at 8 AM. So after 8 AM until 3 hours threshold is reached the data will be inconsistent. Before 8 AM, the teacher has granted 10 marks already so all the server’s show the value 10, at 9 AM teacher has increased marks to 15 which reflects on server A but as the threshold set is not yet crossed the Server’s B and C still show the value 10 itself. Same way at 10 AM teacher increased marks to 20 but still, that doesn’t reflect in other 2 servers. Now at 11 AM teacher grants 30 marks and the 3-hour threshold also crossed so the other 2 servers also see the recent write.




  
Session Consistency: When this model is used the client will always see his recent write. So this model allows the user to see their recent write and all the other sessions will see the recent write once the write gets committed eventually based on eventual consistency.

In our example of counting exam marks of a student, if session consistency is used, the teacher will always see her latest write but the student and parent will be able to see the recent write only after the data eventually becomes consistent. So at 9 AM when the teacher changes the value to 15 marks it will be written in Server A and the teacher would see immediately 15 marks but the other 2 servers will still show 10 marks, same way at 10 AM Server A would see 20 marks and the other 2 servers will still show 10 marks. The other 2 servers also show the recent write which is 30, once the changes get replicated eventually.








Consistent prefix Consistency: This model will allow the sequential reads to see only the sequential writes which are if the value is changed from 10 to 15 and then to 30 the other servers will show either 10,10,15 or 10,15,15 or 10,15,30 but the other servers will never show out of order writes that means the other servers reads will never see values like 10,30,15 or 30,10,15 as that is not the sequence they have been updated. The reads will always show the data in the same sequence they are written.

In our example as the teacher, updates marks to 10 then to 15 then to 20 and finally to 30, the student and parent also see the marks in same sequence 10, 15,20 and 30.






Eventual Consistency: This is the weakest of all the consistency models. There is no guarantee of consistent reads while using this model. As there is no threshold limit set we can never be sure the data is consistent. Probabilistic bounded staleness metric can be monitored to estimate how often we can see strong consistent reads.

In our example, let’s say at 8 AM teacher updated marks to 5 but the other servers will still show 0 marks and same way at 9 AM even though the marks have been updated to 15 the other servers will still see 0 marks. So once the data becomes consistent eventually then all the servers will show the same value.






Most of the real-time applications use Session consistency level. If strong consistency is required along with multiple regions then Bounded staleness consistency is used. When the highest availability and low latency are required then eventual consistency is used.


Please share the type of consistency used in your environment and the reason to use.



Thanks VV!!


References:

No comments:

Post a Comment