Thursday, January 24, 2019

Entity Hierarchy in Cosmos DB.

In this article, we will discuss different entities of Cosmos DB and the overview about the entities. Below picture illustrates the entity order of Cosmos DB.




We need to create a Cosmos DB account under a Azure subscription. Once we have Cosmos DB account we can start creating the database under it. There can be one or more databases under one account. A database in cosmos DB is comparable to a namespace, it is a logical grouping of containers. The database helps in managing containers. Based on the type of API (Application Programming Interface) we select the type of entities in the database will differ.

Each database can have one or more containers. Containers help in managing the throughput and storage of items of a container. That is, during the creation of the container we can select the throughput and storage capacity of the container and these values can be altered after the creation of the container as required. The data entered in the container are logically partitioned automatically based on the partition key. So as the new data gets added to the container new logical partitions get created automatically. These logical partitions are mapped to physical partitions. By using snapshot isolation in cosmos DB we can update items under a particular partition.

The throughput of the container also gets partitioned across the partitions. The throughput of a container can be configured in 2 modes:

  • Dedicated: Any container having throughput set to dedicated mode, the throughput of the container is dedicated to it alone.
  • Shared: The containers for which throughput is set to Shared, the throughput is shared among other containers of the database.


Containers are Schema-Agnostic which means it is not mandatory to create schemas while using it and in case if we want to create a schema, we can create. Due to this, by default, all the items of a container are indexed automatically. We can manage indexes by using index policies of the container. 

A container can have different types of items, it can have an item representing an employee and it can have an item representing a vehicle and so on. Based on the type of API we select the items will differ. For example, we can use SQL API if we want to build a non-relational database and to query using SQL syntax, we can use Gremlin API if we want to build a graph database, if we are planning to migrate from Azure table storage to cosmos DB then we can use Table API and so on.

A unique key constraint can be created on a container through which we can enforce one or more unique values per logical partition key. This helps in preventing the duplication of values that have been specified by unique key constraint. Based on the type of API we choose the property of the container varies. We can maintain the operations log of a container by using Change feed option. Through change feed, we can maintain before/after images of items of a container.

The life span of the items of a container also can be managed by using the Time To Live (TTL) option. This option can be used to delete particular items of a container after a certain period of time or it can be used against the container itself. The items set with TTL value gets deleted from the container once the value is reached.


Thanks VV!!

#cosmosdb #Containers, #items, #Entity Hirerachy