Tag Archives: ACID

SQL vs NoSQL


In this post you will know about the main high level difference between SQL and NoSQL type databases.

SQL vs NoSQL

SQL NoSQL
Relational Type Non-Relational Type
Structured Data Stored in Tables The un-structured data store in JSON format in file but graph of database show relationship.
Strict Schema Dynamic Schema
Vertical Scalable Horizontal Scalable
Structured Query Language Un-structured Query Language
ACID Transactions CAP Theorem
Requires downtime In most cases automatic, No outage required
Rigid schema bound to the relationship Non-rigid schema and flexible.
Helpful to design complex queries. No joins relationship, no any powerful tool to prepare complex queries.
Recommend and best suited for OLTP (Online Transactional Processing) Systems. Less likely to be considered for the OLTP System.
Storage : Table (Row->Entity, Column->Attribute)
RDBMS: Oracle, MYSQL, SQL Server, IBM DB2 etc.
Storage:
Key-Value: Redis, Dynamo
Document: MongoDB
Graph: Neo4j, InfiniteGraph
Wide-column- Cassandra, HBASE
SQL is not fit for Hierarchical work. NoSQL is the best fit for hierarchical work as it follows the key-value pair’s way to store values.

BASE Properties for Distributed Database Transactions


In the previous post, you have learned about ACID properties of the database transactions for traditional databases. As transactions are growing over the internet, systems are scalable and distributed. In some of the systems where availability is more important than the consistency.

For Example, Amazon, eBay, etc.

For such types of systems in 2000, Eric Brewer’s introduce a theorem that’s called CAP Theorem. It states that

“In a distribution system can only have two out of following three Consitentency, Availability, and Partition Tolerance- One of them must be a sacrifice. You can’t promise all three at a time across reading/write requests.”

Based on CAP theorem, where scalability and availability is most important. It introduces alternatives to ACID is BASE for distributed database transactions.

BASE full form:

BASE

Basically Available

The system guarantees availability.

It majorly focuses on availability, potentially with outdated data and it does not provide guarantee on global data consistency across the entire system.

Soft-state

The state of the system may change over time.

Even without explicit state updates, data may change due to the asynchronous propagation of updates and nodes that become available.

Eventual consistent

The system will eventually become consistent.

Updates eventually propagated, the system would reach in a consistent state if no further updates and network partitions fixed.

See Also:

CAP Theorem


Now a days, most of the enterprise based applications are distributed (a collection of interconnected nodes that shared data) over the internet/cloud so that increases the availability of systems. As the application grows and in terms of users and transactions counts and required persistence than big concern is database scalability.

After considering such facts In the year 2000, Eric Brewer developed one theorem that is called as CAP Theorem or Brewer’s conjecture.

CAP Theorem, states that:

“In a distribution system can only have two out of following three Consitentency, Availability, and Partition Tolerance- One of them must be a sacrifice. You can’t promise all three at a time across reading/write requests.”

  • Consistency: Every read request receives the most recent write or an error.
  • Availability: Every request should receive a (non-error) response, without the guarantee that it contains the most recent write.
  • Partition Tolerance: The system continues to work despite an arbitrary number of messages being dropped/delayed by the network between nodes/partitions.

Cap Theorem

In the CAP theorem, consistency is quite different from the ACID database transactions. In distributed systems, partition tolerance means the system will work continue unless there is a complete network failure. If a few nodes fail then the system should keep going.

CAP Theorem Example

You can decide your system technologies based on your primary importance for Consistency, Availability and Partitioning Tolerance. Here we are just taking one example base on database selection:

CA (Consistency + Availability) Type

In this system consistency and availability is primary constraints but such type of system not provide a guarantee of one of the system is offline then the whole system is offline. Otherwise, some of the nodes will not consistent and also not have the latest information.

For Example, Oracle and MySQL are good with Consistency and Availability but not partition tolerant.

CP (Consistency + Partition Tolerant) Type

In this system, consistency and partition tolerance is primary constrains but such a system not provide a guarantee for availability and throws an error as long as the partitioned state not resolved.

For Example, Hadoop and MongoDB stored redundant data in multiple slave nodes and it tolerates an outage of a large number of nodes in the cluster.

AP (Availability + Partition Tolerant) Type

Such a system can not guarantee consistency because if updates can be made to either of a node if some nodes or network issues. This system can have different values on different nodes.

For Example, CouchDB, Dynamo DB, and Cassandra PA type database.

Note CouchDB and Dynamo DB store values in key-value pairs while Cassandra store values in the form of a column family.

See Also:

References