Database Consistency Explained

May 27, 20227 minute read

John Noonan

What is database consistency?

Database consistency is defined by a set of values that all data points within the database system must align to in order to be properly read and accepted. Should any data that does not meet the preconditioned values enter the database, it will result in consistency errors for the dataset. Database consistency is achieved by establishing rules. Any transaction of data written to the database must only change affected data as defined by the specific constraints, triggers, variables, cascades, etc., established by the rules set by the database’s developer.

For example, let’s say you work for the National Traffic Safety Institute (NTSI). You’ve been tasked with creating a database of new California driver’s licenses. The population of California has exploded in the past ten years, creating the need for a new alphabet and numerical format for all first-time driver’s license holders. Your team has determined that the new set value for a California driver’s license in your database goes as follows: 1 Alpha + 7 Numeric. Every single entry must now follow this rule. An entry that reads “C08846024” – would return with an error. Why? Because the value entered was 1 Alpha + 8 Numeric, which is, in essence, a form of inconsistent data.

Consistency also implies that any data changes to any one particular object in one table need to be changed in all other tables where that object is present. Keeping the driver’s license example going, should the new driver’s home address change, that update must be represented across all tables where that prior address existed. If one table has the old address and all the others have the new address, that would be a prime example of data inconsistency.

Note: Database consistency doesn’t guarantee that the data introduced in any given transaction is correct. It only guarantees that the data written and read within the system meets all prerequisites of data that is eligible for entry into the database. To put it simpler, given the example above, you can very well enter a data transaction that meets the 1 Alpha + 7 Numeric rule, but that doesn’t guarantee that the data corresponds to an actual driver’s license. Database consistency doesn’t account for what the data represents, just its format.

Why is database consistency important?

Consistent data is what keeps a database working like a well-oiled machine. Established rules/values that keep inconsistent data out of primary databases and replicas keep its operations smooth with:

Accuracy
Increased database space
Faster and more efficient data retrieval

Database consistency regulates all data coming in. So although the database changes when accepting new data, it at least changes consistently and in accordance with the validation rules established at the onset. In today’s world, there are daily billion-dollar decisions made around the globe based on the perceived consistency of a database. When real-time information becomes the new status quo for modern-day digital businesses, it’s vitally important that validation rules are put in place to keep datasets clear of erroneous information, as that also increases latency, making real-time experiences not so real-time after all.

Database consistency examples

What are examples of a database consistency operation in the real world? We’ve already explored one example with our NTSI scenario above. Let’s pivot to the world of banking.

Say you’re transferring funds from one account to another. You’ve just transferred $1200 into an account that already has $300. You refresh, positive you’ll find a $1500 balance. Yet, this recent operation isn’t reflected in your balance. In fact, your new balance now reads $0. This technical slight is a prime example of weak consistency and will likely result in time spent troubleshooting the problem with a bank representative. Issues like these can tarnish a brand’s reputation and cost a significant amount of money. Strong consistency in database systems is becoming more and more of a non-negotiable, for developers and consumers alike.

Strong consistency vs weak consistency

Strong Consistency means that all data in a primary, replica and all its corresponding nodes fit the validation rules and are the same at any given time. With strong database consistency, it does not matter what client is accessing the data – they will always see the most recently updated data that follows the rules established for the database.

Weak consistency is a bit like the proverbial wild, wild west. There are no assurances that your data in your primary, replica, or nodes is the same at any given moment. One client in India could access the data and see information that passes the validation rules, but may not be the most recently updated data, resulting in consistency errors. They could very well be acting on information that is no longer relevant, even though at one point it may have been.

Consistency levels

Consistency levels are another set of preconditioned values that dictate how many replicas or nodes must respond with the new permissible data before it is acknowledged as a valid transaction. This operation can be changed on a per-transaction basis. So, for example, a programmer can dictate that only two nodes need to read the newly input data before it acknowledges data consistency. Once it crosses that barometer, it will be considered consistent data thereafter.

Isolation levels

Isolation levels are part of a database’s ACID (Atomicity, Consistency, Isolation, Durability) properties. ACID is a foundational concept of database consistency with SQL databases and is what certain databases follow in order to optimize database consistency. Isolation is one of ACID’s properties, and it compartmentalizes certain pieces of data away from all the information in a certain database network, keeping it from being modified by other user transactions. Isolation is leveraged to curtail reads and writes of inconsequential data produced in concurrent transactions.

There are four types of isolation levels:

Read Uncommitted: Lowest level. Stops row updates if a previous transaction provided an uncommitted update to that row.
Read Committed: This does not allow “dirty reads.” This blocks any reads or writes if a transaction has already been updated, but has yet committed.
Repeatable Read: This level keeps the row of data that is being read from being accessed and potentially updated.
Serializable: The highest isolation level, serializable typically locks an entire table rather than a specific row of data.

Database consistency FAQs

Data is consistent if it appears the same in all corresponding nodes at the same time, regardless of the user and where they are accessing the data, geographically.

No. Database consistency requires validation rules for data entering a network in order for it to be consistent, formula-wise, with all the other data in the table.

Data consistency is the process by which data is kept as uniformly as possible throughout the network and between numerous applications leveraging that data.

With eventual consistency, data that has undergone an update will eventually be reflected in all the nodes where that data is stored. Eventually, all nodes will produce the same data whenever any client accesses it in the network through eventual consistency.

All the data in a relational database is stored in tables, which consist of rows and columns. The data points are organized in these rows and columns. Rows, which are often referred to as “records” typically represent the categories of data, while the columns, or “fields,” stand-in for “instances.” A table is found within a database and helps keep your data from getting redundant with its subject-based design.

Tables

The main difference between ACID and BASE (Basically Available, Soft State, Eventually Consistent) models is that while ACID works to optimize database consistency, BASE strengthens high availability. ACID keeps transactions consistent, so if you go with a BASE model, make sure that consistency remains a top priority and is thoroughly addressed.

When Redis is used as a cache, the consistency in question could be between Redis instances (primary/replica) and between Redis cache and Redis as a primary database. In this instance, data can be inconsistent if data between those two do not match. Our blog, Three Ways to Maintain Cache Consistency, addresses how to solve this issue.

For open source Redis, there is weak consistency, but Redis Enterprise’s Active–Active Geo-Distribution provides strong eventual consistency.

Interested in cloud caching techniques for enterprise applications? Click below to read Lee Atchison’s Caching at Scale with Redis.

Read the e-book

Get started with Redis today

Speak to a Redis expert and learn more about enterprise-grade Redis today.

Try for free Talk to sales