Connect with us


A look Into Latest DBA Discussions – Distributed SQL vs. NewSQL Databases

In this series of hot DBA discussion topics, we will try to compare distributed SQL DBs against the NewSQL DBs to understand the significant differences. However, before we jump into the latest categorization of NewSQL databases, it is vital to understand why the NoSQL DBs like MariaDB, MongoDB, and Cassandra, etc. started gaining popularity in the last decade. They came up into the DBMS playground as innovative alternatives to the relational SQL databases but fell short of their objective.

From our experience, we now know that these databases were monolithic, and the distributed nature of these NoSQL databases was attractive to those applications which needed more scalability. Since most of the NoSQL database systems focused on the key-value (single row) data models and failed to handle the multi-row or relational structures of the conventional SQL language, these weren’t able to be tagged as “SQL” DBs. That is how they were called NoSQL.

In fact, NoSQL originally meant “No support to SQL” but then later re-termed as “Not Only SQL” by realizing that NoSQL databases may have to coexist with SQL databases, but cannot replace them completely. The need for conventional SQL databases persisted with the relational database models supporting single-row and ACID multi-row transactions. As time passed by, NoSQL databases proved out to be architecturally unfit to the server for the consistency-first application needs.

The invent of NewSQL

As these were proved out to be the limitations of NoSQL databases, the large scale OLTP workloads in which scalability and data correctness were critical continued to suffer even with NoSQL. Then started the era of NewSQL DBs, which were started showing up in the early 2010s to address this issue. Matthew Aslett of 451 Research coined the term ‘NewSQL’ in 2011 to categorize this new set of “scalable” DBs. Now, the NewSQL DBs come in two flavors.

  1. One flavor of NewSQL DBs offers an automated sharding layer over multiple independent instances of the SQL monolithic databases. Say, for example, Vitess DB can handle it in the way how MySQL does it, whereas Citus handles it the same as of PostgreSQL. So, each instance, when taken independently, is similar to the same old monolithic approach. The challenges like native failover, ACID transactions in a distributed manner, etc. remain impossible to handle. Above all, the developers also have to compromise on agility, which they get by only interacting with a single logical SQL DB.
  2. The second flavor covers DBs like VoltDB, NuoDB, and Clustrix, etc. which are built as distributed storage systems with the objective of keeping the concept of a single logical SQL database in place.

Next, let us evaluate some of the NewSQL Databases with distributed SQL

  1. Vitess

Vitess provides MySQL automated sharding features. Each of the MySQL instances acts as a shard here. A very consistent key-value store is used in the case of Vitess, which is called ECTD. This helps to store the metadata related to the shard location like which shred is located against which given instance. Vitess also uses VTGate as a set of coordinator nodes. This helps to accept the client queries of the applications and route all those to the corresponding shard based on the pre-stored ECTD mapping. Each such instance uses the master-slave replications as per MySQL.

However, as per RemoteDBA.comexperts, the SQL features like accessing various rows of data spread across multiple rows and across various shards are strongly discouraged in this database application. Some such discouraged features are the global secondary indexes and the cross-chard JOINs. All these reiterate the point that the Vitess cluster lacks the single logical SQL DB notion in a real applicational environment. The developers should be aware of the sharding to account for this shortfall while designing their schema and executing their queries.

  1. Citus

Basically, Citus is the PostgreSQL version of Vitess. Plying as the extension of PostgreSQL, Citus can ensure both vertical and horizontal scalability for the write commands to PostgreSQL DB deployments using open sharding. This installation begins with the number of nodes of the PostgreSQL, and each node also has a Citus extension. Afterward, one single node of the ‘number of nodes of PostgreSQL becomes the coordinator node for the situs, and the remaining nodes act as worker’s nodes.

The applications only interact with one coordinator node and will not be aware of the worker nodes existing. The replication-based architecture, which ensures availability even during failures, still acts as master-slave based on the Postgres standards. There may probably be availability and performance bottlenecks with this single-coordinator node constitution. Any slowdown for the coordinator node may ultimately slow down the whole cluster even when the worker nodes may function normally. Similarly, any coordinator node outrage may make the whole cluster down. When worker nodes are unable to interact with client applications directly, there would not be any ways to make the client drivers smarter by caching the shard metadata.

  1. VoltDB

VoltDB acts based on the auto-sharded distributed database architecture. This is a proprietary SQL which has not foreign key support. Intra-cluster replications act on the basis of the K-safety algorithm in which K denotes the number of extra copies of the same data stored at each of the shards. For example, the configuration of K=2 maps to the Replication Factor 3 of the distributed SQL databases by default, i.e., YugabyteDB and Google Spanner, etc.

In the case of VoltDB, the replicas for any given shard get simultaneously updated in a synchronized manner by the client application. However, when the distributed consensus protocols as Paxos and Raft etc. require some writes to be sent to every replica, but only commits so when the majority of the replicas acknowledge the request. In real, waiting for responses from all the replicas is not necessary as the consensus can also be established with the majority. Also, VoltDB may not be able to detect any network partitions but requires an add-on network-fault-protection to be set. When a single node in the cluster is partitioned, fault protection mode gets activated, which may adversely impact the cluster performance, too, by increasing the cluster recovery time for accepting rights.

Other examples are NuoDB (a proprietary NewSQL DB), ClustrixDB (a scale-out SQL DB), and so on. In fact, the NewSQL cloud is still in its infancy, and the distributed SQL DBs like Google Spanner is slowly building up to take advantage of the cloud elasticity to work even on the inherently unreliable database infrastructures.

Ariya Stark is a blogger and content writer who write many articles on Business, Web Design, Social Media, and Technology. She enjoys reading a new thing on the internet. She spends a lot of time on social media.


Join Disrupt Magazine

Become A Disrupt Contributor

Most Disruptive

Executive Voice1 month ago

Kerwin Rae Shares How He Has Helped Over 100,000 Entrepreneurs Succeed and Grow

Kerwin Rae is Australia’s leading business strategist and high performance specialist helping over 100,000 businesses, in 154 different industries, throughout 11...

Politics2 months ago

Brock Pierce Wants To Disrupt The Two Party System And Be Your Next President

We don’t usually cover politics much here at Disrupt, but when Crypto billionaire and friend of the show, Brock Pierce...

Business3 months ago

John Mcafee – Predictions For The Future

John McAfee is a world-famous tech CEO, computer scientist, civil disobedience activist, privacy advocate, and pioneer of the commercial anti-virus...

Finance5 months ago

Gaby Wall Street – Teaching Latinas to Thrive During The Crisis

It’s no secret we are facing one of the most challenging financial times of the last few decades as we...

Entrepreneurship5 months ago

Tony Delgado – The #1 Entrepreneurship Movement In Puerto Rico

Puerto Rican online market is in constant progress. With many entrepreneurs who are coming here to start a business, it...

Entrepreneurship7 months ago

Elena Cardone – The 10X Ladies Conference Is Declaring 2020 The Decade For Women

The next ten years are meant for women to continue growing their potential and succeeding in multiple areas, including business....

Marketing1 year ago

How Josh Elizetxe Built Snow Into a $40 Million Dollar Business

There is nothing quite like an entrepreneur’s determination when starting a business. That’s my original quote by the way (pun...

Entrepreneurship1 year ago

How Jason Capital Became A Self Made Millionaire By 24

Have you ever wanted to earn the respect of everyone who ever looked down on you at some point in...

Entrepreneurship1 year ago

Sam Bakhtiar On His Way To A Quarter Billion

Dr. Saman Bakhtiar, who prefers being referred as Sam, lives in an 8200 square foot $5.2 million house, Sam is...


Copyright © 2020 Disrupt Magazine

The Disrupt Magazine & Podcast tells the stories of the world top entrepreneurs, developers, creators, and digital marketers and help empower them to teach others the skills they used to grow their careers, chase their passions and create financial freedom for themselves, their families, and their lives, all while living out their true purpose. We recognize the fact that most young people are opting to skip college in exchange for entrepreneurship and real-life experience. This Podcast was designed to give them a taste of that.