Share this Post
Recently, many large companies have switched to NoSQL which, subsequently, made a lot of heads turn. Facebook, Hulu, Netflix, and Uber are just some examples.
Still, many remain in the dark about what NoSQL databases actually are. Today, we'll discuss what different types of non-relational DBs exist and how best to utilize them.
NoSQL Database Guide for Beginners 🏁
What is NoSQL? 🤔
It avoids joins, is schema-free and easy to scale whenever required. NoSQL databases are used in big data and for real-time web applications. They are also called ‘Not only SQL’ which means that it may support query languages like SQL.
NoSQL database is very easy to scale and comparatively faster in most of the operations that are performed on databases. The data structures used by NoSQL are more flexible than relational databases.
Before actually inserting the data in a relational database, you need to define a schema, create a table, set fields of data types, etc. But in NoSQL, you can update and insert data on the fly without any worry.
It offers high availability with high performance and is easily scalable with a rich query language. And it can easily handle messy, unstructured and unpredictable data in real-time.
Relational databases weren’t designed to face agile challenges and cope up with the scale that modern applications require. So, organizations are now choosing geographically distributed architectures using cloud computing, open software technologies and commodity servers instead of storage infrastructure and large monolithic servers.
NoSQL Database Guide for Beginners 🏁
All you need to know about NoSQL 📚
A Brief History of Nosql 📜
In the year 1998, Carlo Strozzi coined the term NoSQL to name his lightweight, open-source, and relational database which didn’t have an SQL interface.
Again in 2009, the name reused by Johan Oskarsson and Eric Evans to refer databases that are distributed, non-relational and doesn’t conform to consistency, atomicity, durability, isolation – the obvious features of the traditional database systems.
The “NoSQL (east)” conference held in the same year 2009 in Atlanta, the USA where NoSQL was debated and discussed a lot. There, a discussion of NoSQL and its practice was a hot topic given its momentum, adoption and growth in just over 10 years.
Here's a brief timeline of the history of NoSQL:
- ☑️ 1998 – The term NoSQL used by Carlo Strozzi
- ☑️ 2000 – Neo4j graph database is launched
- ☑️ 2004 – Google launched BigTable
- ☑️ 2005 – Launch of CouchDB
- ☑️ 2007 – Release of research paper on Amazon Dynamo
- ☑️ 2008 – Cassandra project open-sourced by Facebook
- ☑️ 2009 – NoSQL term was reintroduced
Key Features of NoSQL 🔑
- ☑️ Flexible – NoSQL database has the flexibility to ingest structured, semi-structured and unstructured data
- ☑️ Preferable - It meets the requirments of developers, programmers and architectures
- ☑️ Zero Downtime – Zero Downtime is another important feature of NoSQL
- ☑️ Scalability – Relational databases are scalable but it’s neither easy nor cheap
- ☑️ Distributed - The design of NoSQL databases is such that it can distribute the data globally
Why NoSQL? 💭
Today, accessing and capturing website data is becoming easier through third parties like Facebook, Twitter and others. A user's information, geolocation data, social graphs, machine logging data, and user-generated content all contribute to the growing data deluge which we now face.
For these services - and many others - to function properly, it's essential to properly store and process these huge amounts of data. But SQL databases were never designed to handle this enormous amount of data (the term "big data" doesn't quite cut it).
Not only can NoSQL databases handle both structured and unstructured data, but also they can process unstructured and big data quickly. This has led large organizations like Google, Facebook, LinkedIn, and Twitter to adopt NoSQL systems to deal with terabytes of user data that they collect every single daily.
These organizations process a tremendous amount of unstructured data every day, and coordinate it to gain business insights and make strategic decisions.
Types of NoSQL Databases
There are a couple 📋
Each key is paired with a complex data structure called document. The documents can carry various key-array pairs, key-value pairs, or nested documents.
Examples: CouchDB, MongoDB, Cloudant.
Key-Value Store 🗝️
They are simple NoSQL databases where every item is stored as ‘key’ or attribute name, together with its value. Some key-value stores like Redis allow having a type for each value (like ‘integer’) for adding functionality.
Examples: Berkeley DB and Riak.
Wide-Column Stores 🏬
They are optimized for large dataset queries and instead of rows they store columns of data together.
Example: HBase and Cassandra.
Graph Stores 🏪
To store information about networks of data (like social connections), graph stores are used. Example: Giraph and Neo4j.
SQL vs. NoSQL Databases
Time to compare 🧮
|SQL Databases||NoSQL Databases|
|Types||Only one type with little variations.||Many types including document|
|History||In the 1970s, SQL was developed to deal with the first wave of big data storage.||In the late 2000s, NoSQL databases were developed to cope up with the limitations of SQL, especially multi-structured data, scalability, agile development sprints and geo-distribution.|
|Examples||MySQL, Oracle Database, Microsoft SQL server, Postgres||MongoDB, Neo4j, HBase, Cassandra|
|Data Storage||Individual records being stored as rows in a table much like a spreadsheet where each column stores specific data about the record. Separate tables store related data and when complex queries are executed, they are joined together. Example: One table store ‘offices’ and another table store ‘employees’. So, while searching for an employee’s work address, the database engine would join ‘office’ and ‘employee’ tables together to fetch all the necessary information.||Depending on the database type, it varies. Example: Key-value stores work similar to the SQL database, but it only has 2 columns (‘value’ and ‘key’), where the ‘value’ column sometimes store the more complex information within it as BLOBs.|
|Development Model||Mix of closed source (Example: Oracle Database) and open technologies (Example: MySQL, Postgres)||Open technologies|
|Scaling||Vertical scaling. It means to deal with the increased demand; the single server must be made to be increasingly powerful. SQL databases can be spread over many servers but it generally requires additional engineering and relational features like JOINs, transactions and referential integrity are typically lost.||Horizontal scaling. It means to add more capacity; simply the database administrator can add more cloud instances or commodity servers. Automatically the data is spread by the database across the servers as required.|
|Consistency||Strong consistency can be configured||Consistency depends on the product. Some offer strong consistency such as MongoDB (read consistency ta tunable) whereas others provide eventual consistency like Cassandra.|
Pros and cons 🖥️
Main Advantages of NoSQL ➕
The positives of using NoSQL database systems such as Cassandra and MongoDB are plenty. Three of the primary benefits are high availability and scalability.
High Scalability ⚖️
Sharding is used by NoSQL databases like MongoDB for horizontal scaling. So, what is "sharding"?
Basically, sharding is partitioning and placing of data on various machines while preserving the order of the data.
Vertical scaling means the addition of more assets to the already existing machine whereas; horizontal scaling means the addition of more machines for handling the data.
Implementation of vertical scaling is a bit complicated while horizontal scaling can be easily implemented.
High Availability ✈️
NoSQL databases have auto replication features, making them highly available as the data can replicate itself in its previous consistent state if any issues arise.
Other than high scalability and availability, NoSQL has many other advantages that include its ability to handle:
- ☑️ Large volumes of unstructured, semi-structured and structured data
- ☑️ Quick iteration, agile sprints and frequent code pushes
- ☑️ Instead of monolithic architecture, NoSQL has geographically distributed architecture.
Disadvantages of NoSQL ❌
The following are the prominent disadvantages of NoSQL.
Narrow Focus 💻
Given that NoSQL databases are mainly designed for storage (offering little in the way of functionality), they are narrowly focused. So for transaction management, relational databases are a better option than NoSQL.
Data Management 👔
Managing a huge amount of data in a simple way is the work of big data tools. But it’s not easy. NoSQL data management is much more complicated than a relational database. Also, it’s quite tricky to install and even more troublesome to maintain and work with.
When to Use NoSQL? ❓
- ☑️When there is a huge amount of data involved that need to be stored and retrieved
- ☑️ The data is not structured and changing over time.
- ☑️ No support of Joins and Constraints needed at the database level.
- ☑️ The stored data relationship is not of much importance.
- ☑️ The data is ever-growing and you need to regularly scale the database to handle the growing data.
Before you go 👋
By looking at the features and multiple benefits of NoSQL databases, it’s better than relational databases or SQL databases. NoSQL databases have gained in popularity recently due to their adoption by internet giants like Facebook, Amazon, Google, and Microsoft who daily produce massive volumes of data.
There are various NoSQL database types to handle unstructured, semi-structured and structured data efficiently. This makes any database consistent and highly available without any point of failure.