Unit 4
🎯 Unit 4 Overview
Big Data Unit 4 mainly covers NoSQL databases and MongoDB. NoSQL databases are used
to store and manage large-scale structured, semi-structured and unstructured data.
Exam Tip: NoSQL vs SQL, types of NoSQL databases, NoSQL architectural patterns and MongoDB are highly important for RGPV exams.
📘 Introduction to NoSQL
NoSQL stands for Not Only SQL. It is a type of database system that can store data
in flexible formats other than traditional tables. NoSQL is useful when data is huge,
fast-changing and not always structured.
Why NoSQL is Needed?
- Traditional RDBMS is not suitable for very large and unstructured data.
- Modern applications generate data in different formats like JSON, XML, images and logs.
- NoSQL provides high scalability and flexibility.
- NoSQL supports distributed storage and faster access.
Simple Meaning: Jab data fixed table format me nahi hota aur size bahut bada hota hai,
tab NoSQL databases use kiye jaate hain.
⭐ Features of NoSQL
- Schema-less database design
- High scalability
- Distributed storage
- Handles unstructured and semi-structured data
- High performance
- Flexible data model
- Fault tolerance
- Horizontal scaling support
⚖️ SQL vs NoSQL
| SQL Database |
NoSQL Database |
| Uses fixed schema |
Schema-less or flexible schema |
| Stores data in tables |
Stores data in documents, key-value, graphs or columns |
| Best for structured data |
Best for semi-structured and unstructured data |
| Vertical scaling |
Horizontal scaling |
| Examples: MySQL, Oracle |
Examples: MongoDB, Cassandra, Redis, Neo4j |
🏢 Business Drivers of NoSQL
Business drivers are the reasons due to which companies adopt NoSQL databases.
- Big Data Growth: Companies generate huge amounts of data from users, apps and devices.
- Need for Scalability: Applications require horizontal scaling across multiple servers.
- Real-Time Processing: Businesses need fast data access and quick analysis.
- Flexible Data: Modern data does not always follow fixed table structure.
- Cloud Applications: Cloud-based systems need distributed and scalable databases.
- Cost Reduction: NoSQL can run on commodity hardware and cloud infrastructure.
- High Availability: Businesses need systems that remain available all the time.
🏗️ Data Architectural Patterns
Data architectural patterns define how data is stored, processed and accessed in a system.
| Pattern |
Description |
| Data Warehouse Pattern |
Stores historical structured data for reporting and analysis. |
| Data Lake Pattern |
Stores raw structured, semi-structured and unstructured data. |
| Lambda Architecture |
Combines batch processing and real-time processing. |
| Microservices Data Pattern |
Each service manages its own database. |
| Distributed Database Pattern |
Data is distributed across multiple nodes or servers. |
🧩 NoSQL Architectural Patterns
NoSQL architectural patterns explain different ways in which NoSQL databases store data.
| NoSQL Type |
Storage Style |
Example |
| Key-Value Store |
Stores data as key-value pairs |
Redis, DynamoDB |
| Document Store |
Stores data as documents like JSON/BSON |
MongoDB, CouchDB |
| Column Family Store |
Stores data in column families |
Cassandra, HBase |
| Graph Database |
Stores data as nodes and relationships |
Neo4j |
📂 Types of NoSQL Databases
1. Key-Value Database
Data is stored in key-value pair format. It is simple and very fast.
- Example: Redis
- Use: Caching, session management
2. Document Database
Data is stored in document format such as JSON or BSON.
- Example: MongoDB
- Use: Web applications, content management
3. Column-Oriented Database
Data is stored in columns instead of rows. It is useful for large analytical workloads.
- Example: Cassandra, HBase
- Use: Big Data analytics
4. Graph Database
Data is stored as nodes and edges. It is useful for relationship-based data.
- Example: Neo4j
- Use: Social networks, recommendation systems
📊 Managing Big Data with NoSQL
NoSQL helps in managing Big Data because it supports distributed storage, flexible schema
and high-speed processing.
How NoSQL Manages Big Data?
- Data is distributed across multiple servers.
- Flexible schema allows different data formats.
- Replication improves availability and fault tolerance.
- Sharding divides data into smaller parts.
- Horizontal scaling adds more servers when data increases.
- Fast read/write operations support real-time applications.
🍃 Introduction to MongoDB
MongoDB is a popular open-source NoSQL database. It stores data in document format using
BSON, which is similar to JSON.
Features of MongoDB
- Document-oriented database
- Schema-less design
- High scalability
- Indexing support
- Replication support
- Sharding support
- Fast read and write operations
- Suitable for Big Data applications
Important: MongoDB me data table ke form me nahi, balki collection aur document ke form me store hota hai.
📌 MongoDB Basic Terms
| MongoDB Term |
Meaning |
SQL Equivalent |
| Database |
Collection of related data |
Database |
| Collection |
Group of documents |
Table |
| Document |
Single record in BSON/JSON format |
Row |
| Field |
Key-value pair inside document |
Column |
💻 MongoDB Document Example
{
"rollNo": 101,
"name": "Amit",
"branch": "CSE",
"semester": 7,
"skills": ["Hadoop", "MongoDB", "Big Data"]
}
⚙️ MongoDB Commands
show dbs
use college
db.students.insertOne({name:"Amit", branch:"CSE", semester:7})
db.students.find()
db.students.updateOne({name:"Amit"}, {$set:{semester:8}})
db.students.deleteOne({name:"Amit"})
✅ Advantages of MongoDB
- Flexible schema
- Easy to store complex data
- Fast performance
- Horizontal scalability
- Supports replication and sharding
- Good for real-time applications
- Easy integration with web applications
⚠️ Limitations of MongoDB
- Not suitable for complex joins like SQL databases
- Consumes more memory
- Data duplication may occur
- Transactions are more limited compared to traditional RDBMS
- Requires proper indexing for better performance
⭐ Important Questions
- What is NoSQL? Explain its features.
- Differentiate between SQL and NoSQL databases.
- Explain business drivers of NoSQL.
- Explain data architectural patterns.
- Explain NoSQL architectural patterns.
- Explain types of NoSQL databases with examples.
- How does NoSQL manage Big Data?
- What is MongoDB? Explain its features.
- Explain MongoDB terms: database, collection, document and field.
- Write advantages and limitations of MongoDB.
🔥 Last Minute Revision
- NoSQL = Not Only SQL.
- NoSQL is useful for unstructured and semi-structured Big Data.
- SQL uses tables, NoSQL uses flexible models.
- Main NoSQL types: Key-Value, Document, Column, Graph.
- MongoDB is a document-oriented NoSQL database.
- MongoDB stores data in BSON format.
- Collection = Table, Document = Row, Field = Column.
- Sharding and replication are important MongoDB concepts.