0% found this document useful (0 votes)
1 views

mongodb

non sql database

Uploaded by

shyamsingh841442
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

mongodb

non sql database

Uploaded by

shyamsingh841442
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

MongoDB Overview

MongoDB is a popular, open-source, NoSQL database designed for flexibility and scalability. It follows
a document-oriented approach, meaning data is stored in documents, not rows and columns as in
traditional relational databases (RDBMS). The documents are structured in a JSON-like format called
BSON (Binary JSON), which allows for easy storage of nested and complex data types.

Features of MongoDB

1. Schema-less Structure: MongoDB is schema-flexible, meaning it doesn't require a predefined


schema. Each document in a collection can have its own structure, allowing developers to
store different kinds of data without altering the database schema.

2. Document-based Storage: MongoDB stores data in flexible, JSON-like documents called


BSON, which can easily represent complex objects and nested data.

3. Horizontal Scalability (Sharding): MongoDB supports horizontal scaling by partitioning data


across multiple servers using a technique called sharding. This allows applications to handle
large datasets by distributing data across multiple nodes.

4. Replication for High Availability: MongoDB ensures high availability through replica sets. A
replica set consists of a primary node and multiple secondary nodes that replicate the data in
real time. If the primary node fails, a secondary node automatically takes over.

5. Aggregation Framework: MongoDB has a powerful aggregation framework that allows you
to process data and perform operations like filtering, sorting, grouping, and transforming
documents within collections.

6. Indexing: MongoDB supports indexes on any field in a document, which improves query
performance. It also provides support for compound, geospatial, and text indexes.

7. ACID Transactions: MongoDB provides multi-document ACID (Atomicity, Consistency,


Isolation, Durability) transactions, making it suitable for more complex and mission-critical
use cases.

NoSQL Database

NoSQL databases are designed to handle large volumes of unstructured or semi-structured data and
are optimized for scalability, performance, and flexibility. Unlike relational databases, they don’t rely
on a fixed schema and can store data in various formats, such as:

 Document-based (e.g., MongoDB)

 Key-Value (e.g., Redis)

 Column-oriented (e.g., Cassandra)

 Graph-based (e.g., Neo4j)

NoSQL databases are often used for large-scale, distributed systems that require high throughput
and flexibility in data representation.

How MongoDB Stores Data

MongoDB stores data in BSON (Binary JSON) format, which is a binary representation of JSON-like
documents. BSON supports richer data types than JSON, such as dates, integers, floating points, and
binary data. It allows for the storage of nested documents and arrays, which gives MongoDB the
flexibility to handle complex data structures.

Key Concepts in MongoDB

1. Document: A document is the basic unit of data in MongoDB, represented as a JSON-like


object with key-value pairs. It can contain fields of various data types, including arrays and
embedded documents. For example:

json

Copy code

"name": "Alice",

"age": 25,

"address": { "city": "New York", "zip": "10001" },

"hobbies": ["reading", "traveling"]

2. Collection: A collection in MongoDB is a group of documents. It's akin to a table in relational


databases but doesn’t enforce a fixed schema, allowing documents within the same
collection to have different structures. For example, a users collection could contain
documents with varying fields and types.

3. Database: A MongoDB database is a container for collections. A database in MongoDB is


similar to a traditional database in RDBMS and can hold multiple collections. Databases help
logically group collections based on different use cases or applications.

BSON

BSON (Binary JSON) is MongoDB’s storage format that extends JSON by adding support for data
types not present in JSON, such as integers, floating-point numbers, dates, and binary data. BSON is
designed to be fast to parse, compact, and efficient in terms of storage space.

Data Types Supported by MongoDB

MongoDB supports a wide range of data types, including:

 String: Textual data.

 Number: Numeric data, including integers, long integers, and floating-point values.

 Date: Dates, stored in milliseconds since the Unix epoch.

 Boolean: True/false values.

 Array: Ordered lists of values.

 Object/Document: Embedded documents (nested objects).

 Binary data: Binary-encoded data, such as images or files.


 Null: Representing missing or undefined fields.

 Regular expressions: Pattern-matching expressions for querying text.

Creating a Database in MongoDB

In MongoDB, a database is created implicitly when you first insert data into a collection within the
database. You can create a database using the use command and inserting a document:

bash

Copy code

use myDatabase

db.myCollection.insertOne({ name: "John", age: 30 })

This will create the myDatabase database and the myCollection collection if they don't already exist.

Inserting Documents into a Collection

You can insert documents using the insertOne() and insertMany() methods:

bash

Copy code

db.users.insertOne({ name: "Alice", age: 25 })

db.users.insertMany([

{ name: "Bob", age: 30 },

{ name: "Carol", age: 27 }

])

 insertOne() inserts a single document.

 insertMany() inserts multiple documents in one operation.

Primary Key in MongoDB

Every document in MongoDB contains an _id field, which acts as the primary key and uniquely
identifies the document. If you don’t specify an _id field, MongoDB automatically generates a unique
one using an ObjectId.

Querying Data in MongoDB

MongoDB provides the find() and findOne() methods to query documents from a collection:

bash

Copy code

db.users.find() // Returns all documents in the collection

db.users.find({ age: 25 }) // Returns all documents where age is 25

db.users.findOne({ name: "Alice" }) // Returns the first document that matches the query
Updating Data in MongoDB

You can update documents using the updateOne(), updateMany(), and replaceOne() methods:

bash

Copy code

db.users.updateOne({ name: "Alice" }, { $set: { age: 26 } })

db.users.updateMany({ age: { $gt: 30 } }, { $set: { status: "senior" } })

db.users.replaceOne({ name: "Alice" }, { name: "Alice", age: 27, city: "Boston" })

Deleting Documents in MongoDB

Documents can be deleted using deleteOne() and deleteMany() methods:

bash

Copy code

db.users.deleteOne({ name: "Alice" })

db.users.deleteMany({ age: { $lt: 30 } })

Replica Set in MongoDB

A replica set in MongoDB is a group of servers that maintain the same data set, providing data
redundancy and high availability. A replica set has:

 One primary server that handles all write operations.

 One or more secondary servers that replicate the primary's data.

If the primary server goes down, one of the secondary servers is automatically elected as the new
primary, ensuring continuous availability.

Conclusion

MongoDB is a powerful NoSQL database known for its flexibility, scalability, and ease of use. Its
document-oriented data model, schema-less structure, and built-in features for replication and
sharding make it ideal for a wide range of use cases, especially those requiring horizontal scaling and
the ability to handle complex or rapidly changing data.

What is sharding in MongoDB?

Sharding is MongoDB's method for horizontal scaling. It divides a large dataset across multiple
servers, or "shards", ensuring that no single machine becomes overloaded with too much data or too
many queries. This is especially useful for handling large datasets and high-throughput operations.
MongoDB splits data based on a shard key and distributes the data evenly across different shards,
allowing the database to scale beyond the resources of a single server.

What are indexes in MongoDB, and why are they important?

Indexes in MongoDB are special data structures that store a portion of the collection’s data in an
easy-to-traverse form. Indexes support efficient execution of queries by allowing MongoDB to quickly
locate the documents that match a query condition. Without indexes, MongoDB would have to scan
every document in a collection to find matches, which is slower. Indexes can be created on one or
multiple fields and significantly improve performance for read operations.

What is an aggregation pipeline in MongoDB?

The aggregation pipeline is a framework in MongoDB used to process data in stages, allowing for
transformation and computation on documents. Each stage of the pipeline transforms the
documents as they pass through, ultimately returning computed results. This is useful for tasks like
filtering data, performing calculations, and summarizing results. Here’s an example of an aggregation
pipeline:

bash

Copy code

db.orders.aggregate([

{ $match: { status: "shipped" } },

{ $group: { _id: "$customerId", totalAmount: { $sum: "$amount" } } }

])

This example filters orders with status "shipped" and groups them by customer ID, calculating the
total order amount for each customer.

What is the difference between find() and aggregate() in MongoDB?

 find() is used for simple querying of documents. It retrieves documents that match the query
criteria but has limited data transformation capabilities.

 aggregate() is used for complex data manipulation. The aggregation pipeline provides
advanced features like grouping, sorting, filtering, and transforming documents in multiple
stages. It's more powerful when performing data analysis and computation.

What is a capped collection in MongoDB?

A capped collection is a fixed-size collection in MongoDB that automatically overwrites the oldest
documents when the collection reaches its size limit. Capped collections are useful for scenarios like
logging and caching, where the most recent data is more important than older data. Capped
collections maintain insertion order and do not support document deletions or updates that change
the document size.

What is the $set operator in MongoDB?

The $set operator in MongoDB is used to update the value of a field in a document or to add a new
field if it does not exist. It allows you to modify specific fields without replacing the entire document.
Example:

bash

Copy code

db.users.updateOne({ name: "Alice" }, { $set: { age: 28 } })

This command updates Alice’s age to 28. If the field age didn’t exist, it would be added.

What are MongoDB transactions?

Transactions in MongoDB allow multiple read and write operations to be grouped together and
executed atomically. This means that either all the operations in the transaction are successfully
committed, or none of them are. Transactions provide ACID (Atomicity, Consistency, Isolation,
Durability) guarantees, which are especially important for critical applications where data integrity is
essential, such as banking or financial applications.

How do you enable and use transactions in MongoDB?

Transactions can be used on replica sets or sharded clusters. To use a transaction, you need to start a
session and initiate the transaction. Here's a simple example:

bash

Copy code

const session = db.getMongo().startSession();

session.startTransaction();

try {

db.users.updateOne({ name: "Alice" }, { $set: { age: 28 } }, { session });

db.orders.insertOne({ customerId: "Alice", total: 50 }, { session });

session.commitTransaction();

} catch (e) {

session.abortTransaction();

} finally {

session.endSession();

In this example, if any operation fails, the entire transaction is aborted.

What is the $lookup operator in MongoDB?

The $lookup operator is used to perform joins between two collections. It allows you to combine
documents from a "local" collection with related documents from a "foreign" collection based on a
matching condition, similar to an SQL join. The result is embedded in the returned documents.
Example:

bash

Copy code

db.orders.aggregate([

{ $lookup: {

from: "customers",

localField: "customerId",

foreignField: "_id",

as: "customerInfo"

}}

])

This example performs a left outer join between the orders and customers collections, embedding
the customer information in the order documents.

What is the difference between embedded documents and references in MongoDB?

 Embedded documents: Store related data directly within the parent document. This results
in denormalization, where all relevant data is retrieved in a single query, but the document
size may grow large.

 References: Use the _id of one document to reference related data stored in a different
document or collection. This approach uses normalization but may require multiple queries
to retrieve the related data.

Embedded documents are typically used when the related data is tightly coupled, while references
are useful for loosely coupled or frequently changing data.

What are the differences between save() and insert() methods in MongoDB?

 insert() is used to add new documents to a collection. If a document with the same _id
exists, the operation will fail.

 save() can either insert a new document or update an existing document if it already exists
(based on its _id). Essentially, save() is a combination of an insert and update operation.

What is GridFS in MongoDB?

GridFS is a specification used for storing and retrieving large files (larger than 16 MB) in MongoDB. It
splits large files into smaller chunks, usually 255 KB in size, and stores each chunk as a separate
document. This allows MongoDB to handle large files efficiently by distributing them across different
shards or machines if needed. GridFS is commonly used for storing files such as images, audio, video,
and other large binary data.
What is schema design in MongoDB, and how is it different from relational databases?

Schema design in MongoDB is flexible and dynamic, allowing documents in a collection to have
varying fields, data types, and structures. MongoDB uses BSON (a binary representation of JSON) to
store documents. This flexible schema design allows for storing nested fields, arrays, and more
complex data structures without the need for a predefined schema.

In contrast, relational databases (RDBMS) require a predefined schema with fixed table structures
where each row must follow the same structure, and columns have specific data types. Changes in
the structure (like adding a new column) require altering the schema.

How does MongoDB handle scaling?

MongoDB handles scaling through sharding, which distributes data across multiple servers or nodes.
By partitioning large datasets based on a shard key, MongoDB enables horizontal scaling, which
allows the database to scale out by adding more servers, rather than scaling vertically (by upgrading
a single server’s resources). This makes MongoDB suitable for handling large datasets and high
traffic.

What are the different types of replication in MongoDB?

MongoDB uses replica sets for replication. A replica set consists of:

 Primary node: The node that accepts read and write operations.

 Secondary nodes: Nodes that replicate data from the primary. They can take over if the
primary node fails (through automatic failover).

 Arbiter: A node that participates in elections for a new primary but doesn’t store data.

Replica sets ensure data redundancy and fault tolerance, providing availability and data integrity in
case of node failure.

How does MongoDB achieve high availability?

MongoDB achieves high availability through replica sets. If the primary node fails, one of the
secondary nodes automatically becomes the new primary through an election process, minimizing
downtime. This ensures continuous service availability even in the event of hardware failure,
network issues, or other disruptions.

What is the difference between MongoDB and traditional RDBMS?

The key differences are:

 Schema: MongoDB is schema-less, allowing for flexible data models, while RDBMS require
predefined, fixed schemas.

 Data Storage: MongoDB stores data in documents (BSON/JSON format), whereas RDBMS
stores data in tables with rows and columns.

 Scalability: MongoDB is designed for horizontal scaling using sharding, while RDBMS
typically use vertical scaling.

 Transactions: MongoDB supports multi-document transactions with ACID properties, while


RDBMS traditionally provide strong ACID guarantees across multiple tables.
 Joins: MongoDB typically avoids joins by embedding related data in a document, whereas
RDBMS rely heavily on joins between tables.

What is the use of MongoDB Atlas?

MongoDB Atlas is a fully managed cloud database service that handles tasks like provisioning,
monitoring, backups, scaling, and security for MongoDB deployments. It allows users to run
MongoDB clusters on cloud platforms like AWS, Azure, and Google Cloud, providing automated and
scalable database infrastructure.

How do you optimize query performance in MongoDB?

To optimize query performance in MongoDB:

1. Use indexes: Create indexes on fields frequently used in query filters or sort operations.

2. Limit fields using projections: Retrieve only the necessary fields using projections to reduce
data transfer.

3. Aggregation pipelines: Use aggregation pipelines for complex data processing and
transformation.

4. Avoid full collection scans: Ensure that queries are covered by indexes, minimizing
collection-wide scans.

5. Analyze performance: Use the explain() method to see the query execution plan and
understand how MongoDB processes the query.

What is the difference between mongoimport and mongorestore?

 mongoimport: Used to import data from JSON, CSV, or TSV files into a MongoDB collection.
It’s commonly used for importing flat files or external data sources.

 mongorestore: Used to restore a MongoDB database from a binary database dump created
by the mongodump utility. It is typically used for backups and migrations.

What are change streams in MongoDB?

Change streams allow applications to subscribe to real-time changes in MongoDB collections,


databases, or the entire deployment. It provides a stream of updates, insertions, and deletions
happening in the database, which is useful for building reactive applications. Change streams are
often used for event-driven architectures, where real-time notifications or updates are needed.

How do you ensure ACID properties in MongoDB?

MongoDB ensures ACID properties through multi-document transactions. Transactions allow


multiple read and write operations to be executed in an all-or-nothing manner, providing atomicity,
consistency, isolation, and durability. This means that all changes within a transaction either succeed
or are rolled back if any operation fails.

To use transactions, you need to start a session and perform operations within that session.
Transactions are available on replica sets and sharded clusters. Here's an example of a transaction in
MongoDB:

js
Copy code

const session = db.getMongo().startSession();

session.startTransaction();

try {

db.users.updateOne({ name: "Alice" }, { $set: { age: 28 } }, { session });

db.orders.insertOne({ customerId: "Alice", total: 100 }, { session });

session.commitTransaction();

} catch (e) {

session.abortTransaction();

} finally {

session.endSession();

For implementing CRUD (Create, Read, Update, Delete) operations in MongoDB, the following
structure provides a clean and scalable way to manage the database, collections, and methods.
Below is an example of how to structure the CRUD operations using Node.js and MongoDB.

Project Structure

bash

Copy code

mongodb-crud

├── config

│ └── database.js # MongoDB connection setup

├── controllers

│ └── userController.js # Business logic for user-related operations (CRUD)

├── models

│ └── userModel.js # MongoDB schema/model for the User collection


├── routes

│ └── userRoutes.js # API routes for handling user-related requests

├── app.js # Main application entry point (Express server)

└── package.json # NPM dependencies and scripts

Step-by-Step Implementation

1. Database Configuration (config/database.js)

This file will handle the MongoDB connection.

javascript

Copy code

const mongoose = require('mongoose');

const connectDB = async () => {

try {

const conn = await mongoose.connect('mongodb://localhost:27017/crud-example', {

useNewUrlParser: true,

useUnifiedTopology: true

});

console.log(`MongoDB Connected: ${conn.connection.host}`);

} catch (error) {

console.error(`Error: ${error.message}`);

process.exit(1);

};

module.exports = connectDB;

2. User Model (models/userModel.js)

This file defines the schema and model for the User collection.

javascript
Copy code

const mongoose = require('mongoose');

// Define the schema for a User

const userSchema = new mongoose.Schema({

name: { type: String, required: true },

email: { type: String, required: true, unique: true },

age: { type: Number, required: true },

createdAt: { type: Date, default: Date.now }

});

// Create the model from the schema

const User = mongoose.model('User', userSchema);

module.exports = User;

3. Controller with CRUD Operations (controllers/userController.js)

This file contains all the business logic for handling the CRUD operations.

javascript

Copy code

const User = require('../models/userModel');

// Create a new user (INSERT)

exports.createUser = async (req, res) => {

try {

const { name, email, age } = req.body;

const user = new User({ name, email, age });

await user.save();

res.status(201).json({ message: 'User created successfully', user });

} catch (error) {

res.status(400).json({ message: 'Error creating user', error });

}
};

// Retrieve all users (READ)

exports.getAllUsers = async (req, res) => {

try {

const users = await User.find();

res.status(200).json(users);

} catch (error) {

res.status(500).json({ message: 'Error retrieving users', error });

};

// Retrieve a single user by ID (READ)

exports.getUserById = async (req, res) => {

try {

const user = await User.findById(req.params.id);

if (!user) return res.status(404).json({ message: 'User not found' });

res.status(200).json(user);

} catch (error) {

res.status(500).json({ message: 'Error retrieving user', error });

};

// Update a user (UPDATE)

exports.updateUser = async (req, res) => {

try {

const user = await User.findByIdAndUpdate(

req.params.id,

req.body,

{ new: true, runValidators: true }

);
if (!user) return res.status(404).json({ message: 'User not found' });

res.status(200).json({ message: 'User updated successfully', user });

} catch (error) {

res.status(400).json({ message: 'Error updating user', error });

};

// Delete a user (DELETE)

exports.deleteUser = async (req, res) => {

try {

const user = await User.findByIdAndDelete(req.params.id);

if (!user) return res.status(404).json({ message: 'User not found' });

res.status(200).json({ message: 'User deleted successfully' });

} catch (error) {

res.status(500).json({ message: 'Error deleting user', error });

};

4. Routes (routes/userRoutes.js)

This file defines the routes for the API and connects them to the controller methods.

javascript

Copy code

const express = require('express');

const router = express.Router();

const userController = require('../controllers/userController');

// Routes for CRUD operations

router.post('/users', userController.createUser); // Create a user

router.get('/users', userController.getAllUsers); // Get all users

router.get('/users/:id', userController.getUserById); // Get a user by ID

router.put('/users/:id', userController.updateUser); // Update a user

router.delete('/users/:id', userController.deleteUser); // Delete a user


module.exports = router;

5. Main Application (app.js)

This is the entry point of the application, where you set up the Express server, connect to MongoDB,
and use the routes.

javascript

Copy code

const express = require('express');

const connectDB = require('./config/database');

const userRoutes = require('./routes/userRoutes');

const app = express();

// Connect to MongoDB

connectDB();

// Middleware to parse incoming JSON requests

app.use(express.json());

// Use the user routes

app.use('/api', userRoutes);

// Start the server

const PORT = process.env.PORT || 5000;

app.listen(PORT, () => {

console.log(`Server running on port ${PORT}`);

});

MongoDB Schema Example

json

Copy code

{
"_id": "60c72b2f9b1e8f060c7f2b40",

"name": "John Doe",

"email": "john@example.com",

"age": 30,

"createdAt": "2023-05-12T08:25:30.000Z"

Running the CRUD Application

1. Install the necessary dependencies using NPM:

bash

Copy code

npm install express mongoose

2. Start the MongoDB server (if not already running):

bash

Copy code

mongod

3. Start the Node.js server:

bash

Copy code

node app.js

4. You can now interact with the CRUD API using tools like Postman or curl:

 Create User (POST):


POST /api/users

json

Copy code

"name": "Alice",

"email": "alice@example.com",

"age": 25

 Get All Users (GET):


GET /api/users
 Get User by ID (GET):
GET /api/users/:id

 Update User (PUT):


PUT /api/users/:id

json

Copy code

"name": "Alice Updated",

"age": 26

 Delete User (DELETE):


DELETE /api/users/:id

Conclusion

This structure provides a scalable and organized way to implement CRUD operations in MongoDB
using Node.js. You can expand this with more advanced MongoDB features like transactions,
validation, indexing, and more based on your requirements.

Real-World Applications

1. E-Commerce Platform

o Description: Developed an e-commerce application using MongoDB for product


catalog, user profiles, and order management. Node.js served as the backend to
handle API requests and manage business logic.

o Challenges:

 Data Structure: Initially faced challenges with data modeling, especially


regarding product attributes that varied widely.

 Solution: Used a flexible schema to accommodate various product types,


leveraging embedded documents for related data (e.g., reviews, images).

 Performance: As the application grew, we encountered performance issues


with complex queries.

 Solution: Implemented indexing on frequently queried fields and utilized


aggregation pipelines for reporting.

2. Real-Time Chat Application


o Description: Created a chat application where users could send messages in real-
time. MongoDB was used to store messages, user information, and chat room data.

o Challenges:

 Scalability: Faced challenges when scaling to support many concurrent


users.

 Solution: Implemented sharding in MongoDB to distribute data across


multiple servers, ensuring that the system could handle increased load
without performance degradation.

3. Social Media Analytics Tool

o Description: Developed a tool that analyzed social media data, collecting posts,
comments, and user interactions for insights.

o Challenges:

 Data Volume: The volume of incoming data was high, leading to difficulties
in processing and storing information efficiently.

 Solution: Used MongoDB’s change streams to capture real-time updates,


allowing for responsive analytics and quick data processing.

CRUD Operations in Application Architecture

 RESTful APIs: In a typical RESTful API architecture, CRUD operations are mapped to HTTP
methods:

o Create: POST /api/items (to add a new item)

o Read: GET /api/items (to retrieve a list of items or a specific item)

o Update: PUT /api/items/:id (to update an item)

o Delete: DELETE /api/items/:id (to delete an item)

Each endpoint corresponds to a specific controller method that interacts with the MongoDB
database using a library like Mongoose for ODM (Object Data Modeling).

 Microservices: In a microservices architecture, each service can manage its own MongoDB
instance or collection:

o Service Communication: Services communicate through APIs (e.g., REST or


GraphQL). Each service handles specific functionalities (e.g., user service, product
service) and performs CRUD operations relevant to its domain.

o Database Design: Each service can use its own schema in MongoDB, allowing for
independence and scalability.

Common Interview Questions

1. What are the benefits of using MongoDB?

o Flexible Schema: Allows for a dynamic schema that can evolve as application
requirements change.
o High Availability: Supports replication through replica sets, ensuring data
redundancy and fault tolerance.

o Scalability: Easily scales horizontally via sharding, making it suitable for large-scale
applications.

o Rich Query Language: Supports a powerful aggregation framework and indexing


capabilities for efficient data retrieval.

o Geospatial Queries: Provides support for geospatial data, making it ideal for
location-based applications.

2. How does MongoDB handle relationships between documents?

o Embedded Documents: Related data can be stored within a single document,


making data retrieval more efficient and reducing the need for joins.

o References: MongoDB can also use references, where documents in one collection
contain references (e.g., ObjectIDs) to documents in another collection, allowing for
normalized data structures. Developers can choose the method based on use cases,
balancing performance and data integrity.

3. Explain the difference between a document and a collection in MongoDB.

o Document: A document is a single record in MongoDB, stored in BSON format. It can


contain various fields and data types, including arrays and nested documents. Each
document has a unique _id field.

o Collection: A collection is a group of documents within a MongoDB database, similar


to a table in a relational database. Collections do not enforce a schema, allowing
different documents within the same collection to have varying structures.

You might also like