Skip to content

arnaumunozbarrera/Non-relational-Distributed-DBs-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Non relational & Distributed DBs for MongoDB with noSQLBooster

Author:

Arnau Muñoz Barrera

Introduction

In this project I worked with non-relational databases (MongoDB) and build a distributed system.I also create views and analyze query performance with execution plans. Specifically, I worked on replication, sharding and creation/deletion/manipulation of indexes and queries in mongoDB with a collection of geographic locations (geonames).

Prerequisites & tools

  • Free version of noSQLBooster, VS Code (IDE to code using Python).
  • insertGeonames.ipynb. That will allow you to insert the data on the DB Geonames.

📄 Project Report

You can see the full report generated by Deepwiki here:

View Deepwiki Report

Distributed Architecture

As shown in the following Figure, the distributed architecture in mongoDB will be made up of 3 elements: the shards, the configuration servers and the router. The shards contain a horizontal partition of the data and can be MongoDB servers running in standalone mode or replica sets. Configuration servers are MongoDB servers that contain distributed system metadata. In them is all the information related to the organization of the data (they are the ones who know in which shard each piece of data is). Finally, the router (mongos) is responsible for distributing the transactions between the different shards and collecting the response to return it to the client application that made the request. MongoDB allows us to have more than one server running on the same machine. We only need to specify a directory to store the data files (dbpath) and a port to listen to requests. We will take advantage of this feature to create a distributed system with 9 MongoDB servers on the 3 machines you have. The proposed architecture is as follows:

  • A mongos server that will run on the main machine, and will listen on port 27017.
  • 2 shards. The first shard, which we will call rs0, will be a replica set made up of 3 nodes that will listen on port 37017: one on main, one on mongo-1 and one on mongo-2. The second shard, which we will call rs1, will also consist of a replica set distributed on the 3 machines but will listen on port 47017.
  • The configuration server formed by a replica set of 3 nodes: main, mongo-1 and mongo-2. And it will listen on port 57017. Figure 1.Distributed System Architecture

About

A project exploring distributed, non-relational databases, replication and data partitioning.

Topics

Resources

Stars

Watchers

Forks