Tech C**P
13 subscribers
161 photos
9 videos
59 files
304 links
مدرس و برنامه نویس پایتون و لینوکس @alirezastack
Download Telegram
What is shard and replication in mongoDB? What is their differences?

MongoDB has 2 concepts that may lead even intermediate programmers to confusion! So let's break it down and explain both in depth.

1- Take a deep breath. :)

2- Replication: replicate means reproducing or making an exact copy of something. In MongoDB replication, mirror all data sets into another server. This is process is used for fault tolerance. If there are 4 mongo servers and your dataset is 1 terabyte, each node in replica-set will have 1 terabyte of data.
In replica-set there is ONE master (primary) node, and one or more slaves (secondary). Read performance can be improved by adding more and more slaves, but not writes! Adding more slaves does not affect writes, that's because all writes goes to master first and then will be propagated to other slaves.

3- Sharding: sharding on the other hand has completely a different concept. If you have a server with 1 terabyte of data and you have 4 servers, then each nore will have 250 gigabyte of data each. As you may have guessed it is not fault tolerant because each part of data resides in a separate server. Each read and write will be sent to the corresponding section. So if you add more shards, both read and write performance will be improved in the cluster. When one shard of the cluster goes down, any data on it is inaccessible. For that reason each member of the cluster should also be a replica-set, but not required to.

4- Take another deep breath, and let's get back to work.

#mongodb #mongo #shard #replica #replication #sharding #cluster
In order to connect to MongoDB replica set in Python you can give all server node addersses to MongoClient. Addresses passed to MongoClient() are called the seeds. As long as at least one of the seeds is online, MongoClient discovers all the members in the replica set, and determines which is the current primary and which are secondaries or arbiters.


Sample usages:

>>> MongoClient('localhost', replicaset='foo')
MongoClient(host=['localhost:27017'], replicaset='foo', ...)
>>> MongoClient('localhost:27018', replicaset='foo')
MongoClient(['localhost:27018'], replicaset='foo', ...)
>>> MongoClient('localhost', 27019, replicaset='foo')
MongoClient(['localhost:27019'], replicaset='foo', ...)
>>> MongoClient('mongodb://localhost:27017,localhost:27018/?replicaSet=foo')
MongoClient(['localhost:27017', 'localhost:27018'], replicaset='foo', ...)

Read full details here:

- https://api.mongodb.com/python/current/examples/high_availability.html#connecting-to-a-replica-set


#database #mongodb #mongo #replica_set #replication #pymongo #arbiter #master #primary #slave
Secondary Reads

By default an instance of MongoClient sends queries to the primary member of the replica set. To use secondaries for queries we have to change the read preference:

>>> client = MongoClient(
... 'localhost:27017',
... replicaSet='foo',
... readPreference='secondaryPreferred')
>>> client.read_preference
SecondaryPreferred(tag_sets=None)


Now all queries will be sent to the secondary members of the set. If there are no secondary members the primary will be used as a fallback. If you have queries you would prefer to never send to the primary you can specify that using the secondary read preference.

#mongodb #replica_set #replication #secondary #slave #pymongo
How to configure a Delayed Replica Set Member?

Let's assume that our member is third in the array of replica members:

cfg = rs.conf()
cfg.members[2].priority = 0
cfg.members[2].hidden = true
cfg.members[2].slaveDelay = 3600
rs.reconfig(cfg)

The priority is set to 0 (preventing to be elected as primary).

The hidden to true in order to hide the node from clients querying the database.

And finally slaveDelay to number of seconds that we want it to get behind of Primary Node.

The use case for this is to have a replication that is used for analytical purposes or used for backup and so on.

#mongodb #mongo #replica #replication #primary #delayed_replica_set #slaveDelay
In order to see how much time your mongoDB slave is behind the primary node:

rs0:SECONDARY> db.printSlaveReplicationInfo()
source: mongo.mongo.com:27017
syncedTo: Mon Nov 12 2018 06:33:40 GMT+0000 (UTC)
-4 secs (0 hrs) behind the primary

#mongodb #mongo #slave #printSlaveReplicationInfo #replica #replication