Table of contents
- Performing regex queries with PyMongo
- How to create a volume and launch a mongoDB container using that volume
- Transfer database from one server to another server
- Install mongodb in Centos7
- Aggregation pipeline
- update array using filter
- restart MongoDB
- projection in pyMongo
- Copy one collection from one database to another
- Remove one filed in collection
- Rename the filed in collection
- append an item to an array in mongoDB
Performing regex queries with PyMongo
listTumorType = tumorType.split('/')
query = {} # build multiple fields query
tumorType_str = '|'.join(listTumorType)
patternTumorType = re.compile(rf"\b{tumorType_str}\b")
query['tumorType'] = patternTumorType
patternGeneName = re.compile(rf"\b{geneName}\b",re.IGNORECASE)
query['geneName'] = patternGeneName
patternScope = re.compile(rf"\b{scope}\b",re.IGNORECASE)
query['scope'] = patternScope
result =list(collectionComment.find(query,{'listComment':True,'listReference':True,'_id':False}))
Note:
r""
: Defines a raw string, which prevents backslashes from being interpreted as escape sequences.\b
: Matches a word boundary.E.g.r"\b(apple|banana)\b"
This ensures that only whole words “apple” or “banana” are matched, not parts of other words (e.g., “pineapple”).|
: Acts as an “OR” operator, matching either the pattern before or after it.
How to create a volume and launch a mongoDB container using that volume
- Create a volume that mount to your local path
docker volume create --driver local \ --opt type=none \ --opt device=/path/to/your/local \ # change it to your data path --opt o=bind \ volume_name # change it to your volume name
- Check the volumne and inspect
docker volume ls docker volume inspect volume_name # using this command you can find the mount details
- create the mongodb container using the volume
docker run -it --rm \ # change ` -it --rm` to `-d --restart always ` to make the container run backend and restart always --name container-mongodb \ # change to your name --hostname my-mongodb \ # change to your name -v volume_name:/data/db \ -p 27017:27017 \ # the port may need to change if you have already used 27017. E.g. `-p 27018:27017` --network share_network_with_other_containers \ # change to your network mongo:latest
- Check the container
docker container ls | grep mongodb
- Test the mongodb
mongosh --port 27017 # change the port number to the number you used above
Transfer database from one server to another server
- Copy the database:
mongodump
:mongodump --db db_name --archive=./mongodbBackup/db_name.dump --gzip
- Scp transfer the dump file:
scp db_name.yml server_name@ip_address:path_to_save
- login another server
- Restore the database:
mongorestore --gzip --archive=db_name.dump
It will restore the database to the default mongodb save path. You can find the save path at /etc/mongod.conf
Install mongodb in Centos7
- Create a file :
sudo vi /etc/yum.repos.d/mongodb-org.repo
- Enter the content at /etc/yum.repos.d/mongodb-org.repo:
[mongodb-org-6.0] name=MongoDB Repository baseurl=https://repo.mongodb.org/yum/redhat/7/mongodb-org/6.0/x86_64/ gpgcheck=1 enabled=1 gpgkey=https://www.mongodb.org/static/pgp/server-6.0.asc
- Install the package:
sudo yum install mongodb-org
- Start :
sudo systemctl start mongod
- Check Status:
sudo systemctl status mongod
- If you’d like to change the /etc/mongo.conf, such as change the dbPath, then you should use the command to restart the Mongodb service:
sudo service mongod restart
- If you change the dbPath to a location that the mongodb have no permission to modify, then the status of mongodb is failed. you need to change it back or change the folder permission.
Uninstall the Mongodb :
- Stop :
sudo service mongod stop
- Remove the installed package:
sudo yum erase $(rpm -qa | grep mongodb-org)
- Remove data files:
sudo rm -r /var/log/mongodb sudo rm -r /var/lib/mongo
Aggregation pipeline
Check the official document for pipeline stages
You can use aggregation pipeline to get the documents based on complex conditions.
def searchFusions():
tumorType = request.form.get('tumorType',None)
geneName1 = request.form.get('geneName1',None)
geneName2 = request.form.get('geneName2',None)
tier = request.form.get('tier',None)
listAnd = []
if geneName1:
listAnd.append({"$regexMatch":{ "input": "$$fusion.gene" , "regex": f"{geneName1}", "options": "i" }})
# because gene is the second level of dict, so use double $: $$
# Do Not use f"/{geneName1}/" as the CLI format
if geneName2:
listAnd.append({"$regexMatch":{ "input": "$$fusion.gene" , "regex": f"{geneName2}", "options": "i" }})
if tier:
listAnd.append({"$eq":["$$fusion.tier",tier]})
stageMatchTumorType = {}
if tumorType:
stageMatchTumorType = {
'$match':{'dictCaseInfo.tumorType':{'$regex':tumorType,'$options':'i'}}
}
stageProjectFilter = {
"$project":{
"listFusion":{
"$filter":{
"input":"$listFusion",
"as":"fusion",
"cond": {"$and":listAnd},
},
},
'dictCaseInfo':True
}
}
stageProjectFileds = {
"$project":{
'_id':True,
'listFusion':True,
'dictCaseInfo':True,
"numberOfFusion":{"$size":"$listFusion"},
}
}
stageMatchRemoveEmptyResult = {
"$match":{
"numberOfFusion":{"$gt":0}
}
}
# narrow down the documents and arrnge the results
pipeline = [stageMatchTumorType,stageProjectFilter,stageProjectFileds,stageMatchRemoveEmptyResult]
pipeline = [element for element in pipeline if len(element) > 0]
listRecord = collectionReport.aggregate(pipeline)
listRecord = list(listRecord)
return jsonify(listFormatRecord)
update array using filter
if you’d like to update an element in an array
query = {'_id':ObjectId(ID)}
update = {"$set":{f'listArray.$[element]':updateStuff}}
arrayFilter = [{f"element.ID2":ID2}]
result = collectionImportant.update_one(filter=query,update=update,array_filters=arrayFilter)
restart MongoDB
When the server crashed down, and you failed to connect to the mongoDB server after typing mongosh
. Then you need to restart the mongoDB server as shown below.
sudo service mongod restart
projection in pyMongo
query = {'status':'active', 'testOrdered':testOrdered,'panel':panel,'specimen':specimen}
list_tag = list(collection_rule_extraction.find(query, projection={'_id':False,'tag':True}))
list_tag = [rule['tag'] for rule in list_tag]
You need to use True
or False
to specify which filed needed to be returned.
Copy one collection from one database to another
- Install mongosh
- In the shell using the command line to backup: mongodump -d some_database -c some_collection (e.g. mongodump -d comment -c comment_ST)
- restore the collection to another database: mongorestore -d some_other_db -c some_or_other_collection dump/some_collection.bson (e.g. mongorestore -d report -c comment_ST dump/comment/comment_ST.bson)
- Done!
Remove one filed in collection
db.example.update({}, {$unset: {words:1}} , {multi: true});
Add one filed in collection:
db.example.update({}, {$set: {'status':'active'}} , {multi: true});
Rename the filed in collection
go into the mongosh command line interface
db.collectionName.update({}, {
$rename: {
"old": "new"
}
}, false, true);
//e.g.
db.foo.update({}, {
$rename: {
"name.additional": "name.last"
}
}, false, true);
The false, true in the method above are: { upsert:false, multi:true }. You need the multi:true to update all your records.
append an item to an array in mongoDB
coll.update({'ref': ref}, {'$push': {'tags': new_tag}})