Mongodb Joining Sample ($Lookup)

The case is that a website, where users can post topics under a certain section, so it provides a kind of classification for the posts, helping in the grouping, searching, … etc

I did normalize this NoSQL database, by separating Posts and Sections into two collections, So when loading sections it’s simpler, they are not embedded in the “Posts” Collection, So I don’t need to load all posts and bringing “Section” attribute, by projecting other attributes, this could lead to performance issues, due to the duplication of the same”Section” value in more posts, so add that you need to [ Project Other attributes, and skip duplicated values ], So we are not going to do this.

So, we have :


Two Collections:
Posts, Sections


Purpose: 
Fetching all Posts(Objects) from “Posts” collection, according to how many Section(Text) in “Sections” collection, and then arranging them into an object of structure :

{ Section : "" Section Name "" , A: "" array of posts have the same section name in their objects "" }
db.getCollection('Sections').aggregate([
{$lookup:{from:"Posts", localField:"Section",
foreignField:"Section", as:"A"}},
{$unwind: "$A"},
{$limit:2},
{$project:{"_id":0}},
{$group:{_id: "$A._id",Section:{$first:"$A.Section"},
"A":{$push:{logo:"$A.logo",PostTitle:"$A.PostTitle",
forwardUrl:"$A.forwardUrl",Desc:"$A.Desc"}}}},
{$project:{"_id":0}}])

Explanation :

$lookup

is the keyword for joining two collections based on local and foreign keys.

$unwind

is used to separate the result came from the joining which is an “Array”, into objects, ( each array element, becomes an inner object of a duplicated parent object ) as the following :

From :{Section : "ABC", A:[ {Name:"Post1"} , {Name:"Post2"} ]
To :{Section : "ABC" , A:{Name:"Post1"}},
{Section : "ABC", A:{Name:"Post2"}}

$limit

is to limit the results of the inner array ( after it’s “unwind“ed ), as e.g.  i need two inner posts for this specific “Section”.

$project

is to remove unwanted “key:value” object from the document ( discarding the fetching of it ).

$group

is to re-collect the documents fetched from the $lookup, into a different document format, where the document structure differs from its original structure.   e.g.  down there in the sample,  you are going to notice that $group is reforming the structure of the documents, “_id” is mandatory.

Results:

[{"Section":"Sports","A":[{"logo":"logo-1550809687524.png",
"PostTitle":"Football",
"forwardUrl":"1t0cc2khciars8"}]},{"Section":"Sports",
"A":[{"logo":"logo-1550809687524.png","PostTitle":"Football",
"forwardUrl":"1t0cc2khciars8"}]},{"Section":"Games",
"A":[{"logo":"logo-1550769840632.jpeg","PostTitle":"TicTacToe",
"forwardUrl":"1t0221k8m4kk93",
"Desc":"Playing TicTacToe"}]},{"Section":"Sports",
"A":[{"logo":"logo-1550809687524.png","PostTitle":"Football",
"forwardUrl":"1t0cc2khciars8"}]},
{"Section":"Games",
"A":[{"logo":"logo-1550769840632.jpeg","PostTitle":"TicTacToe",
"forwardUrl":"1t0221k8m4kk93",
"Desc":"Playing TicTacToe"}]},{"Section":"Sports",
"A":[{"logo":"logo-1550809687524.png","PostTitle":"Football",
"forwardUrl":"1t0cc2khciars8"}]},
{"Section":"Sports",
"A":[{"logo":"logo-1550809687524.png","PostTitle":"Football",
"forwardUrl":"1t0cc2khciars8"}]},{"Section":"Sports",
"A":[{"logo":"logo-1550809687524.png","PostTitle":"Football",
"forwardUrl":"1t0cc2khciars8"}]},{"Section":"Sports",
"A":[{"logo":"logo-1550809687524.png","PostTitle":"Football",
"forwardUrl":"1t0cc2khciars8"}]},{"Section":"Sports",
"A":[{"logo":"logo-1550809687524.png","PostTitle":"Football",
"forwardUrl":"1t0cc2khciars8"}]},{"Section":"Sports",
"A":[{"logo":"logo-1550809687524.png","PostTitle":"Football",
"forwardUrl":"1t0cc2khciars8"}]}]

Additional:

Change (_id: “$A._id”) in the sample above (inside $group ) to (_id: “$_id”), the reason is that you want to make all the inner objects inside the _id which is the id of the object itself not the id of each inner object.
by saying $A._id this means to take multiple duplicated section object and assign the A._Id to each one.
you will not get an array under one _id (parent id) , but you are going to get duplicated object each one of the duplicated is assign to one index of the array, so no array ( multiple unwind inner objects to single duplicated parent object )


Note, joining in MongoDB in some cases could lead to consumption of a lot of machine resources, depending on the complexity of your case.