Using A Conditional Join To Flag Group Membership

Author: Steven Neiland
Published:

Warning: This blog entry was written two or more years ago. Therefore, it may contain broken links, out-dated or misleading content, or information that is just plain wrong. Please read on with caution.

If I've said it once I've said it a hundred times, only get the data you need from the database. Yet time and again I come across applications that load more data than is needed to accomplish a task, or go about parsing it in the most inefficient way possible.

This particular example is one I come across a lot.

Example: Group Membership

Lets take a common design problem. You have a table of users, a table of groups and you need to create a many-to-many relationship between the two. The database design is fairly simple.

Users
PKuserId
 username
Groups
PKgroupId
 groupname
GroupMembership
PKuserId
PKgroupId

Now in you application you wish to create an interface listing all possible groups that a user could be a member of, and flag those groups which the user actually is a member of. Lets say this is a form with a list of groups and checkboxes.





The Wrong Way

The way I normally see this solved is by using two queries, one to get all possible groups and another to get all groups the given user is a member of. Then the application loops over all the possible groups, and for each possible group iteration it does a nested loop over all member groups to look for matching groupId's like this.

Aside: This is written in cfml, but I've seen the pattern in just about every language I've come across.

Get All Possible Groups

SELECT
      g1.groupId
      , g1.groupName
FROM
      Groups AS g1

Get All Groups This User Is A Member Of

SELECT
      m1.groupId
FROM
      GroupMembership AS m1
WHERE
      userId = 1

Generate The Form Fields

<!--- Loop over all possible groups --->
<cfloop query="allGroups">
      <cfset isMember = false>
      
      <!--- Loop over all groups the user is a member of --->
      <cfloop query="usersGroups">
            <!--- Check the box if user is a member of this group --->
            <cfif allGroups.groupId EQ usersGroups.groupId>
                  <cfset isMember = true>
                  <cfbreak>
            </cfif>
      </cfloop>

      <label>#allGroups.groupName#</label>

      <input
            type="checkbox"
            name="memberGroupId"
            value="#allGroups.groupId#"
            <cfif isMember>
checked="checked"</cfif>><br/>
</cfloop>

So Whats Wrong With This?

Well on the face of it not too much. It generates the output we want and marks the correct checkbox's. It even passes my rule of only getting enough data to perform the action.

All in all its not that bad. In fact a few years ago I frequently used this pattern myself (before I knew better).

The problem with this solution has two parts. Firstly this solution requires two separate calls to the database to get related data in order to generate one list. Secondly but more importantly it requires nested loops.

Why Is This Bad

Continuing the users and group example. Say you have 50 possible groups, and a given user is a member of 10 of those groups. In order to generate the above output you could do as many as 500 comparisons between the two result sets. This is mitigated somewhat by having a break in the inner loop but even so as either group grows in size the total number of loop iterations multiplies.

The Right Way

The right way to solve this problem is actually very simple to implement and even requires less coding. What we need to do is generate a single result set with the {groupId,groupName} columns as before but with an additional third flag column showing if the given user is a member or not.

Step 1: Get All Possible Groups

To start lets get all the possible groups as before.

SELECT
      g1.groupId
      , g1.groupName
FROM
      Groups AS g1

Step 2: Join The GroupMembership Table

Next we join on the group membership table. Note that we use a "Left Join" so that we are guarenteed to always get all possible groups even if they are not currently in the groupMembership table.

SELECT
      g1.groupId
      , g1.groupName
FROM
      Groups AS g1
LEFT JOIN
      GroupMembership AS m1
            ON m1.groupId = g1.groupId

Step 3: Limit By User

Now that we have all groupMemberships joined we filter them down to our target user, in this case the user with id of 1.

SELECT
      g1.groupId
      , g1.groupName
FROM
      Groups AS g1
LEFT JOIN
      GroupMembership AS m1
            ON m1.groupId = g1.groupId
            AND m1.userId = 1

You will note that the userId condition is on the join statement and not a where clause. The significance of this is that if we put it on a where clause we would only get groups that the user is a member of, when in fact we want all possible groups.

Step 4: Create A Flag

Finally we need to create a flag to mark those groups which this user is a member of. To do this we use the fact that the value of m1.userId will be NULL when the user is not a member of a particular group. So we simply test for this condition with a CASE statement like this and output the result as a simple true/false flag.

SELECT
      g1.groupId
      , g1.groupName
      , CASE WHEN m1.userId IS NOT NULL
            THEN 1
            ELSE 0
       END AS isMember
FROM
      Groups AS g1
LEFT JOIN
      GroupMembership AS m1
            ON m1.groupId = g1.groupId
            AND m1.userId = 1

The Result

Using the resultset generated from this single query, our application code to generate the form is significantly simplified and requires less overall processing.

<!--- Loop over all possible groups --->
<cfloop query="groupMembership">
      <label>#groupMembership.groupName#</label>

      <input
            type="checkbox"
            name="memberGroupId"
            value="#groupMembership.groupId#"
            <cfif groupMembership.isMember>
checked="checked"</cfif>><br/>
</cfloop>

Granted this example is somewhat simplified but the theory can be applied in many similar situations to good effect. So please, lets cut down on the nested loops.

Related Blog Postings

Reader Comments

Patrick Spenceley's Gravatar
Patrick Spenceley
Tuesday, December 11, 2012 at 6:34:37 AM Coordinated Universal Time

You can also use the isNull function rather than the switch

isnull(m1.userId,0) AS isMember

Nice post (+1)

  • Please keep comments on-topic.
  • Please do not post unrelated questions or large chunks of code.
  • Please do not engage in flaming/abusive behaviour.
  • Comments that contain advertisments or appear to be created for the purpose of link building, will not be published.

Archives Blog Listing