Do you have lots of blog postings, possibly over a number of years? And do you suspect that despite embedding search and tag clouds into your blog that your readers are still not finding content related to that they enjoy? In fact do you have evidence of that very problem from your web analytics data but don’t know what to do about it? Yes? So did I, so I created this roller weblogger code to generate a list of the most related blog entries based upon tag and category relationships.
To show you what I mean, here’s examples for a couple of blog entries:
Messaging Sub-Systems in the UK Government
and
My banana yellow Yamaha V-Max
The way this works is that for any given individual blog post it attempts to match both category and tags to genuinely find the most related entries (first it cycles though entries related by the same category for tag matches, then though an aggregate of all categories for tag matches, and finally, just those related by category alone).
So this has to be my favourite roller weblogger functionality to date; simply because it offers up the most pertinent content to that which the reader is currently looking at, based upon tag affinity, without forcing them to search or link endlessly though tag clouds (which often contain large number of posts that have very little related contextual meaning).
Here’s the code:
## Related Entries
## Cycles through 100 articles of the same category - look for entries with tag matches - order by no. of matches
## Cycles through 20 articles of any category (ignoring the previous category) - look for entries with tag matches - order by no. of matches
## If still not enough items in the menu adds the most recent entries from the same category
#if ($model.permalink)
## number to display (in menu)
#set ($dispFlag = 10)
## Cycles through 100 articles of the same category
#set ($cycleEntriesTotal = 100)
## Go through same category first - should get most hits
#set ($cycleEntries = $model.weblog.getRecentWeblogEntries($model.weblogEntry.category.name, $cycleEntriesTotal))
## Set up other variables
#set ($theseTags = $entry.tags)
#set ($sawEntries = [] )
#set ($sawCount = [] )
#foreach ($cycleEntry in $cycleEntries)
#if (($entry.title != $cycleEntry.title) && ($model.weblogEntry.locale == $cycleEntry.locale))
#set ($matchBun = 1)
#set ($bunTags = $cycleEntry.tags)
#foreach ($thisTag in $theseTags)
#foreach ($bunTag in $bunTags)
#if ($thisTag.name == $bunTag.name)
#set ($matchBun = $matchBun + 1)
#end
#end
#end
#if ($matchBun >= 2)
#set ($bunEntry = $cycleEntry)
#set ($entryCounter = 0)
#set ($bunFlag = 0)
#foreach ($sawNum in $sawCount)
#if (($matchBun > $sawNum) || (($bunFlag == 1) && ($matchBun == $sawNum)))
#if ($bunFlag == 0)
#set ($bunFlag = 1)
#end
#set ($bunEntry = $sawEntries.set($entryCounter, $bunEntry))
#set ($matchBun = $sawCount.set($entryCounter, $matchBun))
#end
#set ($entryCounter = $entryCounter + 1)
#end
#if ($entryCounter <= $dispFlag)
#if ($sawEntries.add($bunEntry))
#end
#if ($sawCount.add($matchBun))
#end
#end
#end
#end
#end
## Cycles through 20 articles of any category (ignoring the previous category)
#set ($cycleEntriesTotal = 20)
## Go through all categories next - should get fewer hits
#set ($cycleEntries = $model.weblog.getRecentWeblogEntries("", $cycleEntriesTotal))
## Reset other variables
#set ($theseTags = $entry.tags)
#foreach ($cycleEntry in $cycleEntries)
#if (($entry.title != $cycleEntry.title) && ($entry.category.name != $cycleEntry.category.name) && ($model.weblogEntry.locale == $cycleEntry.locale))
#set ($matchBun = 0)
#set ($bunTags = $cycleEntry.tags)
#foreach ($thisTag in $theseTags)
#foreach ($bunTag in $bunTags)
#if ($thisTag.name == $bunTag.name)
#set ($matchBun = $matchBun + 1)
#end
#end
#end
#if ($matchBun >= 1)
#set ($bunEntry = $cycleEntry)
#set ($entryCounter = 0)
#set ($bunFlag = 0)
#foreach ($sawNum in $sawCount)
#if (($matchBun > $sawNum) || (($bunFlag == 1) && ($matchBun == $sawNum)))
#if ($bunFlag == 0)
#set ($bunFlag = 1)
#end
#set ($bunEntry = $sawEntries.set($entryCounter, $bunEntry))
#set ($matchBun = $sawCount.set($entryCounter, $matchBun))
#end
#set ($entryCounter = $entryCounter + 1)
#end
#if ($entryCounter <= $dispFlag)
#if ($sawEntries.add($bunEntry))
#end
#if ($sawCount.add($matchBun))
#end
#end
#end
#end
#end
<ul class="rEntriesList">
## Set up count variables
#set ($noneFlag = 0)
#set ($entryCounter = 0)
## Output related entries
#foreach ($sawEntry in $sawEntries)
#if ($noneFlag &lt; $dispFlag)
#set ($noneFlag = $noneFlag + 1)
#set ($matchBun = $sawCount.get($entryCounter))
<li class="recentposts"><a href="$url.entry($sawEntry.anchor)" title="relationships: $matchBun" name="$utils.encode($sawEntry.anchor)" id="$utils.encode($sawEntry.anchor)">» $sawEntry.title</a></li>
#end
#set ($entryCounter = $entryCounter + 1)
#end
## If still not enough items in the menu adds the most recent entries from the same category
#if ($noneFlag &lt; $dispFlag)
## Cycle though the number to be displayed (plus one including the 'calling' entry itself)
#set ($cycleEntriesTotal = ($dispFlag + 1))
## Reset variables
#set ($cycleEntries = $model.weblog.getRecentWeblogEntries($model.weblogEntry.category.name, $cycleEntriesTotal))
## Cycle though the number to be displayed (plus one including the 'calling' entry itself)
#foreach ($cycleEntry in $cycleEntries)
#set ($addFlag = 0)
#foreach ($sawEntry in $sawEntries)
#if ($sawEntry.title == $cycleEntry.title)
#set ($addFlag = 1)
#end
#end
#if ($addFlag == 0)
#if ($noneFlag &lt; $dispFlag)
#if (($entry.title != $cycleEntry.title) &amp;&amp; ($model.weblogEntry.locale == $cycleEntry.locale))
#set ($noneFlag = $noneFlag + 1)
<li class="recentposts"><a href="$url.entry($cycleEntry.anchor)" title="relationships: 1" name="$utils.encode($cycleEntry.anchor)" id="$utils.encode($cycleEntry.anchor)">» $cycleEntry.title</a></li>
#end
#end
#end
#end
#end
</ul>
#end
After writing this article I got a mention by Dave Johnson, which inspired me to change the code based on his comments so it only grabs the last 100 entries of the same category and then the last 20 entries of the other categories to analyse. Obviously you can change this by editing #set ($cycleEntriesTotal = 100) to whatever you feel appropriate.
I’ve also removed all redundant variables, fixed a bug due to non-allocation of variables, and changed the sort code so that it keeps date and time based precedence (so entries with the same number of matches will remain in date and time sequence).
Rather than repost another article I’ve retconned the code example into this article; the code displayed is the newer, healthier code; the old code is still there, but hidden in a non-displayed “div” code block at the end (just after this text in fact).
Links for this article: