Selecting related tags with SQL

Published on and tagged with sql

Recently I had to implement a “related tags” feature, i.e. you want to get all tags of items tagged with a specific tag. For example, if you have posts tagged with “sql” then you want to get all other tags of those “sql” posts, like “mysql”, “performance”, whatever.

Here the SQL I came up with for the aforementioned example:

SELECT 
    Tag.name, 
    COUNT(*) as postcount
FROM 
    tags AS t 
    INNER JOIN posts_tags AS pt ON (t.id = pt.tag_id) 
    INNER JOIN posts AS p ON (pt.post_id = p.id) 
    INNER JOIN posts_tags AS pota ON (p.id = pota.post_id) 
    INNER JOIN tags AS Tag ON (pota.tag_id = Tag.id) 
WHERE 
    t.name = 'sql'  
    AND t.id <> Tag.id
GROUP BY 
    Tag.name';

I hope this SQL snippet is useful for some of you!

3 comments baked

  • speedmax

    I used pretty much exactly the same Query for my own project, then i have builded a TaggableBehavior for CakePHP to add tagging feature for any model. loosely based on DHH’s acts_as_taggable

    http://gist.github.com/57906

    then i have also added following methods, u may find it useful

    findRelatedTags(),
    findRelatedTagged(),
    findPopularTag()
    findTaggedWith(),
    findTagCount()

  • majna

    Decomposing can help on larger datasets,
    Use decomposing to avoid slow query.

    Example:
    SELECT * FROM tag
    JOIN tag_post ON tag_post.tag_id=tag.id
    JOIN post ON tag_post.post_id=post.id
    WHERE tag.tag=’mysql’;

    Decomposed:
    SELECT * FROM tag WHERE tag=’mysql’;
    SELECT * FROM tag_post WHERE tag_id=1234;
    SELECT * FROM post WHERE post.id in (123,456,567,9098,8904);

  • cakebaker

    @speedmax, majna: Thanks for your comments!

    @speedmax: Thanks for sharing your TaggableBehavior! Your findRelatedTags() seems to be cleaner than the SQL I’m currently working on to retrieve the related tags when you provide multiple tags.

    @majna: Thanks for this tip!

Bake a comment




(for code please use <code>...</code> [no escaping necessary])

© daniel hofstetter. Licensed under a Creative Commons License