dot Stop testing, start deploying your AI apps. See how with MIT Technology Review’s latest research.

Download now

1.3.3 Grouping articles

back to home

1.3.3 Grouping articles

To offer groups requires two steps. The first step is to add information about which articles are in which groups, and the second is to actually fetch articles from a group. We’ll use a SET for each group, which stores the article IDs of all articles in that group. In listing 1.9, we see a function that allows us to add and remove articles from groups.

Listing 1.9
The add_remove_groups() function
def add_remove_groups(conn, article_id, to_add=[], to_remove=[]):
	article = 'article:' + article_id

Construct the article information like we did in post_article.

	for group in to_add:
		conn.sadd('group:' + group, article)

Add the article to groups that it should be a part of.

	for group in to_remove:
		conn.srem('group:' + group, article)

Remove the article from groups that it should be removed from.


At first glance, these SETs with article information may not seem that useful. So far, you’ve only seen the ability to check whether a SET has an item. But Redis has the capability to perform operations involving multiple SETs, and in some cases, Redis can perform operations between SETs and ZSETs.

When we’re browsing a specific group, we want to be able to see the scores of all of the articles in that group. Or, really, we want them to be in a ZSET so that we can have the scores already sorted and ready for paging over. Redis has a command called ZINTERSTORE, which, when provided with SETs and ZSETs, will find those entries that are in all of the SETs and ZSETs, combining their scores in a few different ways (items in SETs are considered to have scores equal to 1). In our case, we want the maximum score from each item (which will be either the article score or when the article was posted, depending on the sorting option chosen).

Figure 1.12 The newly created ZSET, score:programming, is an intersection of the SET and ZSET. Intersection will only keep members from SETs/ZSETs when the members exist in all of the input SETs/ ZSETs. When intersecting SETs and ZSETs, SETs act as though they have a score of 1, so when intersecting with an aggregate of MAX, we’re only using the scores from the score: input ZSET, because they’re all greater than 1.

To visualize what is going on, let’s look at figure 1.12. This figure shows an example ZINTERSTORE operation on a small group of articles stored as a SET with the much larger (but not completely shown) ZSET of scored articles. Notice how only those articles that are in both the SET and the ZSET make it into the result ZSET?

To calculate the scores of all of the items in a group, we only need to make a ZINTERSTORE call with the group and the scored or recent ZSETs. Because a group may be large, it may take some time to calculate, so we’ll keep the ZSET around for 60 seconds to reduce the amount of work that Redis is doing. If we’re careful (and we are), we can even use our existing get_articles() function to handle pagination and article data fetching so we don’t need to rewrite it. We can see the function for fetching a page of articles from a group in listing 1.10.

Listing 1.10
The get_group_articles() function
def get_group_articles(conn, group, page, order='score:'):
	key = order + group

Create a key for each group and each sort order.

	if not conn.exists(key):

If we haven’t sorted these articles recently, we should sort them.

		conn.zinterstore(key,
			['group:' + group, order],
			aggregate='max',

Actually sort the articles in the group based on score or recency.

		)
		conn.expire(key, 60)

Tell Redis to automatically expire the ZSET in 60 seconds.

	return get_articles(conn, page, key)

Call our earlier get_articles() function to handle pagination and article data fetching.


On some sites, articles are typically only in one or two groups at most (“all articles” and whatever group best matches the article). In that situation, it would make more sense to keep the group that the article is in as part of the article’s HASH, and add one more ZINCRBY call to the end of our article_vote() function. But in our case, we chose to allow articles to be a part of multiple groups at the same time (maybe a picture can be both cute and funny), so to update scores for articles in multiple groups, we’d need to increment all of those groups at the same time. For an article in many groups, that could be expensive, so we instead occasionally perform an intersection. How we choose to offer flexibility or limitations can change how we store and update our data in any database, and Redis is no exception.

Exercise: Down-voting

In our example, we only counted people who voted positively for an article. But on many sites, negative votes can offer useful feedback to everyone. Can you think of a way of adding down-voting support to article_vote() and post_article()? If possible, try to allow users to switch their votes. Hint: if you’re stuck on vote switching, check out SMOVE, which I introduce briefly in chapter 3.

Now that we can get articles, post articles, vote on articles, and even have the ability to group articles, we’ve built a back end for surfacing popular links or articles. Congratulations on getting this far! If you had any difficulty in following along, understanding the examples, or getting the solutions to work, keep reading to find out where you can get help.