Searching for subreddits
10 May 2015
reddit announced last week that they’re bringing back the reddit.com beta testing program, and one of the interesting new features is the improved subreddit search. As the admins themselves admit, searching for subreddits has always been a major pain point, and the new search vastly improves the quality of results. I had been working on a subreddit search feature for SNOOPSNOO as well, and right now seems like a good time to release it!
Let’s compare reddit’s old and new search algorithms — searching for “robots” using the old search gives us results with /r/DaftPunk and /r/plotholes pretty high up in the list, presumably because both these subreddits include the word “robots” in their descriptions. The new search for “robots” returns results that are a lot more relevant — /r/DaftPunk and /r/plotholes still appear, but they are preceded by subreddits that are actually about robots. Great!
Now, how can SNOOPSNOO improve search results? One advantage it has is that it knows /r/DaftPunk is about music and /r/plotholes is about movies, thanks to the categorization of subreddits that I had written about earlier. And this comes in handy when searching for subreddits.
Let’s search for “robots” on SNOOPSNOO. Thanks to stemming, it also includes results for “robotics”. Yay! Of course, not-entirely-relevant subreddits such as /r/DaftPunk, /r/plotholes and /r/robotchicken are included (and they should be, because they do have the word “robots” in them somwehere), but the search lets you restrict your results to particular topics — search for “robots topic:technology”, and only subreddits that are classified under Technology are returned.
Let’s look at a few more interesting examples:
- Searching for “python” returns /r/pygame — even though the word “python” does not appear in the subreddit’s title or description, because we know that it is classified under the Python topic. The search results also include /r/montypython and /r/ballpython, but add “topic:programming” to your search query and they’re gone. On the other hand, if you’re really looking for subreddits about the other python, simply search for “python topic:animals”.
- Searching for “universities in texas” returns /r/aggies, because we know that the subreddit is about “Universities and Colleges” and because “Texas” appears in its title/description.
- Searching for “india” returns /r/mumbai and /r/bangalore (although only on page 2), even when they don’t have the word India in their title or description.
The search also supports a small number of filters and operators that I hope you find useful:
- “cats subscribers<5000” returns subreddits about cats that have fewer than 5000 subscribers, for when you are purposely looking for smaller subreddits.
- “music created>2013-05-10” returns subreddits about music that were created within the past two years.
- “hardcore over18:false” excludes 18+ subreddits from the results. Use “over18:true” if you want only 18+ subreddits returned — the search does not judge.
- Common search operators:
It’s exciting to release this new feature, but it does have its limitations — it only searches subreddit metadata, not content in posts. The index is also currently limited to the 30K subreddits that I have data for (UPDATE 05/13: The index has now been updated to include over 800K subreddits, thanks /u/GoldenSights!) but I’m working hard on adding more and more subreddits.
Thanks for reading, and I hope you enjoy the new search feature. Feedback and bug reports are welcome!