January 21, 2006

New local relevance algorithm on Topix city pages

by skrenta at 7:50 PM

We've been working hard to improve the relevance of our news channels, and this weekend deployed some algo changes to our local city and subject news pages. We're trying to promote bigger stories above the fold, rather than just chronologically sorting the news. The above-the-fold stories should now be a combination of recent, relevant stories. The goal is to have really good stories in the first few positions on the page.

There should also be fewer off-topic posts on our local pages. We've had a devil of a time, for example, with Silicon Valley tech business stories ending up on our Palo Alto page, since so many tech companies are located here. Technically we're getting the location of the subject of the stories right, but they're not local news. Local news is about sandbags to prevent San Francisquito Creek from coming in your front door, not Google's earnings. That's another channel. The same sort of thing happens in LA with celeb stories, DC with world news, NY with "wall street", etc. We can remove much (but not all, alas) of this off-topic material now.

This is an interim update, but I'm blogging it anyway. This update has been more about getting bad stories off our of our pages (precision), rather than finding addtional stories we might we missing (recall). We're going to work on precision first, then increase recall. It's not perfect, there's more things we know how to fix but haven't applied the programmer-time yet, but overall the quality should be a big improvement over a few weeks ago.

I'd be interested in any feedback on the quality... feel free to email me, rich-at-topix-dot-n-e-t if you have any comments.

