tirsdag den 5. maj 2009

PCA

Some time ago I read a very interesting article, about a competition by netflix.com (read more) where the point is to make a program that can predict users opinions of a new movie based on previous judgements of other movies.

The authors of the article had noted that no matter which method was used for prediction none of the competitors could get closer than 90%-95% (i think) accuracy in their estimates. The really interesting thing was that though very sophisticated mathematical methods yielded better results it was about 2%-3% better than simple methods.

Obviously this means you can use a lot of time being slightly better, but you should consider if it is worth it. That might be the case if you want storm predictions or something like that. But for a "simple" social entertainment site 90% is properly going to be as good for at start as 95%.

So I am opting for a "simple" method of deciding wheter content is relevant for the user. More on this later - gotta feed my self :)

Ingen kommentarer: