January 17, 2007


Sophia Tu

Well developed folksonomies contain valuable data on the standards of its community members. Notably, an HP study on del.icio.us (a social bookmarking service) showed that after the first ~100 users have tagged a website, the distribution of tags for that site stabilizes - the community has come to a consensus on what that site is about. Similar patterns probably hold for other folksonomies that can attain critical mass - that is, ones that attract enough traffic to balance out outliers in their data.

Data provided by folksonomies could potentially be valuable for improving filtering technology. Instead of solely relying on the content provider, which raises issues of bias and individual interpretation of the rankings, or relying on a third party organization, which is expensive and cannot handle the massive amounts of data being brought online continuously, drawing on the data of folksonomies is inexpensive and synthesizes many voices, mitigating issues of bias.

Most existing folksonomies are predominantly content-based, like del.icio.us. Collaborative ranking sites, like Digg, provide normative data, but they generally only record a single ranking per item. Folksonomies used for filtering would be more valuable if they recorded multiple normative rankings, in different categories (e.g. degrees of violence, profanity, sexuality). The trick in designing such a folksonomy would be convincing potential users that they would personally benefit from participating.

