Personal Rights versus Public Decency: what’s an archive to do?

by pete on November 26, 2008

I can’t get over the number of things that come up that I did not anticipate. Background: At iterasi we build a Web archive, which is a place to capture, save and, optionally, search for and share Web pages. So unlike a bookmark that when clicked returns the current view of the Web page, at iterasi we save the page in the exact form it was when you saw it. But you probably knew that.

So now the unforeseen issue du jour. One of the goals of our work here is to let people archive whatever they find interesting. We do not pretend to be the judges of what is interesting to users and I seriously doubt they care what we think in this regard. So the right of choice is paramount to our belief system, as it appears to most enlightened humans and to the Internet in general.

Sometimes it gets dicey. In particular is material that may be offensive in nature - such as porn.Here is the rub: on our homepage we show ‘What the iterasi community is archiving…’ followed by a set of thumbnails showing the most recent pages saved by users. This is a nice visualization of what we do and pretty popular. Since we launched our search site we have gotten a few folks saying ‘nice site but do you think it’s a good idea to show porn on your homepage?’

Oh great. Ruined my post-launch afterglow buzz completely. Now another problem: Unfortunately our Search capability is very powerful. So users searching our public site start to trip across material they find offensive.

So we have to do something to protect users from seeing offensive material while protecting the rights of those who wish to archive material for their personal use. Luckily, by chance we happen to have the plumbing in place. All we have to do to hide material from ever appearing on our homepage or in the results of a search is to mark offensive material as Private. This should be easy, as there is a fair amount of public domain lists around to help us filter out offensive materials. We go to one of the large porn/blacklist aggregators and downloaded their list of bad sites. So the task at hand sounds simple enough. Just build a query in the database to check new pages against blacklist pages and mark those that appear as Private.

Here’s where things go to hell. The lists we have seem to include any site that ever had anything approaching bad material. That includes YouTube, blip.tv, popular game sites, any photo or art website, etc. I looked up one URL I didn’t recognize and it got me to a National Geographic video on an African Water Frog.

By now I want to scream. I start out not wanting to censor anything and then have to admit I have to. Now I’ve become a censor, someone Urban Dictionary describes as ‘…a moron who decides what is appropriate to list or not.’ At this point we hand edit the list and try to use some intelligence in determining what sites are porn and what sites may occasionally show things like naked frogs.

So for you iterasi users who mysteriously find some of your pages marked as ‘Private’ and you can’t figure out why… it’s our fault. If you feel they were unjustly set to Private and want them Public again please let us know. We’ll get better at this over time but right now this is what we have to do.

Leave a Comment

Previous post:

Next post: