April 13th, 2009


regarding the amazon "glitch"

There seems to be a debate running about whether to attribute the sudden de-ranking of a huge number of LGBT books from Amazon (known as "#amazonfail") to malice or a dumb algorithm at Amazon.

ETA: this post contains wild speculation, some of which turns out to be wrong. See my following post about what we can learn despite my wrong guess.

Wikipedia's page on Hanlon's razor suggests that the original adage may have been a Heinlein quote (hah, now there's a hotbed of homophobia) with a form something like this:

Never attribute to malice that which can be adequately explained by stupidity, but don't rule out malice.

My take on this is to lean in the direction of malice -- some fairly well-organized darknet of Amazon rankers who raised objectionable content flags on huge numbers of queer and feminist-themed books, aided by stupidity, in that Amazon's own data-oriented review/moderation techniques are vulnerable to being swung by collaborating (or sock-puppeted) vocal single-issue factions. It might have been books about abortion, but for the relative lack thereof.

The Malice of Neighborhoods
When I say " fairly-well-organized darknet" I don't mean some malicious crowd of hackers. I mean some a mailing list of pissed-off angry white sexist homophobes, defending their turf: shopping. They are throwing their own sort of Tea Party and fighting what they believe to be a culture war over the territory that matters to them-- in this case, the territory is commercial space.

In particular, I suspect that this "darknet" minority (and this is the word we should use; outraged white sexist homophobe is, thank god, a minority in this country0) is feeling particularly hemmed-in, with the recent election of a black man as American President and several public affirmations of queer rights (Iowa, DC, Vermont), not to mention outrage and anger around Prop. 8. Their organization may not even be deliberate, but they are acting out in the way they feel they own -- they're going shopping, and they're going to keep Those People out of My Space; I mean Do They Really Have To Be So Public About It?

And they have done their organizing in relative private, but are acting in a relatively concerted way, using the "flag inappropriate comment". They are agitating to have Queers Not Welcome, the same way they might organize among the PTA and neighborhood associations to have a Good Vibrations storefront run out of the local mall by filing every single piece of irritating paperwork, double-checking their tax records, asking mall security to "keep an eye on them, please", etc.

The Stupidity of Crowds:
One reason that the Amazon business has scaled up as far as it has is that they are able to treat the purchase and browsing history of millions as clouds in massive data-aggregation1. This means treating many many users as datapoints. I suspect that the stupidity on Amazon's part was two-part:

  1. deciding that the cost of presenting an "objectionable" book is extremely high
  2. trusting the count of "objectionable" reports to be relatively unbiased, rather than swung by a co-ordinated minority
My best guess is that Amazon did both of these. It's also possible, of course, that they put a new keyword list in2, but this is far less likely than a weighting error (#1 above) or a borked independence assumption (#2).

All this is not to defend Amazon -- they screwed up, whether by malice or stupidity, and as-yet they have not come clean on what happened or why. But if I'm correct - Amazon doesn't yet know why this happened -- it's the interaction of a change in models and (possibly) an unexpectedly unified minority challenging Amazon's data-mining models.

0After some discussion in the comments, I want clarify: pointing out that the 'darknet get-out-of-my-mall ban trolls' are a numerical minority is meant not to take the heat off the rest of us. We still hang out in this mall [Amazon, and other online commerce] and if mall policy makes it easy to be homophobic, homophobia will stay ther. I am not asking anybody to calm down. I would like those of us thinking about homophobia, about sexism, about racism to (please!) rattle our cages when we see injustice -- even if the injustice is in our favor. it is just as bad as you have been led to believe, even if it was an accident. Our culture usually not only overlooks homophobia, but perpetuates it -- straight couples can hold hands anywhere, but gay couples have to look over their shoulder in most places. Amazon's failure this weekend -- even if it's a keyword list screwup, as Daisey's link suggests in note 2 -- shows just how close we are, culturally and algorithmically, to declaring gayness -- and sexuality -- something we must be protected from.

1a tiny point-and-laugh, for the other NLP/datamining readers: some computer people don't get statistical models at all.

2a new keyword list is exactly what Mike Daisey reports, so I may be wrong. The question still stands, of course, about how to cope with the malice and stupidity described above. Stupidity we can work with. Malice is harder.

oo, i'm not the only person to think this:


i was wrong, but what can the ontology bleed tell us?

My speculations on the previous post about the Amazon adult-categorization fail were incorrect. (but then, they were just that -- speculations. Hope nobody bet long on my speculations.)

In fact, it was an ontology bleed failure, where somebody "accidentally" lumped together the terms "sexuality", "gay", "erotic" and probably "gender" with "porn" "adult".

So yes -- not overt homophobic censorship on Amazon's part. But also no darknet, as I had speculated. Or none that Amazon will cop to, which is as good as the same for now.

However, I stand beside the major concern: we are not, as a culture of readers and internet-users, very far from a cascaded door-slamming of homophobia: it's not an accident that this particular ontology bleed happened. It's not like somebody went into the database and accidentally decided that "entomology" belongs in "cooking"; that would have provoked (at most) a single boingboing article about the tastiness of fried locust and a global giggle about the chuckleheaded algorithm that got us that one.

No, this ontology bleed happened because a lot of people (including at least one ontology editor and the QA team standing between him [or her] and the door) find that it's reasonable to lump "gay" and "porn". Sexuality, and especially gay sexuality, as I mentioned in the earlier post, is -- for these programmers, and for much of our culture -- something to be protected from, something to be quarantined and hidden from "normal" people, where "normal" is understood to be read as "straight", without even needing a wink and a nod.

Amazon has little to be proud of here -- they screwed up in letting the ontology bleed happen in the first place, and they bobbled the response until late Monday on a "twitstorm" that began on Saturday, and AFAIK there has been no actual apology (the beginning of an explanation, yes, but no apology). As Daisey says: heads will probably not roll; "like any behemoth, there's little accountability outside the bubble,"

For me, "#amazonfail" has been a reminder to take my head out of the narrowcast of my own friends-list and feed-reader, and to remember that I actually live in a culture that remains quite hostile to queer folk. My speculation about some co-ordinated action from this hostility was wrong -- but a useful reminder that co-ordinated action is not necessary when straight privilege blindly gives the same outcomes. To repeat myself from a comment on the previous post: "it's easier to spit on people when you're standing on the top of the hill."