Y'all who know me know I think this stuff is great. I love the "alternative writing systems" page.
My brother pointed me to this link, which came up in the linguists community later. The article describes Regina Barzilay's work at Cornell in text summarization; she was one of the people teaching tutorials at CLSP in Baltimore this summer; she is competent and a good teacher.
My brother's comment:
[I] found this particularly interesting because of the learned bias of the software (taken by analyzing news media articles about the conflict in palestine/israel):trombo2 also had the following comment:For example, the system learned incorrectly that "Palestinian suicide bomber" and "suicide bomber" were the same, and that "killing 20 people" is the same as "killing 20 Israelis", said Barzilay. These mistakes made by the system are "due to how reporters are reporting," she said. "In some sense... the teacher here is what the reporter writes," she said.
Fascinating. I think that these observations are reminding us that we "learn" culture the same way--more or less--that we learn language. Things keep getting repeated in some elemental form, even though the details (exact vocabulary, specific rituals, menu items, etc.) may be different with each rendition. When we eventually perceive the elemental form, we have internalized the cultural message. That includes the biases, syntax, or whatever.
Barzilay is no fool, and this does lead to some interesting speculation about where (and how) one might begin to try to detect bias in media by some objective measure like a computational account. Of course, considering how successful Google-washing (http://www.google.com/search?q=miserable%20failure&btnI=I%27m+Feeling+Lucky) can be, we need to remember that many computational measures of relevance only work if they're not being gamed. But it leads to some interesting questions about how relevance and document selection work -- and it also touches one of the issues dear to me: media independence and media bias.
Perhaps, at the very least, a measure like this could be used to tell how closely all the media were parroting each other's stories, like the stable of parrots that currently inhabits the White House Press Corps. Maybe by the time I graduate, I'll be able to go work for FAIR and kill two birds with one stone -- write software for language and fight media hegemony.
A boy can dream, can't he?
The commander Forbin of janson, being at a repast with a celebrated Boileau, had undertaken to pun him upon her name:--"What name," told him, "carry you thither? Boileau: I would wish better to call me Drink wine." The poet was answered him in the same tune:--"And you, sir, what name have you choice? Janson: I should prefer to be named John-Meal. The meal don't is valuable better than the furfur?"Didn't make sense? Well, don't worry. This is where it came from: an 1883 traveler's guide to English, written by two Portuguese translators who didn't speak English and apparently had only a Portuguese-French and a French-English translation dictionary to work with.