December 7th, 2003


Writing systems - website

This website provides a guide to over 160 different alphabets, syllabaries and other writing systems. It also contains details of many of the languages written with those writing systems and links to a wide range of language-related resources, such as fonts, online dictionaries and online language courses.

Y'all who know me know I think this stuff is great. I love the "alternative writing systems" page.
  • Current Music
    Rahzel - Steal my Soul

Text summarization and media bias

My brother pointed me to this link, which came up in the linguists community later. The article describes Regina Barzilay's work at Cornell in text summarization; she was one of the people teaching tutorials at CLSP in Baltimore this summer; she is competent and a good teacher.

My brother's comment:

[I] found this particularly interesting because of the learned bias of the software (taken by analyzing news media articles about the conflict in palestine/israel):
For example, the system learned incorrectly that "Palestinian suicide bomber" and "suicide bomber" were the same, and that "killing 20 people" is the same as "killing 20 Israelis", said Barzilay. These mistakes made by the system are "due to how reporters are reporting," she said. "In some sense... the teacher here is what the reporter writes," she said.
trombo2 also had the following comment:
Fascinating. I think that these observations are reminding us that we "learn" culture the same way--more or less--that we learn language. Things keep getting repeated in some elemental form, even though the details (exact vocabulary, specific rituals, menu items, etc.) may be different with each rendition. When we eventually perceive the elemental form, we have internalized the cultural message. That includes the biases, syntax, or whatever.

Barzilay is no fool, and this does lead to some interesting speculation about where (and how) one might begin to try to detect bias in media by some objective measure like a computational account. Of course, considering how successful Google-washing ( can be, we need to remember that many computational measures of relevance only work if they're not being gamed. But it leads to some interesting questions about how relevance and document selection work -- and it also touches one of the issues dear to me: media independence and media bias.

Perhaps, at the very least, a measure like this could be used to tell how closely all the media were parroting each other's stories, like the stable of parrots that currently inhabits the White House Press Corps. Maybe by the time I graduate, I'll be able to go work for FAIR and kill two birds with one stone -- write software for language and fight media hegemony.

A boy can dream, can't he?


English as She is Spoke

The commander Forbin of janson, being at a repast with a celebrated Boileau, had undertaken to pun him upon her name:--"What name," told him, "carry you thither? Boileau: I would wish better to call me Drink wine." The poet was answered him in the same tune:--"And you, sir, what name have you choice? Janson: I should prefer to be named John-Meal. The meal don't is valuable better than the furfur?"
Didn't make sense? Well, don't worry. This is where it came from: an 1883 traveler's guide to English, written by two Portuguese translators who didn't speak English and apparently had only a Portuguese-French and a French-English translation dictionary to work with.

The Village Voice has an old review of this book. I find it absolutely hilarious. The funniest part, to me, is that state-of-the-art machine translation does little better!
  • Current Mood
    highly amused

Ad Aware rocks.

exterra's dad is in town; his computer's been having "lots of popups -- do you think you could look at it?"

So I did. His computer turned out to have the graps worm, the ignconnect mal-ware, the mal-ware, and the msiefr40.dll mal-ware. Possibly more.

I installed and ran Ad-aware ( and it got rid of almost all the spyware, and added the Google toolbar; it's now a lot easier to use. There may still be some lurking bit -- I'm going to recommend he download and pay for the expanded Ad-Aware -- but it's definitely a lot better.
  • Current Music