May 26th, 2004


How to tell gender from conversation

A lab colleague of mine has got his hands on a new batch of transcribed conversations. He writes:

I run a standard feature selection algorithm for gender detection. Here are the top 3 features for females and for males. The last two columns refer to the number of occurrences of the word in female and male conversation sides.

husband 4523.0 248.0
dear 186.0 37.0
babies 352.0 68.0

dude 34.0 216.0
fuck 13.0 80.0
fucking 34.0 189.0

As another lab-mate said: "Oh, I get it. Action -- consequence!"
