Sex chat auf der ps3 deutsch
We achieved the best results, 95.5% correct assignment in a 5-fold cross-validation on our corpus, with Support Vector Regression on all token unigrams.Two other machine learning systems, Linguistic Profiling and Ti MBL, come close to this result, at least when the input is first preprocessed with PCA. Introduction In the Netherlands, we have a rather unique resource in the form of the Twi NL data set: a daily updated collection that probably contains at least 30% of the Dutch public tweet production since 2011 (Tjong Kim Sang and van den Bosch 2013).We also varied the recognition features provided to the techniques, using both character and token n-grams.For all techniques and features, we ran the same 5-fold cross-validation experiments in order to determine how well they could be used to distinguish between male and female authors of tweets.Then follow the results (Section 5), and Section 6 concludes the paper. For whom we already know that they are an individual person rather than, say, a husband and wife couple or a board of editors for an official Twitterfeed. the identification of author traits like gender, age and geographical background.In this paper we restrict ourselves to gender recognition, and it is also this aspect we will discuss further in this section.
When using all user tweets, they reached an accuracy of 88.0%.
Another system that predicts the gender for Dutch Twitter users is Tweet Genie ( that one can provide with a Twitter user name, after which the gender and age are estimated, based on the user s last 200 tweets.
The age component of the system is described in (Nguyen et al. The authors apply logistic and linear regression on counts of token unigrams occurring at least 10 times in their corpus.
However, as any collection that is harvested automatically, its usability is reduced by a lack of reliable metadata.
In this case, the Twitter profiles of the authors are available, but these consist of freeform text rather than fixed information fields.