The aim of the project is to develop methods that allow the extraction of relevant feature from documents for authorship analysis, using deep neural architectures that allow to obtain lexical, syntactic, and semantic properties of texts.
The aim of the project is the collection of dictionaries of slang words, contractions, abbreviations and emoticons commonly used in social media. The diccitionaries are in English, Spanish, Dutch, and Italian languages.
The aim of the project is the compilation of news in Spanish from digital media sites and its categorization into three areas: variation of Spanish, author, and author's gender. The collection was carried out semi-automatically with a web crawler developed for this purpose.