A Textual Network Dashboard Twitter data-driven
Rocco Mazza, Nicola Canestrari, Emma Zavarrone, Maria Gabriella Grassia, Marina MarinoThe paper focuses on the components of discourse about hate speech against women inside the Italian twitter community and proposes a shiny dashboard to represent our infographic results. Hate speech based communication is increasing with the massive production of user generated content on social network. The literature defines hate speech as content that disparages a person or a group on the basis of some characteristic such as ethnicity, gender, sexual orientation.
Starting from texts collected from Twitter through API using R, the aim is to explore the mining of contents extracted and study the lexical structure that link the principal discussion topics. Subsequently we develop a shiny application to transform the unstructured textual data in useful and manageable information and represent the results.
Our approach is builds upon different phases: (1) content extraction and corpus preprocessing, (2) descriptive study of texts with most frequent words, (3) applying the model to extract and identify the latent topics within the contents collected, (4) improving the interpretation of the meanings of the topics through the use of network analysis’s tools in order to better identify the neighbourhood of each topic, and, finally, (5) studying the semantic structure that links together the emerged themes with the intent to understand the semantic relationships between the extracted topics and documents’ terms. These methodological phases are all included in the shiny app like analytical tools.
The LDA model is the methods used both to extract latent topics and to construct the terms x topics matrix. At first glance, the main methodological challenges faced lie in the construction of a two mode matrix of reduced order able to represent the network of topics-terms extracted using specific search keywords from July 2018 to May 2019. A beta version of shiny web app is already on line at this link: https://rccmazza.shinyapps.io/Donne4/. From descriptive analysis we can schematically trace three semantic dimensions: the first to cases of news commented by users on the social; the second to an institutional and regulatory dimension; in the last one we find the ways in which violence against women can be realized.