Description:
The millennial generation plays a leading role in today’s connected world in which exists a confluence of numerous technologies and the internet in science, economy and innovation. This study aimed to identify the key factors that characterize the millennial generation within the online chatter on Twitter using an innovative approach. To this end, we analyzed the user generated content (UGC) in the social network (SNS) Twitter using a three-steps knowledge-based method for information management. In order to develop this method, we first used latent Dirichlet allocation (LDA), a state-of-the-art thematic modeling tool that works with Python, to analyze topics in our database. The data were collected by extracting tweets with the hashtag #Millennial, #Millennials and #MillennialGeneration on Twitter (n = 35,401 tweets). Secondly, sentiment analysis with a support vector machine (SVM) algorithm was also developed using machine-learning. Applying this method to the LDA results resulted in the categorization of the topics into those that contained negative, positive and neutral sentiments. Thirdly, in order to gather the final results, data text mining techniques were used. The negative factors that characterize the behavior of this generation are depression, loneliness and real-world relationship. The positive factors are body image, self-expression, travelers and digital life and the neutral factors are self-identity and anxiety. Practical implications can be used by public actors, companies or policy makers that are focused on the millennial generation as a target. The study has important theoretical applications as the topics discovered can be used to test quantitative models based on the findings and insights extracted from the UGC sample.