Published 31.12.2023
Keywords
- Twitter,
- machine learning ,
- bot tweet
Copyright (c) 2023 Refik Söylemez; Ali Boyacı (Co-Author)
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Abstract
Twitter has undergone significant changes since
its launch in 2006, evolving from a platform that only allowed
140-character messages to one that is used for everything from
communication to marketing. Researchers have conducted
numerous studies on Twitter data, exploring everything from
emotion and influence to political polarization and bot analysis.
However, these studies have primarily focused on analyzing bot
tweets to combat the spread of false information. The accuracy
of these analyses varies depending on the selection of training
data used to create the machine learning models. In this study,
we investigate the impact of different training data on the
accuracy of these models, specifically exploring the effects of
randomly selected training data on model performance. By
examining this important question, we hope to shed new light on
the challenges and opportunities of using machine learning
methods to analyze Twitter data.
References
- Varol, O., Ferrara, E., Davis, C. A., Menczer, F., & Flammini, A. “Online Human-Bot Interactions: Detection, Estimation, and Characterization”, 2017.
- Campos Domínguez, E. M. (2017). Twitter y la comunicación política. El profesional de la información, 26(5),785-794.
- HongyuGao,YanChen,KathyLee,DianaPalsetia,andAlokNChoudhary.2012. Towards online spam filtering in social networks.. In NDSS, Vol. 12. 1–16.
- S. Cresci, M. Petrocchi, A. Spognardi, and S. Tognazzi, “On the capability of evolved spambots to evade detection via genetic engineering,” Online Social Networks and Media, vol. 9, pp. 1–16, 2019.
- Feng Wei and Uyen Trang Nguyen, “Twitter Bot Detection Using Bidirectional Long Short-term Memory Neural Networks and Word Embeddings” in IEEE TPS 2019.
- K. Lee, B. D. Eoff, and J. Caverlee, “Seven months with the devils: A long-term study of content polluters on twitter,” in Proc. Fifth Int. AAAI Conf. Weblogs Social Media, 2011.
- M. Alsaleh, A. Alarifi, A. M. Al-Salman, M. Alfayez, and A. Al- muhaysin, “Tsd: Detecting sybil accounts in twitter,” in Proc. 13th Int. Conf. Mach. Learning and Appl., 2014.
- Jürgen Knauth. 2019. Language-Agnostic Twitter-Bot Detection. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 550–558, Varna, Bulgaria. INCOMA Ltd.
- Bellman, R.E. Adaptive Control Processes; Princeton University Press: Princeton, NJ, USA, 1961. [Google Scholar]
- Refaeilzadeh, P., Tang, L., Liu, H. (2009). Cross-Validation. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA