Volume 2022 (30),
Article ID 4037722,
Methods and Applications in Artificial Intelligence and Machine Learning: 40AI 2022
Abstract
In this article, we investigate a variety of documents to vector representation algorithms that may be used for text categorization and sentiment analysis. The document to vector representation technique known as doc2vec was the very first contender for such a job that we examined. This is followed by the TF and TF/IDF as basic approaches, and eventually the word2vec technique. Several techniques and combinations of these vectors, and other approaches to machine learning have been evaluated and assessed. Even when compared to deep pre-trained models such as the Bert model and others, the accuracy rates increase by 2% – 3% in average when the results of the created scheme in this study are compared to the state of-the-art results on various Arabic sentiment analysis benchmarks.