Leveraging Big Data Analytics and Machine Learning Techniques for Sentiment Analysis of Amazon Product Reviews in Business Insights

Chethan Sriharsha Moore, Purna Chandra Rao Chinta, Niharika Katnapally, Krishna Ja, Kishan Kumar Routhu, Vasu Velaga

Citation: Chethan Sriharsha Moore, Purna Chandra Rao Chinta, Niharika Katnapally, Krishna Ja, Kishan Kumar Routhu, Vasu Velaga, "Leveraging Big Data Analytics and Machine Learning Techniques for Sentiment Analysis of Amazon Product Reviews in Business Insights", Universal Library of Engineering Technology, Special Issue.

Copyright: This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Sentiment research is essential for comprehending consumer input and raising the calibre of goods and services. This study looks at the dataset of Amazon product evaluations and how sentiment analysis using ML approaches was done. Several ML methods, such as Gradient Boosting (GB), Logistic Regression (LR), Naïve Bayes (NB), and Recursive Neural Network for Multiple Sentences (RNNMS), are used in this research to analyse the sentiment of Amazon product reviews. The approach begins with preprocessing the dataset by removing punctuation, filtering stop words, and tokenising the text, followed by feature extraction using techniques like Bag of Words (BoW). The models are evaluated using the F1-score, recall, accuracy, and precision once the data is separated into training and testing sets. Among a models tested, Gradient Boosting outperforms the others with a consistent 82% in all metrics, demonstrating its strong classification ability. The outcomes show that while GB provides a highest performance, future work could explore advanced models and techniques to further enhance sentiment classification accuracy across diverse product categories.


Keywords: Sentiment Analysis, Big Data, Business, Machine Learning, Amazon Product Review Dataset, BOW

Download doi https://doi.org/10.70315/uloap.ulete.2022.002