authorimg

REVIEW OF PAPERS

Elina

5.0

Here is Your Sample Download Sample 📩

Previous Research on Text Analytics

Text Cleaning

The article outlines the text analytics methods utilised on the Coursera platform's course reviews. The authors used a pre-processing method that included stemming the reviews and deleting stop words. To ascertain the general sentiment of the evaluations, they carried out sentiment analysis. To find the reviews' most pertinent subjects, they also applied topic modelling with Latent Dirichlet Allocation (LDA). The Natural Language Toolkit (NLTK), the VADER sentiment evaluation instrument, and LDA for topic modelling are only a few of the techniques that the researchers used for their investigation. Accuracy attained: Both sentiment evaluation as well as topic modelling accuracy are reported in the article. The authors' reliability for sentiment analysis was 85.28%, and their average coherence score for topic modelling was 0.42. The publication also acknowledges the study's shortcomings or proposes areas for further investigation (Chan, , Rajamohan, , Gan , & Sam, 2021,).

The research suggests a data cleaning technique called DeepClean that takes a question-and-answer manner. To find and fix data issues, DeepClean creates a set of queries. Used algorithms: DeepClean generates queries about the information using a machine learning technique. In order to learn how to map from data properties to natural language inquiries, the researchers particularly use a neural network. To find and fix data issues, they also employ clustering as well as rule-based methods. Accuracy attained: The DeepClean strategy's accuracy as well as efficacy are discussed in the study. The performance of DeepClean was compared to that of other data cleaning techniques when the authors tested it on various real-world datasets. They claim that compared to the other procedures examined, DeepClean was more accurate and effective (Xu, 2010).

The study outlines a technique for cleaning unfiltered tweets to enhance sentiment analysis. The method entails a number of pre-processing procedures, including the elimination of stop words, stemming, as well as spotting and fixing spelling mistakes. Additionally, the researchers employ a rule-based strategy to find and eliminate data noise like hashtags, URLs, and mentions. Algorithms used: For the initial processing processes, the authors combine a number of the algorithms, include TextBlob for correcting spelling and NLTK for stopping word removal as well as stemming. In order to reduce data noise, researchers also use normal expressions in the rule-based method. Accuracy attained: Utilising both cleaned as well as uncleansed information, the article discusses the accuracy of sentiment analysis ( Kumar & Garg, , 2020,).

The study offers a method for text mining for digital forensics analytics in noisy, dirty, or jumbled datasets. The procedure entails a number of processes, including feature extraction, classification, and information pre-processing. The writers pre-process the data using a variety of methods, such as stop word removal, stemming, and the elimination of special characters as well as digits. For feature extraction, they also combine a number of methods, including Latent Dirichlet Allocation (LDA) and Term Frequency-Inverse Document Frequency (TF-IDF). For categorization, the researchers additionally employ machine learning techniques like Naive Bayes as well as Decision Trees. Used algorithms include the TF-IDF as well as LDA for obtaining features as well as the Naive Bayes algorithm as well as decision tree methods for categorization in the authors' method. Accuracy attained: This paper discusses the accuracy of the text mining approach that is suggested when working with messy, loud, or jumbled datasets ( Xylogiannopoulos, , Karampelas , & Alhajj, , 2017).

The report won't include any accuracy findings because it was written in 2022 and the meeting hasn't yet taken place. However, considering the title of the paper as well as abstract, people are able to determine that the researchers have put forth a fresh approach to sentiment analysis that makes use of an excellent Expectation-Maximization Vector Neural Network to analyse large amounts of information from the travel and tourism sector. Consequently, the following has to be pointed out: big data analytics, sentiment analysis, and a superior expectation-maximization vector neural network were the techniques utilised. The study suggests a novel algorithm that makes use of a Superior Expectation-Maximization Vector Neural Network. Since the meeting has not yet taken place, there are no accuracy results available (Devi & Renuga Devi, 2022,).

Feature Extraction

The investigation of Bakiyev (2022) focused on identifying similarities between text documents for e Kazakh language, and it considered TF-IDF extension (Bakiyev, 2022). The improved technique outperforms the conventional method in Cosine, Jaccard, and Dice comparable measures. In addition, a flowchart of the traditional method has been used. Besides this, to ensure data accuracy, the result of the data extraction was recorded in tabular form. The software must consider Kazakh terms' synonyms when calculating document similarity if the search words do not match because their equivalents may match. Thus, Kazakh text materials may be compared more accurately using word equivalents.

The investigation highlighted that with feature mining, a textual document might be converted into an inventory of features that text-categorization methods can process (Kadhim, 2019). Document feature value computation is an effective preparation method for data mining and categorizing text. Thus, the “TF-IDF feature extraction strategy” is introduced to evaluate the two methods. BM25 and TF-IDF, two methods for quantifying the significance of phrases on Twitter, were compared. On the other hand, Investigations reveal that TF-IDF outperforms BM25 in acquiring feature measurement, with an optimal F1 statistic score of 89.77 against 89.16. Thus, BM25 and TF-IDF effectively extract feature methods used in term weighting. 

Based on the investigation have been determined that TEL, or “technologically enhanced learning,” shares concerns with digitalization because it transforms learning and increases performance via technological aids (Rahmah, 2019). Multiple universities are working to incorporate TEL as it evolves. TEL performance metrics need to be defined based on this study's identification of the traits and variables that affect TEL. 40 TEL publications from the IEEE Xplore database (published after 2010) were analyzed in this study. The solution uses a weighted combination of the TF and IDF frequencies of words to identify the most important phrases in a text. When comparing the 381 significant words to the 685 TF-IDF essential words, this study generates 23 crucial phrases. The results show that the top cluster of TF-IDF weight contains several meaningful terms.

The investigation highlighted that the first step of a data extraction method is the user entering an inquiry into the system. Each page has to go through some pre-processing to be easily consumed by search engines and put through their algorithms (Mishra, 2015). On the other hand, every document goes through processing phases, including tokenization, stemming, normalization, and removal of stop words. Using the FIRE dataset, a collection of articles from various publications, this study evaluates how well Vector-Space Models operate. All tests and assessments were conducted using the free search engine Terrier 3.5. Besides this, the findings with the updated corpus dataset reveal that TF-IDF models provide the greatest accuracy values. Thus, the findings were compared to those of the open-source searching tool Terrier and found to be superior. 

The author proposes a novel algorithm CDI IDF that uses the characteristics of channel-IDF values to fine-tune the IDF measure for a specific term (Xu, 2010). The media industry is segmented by its archive framework, and various channels cover different stories and topics. The effectiveness of the experiments is measured using the Precise-Recall and the F-measure. It takes about as long as the standard algorithm to get the accuracy and recall rate. According to experiments conducted on a manually labeled test set, the CDI TF-IDF improves over the standard TF-IDF in terms of Recall (by 2.71%), Accurate (by 3.07%), and F0.5 (by 3.00%). The space-time cost of the latest CDI-TFIDF and channel-TF-IDF algorithms is N times that of the classic technique. 

Extracting textual features is a crucial part of data analysis and retrieval processes that use either the BOW or the TF-IDF model. The present research utilizes the word2vec method for learning the word matrix in the library to acquire its “semantic characteristics”, improving text-extracted features' accuracy and clustering the “remaining words” by their “word vector similarity” after eliminating those with “low TF-IDF value” using the density-based clustering method (Liu, 2018). Therefore, related words are grouped so they might serve as a shared symbol. Certain “semantic principles” and “specific domain” data structures are required to extract these semantic aspects. Experimental results demonstrate that reusing the TF-IDF approach to build a VSM utilizing these “clusters as feature units” may significantly enhance the precision of text feature extraction. 

Previous Research on Sentiment Classification for Textual Data

The authors provide a technique for sentiment analysis that combines weighted text characteristics and sentiment-specific term encoding. They combine clustering as well as logistic regression, which are supervised as well as unsupervised learning methods, accordingly. The research presents a novel technique which incorporates clustering, logistic regression, sentiment-specific embedding of words, as well as weighed text characteristics. Accuracy discovered: On the standard set of data for sentiment analysis of tweets known as SemEval 2016 Task 4, the researchers claim to have achieved an accuracy of 82.2% (Q. Li, S. , Shah, R. Fang, , A. Nourbakhsh , & X. Liu,, 2016,).

The research developed a technique for analysing the sentiment of Japanese tweets utilising enhanced word embedding and auto-augmented sentiment polarity dictionaries. To increase the size of the initial training dataset as well as the model's precision, the authors additionally employed methods for augmenting data. Convolutional neural networks (CNNs) as well as advanced word encoding techniques were employed by the authors as their algorithms for extraction of features and sentiment categorization, respectively. Accuracy found: The authors stated that their dataset's 83.1% accuracy for sentiment classification was significantly greater than the accuracy levels found in additional research on Japanese twitter sentiment analysis (M. F. F. Khan, , A. Kanemaru , & K. Sakamura, , 2022).

Text mining of tweeting is the technique utilised to classify emotion and associate it with stock prices. Support vector machine (SVM) classification is the method used to categorise tweets into neutral, negative, and positive attitudes. In order to link the tone of tweets with the stock market values of the relevant businesses, the author also utilised a linear regression model. The beginning of the paper or abstract don't specifically specify the accuracy that was attained (Urolagin, 2017)

For text classification of tweets about the flu, the article utilised supervised machine learning. For text classification, the FastText algorithms was applied, as well as the extraction of features, sentiment as well as keyword characteristics were utilised. The accuracy achieved for the classification problem was not reported in the article (A. Alessa, , M. Faezipour , & Z. Alhassan, , 2018,).

The technique for analysing the sentiment involving English tweeting utilising the RapidMiner tool is suggested in the publication by Tripathi et al. The researchers beforehand processed the information through eliminating words that stop as well as resulting from a collection of tweets gathered throughout the 2014 FIFA World Cup. The Text Processing RapidMiner plugin, which uses machine learning techniques like Naive Bayes as well as Support Vector Machines (SVMs) for categorization, was used to carry out the sentiment analysis. The researchers divided the tweets into categories that were favourable, unfavourable, and neutral using both techniques. The authors achieved an accuracy of 73.6% utilising Naive Bayes as well as 70.8% employing SVMs after comparing the findings to a professionally annotated dataset.

Given the usage of a tool like RapidMiner, that streamlines the classification process, the study proposes an intriguing method for sentiment analysis of tweets. When comparing to other research, the accuracy obtained is, however, rather poor, and the authors offer no explanations for why this is the case. A more thorough investigation of the findings and the characteristics that were employed in the categorization method would have enhanced the research paper. In addition, the conclusions' generalizability was constrained by the authors' use of just one dataset. Considering these drawbacks, the publication contributes to the discipline of sentiment evaluation of tweets as well as offers a foundation for further study (P. Tripathi, , S. K. Vishwakarma , & A. Lala,, 2015,).

An approach to supervised machine learning for sentiment assessment of Arabic tweets is presented in the study by Bolbol as well as Maghari. To develop and evaluate their algorithms, the scientists utilised a data set of 10,000 tweets in Arabic that had been classified as good, negative, or neutral. They pre-processed the data and extracted features like emotion lexicons as well as bag-of-words using their data mining programme RapidMiner. The authors next used k-fold cross-validation to train and assess a number of machine learning algorithms, such as Naive Bayes, Random Forest, Decision Tree, as well as Support Vector Machine (SVM) (N. K. Bolbol & A. Y. Maghari,, 2020,).

According to the findings, the SVM system classified Arabic tweets with the highest degree of accuracy—82.7%—as good, negative, or neutral. Additionally, the authors provided accuracy, recollection, and F1-score measures for each class, showing that both positive as well as neutral classes performed well while the negative class performed less well.

In general, the study offers insight into the difficulties of doing sentiment analysis on Arabic tweets and delivers a thorough examination of various machine learning methods for this purpose. Nevertheless, in order to further boost the precision of the sentiment analysis of Arabic tweets, the research might profit from employing a larger set of data and investigating additional attribute extraction techniques.

Previous Research on Stock Sentiment Detection

Sentiment Extraction and Analytics

Machine learning (ML) and Artificial Intelligence (AI) are two digitally advanced techniques used combinedly with data mining to solve a wider range of real-world challenges. Due to the higher effectiveness of these technologies, higher yields are obtained in the stock market with minimal annual income. Based on this context, the current study aimed to address how using machine learning algorithms, stock market prediction can be understood using social sentiments. In doing so, python’s Natural Language Toolkit is used to collect data from Twitter API. Machine learning algorithms are used in this paper to analyse the data using sentiment analysis. The results showed that stock price movements could be accurately predicted by using the analytical method with an accuracy of 87.6% (Mankar, 2018).

In very recent years, various financial institutions have applied ML for predicting stock prices effectively. In doing so, classification and regression enable financial analysts in predicting the future price of the stock market using past data. Based on this context, the purpose of the current paper was to analyse how stock forecasting can be analysed by machine learning and sentiment analysis in the preliminary investigation. In doing so, an experiment is performed using “Classification Learner and Neural Network Pattern Recognition applications from MATLAB” (Seals, 2020). The algorithms used in this method support vector mechanism, regression, artificial neural networks and others. After that sentiment analysis was performed using the VADER model that revealed stock market predictions with 54.6% and 56.5% for AAPL and MSFT stocks.

With the advancement in digital technologies predicting the movements in the stock market has become a special trend. Additionally, social media has become a perfect place for displaying public sentiments and opinions regarding current events. Considering this context, the researchers aimed to address how stock prices of an organisation rise or fall according to changes in public opinions as represented in tweets of the organisation.  In doing so, the researchers b collected nearly 2,50,000 tweets on Microsoft from Twitter API from the date range of 31st Aug '15 to 25th Aug '16. After that, Tokenization, Stopwords removal and regex matching are used as data pre-processor and sentiment analysis is performed using Word2vec and N-gram analysis methods. The algorithms used in this paper are the Random forest algorithm, logistic regression and SMO yielded accuracy for Word2vec from 62.42% to 70.18% and for N-gram between 65.84% to 70.49% respectively (Pagolu, 2016)

Since stock market volatility is a key concern for investors, the current paper aimed to understand how specific mechanisms of the sentiment of investors influence stock market volatility using Pollet and Wilson's theory of volatility decomposition and based on big data strategy. In doing so, a range of sentiment indexes is collected across various platforms such as web search, social networks and others. In doing so, the researchers performed the Granger causality test and correlation analysis to assess the correlation between forecasting the financial market and investor sentiment. In doing so, the support vector mechanism is used as the primary ML algorithm for the paper, which yielded 99% accuracy of the impact of investors' sentiments on stock market prediction (Peng, 2019). 

Stock tweet sentiment classification

Social media and networking are one of the significant platforms for individuals to share opinions and information regarding various topics including the stock market. Similar to personal opinions, social media and networking is a powerful tool to share stock market information of an organisation. In order to leverage the strengths, the researchers aimed to research how movements of stock prices are connected with the sentiments of the users using the text mining feature of Twitter. In doing so, the researchers leveraged SVM and Naive Bias algorithms to perform the sentiment analysis. Specifically, the N-gram feature is used to analyse the tweets effectively with an accuracy rate of 91.51% and 80.042% for SVM and Naive Bayes classification respectively 

The study showed that the prediction of the stock price is a difficult task and depends on the demand for stock as well as the new information that surfaced on the web. Social media is one such powerful tool where individuals share opinions and information about the stock market of various organisations. Leveraging this opportunity the researchers conducted a study on predicting the Indonesian stock market using sentiment analysis. In doing so, Naïve Bayes and Random Forest algorithms are used to analyse the data. In order to build the prediction model, a linear regression analysis method is used. The results revealed that there is a strong connection between stock market volatility and the sentiments of social media users with an accuracy of 56.50% and 60.39% for Naive Bayes and Random Forest algorithms respectively (Cakra, 2015). 

Theoretically, sentiment classification is an instrument used to predict stock price movement. Twitter is one such social networking platform where Fin-Tech news is shared, on which people share their opinions. Based on this context, the researchers conducted research on addressing how sentiment classification is differentiated using ML algorithms of Support vector machine and Naive Bayes classifier to offer a neutral, positive and negative sentiment analysis of the Thai FinTech industry based on the news and tweets found on Twitter. Using the lexicon-based analysis method and applying regression analysis, 58.51% and 65.18% accuracy for Naive Bias and SVM analysis are obtained respectively (Sangsavate, 2019).

The research reveals that the stock market is a volatile and dynamic market. However, using the advancement of the latest technologies, businesses can understand the change, predict the movements and make safe decisions related to investment. In order to understand the dynamics of the market trends and to learn more about the market, three different models of algorithms are used in this paper such as ARIMA, LSTM and Linear Regression model. Additionally, the researchers performed sentiment analysis of the tweets of selected organisations. The results of the classification of the study reveal how stock rises and falls of the market are associated with investors’ sentiments and how investors can make safe moves to avoid financial risks with an accuracy of 8.81% to 28.93% for all the selected organisations across the three algorithms (Mehta, 2021). 

The volatility of the stock market is one of the key concerns for investors. However, the volatility is not only dependent on the market demand of market conditions but also the biased opinions of investors. Based on this preference, the researchers studied how the bias of investors affects the volatility of stocks in the market by analysing the Tweets of investors using natural language processor algorithms. In doing so, sentiment analysis is performed on the pulled tweets using the Microsoft Azure sentiment analyzer. A lexicon-based analysis is used to analyse phrases and texts. The N-Gram algorithm is also used to perform sentiment analysis with 15% accuracy (Chatterjee, 2016).

Previous Research on Stock Purchase Recommendations using Sentiment Analytics

Similarly to understand the impact of movements of the stock market, future recommendations can be understood by analysing the tweets about the stock information of an organisation. In order to understand how the recommender application works on Twitter for the stock market, sentiment analysis is performed in this paper. In doing so, the researchers conducted sentiment analysis to understand the subjectivity, polarity and classification of results using sentiment analysis. In doing so, a hybrid learning method of classification was applied for understanding future recommendations. In doing so, the Bayesian probabilistic algorithm was used for sentence label models for training data. In doing so, AWS was used to extract Twitter data while MySQL was used as a database and Python scripts were used to implement the analyzer with 60% accuracy (Gandhe, 2018). 

The current success of the application of AI in the financial sector resulted in many organisations relying on stochastic models to predict market behaviour. In this paper, to understand the prediction of the stock market, sentiment analysis is performed using enable learning based on the Random Forest algorithm and Support vector mechanism (Pasupulety, 2019). By using these two algorithms as an ensemble, the researchers performed sentiment analysis on India's national stock exchange using hashtag posts on Twitter containing basic market price information to well-leading technical indicators. Additionally, the researchers performed sentiment analysis on the public opinion of an organisation by employing sentiment analysis. In doing so, a trained WOrdVec2 model is used to perform the analysis with higher accuracies. It is observed that, in some cases, hybrid models are effective in performing better than consistent models. 

Today, with the advancement in technologies, especially in the data intelligence platform, organisations can analyse bulk data online related to emotions in the form of comments and tweets. The stock market is no exception to the advanced features, where on the web and on Twitter, information about the stock market is shared.  Since internet users are mostly motivated by data presented online, there is a higher scope for the stock market to be influenced by the online tweets of investment and the stock market. Based on this context, the researchers conducted a study in understanding how sentiments presented in tweets impact the movements of the share market. By doing so, the researchers analysed tweets from a five-year period from 2015 to 2019 and classified the tweets using the VADER model (Singh, 2022). In doing so, the SVM algorithm is used to predict how the behaviour of the share market is influenced by sentiments with an accuracy of 50.62%

Across a variety of domains, the sentiment classification of Twitter data is used regarding the predictions. Stock market is one such field, where sentiment analysis can be performed by analysing the sentiments embedded in tweets related to investment and stock market volatility. In doing so, the researchers compared the overall accuracy of two key ML algorithms such as neural network and logistic regression by analysing neutral, positive and negative sentiment for stock-based tweets. These two classifiers are then compared using “Bigram term frequency” and “inverse document term frequency”. The results showed that the two classifiers offer similar results with 58% accuracy 

One of the common approaches for stock market prediction includes ML and sentiment analysis of data presented in the microblogging platforms. Twitter is one such platform from where data can be sourced, extracted and analysed to perform the prediction. The researchers followed a similar process and performed sentiment analysis on data from StockTwits and Twitter to predict stock movement. The Microsoft stock was focused on collecting data to perform sentiment analysis. SVM and logistic regression algorithms were used to analyse the data. Vadera and TextBlob were used to detect the accuracy of data from Twitter and Stocktwits, which ranged from 46.6% to 65.8% (Nousi, 2021).

Social media platforms offer a wide variety of information that can be leveraged to predict the volatility of the stock market. In this study, a comparison of performances of different deep learning, traditional and “state-of-art” pre-trained transformer models are applied to classify texts based on stock marks from StockTwits on the information of five organisations over a  period of six months. Random forest and logistic regression are used as algorithms to perform the sentiment analysis, along with transformer models such as RoBERTa, XLNet, BERT and others  (Bozanta, 2021).The results showed that the “state-of-art” classifier outperformed deep learning algorithms and traditional classifiers with an accuracy of 48.67%. 

The big volume of social media data motivated researchers to apply sentiment analysis in ore4ctsing the movement of stock market financials. Based on this context, the study used stock price as the ground truth where tweets are labelled as “Self-dependent” or “Buy” to determine whether the stocks rose or fell in a scheduled period of time or not. A Bayesian classifier is used as algorithm to perform the sentiment analysis. It is observed that the best Validation set accuracy is 61% for trained data and the results indicate how return rate can reach over 5% with 83% scope per annum (Birbeck, 2018).

References

Kumar , A. P., & Garg, , K. (2020,). "Data Cleaning of Raw Tweets for Sentiment Analysis," 2020 Indo – Taiwan 2nd International Conference on Computing, Analytics and Networks (Indo-Taiwan ICAN),. Rajpura, India,, pp. 273-276,.

Xylogiannopoulos, , K., Karampelas , P., & Alhajj, , R. (2017). "Text Mining in Unclean, Noisy or Scrambled Datasets for Digital Forensics Analytics," 2017. European Intelligence and Security Informatics Conference (EISIC), Athens, Greece,, pp. 76-83,.

Zhang, , X., Ji, , Y., Nguyen , C., & Wang, , T. (2018,). "DeepClean: Data Cleaning via Question Asking,". 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), Turin, Italy,, pp. 283-292,.

  1. Alessa, , M. Faezipour , & Z. Alhassan, . (2018,). "Text Classification of Flu-Related Tweets Using FastText with Sentiment and Keyword Features," 2018. IEEE International Conference on Healthcare Informatics (ICHI), New York, NY, USA,, pp. 366-367,.

Bakiyev, B. (2022). Method for Determining the Similarity of Text Documents for the Kazakh language, Taking Into Account Synonyms: Extension to TF-IDF. 

Birbeck, E. a. (2018, November). Using stock prices as ground truth in sentiment analysis to generate profitable trading signals. IEEE Symposium Series on Computational Intelligence (SSCI), 1868-1875.

Bozanta, A. A. (2021, December). Sentiment Analysis of StockTwits Using Transformer Models. 20th IEEE International Conference on Machine Learning and Applications (ICMLA, 1253-1258.

Cakra, Y. a. (2015, October). Stock price prediction using linear regression based on sentiment analysis. international conference on advanced computer science and information systems (ICACSIS), 147-154.

Chan, , Y. H., Rajamohan, , R., Gan , K. H., & Sam, N. -H. (2021,). "Text Analytics on Course Reviews from Coursera Platform," 2021 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET),. Kota Kinabalu, Malaysia,, pp. 1.

Chatterjee, A. a. (2016, August). Investor classification and sentiment analysis. IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 1177-1180.

Devi , C. N., & Renuga Devi, R. (2022,). "Big Data Analytics Based Sentiment Analysis Using Superior Expectation-Maximization Vector Neural Network in Tourism," 2022. 6th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India,, pp. 1708-1716.

Gandhe, K. V. (2018, November). Sentiment analysis of Twitter data with hybrid learning for recommender applications. 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), 57-63.

Kadhim, A. (2019). Term weighting for feature extraction on Twitter: A comparison between BM25 and TF-IDF. 

Liu, Q. W. (2018). Text features extraction based on TF-IDF associating semantic. . 4th International Conference on Computer and Communications (ICCC) .

  1. F. F. Khan, , A. Kanemaru , & K. Sakamura, . (2022). "Sentiment Analysis of Japanese Tweets Using Auto-Augmented Sentiment Polarity Dictionaries and Advanced Word Embedding," 2022. IEEE 11th Global Conference on Consumer Electronics (GCCE), Osaka, Japan, pp. 462-466.

Mankar, T. H. (2018, January). Stock market prediction based on social sentiments using machine learning. 2018 international conference on smart city and emerging technology (ICSCET), 1-3.

Mehta, Y. M. (2021, May). Stock price prediction using machine learning and sentiment analysis. 2nd International Conference for Emerging Technology (INCET), 1-4.

Mishra, A. a. (2015). Analysis of tf-idf model and its variant for document retrieval. . In 2015 international conference on computational intelligence and communication networks (cicn).

  1. K. Bolbol , & A. Y. Maghari,. (2020,). "Sentiment Analysis of Arabic Tweets Using Supervised Machine Learning," 2020. International Conference on Promising Electronic Technologies (ICPET), Jerusalem, Palestine,, pp. 89-93.

Nousi, C. a. (2021, September). A methodology for stock movement prediction using sentiment analysis on Twitter and stocktwits data. 6th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference, 1-7.

  1. Tripathi, , S. K. Vishwakarma , & A. Lala,. (2015,). "Sentiment Analysis of English Tweets Using Rapid Miner," 2015. International Conference on Computational Intelligence and Communication Networks (CICN), Jabalpur, India,, pp. 668-672,.

Pagolu, V. R. (2016, October). Sentiment analysis of Twitter data for predicting stock market movements. international conference on signal processing, communication, power and embedded system (SCOPES), 1345-1350.

Pasupulety, U. A. (2019, June). Predicting stock prices using ensemble learning and sentiment analysis. IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), 215-222.

Peng, D. (2019, June). Analysis of investor sentiment and stock market volatility trend based on big data strategy. International Conference on Robots & Intelligent System (ICRIS), 269-272.

  1. Li, S. , Shah, R. Fang, , A. Nourbakhsh , & X. Liu,. (2016,). "Tweet Sentiment Analysis by Incorporating Sentiment-Specific Word Embedding and Weighted Text Features," 2016. IEEE/WIC/ACM International Conference on Web Intelligence (WI), Omaha, NE, USA,, pp. 568.

Qasem, M. T. (2015, August). witter sentiment classification using machine learning techniques for stock markets. International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 834-840.

Rahmah, A. S. (2019). Exploring technology-enhanced learning key terms using TF-IDF weighting.

  1. Urolagin. (2017, ). "Text Mining of Tweet for Sentiment Classification and Association with Stock Prices," 2017. International Conference on Computer and Applications (ICCA), Doha, Qatar, , pp. 384-388.

Sangsavate, S. T. (2019, November). Stock market sentiment classification from FinTech News. 17th International Conference on ICT and Knowledge Engineering (ICT&KE), 1-4.

Seals, E. a. (2020, March). Preliminary Investigation in the use of Sentiment Analysis in Prediction of Stock Forecasting using Machine Learning. SoutheastCon, 2, 1-2.

Singh, A. S. (2022, May). Impact Of Social Media On Stock Market-A Case Of Sentiment Analysis Using Vader. nternational Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON), 1, pp. 300-304.

Urolagin, S. (2017, September). Text mining of tweet for sentiment classification and association with stock prices. International Conference on Computer and Applications (ICCA), 384-388.

Xu, M. H. (2010). A refined TF-IDF algorithm based on channel distribution information for Web news feature extraction. International Workshop on Education Technology and Computer Science.