Call for Paper

CAE solicits original research papers for the October 2021 Edition. Last date of manuscript submission is September 30, 2021.

Read More

Sentiment Analysis of Twitter Feeds using Machine Learning, Effect of Feature Hash Bit Size

Silas Kwabla Gah, Nana Kwame Gyamfi, Ferdinard Katsriku. Published in Information Sciences.

Communications on Applied Electronics
Year of Publication: 2017
Publisher: Foundation of Computer Science (FCS), NY, USA
Authors: Silas Kwabla Gah, Nana Kwame Gyamfi, Ferdinard Katsriku
10.5120/cae2017652544

Silas Kwabla Gah, Nana Kwame Gyamfi and Ferdinard Katsriku. Sentiment Analysis of Twitter Feeds using Machine Learning, Effect of Feature Hash Bit Size. Communications on Applied Electronics 6(9):16-21, April 2017. BibTeX

@article{10.5120/cae2017652544,
	author = {Silas Kwabla Gah and Nana Kwame Gyamfi and Ferdinard Katsriku},
	title = {Sentiment Analysis of Twitter Feeds using Machine Learning, Effect of Feature Hash Bit Size},
	journal = {Communications on Applied Electronics},
	issue_date = {April 2017},
	volume = {6},
	number = {9},
	month = {Apr},
	year = {2017},
	issn = {2394-4714},
	pages = {16-21},
	numpages = {6},
	url = {http://www.caeaccess.org/archives/volume6/number9/717-2017652544},
	doi = {10.5120/cae2017652544},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

Sentiment Analysis is a way of considering and grouping of opinions or views expressed in a text. In this age when social media technologies are generating vast amounts of data in the form of tweets, Facebook comments, blog posts, and Instagram comments, sentiment analysis of these user-generated data provides very useful feedback. Since it is undisputable facts that twitter sentiment analysis has become an effective way in determining public sentiment about a certain topic product or issue. Thus, a lot of research have been ongoing in recent years to build efficient models for sentiment classification accuracy and precision. In this work, we analyse twitter data using support vector machine algorithm to classify tweets into positive, negative and neutral sentiments. This research try to find the relationship between feature hash bit size and the accuracy and precision of the model that is generated. We measure the effect of varying the feature has bit size on the accuracy and precision of the model. The research showed that as the feature hash bit size increases at a certain point the accuracy and precision value started decreasing with increase in the feature hash bit size.

References

  1. K. C. C. C. a. C. O. Li Bing, "Public Sentiment Analysis in Twitter Data for Prediction of a Company's Stock Price Movement," in 14 Proceedings of the 2014 IEEE 11th International Conference on e-Business Enginerring, 2014.
  2. K. C. C. C. a. C. O. Li Bing, "Public Sentiment Analysis in Twitter Data for Prediction of a Company's Stock Price Movement," in 14 Proceedings of the 2014 IEEE 11th International Conference on e-Business Enginerring, 2014.
  3. B. P. K. M.-F. E., "Automatic Sentiment Analysis in On-line text," ELPUB, pp. 349-360, 2007.
  4. P. D. Turney, "Thumbs up or thumbs down?:semantic orientation applied to unsupervised classification of reviews," 40th annual meeting on association for computationa; linguistics, p. 417–424, 2002.
  5. K. M. M. R. J. M. a. M. D. R. J., Using wordnet to measure semantic orientations of adjectives, 2004.
  6. C. Fellbaum, "Wordnet: An electronic lexical database," 1998.
  7. C. Kaushik and A. Mishra, "A scalable, Lexicon based technique for sentiment analysis," Journal of International Journal in Foundation of Computer Science & Technology (IJFCST), 2014.
  8. T. WebSite, 1 March 2015. [Online]. Available: https://about.twitter.com/company.
  9. B. M. D. S. T. Croft, Search Engines: Information Retrieval in Practice, Addison Wesly Publishing Company, 2009.
  10. G. V. R. Chandrasekaran, "Sentiment analysis and opinion mining: a survey," International Journal 2, vol. 2, p. 6, 2012.
  11. O. e. a. Abdelwahab, "Effect of training set size on SVM and Naive Bayes for Twitter sentiment analysis," in IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), 2015.
  12. P. D. a. M. Pazzani, "On the optimality of the simple bayesian classifier under zero-one loss," in Machine Learning, 1997, pp. 103-130.
  13. Z. Y. a. X. K. Z. Niu, "Sentiment classification for microblog by machine learning in Computational and Information Sciences," IEEE, pp. 286-289, 2012.
  14. L. B. a. J. Feng, "Robust sentiment detection on twitter from biased and noisy data," in 23rd International Conference on Computational Linguistics, 2010.
  15. C. Z. a. S. L. R. Xia, "Ensemble of feature sets and classification algorithms for sentiment classification," Information Sciences: an International Journal, vol. 181, no. 6, pp. 113-1152, 2011.
  16. A. P. a. P. Paroubek, "Twitter as a corpus for sentiment analysis and opinion mining," in Proceedings of LREC, 2010.
  17. N. M. R. R. S, "Sentiment Analysis in Twitter using Machine Learning Techniques," IEEE, 2013.
  18. S. B. Y. Mane and S. S. V. Kazi, "Real Time Sentiment Analysis of Twitter Data Using Hadoop," International Journal of Computer Science and Information Technology, 2014.
  19. G. M. S. B. C. Penchalaiah, "Effective Sentiment Analysis on Twitter Data using Apache Flume and Hive.," International Journal of Innovative Science, 2014.
  20. O. e. a. Abdelwahab, "Effect of training set size on SVM and Naive Bayes for Twitter sentiment analysis," in IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), 2015.

Keywords

Sentiment Analysis; Machine Learning; Support Vector Machine; Feature Hashing