Reseach Article

A New Machine Learning based Approach for Text Spam Filtering Technique

by Dipmalya Sen, Chandan Das, Sarit Chakraborty
Communications on Applied Electronics
Foundation of Computer Science (FCS), NY, USA
Volume 6 - Number 10
Year of Publication: 2017
Authors: Dipmalya Sen, Chandan Das, Sarit Chakraborty

Electronic mail (e-mail) has become an essential element in our daily activities in recent past. Volume of email traffic is increasing many a fold in last couple of decades. Out of all such e-mails around 80% are unwanted mails, called as unsolicited bulk email (UBE) or spam mails. With the drastic increase in the use of electronic mail, there has also been an escalation in the problem of dealing with spam mails. In spite of availability of many commercial text based spam filters, users still suffer from the problem of spam mail, which unnecessarily accumulated in their inbox. In this work, we have proposed a spam detection algorithm based on Machine Learning approach. We have used the concept of Cumulative Weighted Sum (CWS) seeking to achieve a greater rate of accuracy in detecting spam mails. Three different techniques are also proposed for calculating CWS value. Our method is able to detect most of the spam and provides an accurate and dynamic filtration for such mails. Experimental results of our technique with different benchmark datasets are quite significant and gives much improved performance than the available text spam filters.

Index Terms

Computer Science
Information Sciences


E-mail Spam Ham Machine learning Naïve-Bayes Cumulative-Weighted Sum