文章摘要
基于朴素贝叶斯的垃圾邮件分类系统的设计
The Design of Junk Mail Classification System Based on Na? ve Bayes
  
DOI:10.3969/j.issn.1671-5322.2008.02.012
中文关键词: 电子邮件  文本分类  朴素贝叶斯  机器学习
英文关键词: email  text classification  Nave-Bayes  machine learning
基金项目:
作者单位
徐治国 盐城民航站江苏盐城224051 
摘要点击次数: 4755
全文下载次数: 4132
中文摘要:
      结合垃圾邮件分类系统的具体要求,在传统规则分类方法的基础上引入机器学习的知识,给出了系统体系结构和特征提取算法,试验了一种对新邮件计算所属类别后验概率的方法,并详细讨论了一个基于朴素贝叶斯方法的个性化垃圾邮件分类系统的设计.提出的分TFIDF特征子集提取算法和朴素贝叶斯方法对邮件进行分类具有较好的分类精度,应用朴素贝叶斯方法在新邮件到达的同时对其进行分类,具有较好的分类速度.
英文摘要:
      The research of anti junk mail is the hotspot in computer science research area at all times.This paper combines the specific demand to junk mail classifier,introduces the knowledge of machine learning on the base of the traditional regular classification,presents the architecture of the junk mail system and the feature extraction algorithm,and tests a new method to compute the posteriori probability which sort a new email fall into,and discusses in detail the design of an individual junk mail classifier which is based on Nave-Bayes.When the system uses the dispart words algorithm,TFIDF feature subset abstraction algorithm and Naive-Bayes method,it classifies emails more precisely and more quickly.
查看全文   查看/发表评论  下载PDF阅读器
关闭