A Generalized Email Classification System for Workflow Analysis

Piyanuch Chaipornkaew, Takorn Prexawanprasut, Chia‐Lin Chang, M. McAleer
{"title":"A Generalized Email Classification System for Workflow Analysis","authors":"Piyanuch Chaipornkaew, Takorn Prexawanprasut, Chia‐Lin Chang, M. McAleer","doi":"10.26480/icemi.01.2017.429.436","DOIUrl":null,"url":null,"abstract":"One of the most powerful internet communication channels is email. As employees and their clients communicate primarily via email, much crucial business data is conveyed via email content. Where businesses are understandably concerned, they need a sophisticated workflow management system to manage their transactions. A workflow management system should also be able to classify any incoming emails into suitable categories. Previous research has implemented a system to categorize emails based on the words found in email messages. Two parameters affected the accuracy of the program, namely the number of words in a database compared with sample emails, and an acceptable percentage for classifying emails. As the volume of email has become larger and more sophisticated, this research classifies email messages into a larger number of categories and changes a parameter that affects the accuracy of the program. The first parameter, namely the number of words in a database compared with sample emails, remains unchanged, while the second parameter is changed from an acceptable percentage to the number of matching words. The empirical results suggest that the number of words in a database compared with sample emails is 11, and the number of matching words to categorize emails is 7. When these settings are applied to categorize 12,465 emails, the accuracy of this experiment is approximately 65.3%. The optimal number of words that yields high accuracy levels lies between 11 and 13, while the number of matching words lies between 6 and 8.","PeriodicalId":287555,"journal":{"name":"Journal of Management Information and Decision Sciences","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Management Information and Decision Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26480/icemi.01.2017.429.436","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

One of the most powerful internet communication channels is email. As employees and their clients communicate primarily via email, much crucial business data is conveyed via email content. Where businesses are understandably concerned, they need a sophisticated workflow management system to manage their transactions. A workflow management system should also be able to classify any incoming emails into suitable categories. Previous research has implemented a system to categorize emails based on the words found in email messages. Two parameters affected the accuracy of the program, namely the number of words in a database compared with sample emails, and an acceptable percentage for classifying emails. As the volume of email has become larger and more sophisticated, this research classifies email messages into a larger number of categories and changes a parameter that affects the accuracy of the program. The first parameter, namely the number of words in a database compared with sample emails, remains unchanged, while the second parameter is changed from an acceptable percentage to the number of matching words. The empirical results suggest that the number of words in a database compared with sample emails is 11, and the number of matching words to categorize emails is 7. When these settings are applied to categorize 12,465 emails, the accuracy of this experiment is approximately 65.3%. The optimal number of words that yields high accuracy levels lies between 11 and 13, while the number of matching words lies between 6 and 8.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面向工作流分析的通用电子邮件分类系统
电子邮件是最强大的互联网沟通渠道之一。由于员工和他们的客户主要通过电子邮件进行沟通,许多重要的商业数据都是通过电子邮件内容传达的。对于业务来说,他们需要一个复杂的工作流管理系统来管理他们的事务,这是可以理解的。工作流管理系统还应该能够将任何传入的电子邮件分类为合适的类别。之前的研究已经实现了一个系统,根据电子邮件信息中的单词对电子邮件进行分类。两个参数影响了程序的准确性,即数据库中与样本电子邮件相比的单词数量,以及对电子邮件进行分类的可接受百分比。随着电子邮件的数量越来越大,越来越复杂,这项研究将电子邮件信息分类为更多的类别,并改变了一个影响程序准确性的参数。第一个参数保持不变,即数据库中与样本邮件相比较的单词数,而第二个参数从可接受的百分比更改为匹配的单词数。实证结果表明,与样本邮件相比,数据库中的单词数为11个,与邮件分类相匹配的单词数为7个。当这些设置应用于对12,465封邮件进行分类时,本实验的准确率约为65.3%。产生高准确度水平的最佳单词数在11到13之间,而匹配的单词数在6到8之间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Generalized Email Classification System for Workflow Analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1