计算机科学 ›› 2020, Vol. 47 ›› Issue (1): 66-71.doi: 10.11896/jsjkx.181102110

• 计算机科学理论 • 上一篇    下一篇

基于日志自动机的业务流程混沌活动过滤方法

李娟,方贤文,王丽丽,刘祥伟   

  1. (安徽理工大学数学与大数据学院 安徽 淮南 232001)
  • 收稿日期:2018-11-16 发布日期:2020-01-19
  • 通讯作者: 方贤文([email protected])
  • 基金资助:
    国家自然科学基金项目(61572035,61272153,61402011);安徽省自然科学基金项目(1508085MF111)

Chaotic Activity Filter Method for Business Process Based on Log Automaton

LI Juan,FANG Xian-wen,WANG Li-li,LIU Xiang-wei   

  1. (College of Mathematics and Big Data,Anhui University of Science and Technology,Huainan,Anhui 232001,China)
  • Received:2018-11-16 Published:2020-01-19
  • About author:LI Juan,born in 1992,postgraduate.Her main research interests include Pet net and Business process management;FANG Xian-wen,born in 1975,Ph.D,professor,Ph.D supervisor,is member of China Computer Federation (CCF).His main research interests include Petri net and trusted software.
  • Supported by:
    This work supported by the National Natural Science Foundation of China (61572035,61272153,61402011) and Natural Science Foundation of Anhui Province,China (1508085MF111).

摘要: 业务流程事件日志有时包含混沌活动,混沌活动是独立于流程状态且不受流程约束,会随时随地发生的一类活动。混沌活动的存在会严重影响业务流程挖掘的质量,因此过滤混沌活动成为业务流程管理的关键内容之一。目前,混沌活动的过滤方法主要是从事件日志中过滤不频繁行为,以高频优先为基础的过滤方法并不能有效地过滤日志中的混沌活动。为了解决上述问题,提出了一种基于日志自动机和熵的方法来过滤日志中的混沌活动。首先,根据活动的直接前集率和直接后集率计算得到熵值大的可疑混沌活动集;然后,基于事件日志构建日志自动机,利用日志自动机模型计算得到不频繁弧的活动集与日志中熵值大的活动集,对其取交集得到混沌活动集;最后,运用条件发生概率和行为轮廓确定该混沌活动与其他活动之间的依赖关系,从而决定是在日志中完全删除该混沌活动还是保留该混沌活动在日志中的正确位置而删除其他位置的此活动。案例分析验证了该方法的有效性。

关键词: Petri网, 混沌活动, 日志自动机, 熵, 条件发生概率, 行为轮廓

Abstract: Business process event logs sometimes contain chaotic activities,which are a kind of activity independent of process state and free from process constraints,and may happen anytime and anywhere.The existence of chaotic activities can seriously affect the quality of business process mining,so filtering chaotic activities becomes one of the key contents of business process management.At present,the filtering method of chaotic activity mainly filters infrequent behavior from the event the log,and the filtering method based on high frequency priority is not effective in filtering chaotic activities in the log.In order to solve the above problems,a method based on log automata and entropy is proposed to filter chaotic activities in logs.Firstly,a suspicious chaotic activity set with high entropy is obtained by calculating the direct preset rate and direct posterior set rate of activity.Then,the log automata is constructed from the event log.From the log automata model,the intersection of the activity set of infrequent arc and the activity set of high entropy in the log is calculated to obtain the chaotic activity set.Finally,the conditional occurrence probability and behavior profile are used to determine the dependence between the chaotic activity and other activities,so as to decide whether to delete the chaotic activity completely in the log or to keep the chaotic activity in the correct position in the log to delete other activities.The effectiveness of the method is verified by case analysis.

Key words: Behavioral profile, Chaotic activity, Conditional occurrence probability, Entropy, Log automaton, Petir net

中图分类号: 

  • TP391
[1]WIL V D A.Process Mining:Data Science in Action[M]. Springer Publishing Company,Incorporated,2016.
[2]LEEMANS S J J,FAHLAND D,AALST W M P V D.Scalable process discovery and conformance checking[J].Software & Systems Modeling,2018,17(2):599-631.
[3]CHABROL M,DALMAS B,NORRE S,et al.A process tree- based algorithm for the detection of implicit dependencies[C]∥IEEE Tenth International Conference on Research Challenges in Information Science.IEEE,2016:1-11.
[4]SANI M F,ZELST S J V,AALST W M P V D.Repairing Outlier Behaviour in Event Logs [C]∥International Conference on Business Information Systems.Cham:Springer,2018.
[5]HUANG Y,WANG Y,HUANG Y.Filtering Out Infrequent Events by Expectation from Business Process Event Logs[C]∥2018 14th International Conference on Computational Intelligence and Security (CIS).IEEE Computer Society,2018.
[6]LIESAPUTRA V,YONGCHAREON S,CHAISIRI S.Efficient Process Model Discovery Using Maximal Pattern Mining[C]∥International Conference on Business Process Management.Cham,2015:441-456.
[7]LU X,FAHLAND D,BIGGELAAR,et al.Detecting Deviating Behaviors Without Models[C]∥International Conference on Business Process Management.Cham:2015:126-139.
[8]ROJAS E,MUNOZ-GAMA J,SEPU'LVEDA M,et al.Process mining in healthcare:A literature review[J].Journal of Biomedical Informatics,2016,61:224-236.
[9]PULSANONG W,POROUHAN P,TUMSWADI S,et al.Using inductive miner to find the most optimized path of workflow process[C]∥International Conference on ICT and Knowledge Engineering.IEEE,2017:1-5.
[10]BURATTIN A.Heuristics Miner for Time Interval[C]∥Esann 2010,European Symposium on Artificial Neural Networks.Bruges,Belgium:DBLP,2015:85-95.
[11]LINGALA N,SRI NAMACHCHIVAYA N,PERKOWSKI N,et al.Particle filtering in high-dimensional chaotic systems[J].Chaos:An Interdisciplinary Journal of Nonlinear Science,2012,22(4):047509.
[12]CONFORTI R,ROSA M L,HOFSTEDE A H M T.Filtering Out Infrequent Behavior from Business Process Event Logs[J].IEEE Transactions on Knowledge & Data Engineering,2017,29(2):300-314.
[13]MANNHARDT F,DE LEONI M,REIJERS H A,et al.Data-Driven Process Discovery-Revealing Conditional Infrequent Behavior from Event Logs[C]∥International Conference on Advanced Information Systems Engineering.Cham:Springer,2017:545-560.
[14]SANI M F,ZELST S J V,AALST W M P V D.Improving Process Discovery Results by Filtering Outliers Using Conditional Behavioural Probabilities [C]∥International Conference on Business Process Management.Cham:Springer,2017:216-229.
[15]TAX N,SIDOROVA N,AALST W M P V D.Discovering more precise process models from event logs by filtering out chaotic activities[J].Journal of Intelligent Information Systems,2019,52(1):107-139.
[1] 郑文萍, 刘美麟, 杨贵.
一种基于节点稳定性和邻域相似性的社区发现算法
Community Detection Algorithm Based on Node Stability and Neighbor Similarity
计算机科学, 2022, 49(9): 83-91. https://doi.org/10.11896/jsjkx.220400146
[2] 李其烨, 邢红杰.
基于最大相关熵的KPCA异常检测方法
KPCA Based Novelty Detection Method Using Maximum Correntropy Criterion
计算机科学, 2022, 49(8): 267-272. https://doi.org/10.11896/jsjkx.210700175
[3] 毛森林, 夏镇, 耿新宇, 陈剑辉, 蒋宏霞.
基于密度敏感距离和模糊划分的改进FCM算法
FCM Algorithm Based on Density Sensitive Distance and Fuzzy Partition
计算机科学, 2022, 49(6A): 285-290. https://doi.org/10.11896/jsjkx.210700042
[4] 阙华坤, 冯小峰, 刘盼龙, 郭文翀, 李健, 曾伟良, 范竞敏.
Grassberger熵随机森林在窃电行为检测的应用
Application of Grassberger Entropy Random Forest to Power-stealing Behavior Detection
计算机科学, 2022, 49(6A): 790-794. https://doi.org/10.11896/jsjkx.210800032
[5] 余本功, 张子薇, 王惠灵.
一种融合多层次情感和主题信息的TS-AC-EWM在线商品排序方法
TS-AC-EWM Online Product Ranking Method Based on Multi-level Emotion and Topic Information
计算机科学, 2022, 49(6A): 165-171. https://doi.org/10.11896/jsjkx.210400238
[6] 范静宇, 刘全.
基于随机加权三重Q学习的异策略最大熵强化学习算法
Off-policy Maximum Entropy Deep Reinforcement Learning Algorithm Based on RandomlyWeighted Triple Q -Learning
计算机科学, 2022, 49(6): 335-341. https://doi.org/10.11896/jsjkx.210300081
[7] 李清, 刘伟, 管梦真, 杜玉越, 孙红伟.
基于逻辑博弈决策Petri网的应急决策建模与分析
Modeling and Analysis of Emergency Decision Making Based on Logical Probability GamePetri Net
计算机科学, 2022, 49(4): 294-301. https://doi.org/10.11896/jsjkx.210300224
[8] 夏源, 赵蕴龙, 范其林.
基于信息熵更新权重的数据流集成分类算法
Data Stream Ensemble Classification Algorithm Based on Information Entropy Updating Weight
计算机科学, 2022, 49(3): 92-98. https://doi.org/10.11896/jsjkx.210200047
[9] 吴少波, 傅启明, 陈建平, 吴宏杰, 陆悠.
基于相对熵的元逆强化学习方法
Meta-inverse Reinforcement Learning Method Based on Relative Entropy
计算机科学, 2021, 48(9): 257-263. https://doi.org/10.11896/jsjkx.200700044
[10] 罗月童, 汪涛, 杨梦男, 张延孔.
基于历史行车轨迹集的车辆行为可视分析方法
Historical Driving Track Set Based Visual Vehicle Behavior Analytic Method
计算机科学, 2021, 48(9): 86-94. https://doi.org/10.11896/jsjkx.200900040
[11] 罗长银, 陈学斌, 马春地, 张淑芬.
基于层析分析改进的联邦平均算法
Improved Federated Average Algorithm Based on Tomographic Analysis
计算机科学, 2021, 48(8): 32-40. https://doi.org/10.11896/jsjkx.201000093
[12] 石克翔, 保利勇, 丁洪伟, 官铮, 赵雷.
基于生成时间序列均匀优化的混沌人工蜂群算法
Chaos Artificial Bee Colony Algorithm Based on Homogenizing Optimization of Generated Time Series
计算机科学, 2021, 48(7): 270-280. https://doi.org/10.11896/jsjkx.200800087
[13] 陶小燕, 闫春钢, 刘关俊.
基于WFT-net验证合理性的动态数据精炼策略
Dynamic Data Refining Strategy for Soundness Verification Based on WFT-net
计算机科学, 2021, 48(7): 99-104. https://doi.org/10.11896/jsjkx.200700125
[14] 穆俊芳, 郑文萍, 王杰, 梁吉业.
基于重连机制的复杂网络鲁棒性分析
Robustness Analysis of Complex Network Based on Rewiring Mechanism
计算机科学, 2021, 48(7): 130-136. https://doi.org/10.11896/jsjkx.201000108
[15] 周钢, 郭福亮.
基于特征选择的高维数据集成学习方法研究
Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data
计算机科学, 2021, 48(6A): 250-254. https://doi.org/10.11896/jsjkx.200700102
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!