缠丝猫 发表于 2024-8-17 17:55:59

Credit Card Fraud Detection(信用卡诓骗检测数据集)

Credit Card Fraud Detection

   Anonymized credit card transactions labeled as fraudulent or genuine

   About Dataset
   
   Context
   It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase.
   Content
   The dataset contains transactions made by credit cards in September 2013 by European cardholders.
   This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, … V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-sensitive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise.
   Given the class imbalance ratio, we recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification.
   Update (03/05/2021)
   A simulator for transaction data has been released as part of the practical handbook on Machine Learning for Credit Card Fraud Detection - https://fraud-detection-handbook.github.io/fraud-detection-handbook/Chapter_3_GettingStarted/SimulatedDataset.html. We invite all practitioners interested in fraud detection datasets to also check out this data simulator, and the methodologies for credit card fraud detection presented in the book.
   Acknowledgements
   The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles) on big data mining and fraud detection.
   More details on current and past projects on related topics are available on https://www.researchgate.net/project/Fraud-detection-5 and the page of the DefeatFraud project
   Please cite the following works:
   Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015
   Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon
   Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE
   Dal Pozzolo, Andrea Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi)
   Carcillo, Fabrizio; Dal Pozzolo, Andrea; Le Borgne, Yann-Aël; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier
   Carcillo, Fabrizio; Le Borgne, Yann-Aël; Caelen, Olivier; Bontempi, Gianluca. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing
   Bertrand Lebichot, Yann-Aël Le Borgne, Liyun He, Frederic Oblé, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019
   Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Frederic Oblé, Gianluca Bontempi Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection Information Sciences, 2019
   Yann-Aël Le Borgne, Gianluca Bontempi Reproducible machine Learning for Credit Card Fraud Detection - Practical Handbook
   Bertrand Lebichot, Gianmarco Paldino, Wissam Siblini, Liyun He, Frederic Oblé, Gianluca Bontempi Incremental learning strategies for credit cards fraud detection, IInternational Journal of Data Science and Analytics
   
   信用卡诓骗检测

   匿名信用卡交易被标记为诓骗或真实

   关于数据集

   
   上下文

   信用卡公司能够识别诓骗性信用卡交易非常重要,这样客户就不会为他们未购买的商品付费。
   内容
   该数据集包含欧洲持卡人在 2013 年 9 月通过信用卡进行的交易。
   该数据集表现两天内发生的交易,其中 284,807 笔交易中有 492 笔是诓骗交易。该数据集非常不均衡,正类(诓骗)占全部交易的 0.172%。它仅包含数值输入变量,这些变量是 PCA 转换的效果。遗憾的是,由于保密题目,我们无法提供原始特征和有关数据的更多配景信息。特征 V1、V2、…V28 是通过 PCA 获得的主要身分,唯一未通过 PCA 转换的特征是“时间”和“金额”。特征“时间”包含数据会合每笔交易与第一笔交易之间颠末的秒数。特征“金额”是交易金额,该特征可用于依赖于示例的本钱敏感学习。特征“种别”是相应变量,在发生诓骗时取值为 1,否则取值为 0。
   
   考虑到种别不均衡率,我们建议使用精确度-召回率曲线下面积 (AUPRC) 来测量正确度。肴杂矩阵正确度对于不均衡分类没有意义。
   
   更新(2021 年 3 月 5 日)

   
   作为信用卡诓骗检测机器学习实用手册的一部门,已发布了交易数据模拟器 - https://fraud-detection-handbook.github.io/fraud-detection-handbook/Chapter_3_GettingStarted/SimulatedDataset.html。我们邀请全部对诓骗检测数据集感爱好的从业者也查看这个数据模拟器,以及书中先容的信用卡诓骗检测方法。
   致谢

   数据集是在 Worldline 和布鲁塞尔自由大学 (ULB) 机器学习小组 (http://mlg.ulb.ac.be) 就大数据挖掘和诓骗检测展开的研究合作期间收集和分析的。
   有关相干主题的当前和过去项目的更多详细信息,请访问 https://www.researchgate.net/project/Fraud-detection-5 和 DefeatFraud 项目页面
   请引用以下作品:
   Andrea Dal Pozzolo、Olivier Caelen、Reid A. Johnson 和 Gianluca Bontempi。使用欠采样校准不均衡分类的概率。在计算智能和数据挖掘研讨会 (CIDM) 中,IEEE,2015
   Dal Pozzolo,Andrea;Caelen,Olivier;Le Borgne,Yann-Ael;Waterschoot,Serge;Bontempi,Gianluca。从实践者的角度学习信用卡诓骗检测的履历教导,应用专家体系,41,10,4915-4928,2014,Pergamon
   Dal Pozzolo,Andrea;Boracchi,Giacomo;Caelen,Olivier;Alippi,Cesare;Bontempi,Gianluca。信用卡诓骗检测:一种实际的建模和一种新颖的学习策略,IEEE 神经网络和学习体系交易,29,8,3784-3797,2018,IEEE
   Dal Pozzolo,Andrea 自顺应机器学习用于信用卡诓骗检测 ULB MLG 博士论文(由 G. Bontempi 引导)
   Carcillo,Fabrizio;Dal Pozzolo,Andrea;Le Borgne,Yann-Aël;Caelen,Olivier;Mazzer,Yannis;Bontempi,Gianluca。 Scarff:使用 Spark 进行流式信用卡诓骗检测的可扩展框架,信息融合,41,182-194,2018,Elsevier
   Carcillo,Fabrizio;Le Borgne,Yann-Aël;Caelen,Olivier;Bontempi,Gianluca。流式主动学习策略用于实际生活中的信用卡诓骗检测:评估和可视化,国际数据科学与分析杂志,5,4,285-300,2018,Springer International Publishing
   Bertrand Lebichot、Yann-Aël Le Borgne、Liyun He、Frederic Oblé、Gianluca Bontempi 信用卡诓骗检测的深度学习领域自顺应技能,INNSBDDL 2019:大数据和深度学习的最新进展,第 78-88 页,2019 年
   Fabrizio Carcillo、Yann-Aël Le Borgne、Olivier Caelen、Frederic Oblé、Gianluca Bontempi 将无监视和监视学习相结适用于信用卡诓骗检测信息科学,2019 年
   Yann-Aël Le Borgne、Gianluca Bontempi 信用卡诓骗检测的可重复机器学习 - 实用手册
   Bertrand Lebichot、Gianmarco Paldino、Wissam Siblini、Liyun He、Frederic Oblé、Gianluca Bontempi 信用卡诓骗检测的增量学习策略,《国际数据科学与分析杂志》
   https://i-blog.csdnimg.cn/direct/2bcbe8e8382d48bdb3dff897ca4b62f6.png
   https://i-blog.csdnimg.cn/direct/2f6a983a62a34d7f8b78d6a34edfb4a9.png

免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。
页: [1]
查看完整版本: Credit Card Fraud Detection(信用卡诓骗检测数据集)