首页 > 最新文献

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining最新文献

英文 中文
Deep Reinforcement Learning with Applications in Transportation 深度强化学习在交通运输中的应用
Zhiwei Qin, Jian Tang, Jieping Ye
This tutorial aims to provide the audience with a guided introduction to deep reinforcement learning (DRL) with specially curated application case studies in transportation. The tutorial covers both theory and practice, with more emphasis on the practical aspects of DRL that are pertinent to tackle transportation challenges. Some core examples include online ride order dispatching, fleet management, traffic signals control, route planning, and autonomous driving.
本教程旨在为读者提供深度强化学习(DRL)的引导性介绍,并特别策划了交通运输中的应用案例研究。本教程涵盖理论和实践,更强调与解决交通挑战相关的DRL的实践方面。一些核心的例子包括在线乘车调度、车队管理、交通信号控制、路线规划和自动驾驶。
{"title":"Deep Reinforcement Learning with Applications in Transportation","authors":"Zhiwei Qin, Jian Tang, Jieping Ye","doi":"10.1145/3292500.3332299","DOIUrl":"https://doi.org/10.1145/3292500.3332299","url":null,"abstract":"This tutorial aims to provide the audience with a guided introduction to deep reinforcement learning (DRL) with specially curated application case studies in transportation. The tutorial covers both theory and practice, with more emphasis on the practical aspects of DRL that are pertinent to tackle transportation challenges. Some core examples include online ride order dispatching, fleet management, traffic signals control, route planning, and autonomous driving.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127588757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Individualized Indicator for All: Stock-wise Technical Indicator Optimization with Stock Embedding 个性化指标为所有:股票明智的技术指标优化与股票嵌入
Zhige Li, Derek Yang, Li Zhao, Jiang Bian, Tao Qin, Tie-Yan Liu
As one of the most important investing approaches, technical analysis attempts to forecast stock movement by interpreting the inner rules from historic price and volume data. To address the vital noisy nature of financial market, generic technical analysis develops technical trading indicators, as mathematical summarization of historic price and volume data, to form up the foundation for robust and profitable investment strategies. However, an observation reveals that stocks with different properties have different affinities over technical indicators, which discloses a big challenge for the indicator-oriented stock selection and investment. To address this problem, in this paper, we design a Technical Trading Indicator Optimization(TTIO) framework that manages to optimize the original technical indicator by leveraging stock-wise properties. To obtain effective representations of stock properties, we propose a Skip-gram architecture to learn stock embedding inspired by a valuable knowledge repository formed by fund manager's collective investment behaviors. Based on the learned stock representations, TTIO further learns a re-scaling network to optimize the indicator's performance. Extensive experiments on real-world stock market data demonstrate that our method can obtain the very stock representations that are invaluable for technical indicator optimization since the optimized indicators can result in strong investing signals than original ones.
作为最重要的投资方法之一,技术分析试图通过解释历史价格和成交量数据的内在规律来预测股票的运动。为了解决金融市场嘈杂的本质,通用技术分析开发了技术交易指标,作为历史价格和交易量数据的数学总结,为稳健和有利可图的投资策略奠定了基础。然而,通过观察发现,不同性质的股票对技术指标的亲和力不同,这对指标导向的选股和投资提出了很大的挑战。为了解决这个问题,在本文中,我们设计了一个技术交易指标优化(TTIO)框架,该框架通过利用股票属性来优化原始技术指标。为了获得股票属性的有效表示,我们提出了一种Skip-gram架构来学习股票嵌入,该架构的灵感来自于基金经理集体投资行为形成的有价值知识库。基于学习到的股票表示,TTIO进一步学习一个重尺度网络来优化指标的性能。对真实股市数据的大量实验表明,我们的方法可以获得对技术指标优化非常宝贵的股票表征,因为优化后的指标可以产生比原始指标更强的投资信号。
{"title":"Individualized Indicator for All: Stock-wise Technical Indicator Optimization with Stock Embedding","authors":"Zhige Li, Derek Yang, Li Zhao, Jiang Bian, Tao Qin, Tie-Yan Liu","doi":"10.1145/3292500.3330833","DOIUrl":"https://doi.org/10.1145/3292500.3330833","url":null,"abstract":"As one of the most important investing approaches, technical analysis attempts to forecast stock movement by interpreting the inner rules from historic price and volume data. To address the vital noisy nature of financial market, generic technical analysis develops technical trading indicators, as mathematical summarization of historic price and volume data, to form up the foundation for robust and profitable investment strategies. However, an observation reveals that stocks with different properties have different affinities over technical indicators, which discloses a big challenge for the indicator-oriented stock selection and investment. To address this problem, in this paper, we design a Technical Trading Indicator Optimization(TTIO) framework that manages to optimize the original technical indicator by leveraging stock-wise properties. To obtain effective representations of stock properties, we propose a Skip-gram architecture to learn stock embedding inspired by a valuable knowledge repository formed by fund manager's collective investment behaviors. Based on the learned stock representations, TTIO further learns a re-scaling network to optimize the indicator's performance. Extensive experiments on real-world stock market data demonstrate that our method can obtain the very stock representations that are invaluable for technical indicator optimization since the optimized indicators can result in strong investing signals than original ones.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125393858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space 双曲空间中基于树连续表示的梯度分层聚类
Nicholas Monath, M. Zaheer, Daniel Silva, A. McCallum, Amr Ahmed
Hierarchical clustering is typically performed using algorithmic-based optimization searching over the discrete space of trees. While these optimization methods are often effective, their discreteness restricts them from many of the benefits of their continuous counterparts, such as scalable stochastic optimization and the joint optimization of multiple objectives or components of a model (e.g. end-to-end training). In this paper, we present an approach for hierarchical clustering that searches over continuous representations of trees in hyperbolic space by running gradient descent. We compactly represent uncertainty over tree structures with vectors in the Poincare ball. We show how the vectors can be optimized using an objective related to recently proposed cost functions for hierarchical clustering (Dasgupta, 2016; Wang and Wang, 2018). Using our method with a mini-batch stochastic gradient descent inference procedure, we are able to outperform prior work on clustering millions of ImageNet images by 15 points of dendrogram purity. Further, our continuous tree representation can be jointly optimized in multi-task learning applications offering a 9 point improvement over baseline methods.
分层聚类通常使用基于算法的优化搜索树的离散空间来执行。虽然这些优化方法通常是有效的,但它们的离散性限制了它们获得连续优化方法的许多好处,例如可扩展的随机优化和模型的多个目标或组件的联合优化(例如端到端训练)。在本文中,我们提出了一种通过运行梯度下降来搜索双曲空间中树的连续表示的分层聚类方法。我们用庞加莱球中的向量简洁地表示树结构上的不确定性。我们展示了如何使用与最近提出的分层聚类成本函数相关的目标来优化向量(Dasgupta, 2016;Wang and Wang, 2018)。使用我们的方法和一个小批量随机梯度下降推理过程,我们能够在数以百万计的ImageNet图像的聚类上比之前的工作高出15个点的树图纯度。此外,我们的连续树表示可以在多任务学习应用中共同优化,比基线方法提高了9个点。
{"title":"Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space","authors":"Nicholas Monath, M. Zaheer, Daniel Silva, A. McCallum, Amr Ahmed","doi":"10.1145/3292500.3330997","DOIUrl":"https://doi.org/10.1145/3292500.3330997","url":null,"abstract":"Hierarchical clustering is typically performed using algorithmic-based optimization searching over the discrete space of trees. While these optimization methods are often effective, their discreteness restricts them from many of the benefits of their continuous counterparts, such as scalable stochastic optimization and the joint optimization of multiple objectives or components of a model (e.g. end-to-end training). In this paper, we present an approach for hierarchical clustering that searches over continuous representations of trees in hyperbolic space by running gradient descent. We compactly represent uncertainty over tree structures with vectors in the Poincare ball. We show how the vectors can be optimized using an objective related to recently proposed cost functions for hierarchical clustering (Dasgupta, 2016; Wang and Wang, 2018). Using our method with a mini-batch stochastic gradient descent inference procedure, we are able to outperform prior work on clustering millions of ImageNet images by 15 points of dendrogram purity. Further, our continuous tree representation can be jointly optimized in multi-task learning applications offering a 9 point improvement over baseline methods.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125960383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
PerDREP PerDREP
Sanjoy Dey, Ping Zhang, D. Sow, Kenney Ng
In contrast to the one-size-fits-all approach to medicine, precision medicine will allow targeted prescriptions based on the specific profile of the patient thereby avoiding adverse reactions and ineffective but expensive treatments. Longitudinal observational data such as Electronic Health Records (EHRs) have become an emerging data source for personalized medicine. In this paper, we propose a unified computational framework, called PerDREP, to predict the unique response patterns of each individual patient from EHR data. PerDREP models individual responses of each patient to the drug exposure by introducing a linear system to account for patients' heterogeneity, and incorporates a patient similarity graph as a network regularization. We formulate PerDREP as a convex optimization problem and develop an iterative gradient descent method to solve it. In the experiments, we identify the effect of drugs on Glycated hemoglobin test results. The experimental results provide evidence that the proposed method is not only more accurate than state-of-the-art methods, but is also able to automatically cluster patients into multiple coherent groups, thus paving the way for personalized medicine.
{"title":"PerDREP","authors":"Sanjoy Dey, Ping Zhang, D. Sow, Kenney Ng","doi":"10.1145/3292500.3330928","DOIUrl":"https://doi.org/10.1145/3292500.3330928","url":null,"abstract":"In contrast to the one-size-fits-all approach to medicine, precision medicine will allow targeted prescriptions based on the specific profile of the patient thereby avoiding adverse reactions and ineffective but expensive treatments. Longitudinal observational data such as Electronic Health Records (EHRs) have become an emerging data source for personalized medicine. In this paper, we propose a unified computational framework, called PerDREP, to predict the unique response patterns of each individual patient from EHR data. PerDREP models individual responses of each patient to the drug exposure by introducing a linear system to account for patients' heterogeneity, and incorporates a patient similarity graph as a network regularization. We formulate PerDREP as a convex optimization problem and develop an iterative gradient descent method to solve it. In the experiments, we identify the effect of drugs on Glycated hemoglobin test results. The experimental results provide evidence that the proposed method is not only more accurate than state-of-the-art methods, but is also able to automatically cluster patients into multiple coherent groups, thus paving the way for personalized medicine.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126103787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Community Detection on Large Complex Attribute Network 大型复杂属性网络的社区检测
Chen Zhe, Aixin Sun, Xiaokui Xiao
A large payment network contains millions of merchants and billions of transactions, and the merchants are described in a large number of attributes with incomplete values. Understanding its community structures is crucial to ensure its sustainable and long lasting. Knowing a merchant's community is also important from many applications - risk management, compliance, legal and marketing. To detect communities, an algorithm has to take advances from both attribute and topological information. Further, the method has to be able to handle incomplete and complex attributes. In this paper, we propose a framework named AGGMMR to effectively address the challenges come from scalability, mixed attributes, and incomplete value. We evaluate our proposed framework on four benchmark datasets against five strong baselines. More importantly, we provide a case study of running AGGMMR on a large network from PayPal which contains $100 million$ merchants with $1.5 billion$ transactions. The results demonstrate AGGMMR's effectiveness and practicability.
一个庞大的支付网络包含数百万商家和数十亿笔交易,而商家是用大量不完全值的属性来描述的。了解其社区结构是确保其可持续和持久的关键。了解商家的社区在许多应用中也很重要——风险管理、合规、法律和营销。为了检测社区,算法必须同时利用属性信息和拓扑信息。此外,该方法必须能够处理不完整和复杂的属性。在本文中,我们提出了一个名为AGGMMR的框架来有效地解决可扩展性、混合属性和不完整值带来的挑战。我们在四个基准数据集和五个强基线上评估了我们提出的框架。更重要的是,我们提供了一个在PayPal的大型网络上运行AGGMMR的案例研究,该网络包含价值1亿美元的商家和15亿美元的交易。结果证明了AGGMMR的有效性和实用性。
{"title":"Community Detection on Large Complex Attribute Network","authors":"Chen Zhe, Aixin Sun, Xiaokui Xiao","doi":"10.1145/3292500.3330721","DOIUrl":"https://doi.org/10.1145/3292500.3330721","url":null,"abstract":"A large payment network contains millions of merchants and billions of transactions, and the merchants are described in a large number of attributes with incomplete values. Understanding its community structures is crucial to ensure its sustainable and long lasting. Knowing a merchant's community is also important from many applications - risk management, compliance, legal and marketing. To detect communities, an algorithm has to take advances from both attribute and topological information. Further, the method has to be able to handle incomplete and complex attributes. In this paper, we propose a framework named AGGMMR to effectively address the challenges come from scalability, mixed attributes, and incomplete value. We evaluate our proposed framework on four benchmark datasets against five strong baselines. More importantly, we provide a case study of running AGGMMR on a large network from PayPal which contains $100 million$ merchants with $1.5 billion$ transactions. The results demonstrate AGGMMR's effectiveness and practicability.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128092369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Scalable Hierarchical Clustering with Tree Grafting 基于树嫁接的可伸缩分层聚类
Nicholas Monath, Ari Kobren, A. Krishnamurthy, Michael R. Glass, A. McCallum
We introduce Grinch, a new algorithm for large-scale, non-greedy hierarchical clustering with general linkage functions that compute arbitrary similarity between two point sets. The key components of Grinch are its rotate and graft subroutines that efficiently reconfigure the hierarchy as new points arrive, supporting discovery of clusters with complex structure. Grinch is motivated by a new notion of separability for clustering with linkage functions: we prove that when the linkage function is consistent with a ground-truth clustering, Grinch is guaranteed to produce a cluster tree containing the ground-truth, independent of data arrival order. Our empirical results on benchmark and author coreference datasets (with standard and learned linkage functions) show that Grinch is more accurate than other scalable methods, and orders of magnitude faster than hierarchical agglomerative clustering.
本文介绍了一种新的Grinch算法,该算法用于计算两个点集之间的任意相似度,具有一般的链接函数。Grinch的关键组成部分是它的旋转和嫁接子程序,这些子程序可以在新点到达时有效地重新配置层次结构,支持发现具有复杂结构的簇。Grinch的动机是一个新的可分性概念,用于与链接函数聚类:我们证明了当链接函数与一个基础真聚类一致时,Grinch保证生成一个包含基础真聚类树,与数据到达顺序无关。我们在基准和作者共同参考数据集(具有标准和学习链接函数)上的实证结果表明,Grinch比其他可扩展方法更准确,并且比分层凝聚聚类快几个数量级。
{"title":"Scalable Hierarchical Clustering with Tree Grafting","authors":"Nicholas Monath, Ari Kobren, A. Krishnamurthy, Michael R. Glass, A. McCallum","doi":"10.1145/3292500.3330929","DOIUrl":"https://doi.org/10.1145/3292500.3330929","url":null,"abstract":"We introduce Grinch, a new algorithm for large-scale, non-greedy hierarchical clustering with general linkage functions that compute arbitrary similarity between two point sets. The key components of Grinch are its rotate and graft subroutines that efficiently reconfigure the hierarchy as new points arrive, supporting discovery of clusters with complex structure. Grinch is motivated by a new notion of separability for clustering with linkage functions: we prove that when the linkage function is consistent with a ground-truth clustering, Grinch is guaranteed to produce a cluster tree containing the ground-truth, independent of data arrival order. Our empirical results on benchmark and author coreference datasets (with standard and learned linkage functions) show that Grinch is more accurate than other scalable methods, and orders of magnitude faster than hierarchical agglomerative clustering.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125923592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Learning Class-Conditional GANs with Active Sampling 学习类主动采样条件gan
Ming-Kun Xie, Sheng-Jun Huang
Class-conditional variants of Generative adversarial networks (GANs) have recently achieved a great success due to its ability of selectively generating samples for given classes, as well as improving generation quality. However, its training requires a large set of class-labeled data, which is often expensive and difficult to collect in practice. In this paper, we propose an active sampling method to reduce the labeling cost for effectively training the class-conditional GANs. On one hand, the most useful examples are selected for external human labeling to jointly reduce the difficulty of model learning and alleviate the missing of adversarial training; on the other hand, fake examples are actively sampled for internal model retraining to enhance the adversarial training between the discriminator and generator. By incorporating the two strategies into a unified framework, we provide a cost-effective approach to train class-conditional GANs, which achieves higher generation quality with less training examples. Experiments on multiple datasets, diverse GAN configurations and various metrics demonstrate the effectiveness of our approaches.
生成对抗网络(GANs)的类条件变体最近取得了巨大的成功,因为它能够为给定的类选择性地生成样本,并提高生成质量。然而,它的训练需要大量的类别标记数据,这些数据在实践中通常是昂贵且难以收集的。在本文中,我们提出了一种主动采样方法,以减少有效训练类条件gan的标记成本。一方面,选择最有用的例子进行外部人工标注,共同降低模型学习的难度,缓解对抗性训练的缺失;另一方面,主动采样假样本进行内部模型再训练,增强鉴别器和生成器之间的对抗性训练。通过将这两种策略整合到一个统一的框架中,我们提供了一种具有成本效益的训练类条件gan的方法,该方法可以用更少的训练样本获得更高的生成质量。在多个数据集、不同GAN配置和不同度量上的实验证明了我们方法的有效性。
{"title":"Learning Class-Conditional GANs with Active Sampling","authors":"Ming-Kun Xie, Sheng-Jun Huang","doi":"10.1145/3292500.3330883","DOIUrl":"https://doi.org/10.1145/3292500.3330883","url":null,"abstract":"Class-conditional variants of Generative adversarial networks (GANs) have recently achieved a great success due to its ability of selectively generating samples for given classes, as well as improving generation quality. However, its training requires a large set of class-labeled data, which is often expensive and difficult to collect in practice. In this paper, we propose an active sampling method to reduce the labeling cost for effectively training the class-conditional GANs. On one hand, the most useful examples are selected for external human labeling to jointly reduce the difficulty of model learning and alleviate the missing of adversarial training; on the other hand, fake examples are actively sampled for internal model retraining to enhance the adversarial training between the discriminator and generator. By incorporating the two strategies into a unified framework, we provide a cost-effective approach to train class-conditional GANs, which achieves higher generation quality with less training examples. Experiments on multiple datasets, diverse GAN configurations and various metrics demonstrate the effectiveness of our approaches.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124258918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Seeker 寻的器
Ari Biswas, T. T. Pham, Michael Vogelsong, Benjamin Snyder, Houssam Nassif
Seeker®, our interactive application security testing solution, gives you unparalleled visibility into your web app security posture and identifies vulnerability trends against compliance standards (e.g., OWASP Top 10, PCI DSS, GDPR, and CWE/SANS Top 25). Seeker enables security teams to identify and track sensitive data to ensure that it is handled securely and not stored in log files or databases with weak or no encryption. Seeker’s seamless integration into CI/CD workflows enables fast IAST security testing at DevOps speed.
Seeker®是我们的交互式应用程序安全测试解决方案,可为您提供无与伦比的web应用程序安全状态可视性,并根据合规性标准(例如,OWASP十大,PCI DSS, GDPR和CWE/SANS 25强)识别漏洞趋势。Seeker使安全团队能够识别和跟踪敏感数据,以确保其安全处理,而不是存储在日志文件或数据库中,没有加密或没有加密。Seeker无缝集成到CI/CD工作流程中,可以以DevOps的速度快速进行IAST安全测试。
{"title":"Seeker","authors":"Ari Biswas, T. T. Pham, Michael Vogelsong, Benjamin Snyder, Houssam Nassif","doi":"10.1145/3292500.3330733","DOIUrl":"https://doi.org/10.1145/3292500.3330733","url":null,"abstract":"Seeker®, our interactive application security testing solution, gives you unparalleled visibility into your web app security posture and identifies vulnerability trends against compliance standards (e.g., OWASP Top 10, PCI DSS, GDPR, and CWE/SANS Top 25). Seeker enables security teams to identify and track sensitive data to ensure that it is handled securely and not stored in log files or databases with weak or no encryption. Seeker’s seamless integration into CI/CD workflows enables fast IAST security testing at DevOps speed.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"186 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123512740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
AccuAir
Zhipeng Luo, Jianqiang Huang, Ke Hu, Xue Li, Peng Zhang
Since air pollution seriously affects human heath and daily life, the air quality prediction has attracted increasing attention and become an active and important research topic. In this paper, we present AccuAir, our winning solution to the KDD Cup 2018 of Fresh Air, where the proposed solution has won the 1st place in two tracks, and the 2nd place in the other one. Our solution got the best accuracy on average in all the evaluation days. The task is to accurately predict the air quality (as indicated by the concentration of PM2.5, PM10 or O3) of the next 48 hours for each monitoring station in Beijing and London. Aiming at a cutting-edge solution, we first presents an analysis of the air quality data, identifying the fundamental challenges, such as the long-term but suddenly changing air quality, and complex spatial-temporal correlations in different stations. To address the challenges, we carefully design both global and local air quality features, and develop three prediction models including LightGBM, Gated-DNN and Seq2Seq, each with novel ingredients developed for better solving the problem. Specifically, a spatial-temporal gate is proposed in our Gated-DNN model, to effectively capture the spatial-temporal correlations as well as temporal relatedness, making the prediction more sensitive to spatial and temporal signals. In addition, the Seq2Seq model is adapted in such a way that the encoder summarizes useful historical features while the decoder concatenate weather forecast as input, which significantly improves prediction accuracy. Assembling all these components together, the ensemble of three models outperforms all competing methods in terms of the prediction accuracy of 31 days average, 10 days average and 24-48 hours.
{"title":"AccuAir","authors":"Zhipeng Luo, Jianqiang Huang, Ke Hu, Xue Li, Peng Zhang","doi":"10.1145/3292500.3330787","DOIUrl":"https://doi.org/10.1145/3292500.3330787","url":null,"abstract":"Since air pollution seriously affects human heath and daily life, the air quality prediction has attracted increasing attention and become an active and important research topic. In this paper, we present AccuAir, our winning solution to the KDD Cup 2018 of Fresh Air, where the proposed solution has won the 1st place in two tracks, and the 2nd place in the other one. Our solution got the best accuracy on average in all the evaluation days. The task is to accurately predict the air quality (as indicated by the concentration of PM2.5, PM10 or O3) of the next 48 hours for each monitoring station in Beijing and London. Aiming at a cutting-edge solution, we first presents an analysis of the air quality data, identifying the fundamental challenges, such as the long-term but suddenly changing air quality, and complex spatial-temporal correlations in different stations. To address the challenges, we carefully design both global and local air quality features, and develop three prediction models including LightGBM, Gated-DNN and Seq2Seq, each with novel ingredients developed for better solving the problem. Specifically, a spatial-temporal gate is proposed in our Gated-DNN model, to effectively capture the spatial-temporal correlations as well as temporal relatedness, making the prediction more sensitive to spatial and temporal signals. In addition, the Seq2Seq model is adapted in such a way that the encoder summarizes useful historical features while the decoder concatenate weather forecast as input, which significantly improves prediction accuracy. Assembling all these components together, the ensemble of three models outperforms all competing methods in terms of the prediction accuracy of 31 days average, 10 days average and 24-48 hours.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123713527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
4 Perspectives in Human-Centered Machine Learning 以人为中心的机器学习的4个视角
Carlos Guestrin
Machine learning (ML) has had a tremendous impact in across the world over the last decade. As we think about ML solving complex tasks, sometimes at super-human levels, it is easy to forget that there is no machine learning without humans in the loop. Humans define tasks and metrics, develop and program algorithms, collect and label data, debug and optimize systems, and are (usually) ultimately the users of the ML-based applications we are developing. In this talk, we will cover 4 human-centered perspectives in the ML development process, along with methods and systems, to empower humans to maximize the ultimate impact of their ML-based applications.
在过去的十年里,机器学习(ML)在世界各地产生了巨大的影响。当我们想到机器学习解决复杂的任务时,有时会达到超人的水平,我们很容易忘记,没有人类的参与就没有机器学习。人类定义任务和指标,开发和编程算法,收集和标记数据,调试和优化系统,并且(通常)是我们正在开发的基于ml的应用程序的最终用户。在本次演讲中,我们将介绍机器学习开发过程中的4个以人为中心的观点,以及方法和系统,以使人类能够最大限度地发挥基于机器学习的应用程序的最终影响。
{"title":"4 Perspectives in Human-Centered Machine Learning","authors":"Carlos Guestrin","doi":"10.1145/3292500.3340399","DOIUrl":"https://doi.org/10.1145/3292500.3340399","url":null,"abstract":"Machine learning (ML) has had a tremendous impact in across the world over the last decade. As we think about ML solving complex tasks, sometimes at super-human levels, it is easy to forget that there is no machine learning without humans in the loop. Humans define tasks and metrics, develop and program algorithms, collect and label data, debug and optimize systems, and are (usually) ultimately the users of the ML-based applications we are developing. In this talk, we will cover 4 human-centered perspectives in the ML development process, along with methods and systems, to empower humans to maximize the ultimate impact of their ML-based applications.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121935685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1