首页 > 最新文献

Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis最新文献

英文 中文
ATP-OIE: An Autonomous Open Information Extraction Method ATP-OIE:一种自主开放信息提取方法
J. M. Rodríguez, H. Merlino, Patricia Pesado
This paper describes an innovative Open Information Extraction method known as ATP-OIE1. It utilizes extraction patterns to find semantic relations. These patterns are generated automatically from examples, so it has greater autonomy than methods based on fixed rules. ATP-OIE can also summon other methods, ReVerb and ClausIE, if it is unable to find valid semantic relations in a sentence, thus improving its recall. In these cases, it is capable of generating new extraction patterns online, which improves its autonomy. It also implements different mechanisms to prevent common errors in the extraction of semantic relations. Lastly, ATP-OIE was compared with other state-of-the-art methods in a well known texts database: Reuters-21578, obtaining a higher precision than with other methods.
本文描述了一种创新的开放信息提取方法,称为ATP-OIE1。它利用抽取模式来查找语义关系。这些模式是从示例中自动生成的,因此它比基于固定规则的方法具有更大的自主权。如果无法在句子中找到有效的语义关系,ATP-OIE还可以调用ReVerb和ClausIE等其他方法,从而提高其召回率。在这些情况下,它能够在线生成新的提取模式,这提高了它的自主性。它还实现了不同的机制来防止语义关系提取中的常见错误。最后,将ATP-OIE与知名文本数据库Reuters-21578中的其他最先进的方法进行比较,获得了比其他方法更高的精度。
{"title":"ATP-OIE: An Autonomous Open Information Extraction Method","authors":"J. M. Rodríguez, H. Merlino, Patricia Pesado","doi":"10.1145/3388142.3388166","DOIUrl":"https://doi.org/10.1145/3388142.3388166","url":null,"abstract":"This paper describes an innovative Open Information Extraction method known as ATP-OIE1. It utilizes extraction patterns to find semantic relations. These patterns are generated automatically from examples, so it has greater autonomy than methods based on fixed rules. ATP-OIE can also summon other methods, ReVerb and ClausIE, if it is unable to find valid semantic relations in a sentence, thus improving its recall. In these cases, it is capable of generating new extraction patterns online, which improves its autonomy. It also implements different mechanisms to prevent common errors in the extraction of semantic relations. Lastly, ATP-OIE was compared with other state-of-the-art methods in a well known texts database: Reuters-21578, obtaining a higher precision than with other methods.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117258837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptation of RF and CNN on Spark 在Spark上改编RF和CNN
Y. Kou, Zhi Hong, Yun Tian, S. Wang
Biological images are used in many applications, most of which are important in medical field. For example, MRI scans and CT scans result in high resolution images that are critical for diagnosis of cancers and other malfunction of organs. Nowadays, high resolution ultrasound images can provide details to examine blood vessel blockage. Another type of biological images are those of mixed patterns of proteins in microscope human protein atlas images.Due to the enormous amount of image data available even in a single medical organization, Machine Learning and Deep Learning technology have been used to assist in the image data analysis.Spark is a computing framework that has been proved to speed up data analysis dramatically. However, Spark Scala doesn't fully support Deep learning algorithms. In this paper, we present a case study of adapting the Random Forest (RF) and Convolutional Neural Network (CNN) to the Spark Scala framework. These algorithms were applied to multi-classes multilabel classification on a biological dataset from Kagglers. The experimental results show that both RF and CNN can be implemented with Spark Scala and achieve extremely high throughput performance.
生物图像的应用非常广泛,其中在医学领域占有重要地位。例如,核磁共振扫描和CT扫描产生的高分辨率图像对癌症和其他器官功能障碍的诊断至关重要。如今,高分辨率的超声图像可以提供血管阻塞检查的细节。另一种类型的生物图像是显微镜下人类蛋白质图谱图像中蛋白质的混合模式。由于即使在单个医疗机构中也有大量可用的图像数据,因此机器学习和深度学习技术已被用于辅助图像数据分析。Spark是一个计算框架,已被证明可以显著加快数据分析速度。然而,Spark Scala并不完全支持深度学习算法。在本文中,我们提出了一个将随机森林(RF)和卷积神经网络(CNN)应用于Spark Scala框架的案例研究。将这些算法应用于Kagglers生物数据集的多类多标签分类。实验结果表明,RF和CNN都可以用Spark Scala实现,并获得极高的吞吐量性能。
{"title":"Adaptation of RF and CNN on Spark","authors":"Y. Kou, Zhi Hong, Yun Tian, S. Wang","doi":"10.1145/3388142.3388157","DOIUrl":"https://doi.org/10.1145/3388142.3388157","url":null,"abstract":"Biological images are used in many applications, most of which are important in medical field. For example, MRI scans and CT scans result in high resolution images that are critical for diagnosis of cancers and other malfunction of organs. Nowadays, high resolution ultrasound images can provide details to examine blood vessel blockage. Another type of biological images are those of mixed patterns of proteins in microscope human protein atlas images.Due to the enormous amount of image data available even in a single medical organization, Machine Learning and Deep Learning technology have been used to assist in the image data analysis.Spark is a computing framework that has been proved to speed up data analysis dramatically. However, Spark Scala doesn't fully support Deep learning algorithms. In this paper, we present a case study of adapting the Random Forest (RF) and Convolutional Neural Network (CNN) to the Spark Scala framework. These algorithms were applied to multi-classes multilabel classification on a biological dataset from Kagglers. The experimental results show that both RF and CNN can be implemented with Spark Scala and achieve extremely high throughput performance.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123421911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wellhead Compressor Failure Prediction Using Attention-based Bidirectional LSTMs with Data Reduction Techniques 基于注意力的双向lstm与数据约简技术的井口压缩机故障预测
Wirasak Chomphu, B. Kijsirikul
In the offshore oil and gas industry, petroleum in each well of a remote wellhead platform (WHP) is extracted naturally from the ground to the sales delivery point. However, when the oil pressure drops or the well is nearly depleted, the flow rate up to the WHP declines. Installing a Wellhead Compressor (WC) on the WHP is the solution [9]. The WC acts locally on the selected wells and reduces back pressure, thereby substantially enhancing the efficiency of oil and gas recovery [21]. The WC sensors transmit data back to the historian time series database, and intelligent alarm systems are utilized as a critical tool to minimize unscheduled downtime which adversely affects production reliability, as well as monitoring time and cost burden of operating engineers. In this paper, an Attention-Based Bidirectional Long Short-Term Memory (ABD-LSTM) model is presented for WC failure prediction. We also propose feature extraction and data reduction techniques as complementary methods to improve the effectiveness of the training process in a large-scale dataset. We evaluate our model performance based on real WC sensor data. Compared to other Machine Learning (ML) algorithms, our proposed methodology is more powerful and accurate. Our proposed ABD-LSTM achieved an optimal F1 score of 85.28%.
在海上油气行业,远程井口平台(WHP)的每口井中的石油都是自然从地面开采到销售交付点的。然而,当油压下降或油井接近枯竭时,最高采油点的流量下降。解决方案是在抽油机上安装井口压缩机(Wellhead Compressor, WC)[9]。WC局部作用于选定的井,降低了回压,从而大大提高了油气采收率[21]。WC传感器将数据传输回历史时间序列数据库,智能报警系统被用作关键工具,以最大限度地减少对生产可靠性产生不利影响的计划外停机时间,以及监控操作工程师的时间和成本负担。本文提出了一种基于注意力的双向长短期记忆(ABD-LSTM)模型用于WC故障预测。我们还提出了特征提取和数据约简技术作为补充方法,以提高大规模数据集训练过程的有效性。我们基于真实的WC传感器数据来评估模型的性能。与其他机器学习(ML)算法相比,我们提出的方法更加强大和准确。我们提出的ABD-LSTM获得了85.28%的最优F1分数。
{"title":"Wellhead Compressor Failure Prediction Using Attention-based Bidirectional LSTMs with Data Reduction Techniques","authors":"Wirasak Chomphu, B. Kijsirikul","doi":"10.1145/3388142.3388154","DOIUrl":"https://doi.org/10.1145/3388142.3388154","url":null,"abstract":"In the offshore oil and gas industry, petroleum in each well of a remote wellhead platform (WHP) is extracted naturally from the ground to the sales delivery point. However, when the oil pressure drops or the well is nearly depleted, the flow rate up to the WHP declines. Installing a Wellhead Compressor (WC) on the WHP is the solution [9]. The WC acts locally on the selected wells and reduces back pressure, thereby substantially enhancing the efficiency of oil and gas recovery [21]. The WC sensors transmit data back to the historian time series database, and intelligent alarm systems are utilized as a critical tool to minimize unscheduled downtime which adversely affects production reliability, as well as monitoring time and cost burden of operating engineers. In this paper, an Attention-Based Bidirectional Long Short-Term Memory (ABD-LSTM) model is presented for WC failure prediction. We also propose feature extraction and data reduction techniques as complementary methods to improve the effectiveness of the training process in a large-scale dataset. We evaluate our model performance based on real WC sensor data. Compared to other Machine Learning (ML) algorithms, our proposed methodology is more powerful and accurate. Our proposed ABD-LSTM achieved an optimal F1 score of 85.28%.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122540344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Ideology Detection of Personalized Political News Coverage: A New Dataset 个性化政治新闻报道的意识形态检测:一个新的数据集
Khudran Alzhrani
Words selection, writing style, stories cherry-picking, and many other factors play a role in framing news articles to fit the targeted audience or to align with the authors' beliefs. Hence, reporting facts alone is not evidence of bias-free journalism. Since the 2016 United States presidential elections, researchers focused on the media influence on the results of the elections. The news media attention has deviated from political parties to candidates. The news media shapes public perception of political candidates through news personalization. Despite its criticality, we are not aware of any studies which have examined news personalization from the machine learning or deep neural network perspective. In addition, some candidates accuse the media of favoritism which jeopardizes their chances of winning elections. Multiple methods were introduced to place news sources on one side of the political spectrum or the other, yet the mainstream media claims to be unbiased. Therefore, to avoid inaccurate assumptions, only news sources that have stated clearly their political affiliation are included in this research. In this paper, we constructed two datasets out of news articles written about the last two U.S. presidents with respect to news websites' political affiliation. Multiple intelligent models were developed to automatically predict the political affiliation of the personalized unseen article. The main objective of these models is to detect the political ideology of personalized news articles. Although the newly constructed datasets are highly imbalanced, the performance of the intelligent models is reasonably good. The results of the intelligent models are reported with a comparative analysis.
词汇选择、写作风格、故事挑选以及许多其他因素在构建新闻文章以适应目标受众或与作者的信念保持一致方面发挥着作用。因此,仅仅报道事实并不能证明新闻是无偏见的。自2016年美国总统大选以来,研究人员一直关注媒体对选举结果的影响。新闻媒体的注意力已经从政党转移到了候选人身上。新闻媒体通过新闻个性化塑造公众对政治候选人的看法。尽管其至关重要,但我们还没有发现任何从机器学习或深度神经网络角度研究新闻个性化的研究。此外,一些候选人指责媒体偏袒,这危及他们赢得选举的机会。人们采用多种方法将新闻来源置于政治光谱的一边或另一边,但主流媒体声称自己是公正的。因此,为了避免不准确的假设,本研究只包括明确表示其政治派别的新闻来源。在本文中,我们根据新闻网站的政治立场,从关于前两位美国总统的新闻文章中构建了两个数据集。开发了多个智能模型来自动预测个性化未见文章的政治派别。这些模型的主要目的是检测个性化新闻文章的政治意识形态。虽然新构建的数据集高度不平衡,但智能模型的性能相当好。报告了智能模型的结果,并进行了对比分析。
{"title":"Ideology Detection of Personalized Political News Coverage: A New Dataset","authors":"Khudran Alzhrani","doi":"10.1145/3388142.3388149","DOIUrl":"https://doi.org/10.1145/3388142.3388149","url":null,"abstract":"Words selection, writing style, stories cherry-picking, and many other factors play a role in framing news articles to fit the targeted audience or to align with the authors' beliefs. Hence, reporting facts alone is not evidence of bias-free journalism. Since the 2016 United States presidential elections, researchers focused on the media influence on the results of the elections. The news media attention has deviated from political parties to candidates. The news media shapes public perception of political candidates through news personalization. Despite its criticality, we are not aware of any studies which have examined news personalization from the machine learning or deep neural network perspective. In addition, some candidates accuse the media of favoritism which jeopardizes their chances of winning elections. Multiple methods were introduced to place news sources on one side of the political spectrum or the other, yet the mainstream media claims to be unbiased. Therefore, to avoid inaccurate assumptions, only news sources that have stated clearly their political affiliation are included in this research. In this paper, we constructed two datasets out of news articles written about the last two U.S. presidents with respect to news websites' political affiliation. Multiple intelligent models were developed to automatically predict the political affiliation of the personalized unseen article. The main objective of these models is to detect the political ideology of personalized news articles. Although the newly constructed datasets are highly imbalanced, the performance of the intelligent models is reasonably good. The results of the intelligent models are reported with a comparative analysis.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130914446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Text mining for incoming tasks based on the urgency/importance factors and task classification using machine learning tools 基于紧急/重要因素和使用机器学习工具的任务分类对传入任务进行文本挖掘
Y. Alshehri
In workplaces, there is a massive amount of unstructured data from different sources. In this paper, we present a case study that explains how can through communications between employees, we can help to prioritize tasks requests to increase the efficiency of their works for both technical and non-technical workers. This involves managing daily incoming tasks based on their level of urgency and importance.To allow all workers to utilize the urgency-importance matrix as a time-management tool, we need to automate this tool. The textual content of incoming tasks are analyzed, and metrics related to urgency and importance are extracted. A third factor (i.e., the response variable) is defined based on the two input variables (urgency and importance). Then, machine learning applied to the data to predict the class of incoming tasks based on data outcome desired. We used ordinal regression, neural networks, and decision tree algorithms to predict the four levels of task priority. We measure the performance of all using recalls, precisions, and F-scores. All classifiers perform higher than 89% in terms of all measures.
在工作场所,有大量来自不同来源的非结构化数据。在本文中,我们提出了一个案例研究,解释了如何通过员工之间的沟通,我们可以帮助优先考虑任务请求,以提高技术和非技术工人的工作效率。这包括根据紧急程度和重要性来管理每天的任务。为了让所有员工都能利用紧急-重要性矩阵作为时间管理工具,我们需要将这个工具自动化。对传入任务的文本内容进行分析,并提取与紧迫性和重要性相关的度量。第三个因素(即响应变量)是基于两个输入变量(紧迫性和重要性)来定义的。然后,将机器学习应用于数据,根据期望的数据结果预测传入任务的类别。我们使用有序回归、神经网络和决策树算法来预测任务优先级的四个级别。我们使用召回率、精确度和f分数来衡量所有产品的性能。所有分类器在所有度量方面的表现都高于89%。
{"title":"Text mining for incoming tasks based on the urgency/importance factors and task classification using machine learning tools","authors":"Y. Alshehri","doi":"10.1145/3388142.3388153","DOIUrl":"https://doi.org/10.1145/3388142.3388153","url":null,"abstract":"In workplaces, there is a massive amount of unstructured data from different sources. In this paper, we present a case study that explains how can through communications between employees, we can help to prioritize tasks requests to increase the efficiency of their works for both technical and non-technical workers. This involves managing daily incoming tasks based on their level of urgency and importance.To allow all workers to utilize the urgency-importance matrix as a time-management tool, we need to automate this tool. The textual content of incoming tasks are analyzed, and metrics related to urgency and importance are extracted. A third factor (i.e., the response variable) is defined based on the two input variables (urgency and importance). Then, machine learning applied to the data to predict the class of incoming tasks based on data outcome desired. We used ordinal regression, neural networks, and decision tree algorithms to predict the four levels of task priority. We measure the performance of all using recalls, precisions, and F-scores. All classifiers perform higher than 89% in terms of all measures.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130647346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Cyberbullying Detection in Social Media Using an SVM Activated Stacked Convolution LSTM Network 基于SVM激活的堆叠卷积LSTM网络的社交媒体网络欺凌自动检测
Thor Aleksander Buan, Raghavendra Ramachandra
Cyberbullying is becoming a huge problem on social media platforms. New statistics shows that more than a fourth of Norwegiankids report that they have been cyberbullied once or more duringthe last year. In the most recent years, it has become popularto utilize Neural Networks in order to automate the detection ofcyberbullying. These Neural Networks are often based on using Long-Short-Term-Memory layers solely or in combination withother types of layers. In this thesis we present a new Neural Networkdesign that can be used to detect traces of cyberbullying intextual media. The design is based on existing designs that combinesthe power of Convolutional layers with Long-Short-Term-Memorylayers. In addition, our design features the usage of stacked corelayers, which our research shows to increases the performance ofthe Neural Network. The design also features a new kind of activationmechanism, which is referred to as "Support-Vector-Machinelike activation". The "SupportVector-Machine like activation" isachieved by applying L2 weight regularization and utilizing a linearactivation function in the activation layer together with using aHinge loss function. Our experiments show that both the stackingof the layers and the "Support-Vector-Machine like activation"increasesthe performance of the Neural Network over traditionalState-Of-The-Art designs.
网络欺凌正在成为社交媒体平台上的一个大问题。新的统计数据显示,超过四分之一的挪威孩子报告说,他们在去年遭受过一次或多次网络欺凌。近年来,利用神经网络来自动检测网络欺凌已经变得很流行。这些神经网络通常单独使用长短期记忆层或与其他类型的层结合使用。在本文中,我们提出了一种新的神经网络设计,可用于检测网络欺凌文本媒体的痕迹。该设计基于现有的设计,结合了卷积层和长短期记忆层的能力。此外,我们的设计特点是使用堆叠的核心层,我们的研究表明,这可以提高神经网络的性能。该设计还采用了一种新的激活机制,被称为“支持向量机激活”。“类似SupportVector-Machine的激活”是通过应用L2权重正则化和在激活层中利用线性激活函数以及使用aHinge损失函数来实现的。我们的实验表明,层的堆叠和“支持向量机激活”都比传统的最先进的设计提高了神经网络的性能。
{"title":"Automated Cyberbullying Detection in Social Media Using an SVM Activated Stacked Convolution LSTM Network","authors":"Thor Aleksander Buan, Raghavendra Ramachandra","doi":"10.1145/3388142.3388147","DOIUrl":"https://doi.org/10.1145/3388142.3388147","url":null,"abstract":"Cyberbullying is becoming a huge problem on social media platforms. New statistics shows that more than a fourth of Norwegiankids report that they have been cyberbullied once or more duringthe last year. In the most recent years, it has become popularto utilize Neural Networks in order to automate the detection ofcyberbullying. These Neural Networks are often based on using Long-Short-Term-Memory layers solely or in combination withother types of layers. In this thesis we present a new Neural Networkdesign that can be used to detect traces of cyberbullying intextual media. The design is based on existing designs that combinesthe power of Convolutional layers with Long-Short-Term-Memorylayers. In addition, our design features the usage of stacked corelayers, which our research shows to increases the performance ofthe Neural Network. The design also features a new kind of activationmechanism, which is referred to as \"Support-Vector-Machinelike activation\". The \"SupportVector-Machine like activation\" isachieved by applying L2 weight regularization and utilizing a linearactivation function in the activation layer together with using aHinge loss function. Our experiments show that both the stackingof the layers and the \"Support-Vector-Machine like activation\"increasesthe performance of the Neural Network over traditionalState-Of-The-Art designs.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134533783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Using Monte Carlo Simulation to Predict Captive Insurance Solvency 用蒙特卡罗模拟预测专属自保保险偿付能力
Lu Xiong, Don Hong
The solvency of captive insurance is the key financial metric captive managers care about. We built a solvency prediction model for a captive insurance fund using Monte Carlo simulation with the fund's historical losses, current financial data and setups. This model can predict the solvency score of the current captive fund using the fund survival probability as a measurement of solvency. If the simulated future solvency ratios break the upper and lower bounds, we count it as an insolvent case; otherwise, it is counted a solvent (or survival) case. After large scale simulation, we can approximate the future survival probability, i.e. the solvency score, of the current captive fund. The predicted income statements, the balance sheets and financial ratios, will also be generated. We use a heat-map to visualize the solvency score at each retention level so that it can provide support to captive insurance managers to make their decisions. This model is implemented in Excel VBA macro and MATLAB.
自保保险的偿付能力是自保经理人关心的关键财务指标。本文运用蒙特卡罗模拟法,结合某专属保险基金的历史损失、当前财务数据和设置,建立了该基金的偿付能力预测模型。该模型可以用基金生存概率作为衡量偿付能力的指标来预测当前专属基金的偿付能力得分。如果模拟的未来偿付能力比率超过上限和下限,我们将其视为资不抵债的情况;否则,它将被视为溶剂型(或存活型)案例。经过大规模模拟,我们可以近似得出当前专属基金的未来生存概率,即偿付能力得分。预计的损益表、资产负债表和财务比率也将生成。我们使用热图将每个保留级别的偿付能力评分可视化,以便为专属自保保险经理做出决策提供支持。该模型在Excel VBA宏和MATLAB中实现。
{"title":"Using Monte Carlo Simulation to Predict Captive Insurance Solvency","authors":"Lu Xiong, Don Hong","doi":"10.1145/3388142.3388171","DOIUrl":"https://doi.org/10.1145/3388142.3388171","url":null,"abstract":"The solvency of captive insurance is the key financial metric captive managers care about. We built a solvency prediction model for a captive insurance fund using Monte Carlo simulation with the fund's historical losses, current financial data and setups. This model can predict the solvency score of the current captive fund using the fund survival probability as a measurement of solvency. If the simulated future solvency ratios break the upper and lower bounds, we count it as an insolvent case; otherwise, it is counted a solvent (or survival) case. After large scale simulation, we can approximate the future survival probability, i.e. the solvency score, of the current captive fund. The predicted income statements, the balance sheets and financial ratios, will also be generated. We use a heat-map to visualize the solvency score at each retention level so that it can provide support to captive insurance managers to make their decisions. This model is implemented in Excel VBA macro and MATLAB.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131353181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Hybrid Model of Clustering and Neural Network Using Weather Conditions for Energy Management in Buildings 基于天气条件的聚类和神经网络混合模型在建筑能源管理中的应用
Bishnu Nepal, M. Yamaha
For the conservation of energy in buildings, it is essential to understand the energy consumption pattern and make efforts based on the analyzed result for energy load reduction. In this research, we proposed a method for forecasting the electricity load of university buildings using a hybrid model of clustering technique and neural network using weather conditions. The novel approach discussed in this paper includes clustering one whole year data including the forecasting day using K-means clustering and using the result as an input parameter in a neural network for forecasting the electricity peak load of university buildings. The hybrid model has proved to increase the performance of forecasting rather than neural network alone. We also developed a graphical visualization platform for the analyzed result using an interactive web application called Shiny. Using Shiny application and forecasting electricity peak load with appreciable accuracy several hours before peak hours can aware the management authorities about the energy situation and provides sufficient time for making a strategy for peak load reduction. This method can also be implemented in the demand response for reducing the electricity bills by avoiding electricity usage during the high electricity rate hours.
对于建筑节能来说,了解建筑的能耗规律,并根据分析结果进行节能减排是至关重要的。在这项研究中,我们提出了一种基于天气条件的聚类技术和神经网络混合模型的大学建筑电力负荷预测方法。本文讨论的新方法是使用K-means聚类方法对包括预测日在内的全年数据进行聚类,并将结果作为神经网络的输入参数用于预测大学建筑的电力峰值负荷。事实证明,混合模型比单独的神经网络更能提高预测效果。我们还使用一个名为Shiny的交互式web应用程序为分析结果开发了一个图形化的可视化平台。利用Shiny应用程序,在高峰时段前数小时以可观的精度预测电力高峰负荷,可以使管理当局了解能源状况,并为制定高峰负荷降低策略提供充足的时间。该方法也可应用于需求响应,避免在高电费时段用电,从而减少电费支出。
{"title":"A Hybrid Model of Clustering and Neural Network Using Weather Conditions for Energy Management in Buildings","authors":"Bishnu Nepal, M. Yamaha","doi":"10.1145/3388142.3388172","DOIUrl":"https://doi.org/10.1145/3388142.3388172","url":null,"abstract":"For the conservation of energy in buildings, it is essential to understand the energy consumption pattern and make efforts based on the analyzed result for energy load reduction. In this research, we proposed a method for forecasting the electricity load of university buildings using a hybrid model of clustering technique and neural network using weather conditions. The novel approach discussed in this paper includes clustering one whole year data including the forecasting day using K-means clustering and using the result as an input parameter in a neural network for forecasting the electricity peak load of university buildings. The hybrid model has proved to increase the performance of forecasting rather than neural network alone. We also developed a graphical visualization platform for the analyzed result using an interactive web application called Shiny. Using Shiny application and forecasting electricity peak load with appreciable accuracy several hours before peak hours can aware the management authorities about the energy situation and provides sufficient time for making a strategy for peak load reduction. This method can also be implemented in the demand response for reducing the electricity bills by avoiding electricity usage during the high electricity rate hours.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115368722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Research on Automatic Generation Method of Scenario Based on Panosim 基于Panosim的场景自动生成方法研究
Zhang Lu, Zhibin Du, Xianglei Zhu
With the development of science and technology, L3 intelligent vehicles are gradually entering the mass production phase. Traditional testing tools and methods can hardly meet the requirements for multiple dimensions, high standard and big data of self-driving vehicles. The scenario-based simulation test method has great technical advantages in terms of test efficiency, verification cost and versatility, and is an important means for automatic driving test verification. However, it has shortcomings such as long scenario construction period and large repeatability. This paper is compiled based on secondary development of the automatic driving simulation software Panosim and presenting the automatic inputting of scenario and rapid adjustment of parameters through the digital twinning technology. In addition, the natural driving scenario database of China Automotive Technology and Research Center is used for verification. The results show that this method can improve the efficiency and accuracy of scenario construction, and greatly shorten the cycle of simulation test.
随着科技的发展,L3级智能汽车正逐步进入量产阶段。传统的测试工具和方法难以满足自动驾驶汽车对多维度、高标准、大数据的要求。基于场景的仿真测试方法在测试效率、验证成本和通用性方面具有很大的技术优势,是自动驾驶测试验证的重要手段。但存在场景构建周期长、可重复性大等缺点。本文是在对自动驾驶仿真软件Panosim进行二次开发的基础上编写的,通过数字孪生技术实现场景的自动输入和参数的快速调整。此外,使用中国汽车技术研究中心的自然驾驶场景数据库进行验证。结果表明,该方法提高了场景构建的效率和准确性,大大缩短了仿真测试的周期。
{"title":"Research on Automatic Generation Method of Scenario Based on Panosim","authors":"Zhang Lu, Zhibin Du, Xianglei Zhu","doi":"10.1145/3388142.3388165","DOIUrl":"https://doi.org/10.1145/3388142.3388165","url":null,"abstract":"With the development of science and technology, L3 intelligent vehicles are gradually entering the mass production phase. Traditional testing tools and methods can hardly meet the requirements for multiple dimensions, high standard and big data of self-driving vehicles. The scenario-based simulation test method has great technical advantages in terms of test efficiency, verification cost and versatility, and is an important means for automatic driving test verification. However, it has shortcomings such as long scenario construction period and large repeatability. This paper is compiled based on secondary development of the automatic driving simulation software Panosim and presenting the automatic inputting of scenario and rapid adjustment of parameters through the digital twinning technology. In addition, the natural driving scenario database of China Automotive Technology and Research Center is used for verification. The results show that this method can improve the efficiency and accuracy of scenario construction, and greatly shorten the cycle of simulation test.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122609241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
How Different Genders Use Profanity on Twitter? 不同性别的人如何在推特上使用脏话?
S. Wong, P. Teh, Chi-Bin Cheng
Social media, is often the go-to place where people discuss their opinions and share their feelings. As some platforms provide more anonymity than others, users have taken advantage of that privilege, by sitting behind the screen, the use of profanity has been able to create a toxic environment. Although not all profanities are used to offend people, it is undeniable that the anonymity has allowed social media users to express themselves more freely, increasing the likelihood of swearing. In this study, the use of profanity by different gender classes is compiled, and the findings showed that different genders often employ swear words from different hate categories, e.g. males tend to use more terms from the "disability" hate group. Classification models have been developed to predict the gender of tweet authors, and results showed that profanity could be used to uncover the gender of anonymous users. This shows the possibility that profiling of cyberbullies can be done from the aspect of gender based on profanity usage.
社交媒体通常是人们讨论他们的观点和分享他们的感受的地方。由于一些平台比其他平台提供更多的匿名性,用户利用了这一特权,通过坐在屏幕后面,使用亵渎已经能够创造一个有毒的环境。虽然不是所有的脏话都是用来冒犯别人的,但不可否认的是,匿名让社交媒体用户可以更自由地表达自己,增加了说脏话的可能性。在本研究中,对不同性别阶层的脏话使用情况进行了汇总,研究结果表明,不同性别的人使用的脏话往往来自不同的仇恨类别,例如男性倾向于使用更多来自“残疾”仇恨群体的词语。已经开发了分类模型来预测推文作者的性别,结果表明,亵渎可以用来揭示匿名用户的性别。这表明,基于脏话的使用,可以从性别方面对网络恶霸进行分析。
{"title":"How Different Genders Use Profanity on Twitter?","authors":"S. Wong, P. Teh, Chi-Bin Cheng","doi":"10.1145/3388142.3388145","DOIUrl":"https://doi.org/10.1145/3388142.3388145","url":null,"abstract":"Social media, is often the go-to place where people discuss their opinions and share their feelings. As some platforms provide more anonymity than others, users have taken advantage of that privilege, by sitting behind the screen, the use of profanity has been able to create a toxic environment. Although not all profanities are used to offend people, it is undeniable that the anonymity has allowed social media users to express themselves more freely, increasing the likelihood of swearing. In this study, the use of profanity by different gender classes is compiled, and the findings showed that different genders often employ swear words from different hate categories, e.g. males tend to use more terms from the \"disability\" hate group. Classification models have been developed to predict the gender of tweet authors, and results showed that profanity could be used to uncover the gender of anonymous users. This shows the possibility that profiling of cyberbullies can be done from the aspect of gender based on profanity usage.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116444465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1