首页 > 最新文献

2021 13th International Conference on Machine Learning and Computing最新文献

英文 中文
An Unsupervised Feature Learning Method for Enhancing the Generalization of Cancer Diagnosis 一种增强癌症诊断泛化的无监督特征学习方法
Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457720
Zhen Liu, Ruoyu Wang, Wen-bo Zhang, Deyu Tang
Machine learning techniques have been utilized on gene expression profiling for cancer diagnosis. However, the gene expression data suffer from the curse of high dimensionality. Different kinds of feature selection methods were proposed to decrease the features of specific cancer diagnosis. As the difficult of obtaining the samples of a particular tumor, the lack of training samples leads to the overfitting problem. To handle the two problems, this paper proposes an unsupervised feature learning method. This method is able to enhance the performance of unsupervised feature learning by leveraging the unlabeled samples from other sources. Since the method utilizes the knowledge among the expression data from different sources, it can boost cancer classification performance. The experimental results on the gene expression data proves that our method improves the generalization cancer diagnosis when the unlabeled data are used for unsupervised feature learning.
机器学习技术已被用于癌症诊断的基因表达谱分析。然而,基因表达数据受到高维的困扰。提出了不同的特征选择方法,以减少特定癌症诊断的特征。由于难以获得特定肿瘤的样本,缺乏训练样本会导致过拟合问题。针对这两个问题,本文提出了一种无监督特征学习方法。该方法能够通过利用来自其他来源的未标记样本来提高无监督特征学习的性能。由于该方法利用了不同来源的表达数据之间的知识,可以提高癌症分类的性能。在基因表达数据上的实验结果表明,当将未标记数据用于无监督特征学习时,我们的方法提高了癌症诊断的泛化程度。
{"title":"An Unsupervised Feature Learning Method for Enhancing the Generalization of Cancer Diagnosis","authors":"Zhen Liu, Ruoyu Wang, Wen-bo Zhang, Deyu Tang","doi":"10.1145/3457682.3457720","DOIUrl":"https://doi.org/10.1145/3457682.3457720","url":null,"abstract":"Machine learning techniques have been utilized on gene expression profiling for cancer diagnosis. However, the gene expression data suffer from the curse of high dimensionality. Different kinds of feature selection methods were proposed to decrease the features of specific cancer diagnosis. As the difficult of obtaining the samples of a particular tumor, the lack of training samples leads to the overfitting problem. To handle the two problems, this paper proposes an unsupervised feature learning method. This method is able to enhance the performance of unsupervised feature learning by leveraging the unlabeled samples from other sources. Since the method utilizes the knowledge among the expression data from different sources, it can boost cancer classification performance. The experimental results on the gene expression data proves that our method improves the generalization cancer diagnosis when the unlabeled data are used for unsupervised feature learning.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123163580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
K-CSRL: Knowledge Enhanced Conversational Semantic Role Labeling K-CSRL:知识增强会话语义角色标注
Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457763
Boyu He, Han Wu, Congduan Li, Linqi Song, Weigang Chen
Semantic role labeling (SRL) is widely used to extract predicate-argument pairs from sentences. Traditional SRL methods can perform well on the single sentence but fail to work in dialogue scenario where ellipsis and anaphora frequently occurs. Some research work has been proposed to solve this problem, i.e. Conversational Semantic Role Labeling (CSRL), but there are still huge room for improvements. The error case study of BERT-based CSRL model has shown that the majority of the errors are observed in boundary matching, especially in entity mention detection. We think the premier cause of this kind of error is the deficiency of external knowledge such that the ill-informed model cannot correctly capture and correlate the entities. To this end, we propose to incorporate external knowledge into BERT using visible masking strategy. We evaluate our proposed model on DuConv dataset. Experimental results show that our model with knowledge enhancement outperforms the benchmarks. Further analysis also demonstrates that dialogue SRL can benefit from external knowledge.
语义角色标注(SRL)被广泛用于从句子中提取谓词-参数对。传统的SRL方法可以很好地处理单句,但在省略和回指频繁出现的对话场景中效果不佳。为了解决这一问题,已经提出了一些研究工作,即会话语义角色标记(CSRL),但仍有很大的改进空间。基于bert的CSRL模型误差案例研究表明,大部分误差出现在边界匹配中,尤其是实体提及检测中。我们认为这种错误的主要原因是外部知识的缺乏,以至于信息不灵通的模型不能正确地捕获和关联实体。为此,我们提出使用可见掩蔽策略将外部知识纳入BERT。我们在DuConv数据集上评估了我们提出的模型。实验结果表明,我们的知识增强模型优于基准测试。进一步的分析还表明,对话SRL可以从外部知识中获益。
{"title":"K-CSRL: Knowledge Enhanced Conversational Semantic Role Labeling","authors":"Boyu He, Han Wu, Congduan Li, Linqi Song, Weigang Chen","doi":"10.1145/3457682.3457763","DOIUrl":"https://doi.org/10.1145/3457682.3457763","url":null,"abstract":"Semantic role labeling (SRL) is widely used to extract predicate-argument pairs from sentences. Traditional SRL methods can perform well on the single sentence but fail to work in dialogue scenario where ellipsis and anaphora frequently occurs. Some research work has been proposed to solve this problem, i.e. Conversational Semantic Role Labeling (CSRL), but there are still huge room for improvements. The error case study of BERT-based CSRL model has shown that the majority of the errors are observed in boundary matching, especially in entity mention detection. We think the premier cause of this kind of error is the deficiency of external knowledge such that the ill-informed model cannot correctly capture and correlate the entities. To this end, we propose to incorporate external knowledge into BERT using visible masking strategy. We evaluate our proposed model on DuConv dataset. Experimental results show that our model with knowledge enhancement outperforms the benchmarks. Further analysis also demonstrates that dialogue SRL can benefit from external knowledge.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115743407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An Extended Factorial Hidden Markov Model for Non-Intrusive Load Monitoring Based on Density Peak Clustering 基于密度峰值聚类的非侵入式负荷监测扩展阶乘隐马尔可夫模型
Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457712
Zhao Wu, Chao Wang, Ruiyou Li, Huaiqing Zhang
Non-Intrusive Load Monitoring (NILM) has received widespread attention as an energy-saving technology. The method based on Hidden Markov Model (HMM) is very popular in this domain because of its relatively small demand for computing resources. However, the traditional HMM-based methods need additional information such as the working states of appliance to train the model. In this paper, we proposed a non-parameter model (IC-FHMM) to alleviate the problem that require prior knowledge. Experiments are conducted on three open-access datasets, and the results indicate that the proposed model is superior to the four state-of-the-art models on the metrics of Accuracy and F-measure.
非侵入式负荷监测(NILM)作为一种节能技术受到了广泛关注。基于隐马尔可夫模型(HMM)的方法由于其对计算资源的需求相对较小,在该领域非常受欢迎。然而,传统的基于hmm的方法需要附加设备工作状态等信息来训练模型。在本文中,我们提出了一种非参数模型(IC-FHMM)来缓解需要先验知识的问题。在三个开放获取数据集上进行了实验,结果表明,该模型在精度和F-measure指标上优于目前最先进的四种模型。
{"title":"An Extended Factorial Hidden Markov Model for Non-Intrusive Load Monitoring Based on Density Peak Clustering","authors":"Zhao Wu, Chao Wang, Ruiyou Li, Huaiqing Zhang","doi":"10.1145/3457682.3457712","DOIUrl":"https://doi.org/10.1145/3457682.3457712","url":null,"abstract":"Non-Intrusive Load Monitoring (NILM) has received widespread attention as an energy-saving technology. The method based on Hidden Markov Model (HMM) is very popular in this domain because of its relatively small demand for computing resources. However, the traditional HMM-based methods need additional information such as the working states of appliance to train the model. In this paper, we proposed a non-parameter model (IC-FHMM) to alleviate the problem that require prior knowledge. Experiments are conducted on three open-access datasets, and the results indicate that the proposed model is superior to the four state-of-the-art models on the metrics of Accuracy and F-measure.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"2007 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125572812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using the Naive Bayes as a discriminative model 使用朴素贝叶斯作为判别模型
Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457697
E. Azeraf, E. Monfrini, W. Pieczynski
For classification tasks, probabilistic graphical models are usually categorized into two disjoint classes: generative or discriminative. It depends on the posterior probability p(x|y) of the label x given the observation y computation. On the one hand, generative models, like the Naive Bayes or the Hidden Markov Model (HMM), need the computation of the joint probability p(x, y), before using the Bayes rule to compute p(x|y). On the other hand, discriminative models compute p(x|y) directly, regardless of the observations’ law. They are intensively used nowadays, with models as Logistic Regression or Conditional Random Fields (CRF). However, the recent Entropic Forward-Backward algorithm shows that the HMM, considered as a generative model, can also match the discriminative one’s definition. This example leads to question if it is the case for other generative models. In this paper, we show that the Naive Bayes can also match the discriminative model definition, so it can be used in either a generative or a discriminative way. Moreover, this observation also discusses the notion of Generative-Discriminative pairs, linking, for example, Naive Bayes and Logistic Regression, or HMM and CRF. Related to this point, we show that the Logistic Regression can be viewed as a particular case of the Naive Bayes used in a discriminative way.
对于分类任务,概率图模型通常分为两类:生成型和判别型。它取决于给定观测值y计算的标签x的后验概率p(x|y)一方面,生成模型,如朴素贝叶斯或隐马尔可夫模型(HMM),在使用贝叶斯规则计算p(x|y)之前,需要计算联合概率p(x, y)。另一方面,判别模型直接计算p(x|y),而不考虑观测值的规律。它们现在被广泛使用,如逻辑回归或条件随机场(CRF)模型。然而,最近的Entropic Forward-Backward算法表明,作为生成模型的HMM也可以匹配判别式模型的定义。这个例子引发了一个问题,即其他生成模型是否也是如此。在本文中,我们证明了朴素贝叶斯也可以匹配判别模型定义,因此它既可以以生成方式使用,也可以以判别方式使用。此外,本观察还讨论了生成-判别对的概念,例如连接朴素贝叶斯和逻辑回归,或HMM和CRF。与这一点相关,我们表明逻辑回归可以被视为以判别方式使用朴素贝叶斯的特殊情况。
{"title":"Using the Naive Bayes as a discriminative model","authors":"E. Azeraf, E. Monfrini, W. Pieczynski","doi":"10.1145/3457682.3457697","DOIUrl":"https://doi.org/10.1145/3457682.3457697","url":null,"abstract":"For classification tasks, probabilistic graphical models are usually categorized into two disjoint classes: generative or discriminative. It depends on the posterior probability p(x|y) of the label x given the observation y computation. On the one hand, generative models, like the Naive Bayes or the Hidden Markov Model (HMM), need the computation of the joint probability p(x, y), before using the Bayes rule to compute p(x|y). On the other hand, discriminative models compute p(x|y) directly, regardless of the observations’ law. They are intensively used nowadays, with models as Logistic Regression or Conditional Random Fields (CRF). However, the recent Entropic Forward-Backward algorithm shows that the HMM, considered as a generative model, can also match the discriminative one’s definition. This example leads to question if it is the case for other generative models. In this paper, we show that the Naive Bayes can also match the discriminative model definition, so it can be used in either a generative or a discriminative way. Moreover, this observation also discusses the notion of Generative-Discriminative pairs, linking, for example, Naive Bayes and Logistic Regression, or HMM and CRF. Related to this point, we show that the Logistic Regression can be viewed as a particular case of the Naive Bayes used in a discriminative way.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129468671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Spectral-wise Attention-based Residual Network for Hyperspectral Image Classification 基于光谱的残差网络用于高光谱图像分类
Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457735
Yaxin Chen, Zhiqiang Guo, Jie Yang
Hyperspectral images (HSI) have abundant bands and can capture more useful information, having been widely used in military and civil applications. Traditional HSI classification algorithms failed to take full consideration of the relationship between spatial-wise and spectral-wise information. In this paper, we propose the Spectral-wise Attention-based Residual Network (SARN), in which double branches structure is applied for HSI classification. There are two channels in the model. In the first channel, a novel spectral attention block is used to generate the attention map for the spectral-wise information. Then in the second channel, a spatial-wise residual unit is utilized to draw spatial features. Afterward, the spectral attention map and the spatial features are fused for classification. Experiment results on the Pavia University dataset and Indian_pines dataset demonstrate that the proposed method has better performance than the state-of-art method.
高光谱图像波段丰富,能捕获更多有用信息,在军事和民用领域有着广泛的应用。传统的HSI分类算法没有充分考虑到空间信息和光谱信息之间的关系。本文提出了一种基于频谱的基于注意力的残差网络(SARN),该网络采用双分支结构进行HSI分类。模型中有两个通道。在第一个通道中,使用一个新的频谱注意块来生成频谱信息的注意图。然后在第二通道中,利用空间残差单元绘制空间特征。然后,将光谱注意图与空间特征融合进行分类。在Pavia University数据集和Indian_pines数据集上的实验结果表明,该方法比目前的方法具有更好的性能。
{"title":"Spectral-wise Attention-based Residual Network for Hyperspectral Image Classification","authors":"Yaxin Chen, Zhiqiang Guo, Jie Yang","doi":"10.1145/3457682.3457735","DOIUrl":"https://doi.org/10.1145/3457682.3457735","url":null,"abstract":"Hyperspectral images (HSI) have abundant bands and can capture more useful information, having been widely used in military and civil applications. Traditional HSI classification algorithms failed to take full consideration of the relationship between spatial-wise and spectral-wise information. In this paper, we propose the Spectral-wise Attention-based Residual Network (SARN), in which double branches structure is applied for HSI classification. There are two channels in the model. In the first channel, a novel spectral attention block is used to generate the attention map for the spectral-wise information. Then in the second channel, a spatial-wise residual unit is utilized to draw spatial features. Afterward, the spectral attention map and the spatial features are fused for classification. Experiment results on the Pavia University dataset and Indian_pines dataset demonstrate that the proposed method has better performance than the state-of-art method.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127461367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying the Key Residues Regulating the Binding between Antibody Avelumab and PD-L1 VIA Molecular Dynamics Simulation 通过分子动力学模拟鉴定调节抗体Avelumab与PD-L1结合的关键残基
Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457767
Wenping Liu, Ting Chen, Shengsheng Lai, Gangping Zhang, Guangjian Liu, Haoyu Jin
Avelumab, approved by the US Food and Drug Administration (FDA) for the treatment of Merkel cell carcinoma in adults and paediatric patients in 2017, is an investigational fully human anti–PD-L1 IgG1 antibody that inhibits PD-1/PD-L1 interactions. Although the crystal structure of the avelumab/PD-L1 complex was reported in 2017, which provided us the interface information at atom level, the dynamics information of the complex is missed, and some key residues could not be detected in that static crystal structure. Here, molecular dynamics simulations were performed for the avelumab/PD-L1 complex to map the epitope to paratope residues. The results showed that the epitope residues locating on the C strand (PD-L1TYR56 and PD-L1GLU58), CC’ loop (PD-L1GLU60, PD-L1ASP61 and PD-L1LYS62), C’ strand (PD-L1ASN63), and C'D loop (PD-L1HIS69) of PD-L1 mainly form the interface with avelumab. The paratope residues on avelumab include TYR52H, SER54H, GLY102H, THR105H, TYR34L, ASP52L and ARG99L. The C’ strand of PD-L1 is also a binding region for PD-1. Thus, antibody avelumab block PD-1/PD-L1 interaction through direct competitive binding of the C’ strand of PD-L1.
Avelumab于2017年被美国食品和药物管理局(FDA)批准用于治疗成人和儿科患者的默克尔细胞癌,是一种抑制PD-1/PD-L1相互作用的实验性全人抗PD-L1 IgG1抗体。尽管2017年报道了avelumab/PD-L1复合物的晶体结构,为我们提供了原子水平的界面信息,但该复合物的动力学信息缺失,并且在该静态晶体结构中无法检测到一些关键残基。在这里,对avelumab/PD-L1复合物进行了分子动力学模拟,以将表位映射到旁位残基。结果表明,定位于PD-L1的C链(PD-L1TYR56和PD-L1GLU58)、CC '环(PD-L1GLU60、PD-L1ASP61和PD-L1LYS62)、C'链(PD-L1ASN63)和C'环(PD-L1HIS69)上的表位残基主要与avelumab形成界面。avelumab的paratope残基包括TYR52H、SER54H、GLY102H、THR105H、TYR34L、ASP52L和ARG99L。PD-L1的C '链也是PD-1的结合区。因此,抗体avelumab通过直接竞争性结合PD-L1的C '链来阻断PD-1/PD-L1相互作用。
{"title":"Identifying the Key Residues Regulating the Binding between Antibody Avelumab and PD-L1 VIA Molecular Dynamics Simulation","authors":"Wenping Liu, Ting Chen, Shengsheng Lai, Gangping Zhang, Guangjian Liu, Haoyu Jin","doi":"10.1145/3457682.3457767","DOIUrl":"https://doi.org/10.1145/3457682.3457767","url":null,"abstract":"Avelumab, approved by the US Food and Drug Administration (FDA) for the treatment of Merkel cell carcinoma in adults and paediatric patients in 2017, is an investigational fully human anti–PD-L1 IgG1 antibody that inhibits PD-1/PD-L1 interactions. Although the crystal structure of the avelumab/PD-L1 complex was reported in 2017, which provided us the interface information at atom level, the dynamics information of the complex is missed, and some key residues could not be detected in that static crystal structure. Here, molecular dynamics simulations were performed for the avelumab/PD-L1 complex to map the epitope to paratope residues. The results showed that the epitope residues locating on the C strand (PD-L1TYR56 and PD-L1GLU58), CC’ loop (PD-L1GLU60, PD-L1ASP61 and PD-L1LYS62), C’ strand (PD-L1ASN63), and C'D loop (PD-L1HIS69) of PD-L1 mainly form the interface with avelumab. The paratope residues on avelumab include TYR52H, SER54H, GLY102H, THR105H, TYR34L, ASP52L and ARG99L. The C’ strand of PD-L1 is also a binding region for PD-1. Thus, antibody avelumab block PD-1/PD-L1 interaction through direct competitive binding of the C’ strand of PD-L1.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126837932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
YOLO-Tight: an Efficient Dynamic Compression Method for YOLO Object Detection Networks YOLO- tight:一种用于YOLO目标检测网络的高效动态压缩方法
Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457740
Wei Yan, Ting Liu, Yuzhuo Fu
Deep learning algorithms perform well in the field of object detection. Object detection networks represented by YOLO, SSD and faster-RCNN have achieved excellent performance on public datasets such as VOC and COCO. However, deep learning models are difficult to deploy on the edge computing platform with less computing resources due to its huge amount of parameters and computation. In this paper, we propose an efficient dynamic sparsity method to help the network quickly mine important parameters, and then prune the unimportant weight channels, which makes the network model more compact and consumes less computation. In the case of high sparsity, our method is more robust than L1 regularization and other regularization forms, and can achieve better sparsity and pruning effects. Through this method, we can prune the YOLOv3 network and the enhanced YOLOv3-SPP3 network by up to 90%. This allows the network to achieve 5× reduction in FLOPs and maintain an accuracy loss of less than 1% on the BDD100k dataset.
深度学习算法在目标检测领域表现良好。以YOLO、SSD和faster-RCNN为代表的目标检测网络在VOC和COCO等公共数据集上取得了优异的性能。然而,深度学习模型由于参数和计算量巨大,难以在计算资源较少的边缘计算平台上部署。本文提出了一种有效的动态稀疏度方法,帮助网络快速挖掘重要参数,然后修剪不重要的权重通道,使网络模型更加紧凑,减少了计算量。在高稀疏性的情况下,我们的方法比L1正则化和其他正则化形式具有更强的鲁棒性,并且可以获得更好的稀疏性和修剪效果。通过这种方法,我们可以对YOLOv3网络和增强的YOLOv3- spp3网络进行高达90%的修剪。这使得网络可以实现5倍的flop减少,并在BDD100k数据集上保持小于1%的精度损失。
{"title":"YOLO-Tight: an Efficient Dynamic Compression Method for YOLO Object Detection Networks","authors":"Wei Yan, Ting Liu, Yuzhuo Fu","doi":"10.1145/3457682.3457740","DOIUrl":"https://doi.org/10.1145/3457682.3457740","url":null,"abstract":"Deep learning algorithms perform well in the field of object detection. Object detection networks represented by YOLO, SSD and faster-RCNN have achieved excellent performance on public datasets such as VOC and COCO. However, deep learning models are difficult to deploy on the edge computing platform with less computing resources due to its huge amount of parameters and computation. In this paper, we propose an efficient dynamic sparsity method to help the network quickly mine important parameters, and then prune the unimportant weight channels, which makes the network model more compact and consumes less computation. In the case of high sparsity, our method is more robust than L1 regularization and other regularization forms, and can achieve better sparsity and pruning effects. Through this method, we can prune the YOLOv3 network and the enhanced YOLOv3-SPP3 network by up to 90%. This allows the network to achieve 5× reduction in FLOPs and maintain an accuracy loss of less than 1% on the BDD100k dataset.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128175736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multi-Perspective Reasoning Transformers 多角度推理变压器
Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457759
Dagmawi Alemu Moges, Andre Niyongabo Rubungo, Hong Qu
Machine Reading Comprehension is defined as the ability of machines to read and understand unstructured text and answer questions about it. It is considered as a challenging task with wide range of enterprise applications. Wide range of natural language understanding and reasoning tasks are found embedded within machine reading comprehension datasets. This requires effective models with robust relational reasoning capabilities to answer complex questions. Reasoning in natural language is a long-term machine-learning goal and is critically needed for building intelligent agents. However, most papers heavily depend on underlying language modeling and thus pay little to no attention on creating effective reasoning models. This paper proposes a modified transformer architecture that effectively combines soft and hard attention to create multi-perspective reasoning model capable of tackling wide range of reasoning tasks. An attention mechanism that highlights the relational significance of input signals is considered as well. The result from this study shows performance gain as compared to its counterpart the transformer network on bAbI dataset, a natural language reasoning tasks.
机器阅读理解被定义为机器阅读和理解非结构化文本并回答有关问题的能力。它被认为是一项具有挑战性的任务,具有广泛的企业应用。广泛的自然语言理解和推理任务被发现嵌入在机器阅读理解数据集中。这需要具有强大的关系推理能力的有效模型来回答复杂的问题。自然语言推理是一个长期的机器学习目标,也是构建智能代理的关键。然而,大多数论文严重依赖于底层语言建模,因此很少或根本没有注意到创建有效的推理模型。本文提出了一种改进的变压器结构,该结构有效地结合了软注意和硬注意,创建了能够处理广泛推理任务的多角度推理模型。一种强调输入信号的关系重要性的注意机制也被考虑在内。本研究的结果显示,与bAbI数据集(自然语言推理任务)上的变压器网络相比,性能有所提高。
{"title":"Multi-Perspective Reasoning Transformers","authors":"Dagmawi Alemu Moges, Andre Niyongabo Rubungo, Hong Qu","doi":"10.1145/3457682.3457759","DOIUrl":"https://doi.org/10.1145/3457682.3457759","url":null,"abstract":"Machine Reading Comprehension is defined as the ability of machines to read and understand unstructured text and answer questions about it. It is considered as a challenging task with wide range of enterprise applications. Wide range of natural language understanding and reasoning tasks are found embedded within machine reading comprehension datasets. This requires effective models with robust relational reasoning capabilities to answer complex questions. Reasoning in natural language is a long-term machine-learning goal and is critically needed for building intelligent agents. However, most papers heavily depend on underlying language modeling and thus pay little to no attention on creating effective reasoning models. This paper proposes a modified transformer architecture that effectively combines soft and hard attention to create multi-perspective reasoning model capable of tackling wide range of reasoning tasks. An attention mechanism that highlights the relational significance of input signals is considered as well. The result from this study shows performance gain as compared to its counterpart the transformer network on bAbI dataset, a natural language reasoning tasks.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129940334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DSP-PIGAN: A Precision-Consistency Machine Learning Algorithm for Solving Partial Differential Equations DSP-PIGAN:求解偏微分方程的精确一致性机器学习算法
Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457686
Yunzhuo Wang, Hao Sun, Guangzhong Sun
Partial differential equations (PDEs) are the most ubiquitous tool for modeling problems in nature. In recent years, machine learning techniques are adopted to solve PDEs. However, the prediction errors of existing machine learning methods vary widely on different subdomains of PDEs. How to achieve precision-consistency is a crucial and complex issue for machine learning methods for solving PDEs. To tackle this issue, we propose DSP, an adaptive framework for solving PDEs. DSP is composed of domain decomposition, searching for singular subdomains, and prediction. Furthermore, a novel generative model, physics-informed generative adversarial network (PIGAN), is designed to solve PDEs. In addition, we introduce points with high-precision labels into the training process of the model to improve model accuracy. We test the effectiveness of our approach on three real physical equations: Poisson equation, Helmhotz equation and Eikonal equation. Through experiments, we prove that the combination of DSP and PIGAN outperforms various state-of-the-art baselines.
偏微分方程(PDEs)是自然界中最普遍的建模工具。近年来,机器学习技术被用于求解偏微分方程。然而,现有机器学习方法的预测误差在偏微分方程的不同子域上差异很大。如何实现精度一致性是求解偏微分方程的机器学习方法中一个关键而复杂的问题。为了解决这个问题,我们提出了DSP,一种求解偏微分方程的自适应框架。DSP由域分解、奇异子域搜索和预测三个部分组成。此外,设计了一种新的生成模型——物理信息生成对抗网络(PIGAN)来求解偏微分方程。此外,我们在模型的训练过程中引入了具有高精度标签的点,以提高模型的精度。在泊松方程、亥姆霍兹方程和Eikonal方程这三个实际物理方程上验证了该方法的有效性。通过实验,我们证明了DSP和PIGAN的组合优于各种最先进的基线。
{"title":"DSP-PIGAN: A Precision-Consistency Machine Learning Algorithm for Solving Partial Differential Equations","authors":"Yunzhuo Wang, Hao Sun, Guangzhong Sun","doi":"10.1145/3457682.3457686","DOIUrl":"https://doi.org/10.1145/3457682.3457686","url":null,"abstract":"Partial differential equations (PDEs) are the most ubiquitous tool for modeling problems in nature. In recent years, machine learning techniques are adopted to solve PDEs. However, the prediction errors of existing machine learning methods vary widely on different subdomains of PDEs. How to achieve precision-consistency is a crucial and complex issue for machine learning methods for solving PDEs. To tackle this issue, we propose DSP, an adaptive framework for solving PDEs. DSP is composed of domain decomposition, searching for singular subdomains, and prediction. Furthermore, a novel generative model, physics-informed generative adversarial network (PIGAN), is designed to solve PDEs. In addition, we introduce points with high-precision labels into the training process of the model to improve model accuracy. We test the effectiveness of our approach on three real physical equations: Poisson equation, Helmhotz equation and Eikonal equation. Through experiments, we prove that the combination of DSP and PIGAN outperforms various state-of-the-art baselines.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133447795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Low Light Image Enhancement in USV Imaging System Via U-Net and Attention Mechanism 基于U-Net和注意机制的USV成像系统弱光图像增强
Pub Date : 2021-02-26 DOI: 10.1145/3457682.3457729
Sheng Zhang, Tianxiao Cai, Yihang Chen
Images captured by Unmanned Surface Vessel (USV) have a wide range of applications in various fields, such as maritime object detection, remote sensing, and autonomous transportation. However, cameras often suffer from a low light environment, resulting in low contrast, high noise, and poor quality image, causing identification difficulties and machine decision errors. In recent years, convolutional neural networks have developed rapidly, which have strong generalization ability and can extract different levels of information, especially high-level information. Therefore, to preprocess low light images before advanced computer vision tasks of USV, we proposed a deep learning-based end-to-end convolutional network for low light enhancement in USV imaging system. The advantage of our model is using U-Net as the basic architecture to gain multi-scale feature maps with improvements, including attention mechanism and dense connection. Besides, we pay attention to edge information given images' edge loss. With the unique network structure, our model can effectively increase the brightness and contrast of dark aquatic images. Experiments have been carried out on testing images to analyze our proposed method with several latest imaging methods. The experimental results show its outstanding performance in both subjective and objective evaluation.
无人水面舰艇(USV)捕获的图像在各个领域具有广泛的应用,例如海上目标检测,遥感和自主运输。然而,相机经常遭受低光环境,导致低对比度,高噪声,图像质量差,造成识别困难和机器决策错误。近年来,卷积神经网络发展迅速,具有较强的泛化能力,可以提取不同层次的信息,特别是高层次的信息。因此,为了在USV高级计算机视觉任务之前对低光图像进行预处理,我们提出了一种基于深度学习的端到端卷积网络,用于USV成像系统的低光增强。该模型的优点是使用U-Net作为基本架构,改进了多尺度特征图,包括注意机制和密集连接。此外,在图像存在边缘损失的情况下,我们还关注边缘信息。该模型具有独特的网络结构,可以有效地提高深色水体图像的亮度和对比度。在测试图像上进行了实验,并与几种最新的成像方法进行了比较。实验结果表明,该方法在主观评价和客观评价方面都具有优异的性能。
{"title":"Low Light Image Enhancement in USV Imaging System Via U-Net and Attention Mechanism","authors":"Sheng Zhang, Tianxiao Cai, Yihang Chen","doi":"10.1145/3457682.3457729","DOIUrl":"https://doi.org/10.1145/3457682.3457729","url":null,"abstract":"Images captured by Unmanned Surface Vessel (USV) have a wide range of applications in various fields, such as maritime object detection, remote sensing, and autonomous transportation. However, cameras often suffer from a low light environment, resulting in low contrast, high noise, and poor quality image, causing identification difficulties and machine decision errors. In recent years, convolutional neural networks have developed rapidly, which have strong generalization ability and can extract different levels of information, especially high-level information. Therefore, to preprocess low light images before advanced computer vision tasks of USV, we proposed a deep learning-based end-to-end convolutional network for low light enhancement in USV imaging system. The advantage of our model is using U-Net as the basic architecture to gain multi-scale feature maps with improvements, including attention mechanism and dense connection. Besides, we pay attention to edge information given images' edge loss. With the unique network structure, our model can effectively increase the brightness and contrast of dark aquatic images. Experiments have been carried out on testing images to analyze our proposed method with several latest imaging methods. The experimental results show its outstanding performance in both subjective and objective evaluation.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125186563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2021 13th International Conference on Machine Learning and Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1