2017 14th Web Information Systems and Applications Conference (WISA)最新文献

英文中文

Statutes Recommendation Based on Text Similarity 基于文本相似度的法规推荐

2017 14th Web Information Systems and Applications Conference (WISA)

Pub Date : 2017-11-01 DOI: 10.1109/WISA.2017.52

Jin Zeng, Jidong Ge, Yemao Zhou, Yi Feng, Chuanyi Li, Zhongjin Li, B. Luo

The traditional approach to measure text similarity is based on the TF-IDF algorithm to get the document vector, and then use the cosine similarity algorithm to calculate the text similarity. However, this method of statistical way ignores the potential semantics of the articles or words. By some means, this method only aims at the word itself. But with the Latent Semantic Analysis, the semantic space is added on the basis of calculate TF-IDF. Each word and document can have a position in semantic space by Singular Value Decomposition. That allows the semantic analysis, document clustering, and the relationship between semantic class and document class can be finished at the same time. Here, we summarize the text similarity measures, and gradually extend to the Latent Semantic Analysis. The experiment shows that the statutes predicted by LSA are more accurate than that only by TF-IDF.

传统的度量文本相似度的方法是基于TF-IDF算法得到文档向量，然后使用余弦相似度算法计算文本相似度。然而，这种统计方法忽略了文章或词语的潜在语义。从某种意义上说，这种方法只针对单词本身。而潜在语义分析是在计算TF-IDF的基础上添加语义空间。通过奇异值分解，每个词和文档在语义空间中都有一个位置。这使得语义分析、文档聚类以及语义类与文档类之间的关系可以同时完成。在这里，我们总结了文本相似度度量，并逐步扩展到潜在语义分析。实验表明，LSA预测的法律比TF-IDF预测的法律更准确。

引用次数: 1

Mining Frequent Intra-Sequence and Inter-Sequence Patterns Using Bitmap with a Maximal Span 利用最大跨度位图挖掘频繁序列内和序列间模式

2017 14th Web Information Systems and Applications Conference (WISA)

Pub Date : 2017-11-01 DOI: 10.1109/WISA.2017.70

Wenzhe Liao, Qian Wang, Luqun Yang, Jiadong Ren, D. Davis, Changzhen Hu

Frequent intra-sequence pattern mining and inter-sequence pattern mining are both important ways of association rule mining for different applications. However, most algorithms focus on just one of them, as attempting both is usually inefficient. To address this deficiency, FIIP-BM, a Frequent Intra-sequence and Inter-sequence Pattern mining algorithm using Bitmap with a maxSpan is proposed. FIIP-BM transforms each transaction to a bit vector, adjusts the maximal span according to user's demand and obtains the frequent sequences by logic And-operation. For candidate 2-pattern generation, the subscripts of the joining items should be checked first; the bit vector of the joining item will be left-shifted before calculation if the subscript is not 0. Left alignment rule is used for different bit vector length problems. FIIP-BM can mine both intra-sequence and inter-sequence patterns. Experiments are conducted to demonstrate the computational speed and memory efficiency of the FIIP-BM algorithm.

频繁序列内模式挖掘和序列间模式挖掘都是针对不同应用进行关联规则挖掘的重要方法。然而，大多数算法只关注其中一个，因为同时尝试两者通常效率低下。为了解决这一缺陷，提出了一种基于位图和maxSpan的序列内和序列间频繁模式挖掘算法FIIP-BM。FIIP-BM将每个事务转换成一个位向量，根据用户需求调整最大跨度，通过逻辑运算得到频繁序列。对于候选2模式生成，首先要检查连接项的下标;如果下标不为0，则连接项的位向量将在计算前左移。左对齐规则适用于不同位向量长度的问题。FIIP-BM可以同时挖掘序列内和序列间的模式。实验验证了FIIP-BM算法的计算速度和存储效率。

引用次数: 3

Efficient Time Series Classification via Sparse Linear Combination 基于稀疏线性组合的高效时间序列分类

2017 14th Web Information Systems and Applications Conference (WISA)

Pub Date : 2017-11-01 DOI: 10.1109/WISA.2017.37

Zhenguo Zhang, Peng Nie, Yanlong Wen

Time series classification presents a specific machine learning challenge due to the ordering of variables. Recent studies show that the simple nearest neighbor classifier with elastic distance measures is hard to beat and many researchers focus on alternative distance measures. Unlike nearest neighbor classifier try to find a training sample which has the minimum distance with test instance, we utilize a reconstruction strategy to determine the label of new time series in this paper. Concretely, for each test time series, we reconstruct it by using as few training samples as possible and then calculate the residuals between the test time series and the selected training samples of each class. The test time series is classified to the class with minimum residual. To get the required time series from the training set, we employ sparse restriction technique to discover the optimal combination of different training samples while fitting test time series. Meanwhile, to solve the scenarios where the time series dataset is linearly inseparable, we extend our method by the kernel trick. Extensive experimental results show that the proposed method can gain the significant improvement on commonly used time series datasets.

由于变量的排序，时间序列分类提出了一个特定的机器学习挑战。近年来的研究表明，具有弹性距离度量的简单最近邻分类器是难以击败的，许多研究者都在研究替代距离度量。与最近邻分类器试图寻找与测试实例距离最小的训练样本不同，本文采用重构策略来确定新时间序列的标签。具体来说，对于每一个测试时间序列，我们使用尽可能少的训练样本进行重构，然后计算测试时间序列与所选的每一类训练样本之间的残差。将测试时间序列分类为残差最小的一类。为了从训练集中得到所需的时间序列，我们在拟合测试时间序列的同时，采用稀疏约束技术发现不同训练样本的最优组合。同时，为了解决时间序列数据线性不可分割的情况，我们通过核技巧扩展了我们的方法。大量的实验结果表明，该方法在常用的时间序列数据集上可以获得显著的改进。

{"title":"Efficient Time Series Classification via Sparse Linear Combination","authors":"Zhenguo Zhang, Peng Nie, Yanlong Wen","doi":"10.1109/WISA.2017.37","DOIUrl":"https://doi.org/10.1109/WISA.2017.37","url":null,"abstract":"Time series classification presents a specific machine learning challenge due to the ordering of variables. Recent studies show that the simple nearest neighbor classifier with elastic distance measures is hard to beat and many researchers focus on alternative distance measures. Unlike nearest neighbor classifier try to find a training sample which has the minimum distance with test instance, we utilize a reconstruction strategy to determine the label of new time series in this paper. Concretely, for each test time series, we reconstruct it by using as few training samples as possible and then calculate the residuals between the test time series and the selected training samples of each class. The test time series is classified to the class with minimum residual. To get the required time series from the training set, we employ sparse restriction technique to discover the optimal combination of different training samples while fitting test time series. Meanwhile, to solve the scenarios where the time series dataset is linearly inseparable, we extend our method by the kernel trick. Extensive experimental results show that the proposed method can gain the significant improvement on commonly used time series datasets.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114372920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Computing User Similarity by Combining SimRank++ and Cosine Similarities to Improve Collaborative Filtering 结合simmrank ++和余弦相似度计算用户相似度改进协同过滤

2017 14th Web Information Systems and Applications Conference (WISA)

Pub Date : 2017-11-01 DOI: 10.1109/WISA.2017.22

Xiuli Wang, Zhuoming Xu, Xiutao Xia, Chengwang Mao

This paper addresses the sparsity problem in collaborative filtering (CF) by developing an aggregated useruser similarity measure suitable for the user-based CF model. The aggregated similarity measure is a weighted aggregation of the SimRank++ similarity on the user-item bipartite graph and the cosine similarity of the Linked Open Data (LOD)-based user profiles derived from both the rating data and the items' descriptive attributes found from LOD resources. To validate the effectiveness of the aggregated similarity and evaluate the accuracy of rating predictions with the user-based CF method, comparative experiments between four similarity measures, the Pearson correlation coefficient, the SimRank++ similarity, the cosine similarity and the aggregated similarity, were conducted on the MovieLens 100k dataset and DBpedia. The experimental results indicate that the proposed aggregated similarity measure overall outperforms the other three similarity measures in terms of both Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), especially in the cases of 30-100 nearest neighbors.

本文通过开发一种适合于基于用户的协同过滤模型的聚合用户相似度度量来解决协同过滤中的稀疏性问题。聚合相似度度量是用户-项目二部图上的simmrank ++相似度和基于链接开放数据(LOD)的用户配置文件的余弦相似度的加权聚合，这些用户配置文件来自评级数据和从LOD资源中发现的项目描述性属性。为了验证聚合相似度的有效性并评估基于用户的CF方法评级预测的准确性，在MovieLens 100k数据集和DBpedia上进行了Pearson相关系数、simmrank ++相似度、余弦相似度和聚合相似度四种相似度度量的比较实验。实验结果表明，该方法在均方根误差(RMSE)和平均绝对误差(MAE)方面均优于其他三种相似性度量方法，特别是在30-100个近邻的情况下。

{"title":"Computing User Similarity by Combining SimRank++ and Cosine Similarities to Improve Collaborative Filtering","authors":"Xiuli Wang, Zhuoming Xu, Xiutao Xia, Chengwang Mao","doi":"10.1109/WISA.2017.22","DOIUrl":"https://doi.org/10.1109/WISA.2017.22","url":null,"abstract":"This paper addresses the sparsity problem in collaborative filtering (CF) by developing an aggregated useruser similarity measure suitable for the user-based CF model. The aggregated similarity measure is a weighted aggregation of the SimRank++ similarity on the user-item bipartite graph and the cosine similarity of the Linked Open Data (LOD)-based user profiles derived from both the rating data and the items' descriptive attributes found from LOD resources. To validate the effectiveness of the aggregated similarity and evaluate the accuracy of rating predictions with the user-based CF method, comparative experiments between four similarity measures, the Pearson correlation coefficient, the SimRank++ similarity, the cosine similarity and the aggregated similarity, were conducted on the MovieLens 100k dataset and DBpedia. The experimental results indicate that the proposed aggregated similarity measure overall outperforms the other three similarity measures in terms of both Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), especially in the cases of 30-100 nearest neighbors.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"426 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116230102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

A Survey on Visual Place Recognition for Mobile Robots Localization 移动机器人定位中的视觉位置识别研究

2017 14th Web Information Systems and Applications Conference (WISA)

Pub Date : 2017-11-01 DOI: 10.1109/WISA.2017.7

Yutian Chen, Wenyan Gan, Lei Zhang, Chong Liu, Xianlei Wang

Visual place recognition is an active research field in the robotic navigation and localization, which means the ability to recognize a known place in the environment using vision as the main sensor modality. Despite significant progress in computer vision and machine learning techniques, challenges remain especially in dynamic environments such as illumination change, viewpoint change and so on. In this paper, a survey and comparative study on existing approaches of visual place recognition is presented, including place feature extraction methods, image similarity metrics and searching algorithms, as well as some benchmark datasets and evaluation metrics. Experimental results show that the methods combining feature extraction using convolutional neural networks and sequential image searching achieve higher precision in large scale dynamic environment.

视觉位置识别是机器人导航和定位中的一个活跃研究领域，它是指以视觉为主要传感器方式识别环境中已知位置的能力。尽管计算机视觉和机器学习技术取得了重大进展，但挑战仍然存在，特别是在动态环境中，如照明变化、视点变化等。本文对现有的视觉位置识别方法进行了综述和比较研究，包括位置特征提取方法、图像相似度度量和搜索算法，以及一些基准数据集和评价指标。实验结果表明，将卷积神经网络特征提取与序列图像搜索相结合的方法在大规模动态环境中具有较高的精度。

引用次数: 10

The Optimization Mechanism Research of Distributed Unified Authentication Based on Cache 基于缓存的分布式统一认证优化机制研究

2017 14th Web Information Systems and Applications Conference (WISA)

Pub Date : 2017-11-01 DOI: 10.1109/WISA.2017.4

Dongju Yang, Kai Feng

It is one of the popular and effective way to build a unified authentication center to implement single sign-on among many applications in the enterprise. How to deal with the high concurrent and high flow of user requests to ensure the stability and efficiency of the authentication service is most important when integrating multiple systems. Aiming at the problem of authentication center, such as overloaded, single point of failure, slow response time, etc. we put forward a distributed architecture with cache to enable the unified authentication. The authentication tickets can be shared among multiple nodes by cache. The hot and important data can be prefetched to cache to improve the response time. A multi-factor cache replacement algorithm based on Hybird is also proposed which combining complex and diverse user behavior to improve the effectiveness of data replacement. The experimental results show that the optimized distributed authentication architecture can guarantee the stability of the system, and the cache mechanism can improve the response time, and a multi factor cache replacement algorithm based on Hybird can improve the cache hit ratio.

建立统一的认证中心，在企业的众多应用中实现单点登录是目前流行的有效方法之一。如何处理高并发、高流量的用户请求，保证认证服务的稳定性和高效性，是多系统集成时最重要的问题。针对认证中心过载、单点故障、响应速度慢等问题，提出了一种带缓存的分布式架构，实现了统一认证。认证票据可以通过缓存在多个节点之间共享。可以将热门和重要的数据预取到缓存中，以提高响应时间。结合用户行为的复杂性和多样性，提出了一种基于Hybird的多因素缓存替换算法，提高了数据替换的有效性。实验结果表明，优化后的分布式认证架构可以保证系统的稳定性，缓存机制可以提高响应时间，基于Hybird的多因素缓存替换算法可以提高缓存命中率。

{"title":"The Optimization Mechanism Research of Distributed Unified Authentication Based on Cache","authors":"Dongju Yang, Kai Feng","doi":"10.1109/WISA.2017.4","DOIUrl":"https://doi.org/10.1109/WISA.2017.4","url":null,"abstract":"It is one of the popular and effective way to build a unified authentication center to implement single sign-on among many applications in the enterprise. How to deal with the high concurrent and high flow of user requests to ensure the stability and efficiency of the authentication service is most important when integrating multiple systems. Aiming at the problem of authentication center, such as overloaded, single point of failure, slow response time, etc. we put forward a distributed architecture with cache to enable the unified authentication. The authentication tickets can be shared among multiple nodes by cache. The hot and important data can be prefetched to cache to improve the response time. A multi-factor cache replacement algorithm based on Hybird is also proposed which combining complex and diverse user behavior to improve the effectiveness of data replacement. The experimental results show that the optimized distributed authentication architecture can guarantee the stability of the system, and the cache mechanism can improve the response time, and a multi factor cache replacement algorithm based on Hybird can improve the cache hit ratio.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126642518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2017 14th Web Information Systems and Applications Conference (WISA)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀