Int. J. Knowl. Web Intell.最新文献

英文中文

An enterprise perspective of web content analysis research: a strategic road-map 企业视角下的网络内容分析研究:战略路线图

Int. J. Knowl. Web Intell.

Pub Date : 1900-01-01 DOI: 10.1504/IJKWI.2017.10010794

R. Wadawadagi, V. Pagi

Participating in social networks to create and share opinion content has become a ubiquitous part of our everyday life. Understanding social media content is at the top of the agenda for many firms today. Business analysts and quants are trying harder to discover ways in which enterprises can be benefited by comprehending the content generated through social media such as Facebook, Wikipedia, Blogs, Youtube and Twitter. This pioneering work may aid business analysts and data scientists with insights into ways to adapt the stable content analysis (CA) techniques to analyse web page contents containing user-generated data. In this paper, we develop an integrated enterprise framework that defines web content analysis (WCA) as a comprehensive and functional layered architecture, and consequently, this framework can be used in various levels of the decision-making process. Further, a four dimensional view of comparative analysis of various WCA systems is presented. Based on the critical analysis of the literature survey, the study explores many open and challenging issues for further research in this domain.

参与社交网络创造和分享观点内容已经成为我们日常生活中无处不在的一部分。了解社交媒体内容是当今许多公司的首要任务。商业分析师和量化分析师正更加努力地发现，企业可以通过理解Facebook、维基百科(Wikipedia)、博客、Youtube和Twitter等社交媒体产生的内容，从中获益。这项开创性的工作可能会帮助业务分析师和数据科学家深入了解如何调整稳定内容分析(CA)技术来分析包含用户生成数据的网页内容。在本文中，我们开发了一个集成的企业框架，该框架将web内容分析(WCA)定义为一个全面和功能分层的体系结构，因此，该框架可用于决策过程的各个层面。此外，还介绍了各种WCA系统的四维比较分析。在对文献调查进行批判性分析的基础上，本研究探索了该领域许多开放和具有挑战性的问题。

引用次数: 5

WSOLINK: web structure outlier detection algorithm WSOLINK: web结构离群点检测算法

Int. J. Knowl. Web Intell.

Pub Date : 1900-01-01 DOI: 10.1504/IJKWI.2016.10005796

Rachna Miglani

In this world of specialisation where everything is getting specialised, data warehouses and web mining techniques are also getting specialised. Web usage mining, web content mining, and web structure mining are various categories of web mining techniques depending upon the data to be mined. Apriori algorithm, FP growth algorithm, and average linear time algorithm are available to analyse the general access patterns in web server logs whereas WCOND-mine and signed with weight technique are web content outlier mining algorithms. However, no such algorithm is available to check the authenticity and availability of hyperlinks in the resultant web pages given by web search engines. The present research work aims at detection of outliers from the results of queries over web pages through web search engines.

在这个一切都变得专业化的世界里，数据仓库和网络挖掘技术也变得专业化。Web使用挖掘、Web内容挖掘和Web结构挖掘是Web挖掘技术的不同类别，这取决于要挖掘的数据。Apriori算法、FP增长算法和平均线性时间算法可用于分析web服务器日志中的一般访问模式，而WCOND-mine和signed with weight技术是web内容离群值挖掘算法。然而，没有这样的算法是可用的，以检查的真实性和可用性的结果网页上的超链接由网络搜索引擎给出。目前的研究工作旨在通过网络搜索引擎从网页查询结果中检测异常值。

引用次数: 0

Determining the semantic orientation of opinion words using typed dependencies for opinion word senses and SentiWordNet scores from online product reviews 使用在线产品评论的意见词感官和SentiWordNet评分的类型依赖关系来确定意见词的语义取向

Int. J. Knowl. Web Intell.

Pub Date : 1900-01-01 DOI: 10.1504/IJKWI.2017.10010171

K. R. Kumar, D. T. Santosh, B. V. Vardhan

Opinion words express the information regarding the like and dislike of a user on the target entities such as products and product aspects present in the online reviews. The polarised information collected from the reviews is analysed by calculating the orientation of the adjectives. The synonymy relation graph is a way to determine the orientation of the adjectives present in the product reviews dataset. It considers the minimum path length between the adjectives under analysis using WordNet synsets. The synonymy relation graph cannot determine the orientations of all the opinion words present in the dataset. In order to evaluate opinion orientation of all the adjectives from the dataset, the synonymy relation graph of WordNet is to be replaced with the SentiWordNet scores of the opinion words. These scores are provided to the opinion words by finding the contextual clues surrounding the opinion words to disambiguate their sense. The contextual clues are finalised based on the typed dependencies grammatical relations. The distance between the opinion word and the context insensitive seed term (good/bad) is computed by calculating the difference between these scores. This paper addresses advantages of using SentiWordNet scores. This improves the accuracy of the determined opinion word orientations.

意见词表达了用户对目标实体(如在线评论中的产品和产品方面)的喜欢和不喜欢的信息。通过计算形容词的倾向性来分析从评论中收集到的两极化信息。同义词关系图是确定产品评论数据集中出现的形容词方向的一种方法。它考虑使用WordNet同义词集分析的形容词之间的最小路径长度。同义词关系图不能确定数据集中存在的所有意见词的方向。为了评价数据集中所有形容词的意见倾向，将WordNet的同义词关系图替换为意见词的SentiWordNet分数。这些分数是通过寻找围绕意见词的上下文线索来消除其意义的歧义来提供给意见词的。上下文线索是基于类型依赖语法关系来确定的。通过计算这些分数之间的差值来计算意见词和上下文不敏感的种子词(好/坏)之间的距离。本文论述了使用SentiWordNet评分的优点。这提高了确定意见词方向的准确性。

{"title":"Determining the semantic orientation of opinion words using typed dependencies for opinion word senses and SentiWordNet scores from online product reviews","authors":"K. R. Kumar, D. T. Santosh, B. V. Vardhan","doi":"10.1504/IJKWI.2017.10010171","DOIUrl":"https://doi.org/10.1504/IJKWI.2017.10010171","url":null,"abstract":"Opinion words express the information regarding the like and dislike of a user on the target entities such as products and product aspects present in the online reviews. The polarised information collected from the reviews is analysed by calculating the orientation of the adjectives. The synonymy relation graph is a way to determine the orientation of the adjectives present in the product reviews dataset. It considers the minimum path length between the adjectives under analysis using WordNet synsets. The synonymy relation graph cannot determine the orientations of all the opinion words present in the dataset. In order to evaluate opinion orientation of all the adjectives from the dataset, the synonymy relation graph of WordNet is to be replaced with the SentiWordNet scores of the opinion words. These scores are provided to the opinion words by finding the contextual clues surrounding the opinion words to disambiguate their sense. The contextual clues are finalised based on the typed dependencies grammatical relations. The distance between the opinion word and the context insensitive seed term (good/bad) is computed by calculating the difference between these scores. This paper addresses advantages of using SentiWordNet scores. This improves the accuracy of the determined opinion word orientations.","PeriodicalId":113936,"journal":{"name":"Int. J. Knowl. Web Intell.","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132032229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Clustering-based web page prediction 基于聚类的网页预测

Int. J. Knowl. Web Intell.

Pub Date : 1900-01-01 DOI: 10.1504/IJKWI.2011.045163

R. Dutta, A. Kundu, Debajyoti Mukhopadhyay

Web page prediction plays an important role by predicting and fetching probable web page of next request in advance, resulting in reducing the user latency. The users surf the internet either by entering URL or search for some topic or through link of same topic. For searching and for link prediction, clustering plays an important role. Besides the topic, navigational behaviour is not ignored. This paper proposes a web page prediction model giving significant importance to the user's interest using the clustering technique and the navigational behaviour of the user through Markov model. The clustering technique is used for the accumulation of the similar web pages. Similar web pages of same type reside in the same cluster, the cluster containing web pages have the similarity with respect to topic of the session. The clustering algorithms considered are K-means and K-mediods, where K is determined by HITS algorithm. Finally, the predicted web pages are stored in form of cellular automata to make the system more memory efficient.

网页预测通过提前预测和提取下一个请求可能出现的网页，从而减少用户的延迟，起到了重要的作用。用户通过输入URL或搜索某个主题或通过同一主题的链接来上网。对于搜索和链接预测，聚类起着重要的作用。除了主题之外，导航行为也不容忽视。本文利用聚类技术和马尔可夫模型提出了一种重视用户兴趣和用户导航行为的网页预测模型。聚类技术用于相似网页的积累。相同类型的相似网页位于同一簇中，包含网页的簇在会话主题方面具有相似性。考虑的聚类算法为K-means和K- medium，其中K由HITS算法确定。最后，将预测的网页以元胞自动机的形式存储，提高了系统的存储效率。

引用次数: 11

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Int. J. Knowl. Web Intell.

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀