Pub Date : 2015-02-01DOI: 10.1109/ICOSC.2015.7050807
Yimin Yang, Daniel Lopez, Haiman Tian, Samira Pouyanfar, Fausto Fleites, Shu‐Ching Chen, S. Hamid
Home insurance is a critical issue in the state of Florida, considering that residential properties are exposed to hurricane risk each year. To assess hurricane risk and project insured losses, the Florida Public Hurricane Loss Model (FPHLM) funded by the states insurance regulatory agency was developed. The FPHLM is an open and public model that offers an integrated complex computing framework that can be described in two phases: execution and validation. In the execution phase, all major components of FPHLM (i.e., data pre-processing, Wind Speed Correction (WSC), and Insurance Loss Model (ILM)) are seamlessly integrated and sequentially carried out by following a coordination workflow, where each component is modeled as an execution element governed by the centralized data-transfer element. In the validation phase, semantic rules provided by domain experts for individual component are applied to verify the validity of model output. This paper presents how the model efficiently incorporates the various components from multiple disciplines in an integrated execution framework to address the challenges that make the FPHLM unique.
{"title":"Integrated execution framework for catastrophe modeling","authors":"Yimin Yang, Daniel Lopez, Haiman Tian, Samira Pouyanfar, Fausto Fleites, Shu‐Ching Chen, S. Hamid","doi":"10.1109/ICOSC.2015.7050807","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050807","url":null,"abstract":"Home insurance is a critical issue in the state of Florida, considering that residential properties are exposed to hurricane risk each year. To assess hurricane risk and project insured losses, the Florida Public Hurricane Loss Model (FPHLM) funded by the states insurance regulatory agency was developed. The FPHLM is an open and public model that offers an integrated complex computing framework that can be described in two phases: execution and validation. In the execution phase, all major components of FPHLM (i.e., data pre-processing, Wind Speed Correction (WSC), and Insurance Loss Model (ILM)) are seamlessly integrated and sequentially carried out by following a coordination workflow, where each component is modeled as an execution element governed by the centralized data-transfer element. In the validation phase, semantic rules provided by domain experts for individual component are applied to verify the validity of model output. This paper presents how the model efficiently incorporates the various components from multiple disciplines in an integrated execution framework to address the challenges that make the FPHLM unique.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133329175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-01DOI: 10.1109/ICOSC.2015.7050788
Xijiang Ke, Hai Jin, Xia Xie, Jie Cao
Linear classification is useful in many applications, but training large-scale data remains an important research issue. Recent advances in linear classification have shown that distributed methods can be efficient in improving the training time. However, for most of the existing training methods,based on MPI or Hadoop, the communication between nodes is the bottleneck. To shorten the communication between nodes, we propose and analyze a method for distributed support vector machine and implement it on an iterative MapReduce framework. Through our distributed method, the local SVMs are generic and can make use of the state-of-the-art SVM solvers. Unlike previous attempts to parallelize SVMs the algorithm does not make assumptions on the density of the support vectors, i.e., the efficiency of the algorithm holds also for the “difficult” cases where the number of support vectors is very high. The performance of the our method is evaluated in an experimental environment. By partitioning the training dataset into smaller subsets and optimizing the partitioned subsets across a cluster of computers, we reduce the training time significantly while maintaining a high level of accuracy in both binary and multiclass classifications.
{"title":"A distributed SVM method based on the iterative MapReduce","authors":"Xijiang Ke, Hai Jin, Xia Xie, Jie Cao","doi":"10.1109/ICOSC.2015.7050788","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050788","url":null,"abstract":"Linear classification is useful in many applications, but training large-scale data remains an important research issue. Recent advances in linear classification have shown that distributed methods can be efficient in improving the training time. However, for most of the existing training methods,based on MPI or Hadoop, the communication between nodes is the bottleneck. To shorten the communication between nodes, we propose and analyze a method for distributed support vector machine and implement it on an iterative MapReduce framework. Through our distributed method, the local SVMs are generic and can make use of the state-of-the-art SVM solvers. Unlike previous attempts to parallelize SVMs the algorithm does not make assumptions on the density of the support vectors, i.e., the efficiency of the algorithm holds also for the “difficult” cases where the number of support vectors is very high. The performance of the our method is evaluated in an experimental environment. By partitioning the training dataset into smaller subsets and optimizing the partitioned subsets across a cluster of computers, we reduce the training time significantly while maintaining a high level of accuracy in both binary and multiclass classifications.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130537206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-01DOI: 10.1109/ICOSC.2015.7050851
Alan Meehan, Rob Brennan, D. O’Sullivan
The Linked Data (LD) Cloud consists of LD sources covering a wide variety of topics. These data sources use formal vocabularies to represent their data and in many cases, they use heterogeneous vocabularies to represent data about the same topics. This data heterogeneity must be overcome to effectively integrate and consume data from the LD Cloud. Mappings overcome this data heterogeneity by transforming heterogeneous source data to a common target vocabulary. As new data sources emerge and existing ones change over time, new mappings must be created and existing ones maintained. Management of these mappings is an important issue but often neglected. Lack of a mapping management method decreases the ease of finding mappings for sharing, reuse and maintenance purposes. In this paper we present a method for the management of mappings between LD sources - SPARQL Based Mapping Management (SBMM). The SBMM method involves the use of SPARQL queries to perform analysis and maintenance over an RDF-based mapping representation. We present the results from an experiment that compared the analytical affordance of an RDF-based mapping representation we previously devised, called the SPARQL Centric Mapping (SCM) representation, compared to the R2R Mapping Language.
{"title":"SPARQL based mapping management","authors":"Alan Meehan, Rob Brennan, D. O’Sullivan","doi":"10.1109/ICOSC.2015.7050851","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050851","url":null,"abstract":"The Linked Data (LD) Cloud consists of LD sources covering a wide variety of topics. These data sources use formal vocabularies to represent their data and in many cases, they use heterogeneous vocabularies to represent data about the same topics. This data heterogeneity must be overcome to effectively integrate and consume data from the LD Cloud. Mappings overcome this data heterogeneity by transforming heterogeneous source data to a common target vocabulary. As new data sources emerge and existing ones change over time, new mappings must be created and existing ones maintained. Management of these mappings is an important issue but often neglected. Lack of a mapping management method decreases the ease of finding mappings for sharing, reuse and maintenance purposes. In this paper we present a method for the management of mappings between LD sources - SPARQL Based Mapping Management (SBMM). The SBMM method involves the use of SPARQL queries to perform analysis and maintenance over an RDF-based mapping representation. We present the results from an experiment that compared the analytical affordance of an RDF-based mapping representation we previously devised, called the SPARQL Centric Mapping (SCM) representation, compared to the R2R Mapping Language.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130261769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-01DOI: 10.1109/ICOSC.2015.7050787
D. Terrana, A. Augello, G. Pilato
This work proposes a system for the analysis and the comparison of users profiles in social networks. Posts are extracted and analyzed in order to detect similar contents, like topics, sentiments and writing styles. A case study regarding the analysis of the authenticity of profiles of the Italian prime minister in different social networks is illustrated.
{"title":"A system for analysis and comparison of social network profiles","authors":"D. Terrana, A. Augello, G. Pilato","doi":"10.1109/ICOSC.2015.7050787","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050787","url":null,"abstract":"This work proposes a system for the analysis and the comparison of users profiles in social networks. Posts are extracted and analyzed in order to detect similar contents, like topics, sentiments and writing styles. A case study regarding the analysis of the authenticity of profiles of the Italian prime minister in different social networks is illustrated.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126278116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-01DOI: 10.1109/ICOSC.2015.7050810
Wei Zhu, Guang Zhou, I. Yen, San-Yih Hwang
As the carbon emission becomes a serious problem, a lot of research works now focus on how to monitor and manage carbon footprints. One promising approach is to create a “carbon footprint aware” world to expose people to the carbon footprints associated with the products they buy and the services they use. Carbon footprint labeling (CFL) of products enables the consumers to choose their products not only based on quality and cost, but also based on their carbon footprints. Similarly, carbon footprints of common activities and services can also be labeled to enable informed choices. CFL can impact the supply chain operations as well. With the carbon footprint information, the carbon-footprint-optimal supply chain can be identified to model the supply chains with least carbon emissions. Existing carbon footprint management systems mostly rely on databases to maintain carbon footprint data. But database alone is not sufficient for carbon footprint labeling. In this paper, we develop an ontology model, CFL-ontology, to specify how products are produced, the processes involved in activities and services, and the computation functions to derive the carbon footprints of the products, activities, and services, based on the associated descriptions. With the CFL-ontology, reasoning can be performed to automatically derive the carbon footprint labels for individual products and services.
{"title":"A CFL-ontology model for carbon footprint reasoning","authors":"Wei Zhu, Guang Zhou, I. Yen, San-Yih Hwang","doi":"10.1109/ICOSC.2015.7050810","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050810","url":null,"abstract":"As the carbon emission becomes a serious problem, a lot of research works now focus on how to monitor and manage carbon footprints. One promising approach is to create a “carbon footprint aware” world to expose people to the carbon footprints associated with the products they buy and the services they use. Carbon footprint labeling (CFL) of products enables the consumers to choose their products not only based on quality and cost, but also based on their carbon footprints. Similarly, carbon footprints of common activities and services can also be labeled to enable informed choices. CFL can impact the supply chain operations as well. With the carbon footprint information, the carbon-footprint-optimal supply chain can be identified to model the supply chains with least carbon emissions. Existing carbon footprint management systems mostly rely on databases to maintain carbon footprint data. But database alone is not sufficient for carbon footprint labeling. In this paper, we develop an ontology model, CFL-ontology, to specify how products are produced, the processes involved in activities and services, and the computation functions to derive the carbon footprints of the products, activities, and services, based on the associated descriptions. With the CFL-ontology, reasoning can be performed to automatically derive the carbon footprint labels for individual products and services.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126846394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-01DOI: 10.1109/ICOSC.2015.7050826
Qifeng Zhou, Hao Zhou, Yimin Zhu, Tao Li
Life cycle assessment (LCA) as a decision support tool for evaluating the environmental load of products has been widely used in many fields. However, applying LCA in the building industry is expensive and time consuming. This is due to the complexity of building structure along with a large amount of high-dimensional heterogeneous building data. So far building environmental impact assessment (BEIA) is an important yet under-addressed issue. This paper gives a brief survey of BEIA and investigates potential advantages of using data mining techniques to discover the relationships between building materials and environment impacts. We formulate three important BEIA issues as a series of data mining problems, and propose corresponding solution schemes. Specifically, first, a feature selection approach is proposed based on the practical demand and construction characteristics to perform assessment analysis. Second, a unified framework for solving constraint-based clustering ensemble selection is proposed to extend the environmental impact assessment range from the building level to the regional level. Finally, a multiple disparate clustering method is presented to help sustainable new buildings design. We expect our proposal would shed light on data-driven approaches for environment impact assessment.
{"title":"Data-driven solutions for building environmental impact assessment","authors":"Qifeng Zhou, Hao Zhou, Yimin Zhu, Tao Li","doi":"10.1109/ICOSC.2015.7050826","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050826","url":null,"abstract":"Life cycle assessment (LCA) as a decision support tool for evaluating the environmental load of products has been widely used in many fields. However, applying LCA in the building industry is expensive and time consuming. This is due to the complexity of building structure along with a large amount of high-dimensional heterogeneous building data. So far building environmental impact assessment (BEIA) is an important yet under-addressed issue. This paper gives a brief survey of BEIA and investigates potential advantages of using data mining techniques to discover the relationships between building materials and environment impacts. We formulate three important BEIA issues as a series of data mining problems, and propose corresponding solution schemes. Specifically, first, a feature selection approach is proposed based on the practical demand and construction characteristics to perform assessment analysis. Second, a unified framework for solving constraint-based clustering ensemble selection is proposed to extend the environmental impact assessment range from the building level to the regional level. Finally, a multiple disparate clustering method is presented to help sustainable new buildings design. We expect our proposal would shed light on data-driven approaches for environment impact assessment.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114992534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-01DOI: 10.1109/ICOSC.2015.7050808
Yuta Ohwatari, Takahiro Kawamura, Y. Sei, Yasuyuki Tahara, Akihiko Ohsuga
In many movies, cultures, social conditions, and awareness of the issues of the times are depicted in any form. Even if fantasy and SF are works far from reality, the stories do mirror the real world. Therefore, we assumed to be able to understand social conditions and cultures of the real world by analyzing the movie. As a way to analyze the film, we decided to estimate the interpersonal relationships between the characters in the movies. In this paper, we propose a method of estimating interpersonal relationships of the characters using Markov Logic Network from movie script databases on the Web. Markov Logic Network is a probabilistic logic network that can describe the relationships between characters, which are not necessarily satisfied on every occasion. In experiments, we confirmed that our proposed method can estimate favors between the characters in a movie with a precision of 64.2%. Finally, by comparing the estimated relationships with social indicators, we discussed the relevance of the movie to the real world.
{"title":"Estimation of character diagram from open-movie databases for cultural understanding","authors":"Yuta Ohwatari, Takahiro Kawamura, Y. Sei, Yasuyuki Tahara, Akihiko Ohsuga","doi":"10.1109/ICOSC.2015.7050808","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050808","url":null,"abstract":"In many movies, cultures, social conditions, and awareness of the issues of the times are depicted in any form. Even if fantasy and SF are works far from reality, the stories do mirror the real world. Therefore, we assumed to be able to understand social conditions and cultures of the real world by analyzing the movie. As a way to analyze the film, we decided to estimate the interpersonal relationships between the characters in the movies. In this paper, we propose a method of estimating interpersonal relationships of the characters using Markov Logic Network from movie script databases on the Web. Markov Logic Network is a probabilistic logic network that can describe the relationships between characters, which are not necessarily satisfied on every occasion. In experiments, we confirmed that our proposed method can estimate favors between the characters in a movie with a precision of 64.2%. Finally, by comparing the estimated relationships with social indicators, we discussed the relevance of the movie to the real world.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"46 10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125701199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-01DOI: 10.1109/ICOSC.2015.7050811
Azze-eddine Maredj, Nourredine Tonkin
We propose an approach for the dynamic adaptation of multimedia documents modeled by an over-constrained constraint satisfaction problem (OCSP). In addition the solutions that it provides for the problem of determining the relations that do not comply with the user profile and the problem of the combinatorial explosion when searching for alternative relations, it insures a certain quality of service to the presentation of the adapted document: (i) If the required constraints are not satisfied, no document is generated, unlike other approaches that generates even if the presentation of the adapted document is completely different from the initial one, (ii) The definition of the constraints hierarchy (strong constraints and medium constraints) maintains as much as possible of the initial document relations in the adapted one. As result, the adapted presentations are consistent and close to those of the initial ones.
{"title":"CSP-based adaptation of multimedia document composition","authors":"Azze-eddine Maredj, Nourredine Tonkin","doi":"10.1109/ICOSC.2015.7050811","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050811","url":null,"abstract":"We propose an approach for the dynamic adaptation of multimedia documents modeled by an over-constrained constraint satisfaction problem (OCSP). In addition the solutions that it provides for the problem of determining the relations that do not comply with the user profile and the problem of the combinatorial explosion when searching for alternative relations, it insures a certain quality of service to the presentation of the adapted document: (i) If the required constraints are not satisfied, no document is generated, unlike other approaches that generates even if the presentation of the adapted document is completely different from the initial one, (ii) The definition of the constraints hierarchy (strong constraints and medium constraints) maintains as much as possible of the initial document relations in the adapted one. As result, the adapted presentations are consistent and close to those of the initial ones.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128994709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-01DOI: 10.1109/ICOSC.2015.7050780
Xing Tan, G. Tayi
Time Petri Nets (TPN) have been applied to model the basic event patterns that arise commonly in supply chains. Additionally these TPN-specified patterns can be aggregated to create more complicated supply chain event systems. In our previous work, meanwhile, we introduced SCOPE (Situation Calculus Ontology for PEtri nets), which semantically describes Petri Nets using the Situation Calculus. In this paper, we show that TESCOPE, which extends SCOPE to incorporate the concept of time, can be naturally applied for supply chain event aggregation. That is, we show that supply-chain event patterns can be easily represented as TESCOPE-based Golog procedures, where Golog is a logic language built on top of the Situation Calculus; We further demonstrate by examples that these basic Golog procedures can be aggregated semantically and hierarchically into complex ones.
{"title":"An ontological and hierarchical approach for supply chain event aggregation","authors":"Xing Tan, G. Tayi","doi":"10.1109/ICOSC.2015.7050780","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050780","url":null,"abstract":"Time Petri Nets (TPN) have been applied to model the basic event patterns that arise commonly in supply chains. Additionally these TPN-specified patterns can be aggregated to create more complicated supply chain event systems. In our previous work, meanwhile, we introduced SCOPE (Situation Calculus Ontology for PEtri nets), which semantically describes Petri Nets using the Situation Calculus. In this paper, we show that TESCOPE, which extends SCOPE to incorporate the concept of time, can be naturally applied for supply chain event aggregation. That is, we show that supply-chain event patterns can be easily represented as TESCOPE-based Golog procedures, where Golog is a logic language built on top of the Situation Calculus; We further demonstrate by examples that these basic Golog procedures can be aggregated semantically and hierarchically into complex ones.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128559876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-02-01DOI: 10.1109/ICOSC.2015.7050833
Fengjiao Wang, Guan Wang, Shuyang Lin, Philip S. Yu
Recent years, social network has attracted many attentions from research communities in data mining, social science and mobile etc, since users can create different types of information due to different actions and the information gives us the opportunities to better understand the insights of people's social lives. Co-clustering is an important technique to detect patterns and phenomena of two types of closely related objects. For example, in a location based social network, places can be clustered with regards to location and category, respectively and users can be clustered w.r.t. their location and interests, respectively. Therefore, there are usually some latent goals behind a co-clustering application. However, traditionally, co-clustering methods are not specifically designed to handle multiple goals. That leaves certain drawbacks, i.e., it cannot guarantee that objects satisfying each individual goal would be clustered into the same cluster. However, in many cases, clusters of objects meeting the same goal is required, e.g., a user may want to search places within one category but in different locations. In this paper, we propose a goal-oriented co-clustering model, which could generate co-clusterings with regards to different goals simultaneously. By this method, we could get co-clusterings containing objects with desired aspects of information from the original data source. Seed features sets are pre-selected to represent goals of co-clusterings. By generating expanded feature sets from seed feature sets, the proposed model concurrently co-clustering objects and assigning other features to different feature clusters.
{"title":"Concurrent goal-oriented co-clustering generation in social networks","authors":"Fengjiao Wang, Guan Wang, Shuyang Lin, Philip S. Yu","doi":"10.1109/ICOSC.2015.7050833","DOIUrl":"https://doi.org/10.1109/ICOSC.2015.7050833","url":null,"abstract":"Recent years, social network has attracted many attentions from research communities in data mining, social science and mobile etc, since users can create different types of information due to different actions and the information gives us the opportunities to better understand the insights of people's social lives. Co-clustering is an important technique to detect patterns and phenomena of two types of closely related objects. For example, in a location based social network, places can be clustered with regards to location and category, respectively and users can be clustered w.r.t. their location and interests, respectively. Therefore, there are usually some latent goals behind a co-clustering application. However, traditionally, co-clustering methods are not specifically designed to handle multiple goals. That leaves certain drawbacks, i.e., it cannot guarantee that objects satisfying each individual goal would be clustered into the same cluster. However, in many cases, clusters of objects meeting the same goal is required, e.g., a user may want to search places within one category but in different locations. In this paper, we propose a goal-oriented co-clustering model, which could generate co-clusterings with regards to different goals simultaneously. By this method, we could get co-clusterings containing objects with desired aspects of information from the original data source. Seed features sets are pre-selected to represent goals of co-clusterings. By generating expanded feature sets from seed feature sets, the proposed model concurrently co-clustering objects and assigning other features to different feature clusters.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129131357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}