Smart contracts are programs that are executed on the blockchain and can hold, manage and transfer assets in the form of cryptocurrencies. The contract's execution is then performed on-chain and is subject to consensus, i.e. every node on the blockchain network has to run the function calls and keep track of their side-effects including updates to the balances and contract's storage. The notion of gas is introduced in most programmable blockchains, which prevents DoS attacks from malicious parties who might try to slow down the network by performing time-consuming and resource-heavy computations. While the gas idea has largely succeeded in its goal of avoiding DoS attacks, the resulting fees are extremely high. For example, in June-September 2022, on Ethereum alone, there has been an average total gas usage of 2,706.8 ETH ≈ 3,938,749 USD per day. We propose a protocol for alleviating these costs by moving most of the computation off-chain while preserving enough data on-chain to guarantee an implicit consensus about the contract state and ownership of funds in case of dishonest parties. We perform extensive experiments over 3,330 real-world Solidity contracts that were involved in 327,132 transactions in June-September 2022 on Ethereum and show that our approach reduces their gas usage by 40.09 percent, which amounts to a whopping 442,651 USD.
{"title":"Alleviating High Gas Costs by Secure and Trustless Off-chain Execution of Smart Contracts","authors":"Soroush Farokhnia, Amir Kafshdar Goharshady","doi":"10.1145/3555776.3577833","DOIUrl":"https://doi.org/10.1145/3555776.3577833","url":null,"abstract":"Smart contracts are programs that are executed on the blockchain and can hold, manage and transfer assets in the form of cryptocurrencies. The contract's execution is then performed on-chain and is subject to consensus, i.e. every node on the blockchain network has to run the function calls and keep track of their side-effects including updates to the balances and contract's storage. The notion of gas is introduced in most programmable blockchains, which prevents DoS attacks from malicious parties who might try to slow down the network by performing time-consuming and resource-heavy computations. While the gas idea has largely succeeded in its goal of avoiding DoS attacks, the resulting fees are extremely high. For example, in June-September 2022, on Ethereum alone, there has been an average total gas usage of 2,706.8 ETH ≈ 3,938,749 USD per day. We propose a protocol for alleviating these costs by moving most of the computation off-chain while preserving enough data on-chain to guarantee an implicit consensus about the contract state and ownership of funds in case of dishonest parties. We perform extensive experiments over 3,330 real-world Solidity contracts that were involved in 327,132 transactions in June-September 2022 on Ethereum and show that our approach reduces their gas usage by 40.09 percent, which amounts to a whopping 442,651 USD.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":"137 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77813297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liang-Chi Chen, Shu-Qi Yu, Chien-Chung Ho, Wei-Chen Wang, Yung-Chun Li
Conventional LSM tree designs delete data by inserting a delete mark to the specified key, and they thus it leaves several out-of-date values to the specified key on the LSM tree. As a result, the LSM tree encounters a serious data security issue due to the undeleted values when there arises the need for data sanitization. Sanitization is a time-consuming process that involves completely removing sensitive data from storage devices. Flash-based SSDs are widely used in many systems, but they lack an in-place update feature, which makes it difficult for LSM trees to maintain both privacy and performance on these devices. This work proposes an efficient sanitizable LSM-tree design for LSM-based key-value store over 3D NAND flash memories. Our proposed efficient sanitizable LSM-tree design focuses on integrating the processes of key-value pair updating and the execution of sanitization by exploiting our proposed influence-conscious programming method. The capability of the proposed design is evaluated by a series of experiments, for which we have very encouraging results.
{"title":"Efficient Sanitization Design for LSM-based Key-Value Store over 3D MLC NAND Flash","authors":"Liang-Chi Chen, Shu-Qi Yu, Chien-Chung Ho, Wei-Chen Wang, Yung-Chun Li","doi":"10.1145/3555776.3577780","DOIUrl":"https://doi.org/10.1145/3555776.3577780","url":null,"abstract":"Conventional LSM tree designs delete data by inserting a delete mark to the specified key, and they thus it leaves several out-of-date values to the specified key on the LSM tree. As a result, the LSM tree encounters a serious data security issue due to the undeleted values when there arises the need for data sanitization. Sanitization is a time-consuming process that involves completely removing sensitive data from storage devices. Flash-based SSDs are widely used in many systems, but they lack an in-place update feature, which makes it difficult for LSM trees to maintain both privacy and performance on these devices. This work proposes an efficient sanitizable LSM-tree design for LSM-based key-value store over 3D NAND flash memories. Our proposed efficient sanitizable LSM-tree design focuses on integrating the processes of key-value pair updating and the execution of sanitization by exploiting our proposed influence-conscious programming method. The capability of the proposed design is evaluated by a series of experiments, for which we have very encouraging results.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":"11 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90538526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multi-domain sentiment classification trains a classifier using multiple domains and then tests the classifier on one of the domains. Importantly, no domain is assumed to have sufficient labeled data; instead, the goal is leveraging information between domains, making multi-domain sentiment classification a very realistic scenario. Typically, labeled data is costly because humans must classify it manually. In this context, we propose the MUTUAL approach that learns general and domain-specific sentence embeddings that are also context-aware due to the attention mechanism. In this work, we propose using a stacked BiLSTM-based Autoencoder with an attention mechanism to generate the two above-mentioned types of sentence embeddings. Then, using the Jensen-Shannon (JS) distance, the general sentence embeddings of the four most similar domains to the target domain are selected. The selected general sentence embeddings and the domain-specific embeddings are concatenated and fed into a dense layer for training. Evaluation results on public datasets with 16 different domains demonstrate the efficiency of our model. In addition, we propose an active learning algorithm that first applies the elliptic envelope for outlier removal to a pool of unlabeled data that the MUTUAL model then classifies. Next, the most uncertain data points are selected to be labeled based on the least confidence metric. The experiments show higher accuracy for querying 38% of the original data than random sampling.
{"title":"MUTUAL: Multi-Domain Sentiment Classification via Uncertainty Sampling","authors":"K. Katsarou, Roxana Jeney, K. Stefanidis","doi":"10.1145/3555776.3577765","DOIUrl":"https://doi.org/10.1145/3555776.3577765","url":null,"abstract":"Multi-domain sentiment classification trains a classifier using multiple domains and then tests the classifier on one of the domains. Importantly, no domain is assumed to have sufficient labeled data; instead, the goal is leveraging information between domains, making multi-domain sentiment classification a very realistic scenario. Typically, labeled data is costly because humans must classify it manually. In this context, we propose the MUTUAL approach that learns general and domain-specific sentence embeddings that are also context-aware due to the attention mechanism. In this work, we propose using a stacked BiLSTM-based Autoencoder with an attention mechanism to generate the two above-mentioned types of sentence embeddings. Then, using the Jensen-Shannon (JS) distance, the general sentence embeddings of the four most similar domains to the target domain are selected. The selected general sentence embeddings and the domain-specific embeddings are concatenated and fed into a dense layer for training. Evaluation results on public datasets with 16 different domains demonstrate the efficiency of our model. In addition, we propose an active learning algorithm that first applies the elliptic envelope for outlier removal to a pool of unlabeled data that the MUTUAL model then classifies. Next, the most uncertain data points are selected to be labeled based on the least confidence metric. The experiments show higher accuracy for querying 38% of the original data than random sampling.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":"1 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89737340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gil-beom Lee, Jinbeom Kim, Taejune Kim, Simon S. Woo
Oriented object detection in aerial images is a challenging task due to the highly complex backgrounds and objects with arbitrary oriented and usually densely arranged. Existing oriented object detection methods adopt CNN-based methods, and they can be divided into three types: two-stage, one-stage, and anchor-free methods. All of them require non-maximum suppression (NMS) to eliminate the duplicated predictions. Recently, object detectors based on the transformer remove hand-designed components by directly solving set prediction problems via performing bipartite matching, and achieve state-of-the-art performances in general object detection. Motivated by this research, we propose a transformer-based oriented object detector named Rotated DETR with oriented bounding boxes (OBBs) labeling. We embed the scoring network to reduce the tokens corresponding to the background. In addition, we apply a proposal generator and iterative proposal refinement module in order to provide proposals with angle information to the transformer decoder. Rotated DETR achieves state-of-the-art performance on the single-stage and anchor-free oriented object detectors on DOTA, UCAS-AOD, and DIOR-R datasets with only 10% feature tokens. In the experiment, we show the effectiveness of the scoring network and iterative proposal refinement module.
{"title":"Rotated-DETR: an End-to-End Transformer-based Oriented Object Detector for Aerial Images","authors":"Gil-beom Lee, Jinbeom Kim, Taejune Kim, Simon S. Woo","doi":"10.1145/3555776.3577745","DOIUrl":"https://doi.org/10.1145/3555776.3577745","url":null,"abstract":"Oriented object detection in aerial images is a challenging task due to the highly complex backgrounds and objects with arbitrary oriented and usually densely arranged. Existing oriented object detection methods adopt CNN-based methods, and they can be divided into three types: two-stage, one-stage, and anchor-free methods. All of them require non-maximum suppression (NMS) to eliminate the duplicated predictions. Recently, object detectors based on the transformer remove hand-designed components by directly solving set prediction problems via performing bipartite matching, and achieve state-of-the-art performances in general object detection. Motivated by this research, we propose a transformer-based oriented object detector named Rotated DETR with oriented bounding boxes (OBBs) labeling. We embed the scoring network to reduce the tokens corresponding to the background. In addition, we apply a proposal generator and iterative proposal refinement module in order to provide proposals with angle information to the transformer decoder. Rotated DETR achieves state-of-the-art performance on the single-stage and anchor-free oriented object detectors on DOTA, UCAS-AOD, and DIOR-R datasets with only 10% feature tokens. In the experiment, we show the effectiveness of the scoring network and iterative proposal refinement module.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":"10 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89429724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ben Crulis, Barthélémy Serres, Cyril de Runz, G. Venturini
Current artificial neural networks are trained with parameters encoded as floating point numbers that occupy lots of memory space at inference time. Due to the increase in size of deep learning models, it is becoming very difficult to consider training and using artificial neural networks on edge devices such as smartphones. Binary neural networks promise to reduce the size of deep neural network models as well as increasing inference speed while decreasing energy consumption and so allow the deployment of more powerful models on edge devices. However, binary neural networks are still proven to be difficult to train using the backpropagation based gradient descent scheme. We propose to adapt to binary neural networks two training algorithms considered as promising alternatives to backpropagation but for continuous neural networks. We provide experimental comparative results for image classification including the backpropagation baseline on the MNIST, Fashion MNIST and CIFAR-10 datasets in both continuous and binary settings. The results demonstrate that binary neural networks can not only be trained using alternative algorithms to backpropagation but can also be shown to lead better performance and a higher tolerance to the presence or absence of batch normalization layers.
{"title":"Are alternatives to backpropagation useful for training Binary Neural Networks? An experimental study in image classification","authors":"Ben Crulis, Barthélémy Serres, Cyril de Runz, G. Venturini","doi":"10.1145/3555776.3577674","DOIUrl":"https://doi.org/10.1145/3555776.3577674","url":null,"abstract":"Current artificial neural networks are trained with parameters encoded as floating point numbers that occupy lots of memory space at inference time. Due to the increase in size of deep learning models, it is becoming very difficult to consider training and using artificial neural networks on edge devices such as smartphones. Binary neural networks promise to reduce the size of deep neural network models as well as increasing inference speed while decreasing energy consumption and so allow the deployment of more powerful models on edge devices. However, binary neural networks are still proven to be difficult to train using the backpropagation based gradient descent scheme. We propose to adapt to binary neural networks two training algorithms considered as promising alternatives to backpropagation but for continuous neural networks. We provide experimental comparative results for image classification including the backpropagation baseline on the MNIST, Fashion MNIST and CIFAR-10 datasets in both continuous and binary settings. The results demonstrate that binary neural networks can not only be trained using alternative algorithms to backpropagation but can also be shown to lead better performance and a higher tolerance to the presence or absence of batch normalization layers.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":"35 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77059636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Sloan, E. Dombay, W. Sabiiti, B. Mtafya, Ognjen Arandelovic, Marios Zachariou
Microscopy analysis of sputum images for bacilli screening is a common method used for both diagnosis and therapy monitoring of tuberculosis (TB). Nonetheless, it is a challenging procedure, since sputum examination is time-consuming and needs highly competent personnel to provide accurate results which are important for clinical decision-making. In addition, manual fluorescence microscopy examination of sputum samples for tuberculosis diagnosis and treatment monitoring is a subjective operation. In this work, we automate the process of examining fields of view (FOVs) of TB bacteria in order to determine the lipid content, and bacterial length and width. We propose a modified version of the UNet model to rapidly localise potential bacteria inside a FOV. We introduce a novel method that uses Fourier descriptors to exclude contours that do not belong to the class of bacteria, hence minimising the amount of false positives. Finally, we propose a new feature as a means of extracting a representation fed into a support vector multi-regressor in order to estimate the length and width of each bacterium. Using a real-world data corpus, the proposed method i) outperformed previous methods, and ii) estimated the cell length and width with a root mean square error of less than 0.01%.
{"title":"Estimating Phenotypic Characteristics of Tuberculosis Bacteria","authors":"D. Sloan, E. Dombay, W. Sabiiti, B. Mtafya, Ognjen Arandelovic, Marios Zachariou","doi":"10.1145/3555776.3578609","DOIUrl":"https://doi.org/10.1145/3555776.3578609","url":null,"abstract":"Microscopy analysis of sputum images for bacilli screening is a common method used for both diagnosis and therapy monitoring of tuberculosis (TB). Nonetheless, it is a challenging procedure, since sputum examination is time-consuming and needs highly competent personnel to provide accurate results which are important for clinical decision-making. In addition, manual fluorescence microscopy examination of sputum samples for tuberculosis diagnosis and treatment monitoring is a subjective operation. In this work, we automate the process of examining fields of view (FOVs) of TB bacteria in order to determine the lipid content, and bacterial length and width. We propose a modified version of the UNet model to rapidly localise potential bacteria inside a FOV. We introduce a novel method that uses Fourier descriptors to exclude contours that do not belong to the class of bacteria, hence minimising the amount of false positives. Finally, we propose a new feature as a means of extracting a representation fed into a support vector multi-regressor in order to estimate the length and width of each bacterium. Using a real-world data corpus, the proposed method i) outperformed previous methods, and ii) estimated the cell length and width with a root mean square error of less than 0.01%.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":"51 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85130465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Crystalline materials, such as metals and semiconductors, nearly always contain a special defect type called dislocation. This defect decisively determines many important material properties, e.g., strength, fracture toughness, or ductility. Over the past years, significant effort has been put into understanding dislocation behavior across different length scales via experimental characterization techniques and simulations. This paper introduces the dislocation ontology (DISO), which defines the concepts and relationships related to linear defects in crystalline materials. We developed DISO using a top-down approach in which we start defining the most general concepts in the dislocation domain and subsequent specialization of them. DISO is published through a persistent URL following W3C best practices for publishing Linked Data. Two potential use cases for DISO are presented to illustrate its usefulness in the dislocation dynamics domain. The evaluation of the ontology is performed in two directions, evaluating the success of the ontology in modeling a real-world domain and the richness of the ontology.
{"title":"DISO: A Domain Ontology for Modeling Dislocations in Crystalline Materials","authors":"Ahmad Zainul Ihsan, S. Fathalla, S. Sandfeld","doi":"10.1145/3555776.3578739","DOIUrl":"https://doi.org/10.1145/3555776.3578739","url":null,"abstract":"Crystalline materials, such as metals and semiconductors, nearly always contain a special defect type called dislocation. This defect decisively determines many important material properties, e.g., strength, fracture toughness, or ductility. Over the past years, significant effort has been put into understanding dislocation behavior across different length scales via experimental characterization techniques and simulations. This paper introduces the dislocation ontology (DISO), which defines the concepts and relationships related to linear defects in crystalline materials. We developed DISO using a top-down approach in which we start defining the most general concepts in the dislocation domain and subsequent specialization of them. DISO is published through a persistent URL following W3C best practices for publishing Linked Data. Two potential use cases for DISO are presented to illustrate its usefulness in the dislocation dynamics domain. The evaluation of the ontology is performed in two directions, evaluating the success of the ontology in modeling a real-world domain and the richness of the ontology.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":"33 5 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85565649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Geunsu Kim, Gyudo Park, Soohyeok Kang, Simon S. Woo
Most of the existing face recognition applications using deep learning models have leveraged CNN-based architectures as the feature extractor. However, recent studies have shown that in computer vision tasks, vision transformer-based models often outperform CNN-based models. Therefore, in this work, we propose a Sparse Vision Transformer (S-ViT) based on the Vision Transformer (ViT) architecture to improve the face recognition tasks. After the model is trained, S-ViT tends to have a sparse distribution of weights compared to ViT, so we named it according to these characteristics. Unlike the conventional ViT, our proposed S-ViT adopts image Relative Positional Encoding (iRPE) method for positional encoding. Also, S-ViT has been modified so that all token embeddings, not just class token, participate in the decoding process. Through extensive experiment, we showed that S-ViT achieves better performance in closed-set than the other baseline models, and showed better performance than the baseline ViT-based models. For example, when using ArcFace as the loss function in the identification protocol, S-ViT achieved up to 3.27% higher accuracy than ResNet50. We also show that the use of ArcFace loss functions yields greater performance gains in S-ViT than in baseline models. In addition, S-ViT has an advantage in cost-performance trade-off because it tends to be more robust to the pruning technique than the underlying model, ViT. Therefore, S-ViT offers the additional advantage, which can be applied more flexibly in the target devices with limited resources.
{"title":"S-ViT: Sparse Vision Transformer for Accurate Face Recognition","authors":"Geunsu Kim, Gyudo Park, Soohyeok Kang, Simon S. Woo","doi":"10.1145/3555776.3577640","DOIUrl":"https://doi.org/10.1145/3555776.3577640","url":null,"abstract":"Most of the existing face recognition applications using deep learning models have leveraged CNN-based architectures as the feature extractor. However, recent studies have shown that in computer vision tasks, vision transformer-based models often outperform CNN-based models. Therefore, in this work, we propose a Sparse Vision Transformer (S-ViT) based on the Vision Transformer (ViT) architecture to improve the face recognition tasks. After the model is trained, S-ViT tends to have a sparse distribution of weights compared to ViT, so we named it according to these characteristics. Unlike the conventional ViT, our proposed S-ViT adopts image Relative Positional Encoding (iRPE) method for positional encoding. Also, S-ViT has been modified so that all token embeddings, not just class token, participate in the decoding process. Through extensive experiment, we showed that S-ViT achieves better performance in closed-set than the other baseline models, and showed better performance than the baseline ViT-based models. For example, when using ArcFace as the loss function in the identification protocol, S-ViT achieved up to 3.27% higher accuracy than ResNet50. We also show that the use of ArcFace loss functions yields greater performance gains in S-ViT than in baseline models. In addition, S-ViT has an advantage in cost-performance trade-off because it tends to be more robust to the pruning technique than the underlying model, ViT. Therefore, S-ViT offers the additional advantage, which can be applied more flexibly in the target devices with limited resources.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":"18 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81609493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Learning to Rank is the task of learning a ranking function from a set of query-documents pairs. Generally, documents within a query are thousands but not all documents are informative for the learning phase. Different strategies were designed to select the most informative documents from the training set. However, most of them focused on reducing the size of the training set to speed up the learning phase, sacrificing effectiveness. A first attempt in this direction was achieved by Selective Gradient Boosting a learning algorithm that makes use of customisable sampling strategy to train effective ranking models. In this work, we propose a new sampling strategy called High_Low_Sampl for selecting negative examples applicable to Selective Gradient Boosting, without compromising model effectiveness. The proposed sampling strategy allows Selective Gradient Boosting to compose a new training set by selecting from the original one three document classes: the positive examples, high-ranked negative examples and low-ranked negative examples. The resulting dataset aims at minimizing the mis-ranking risk, i.e., enhancing the discriminative power of the learned model and maintaining generalisation to unseen instances. We demonstrated through an extensive experimental analysis on publicly available datasets, that the proposed selection algorithm is able to make the most of the negative examples within the training set and leads to models capable of obtaining statistically significant improvements in terms of NDCG, compared to the state of the art.
{"title":"On the Effect of Low-Ranked Documents: A New Sampling Function for Selective Gradient Boosting","authors":"C. Lucchese, Federico Marcuzzi, S. Orlando","doi":"10.1145/3555776.3577597","DOIUrl":"https://doi.org/10.1145/3555776.3577597","url":null,"abstract":"Learning to Rank is the task of learning a ranking function from a set of query-documents pairs. Generally, documents within a query are thousands but not all documents are informative for the learning phase. Different strategies were designed to select the most informative documents from the training set. However, most of them focused on reducing the size of the training set to speed up the learning phase, sacrificing effectiveness. A first attempt in this direction was achieved by Selective Gradient Boosting a learning algorithm that makes use of customisable sampling strategy to train effective ranking models. In this work, we propose a new sampling strategy called High_Low_Sampl for selecting negative examples applicable to Selective Gradient Boosting, without compromising model effectiveness. The proposed sampling strategy allows Selective Gradient Boosting to compose a new training set by selecting from the original one three document classes: the positive examples, high-ranked negative examples and low-ranked negative examples. The resulting dataset aims at minimizing the mis-ranking risk, i.e., enhancing the discriminative power of the learned model and maintaining generalisation to unseen instances. We demonstrated through an extensive experimental analysis on publicly available datasets, that the proposed selection algorithm is able to make the most of the negative examples within the training set and leads to models capable of obtaining statistically significant improvements in terms of NDCG, compared to the state of the art.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":"18 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81259953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Haines, Johannes Müller, Iñigo Querejeta-Azurmendi
Electronic voting (e-voting) is regularly used in many countries and organizations for legally binding elections. In order to conduct such elections securely, numerous e-voting systems have been proposed over the last few decades. Notably, some of these systems were designed to provide coercion-resistance. This property protects against potential adversaries trying to swing an election by coercing voters. Despite the multitude of existing coercion-resistant e-voting systems, to date, only few of them can handle large-scale Internet elections efficiently. One of these systems, VoteAgain (USENIX Security 2020), was originally claimed secure under similar trust assumptions to state-of-the-art e-voting systems without coercion-resistance. In this work, we review VoteAgain's security properties. We discover that, unlike originally claimed, VoteAgain is no more secure than a trivial voting system with a completely trusted election authority. In order to mitigate this issue, we propose a variant of VoteAgain which effectively mitigates trust on the election authorities and, at the same time, preserves VoteAgain's usability and efficiency. Altogether, our findings bring the state of science one step closer to the goal of scalable coercion-resistant e-voting being secure under reasonable trust assumptions.
{"title":"Scalable Coercion-Resistant E-Voting under Weaker Trust Assumptions","authors":"Thomas Haines, Johannes Müller, Iñigo Querejeta-Azurmendi","doi":"10.1145/3555776.3578730","DOIUrl":"https://doi.org/10.1145/3555776.3578730","url":null,"abstract":"Electronic voting (e-voting) is regularly used in many countries and organizations for legally binding elections. In order to conduct such elections securely, numerous e-voting systems have been proposed over the last few decades. Notably, some of these systems were designed to provide coercion-resistance. This property protects against potential adversaries trying to swing an election by coercing voters. Despite the multitude of existing coercion-resistant e-voting systems, to date, only few of them can handle large-scale Internet elections efficiently. One of these systems, VoteAgain (USENIX Security 2020), was originally claimed secure under similar trust assumptions to state-of-the-art e-voting systems without coercion-resistance. In this work, we review VoteAgain's security properties. We discover that, unlike originally claimed, VoteAgain is no more secure than a trivial voting system with a completely trusted election authority. In order to mitigate this issue, we propose a variant of VoteAgain which effectively mitigates trust on the election authorities and, at the same time, preserves VoteAgain's usability and efficiency. Altogether, our findings bring the state of science one step closer to the goal of scalable coercion-resistant e-voting being secure under reasonable trust assumptions.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":"22 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89267605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}