Understanding temporal commonsense concepts, such as times of occurrence and durations is crucial for event-centric language understanding. Reasoning about such temporal concepts in a complex context requires reasoning over both the stated context and the world knowledge that underlines it. A recent study shows massive pre-trained LM still struggle with such temporal reasoning under complex contexts (e.g., dialog) because they only implicitly encode the relevant contexts and fail to explicitly uncover the underlying logical compositions for complex inference, thus may not be robust enough. In this work, we propose to augment LMs with the temporal logic induction ability, which frames the temporal reasoning by defining three modular components: temporal dependency inducer and temporal concept defuzzifier and logic validator. The former two components disentangle the explicit/implicit dependency between temporal concepts across context (before, after, ...) and the specific meaning of fuzzy temporal concepts, respectively, while the validator combines the intermediate reasoning clues for robust contextual reasoning about the temporal concepts. Extensive experimental results on TIMEDIAL, a challenging dataset for temporal reasoning over dialog, show that our method, Logic Induction Enhanced Contextualized TEmporal Reasoning (LECTER), can yield great improvements over the traditional language model for temporal reasoning.
{"title":"Self-Supervised Logic Induction for Explainable Fuzzy Temporal Commonsense Reasoning","authors":"Bibo Cai, Xiao Ding, Zhouhao Sun, Bing Qin, Ting Liu, Baojun Wang, Lifeng Shang","doi":"10.1609/aaai.v37i11.26481","DOIUrl":"https://doi.org/10.1609/aaai.v37i11.26481","url":null,"abstract":"Understanding temporal commonsense concepts, such as times of occurrence and durations is crucial for event-centric language understanding. Reasoning about such temporal concepts in a complex context requires reasoning over both the stated context and the world knowledge that underlines it. A recent study shows massive pre-trained LM still struggle with such temporal reasoning under complex contexts (e.g., dialog) because they only implicitly encode the relevant contexts and fail to explicitly uncover the underlying logical compositions for complex inference, thus may not be robust enough. In this work, we propose to augment LMs with the temporal logic induction ability, which frames the temporal reasoning by defining three modular components: temporal dependency inducer and temporal concept defuzzifier and logic validator. The former two components disentangle the explicit/implicit dependency between temporal concepts across context (before, after, ...) and the specific meaning of fuzzy temporal concepts, respectively, while the validator combines the intermediate reasoning clues for robust contextual reasoning about the temporal concepts. Extensive experimental results on TIMEDIAL, a challenging dataset for temporal reasoning over dialog, show that our method, Logic Induction Enhanced Contextualized TEmporal Reasoning (LECTER), can yield great improvements over the traditional language model for temporal reasoning.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"1 1","pages":"12580-12588"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77174303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-26DOI: 10.1609/aaai.v37i12.26653
Avinandan Bose, Tracey Li, Arunesh Sinha, Tien Mai
Community health workers (CHWs) play a crucial role in the last mile delivery of essential health services to underserved populations in low-income countries. Many nongovernmental organizations (NGOs) provide training and support to enable CHWs to deliver health services to their communities, with no charge to the recipients of the services. This includes monetary compensation for the work that CHWs perform, which is broken down into a series of well defined tasks. In this work, we partner with a NGO D-Tree International to design a fair monetary compensation scheme for tasks performed by CHWs in the semi-autonomous region of Zanzibar in Tanzania, Africa. In consultation with stakeholders, we interpret fairness as the equal opportunity to earn, which means that each CHW has the opportunity to earn roughly the same total payment over a given T month period, if the CHW reacts to the incentive scheme almost rationally. We model this problem as a reward design problem for a Markov Decision Process (MDP) formulation for the CHWs’ earning. There is a need for the mechanism to be simple so that it is understood by the CHWs, thus, we explore linear and piecewise linear rewards in the CHWs’ measured units of work. We solve this design problem via a novel policy-reward gradient result. Our experiments using two real world parameters from the ground provide evidence of reasonable incentive output by our scheme.
{"title":"A Fair Incentive Scheme for Community Health Workers","authors":"Avinandan Bose, Tracey Li, Arunesh Sinha, Tien Mai","doi":"10.1609/aaai.v37i12.26653","DOIUrl":"https://doi.org/10.1609/aaai.v37i12.26653","url":null,"abstract":"Community health workers (CHWs) play a crucial role in\u0000the last mile delivery of essential health services to underserved\u0000populations in low-income countries. Many nongovernmental\u0000organizations (NGOs) provide training and\u0000support to enable CHWs to deliver health services to their\u0000communities, with no charge to the recipients of the services.\u0000This includes monetary compensation for the work that\u0000CHWs perform, which is broken down into a series of well defined\u0000tasks. In this work, we partner with a NGO D-Tree\u0000International to design a fair monetary compensation scheme\u0000for tasks performed by CHWs in the semi-autonomous region\u0000of Zanzibar in Tanzania, Africa. In consultation with\u0000stakeholders, we interpret fairness as the equal opportunity\u0000to earn, which means that each CHW has the opportunity to\u0000earn roughly the same total payment over a given T month\u0000period, if the CHW reacts to the incentive scheme almost rationally.\u0000We model this problem as a reward design problem\u0000for a Markov Decision Process (MDP) formulation for the\u0000CHWs’ earning. There is a need for the mechanism to be\u0000simple so that it is understood by the CHWs, thus, we explore\u0000linear and piecewise linear rewards in the CHWs’ measured\u0000units of work. We solve this design problem via a novel\u0000policy-reward gradient result. Our experiments using two real\u0000world parameters from the ground provide evidence of reasonable\u0000incentive output by our scheme.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"46 1","pages":"14127-14135"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77176073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-26DOI: 10.1609/aaai.v37i12.26744
Jianglin Lan, Yang Zheng, A. Lomuscio
We propose an enhanced semidefinite program (SDP) relaxation to enable the tight and efficient verification of neural networks (NNs). The tightness improvement is achieved by introducing a nonlinear constraint to existing SDP relaxations previously proposed for NN verification. The efficiency of the proposal stems from the iterative nature of the proposed algorithm in that it solves the resulting non-convex SDP by recursively solving auxiliary convex layer-based SDP problems. We show formally that the solution generated by our algorithm is tighter than state-of-the-art SDP-based solutions for the problem. We also show that the solution sequence converges to the optimal solution of the non-convex enhanced SDP relaxation. The experimental results on standard benchmarks in the area show that our algorithm achieves the state-of-the-art performance whilst maintaining an acceptable computational cost.
{"title":"Iteratively Enhanced Semidefinite Relaxations for Efficient Neural Network Verification","authors":"Jianglin Lan, Yang Zheng, A. Lomuscio","doi":"10.1609/aaai.v37i12.26744","DOIUrl":"https://doi.org/10.1609/aaai.v37i12.26744","url":null,"abstract":"We propose an enhanced semidefinite program (SDP) relaxation to enable the tight and efficient verification of neural networks (NNs). The tightness improvement is achieved by introducing a nonlinear constraint to existing SDP relaxations previously proposed for NN verification. The efficiency of the proposal stems from the iterative nature of the proposed algorithm in that it solves the resulting non-convex SDP by recursively solving auxiliary convex layer-based SDP problems. We show formally that the solution generated by our algorithm is tighter than state-of-the-art SDP-based solutions for the problem. We also show that the solution sequence converges to the optimal solution of the non-convex enhanced SDP relaxation. The experimental results on standard benchmarks in the area show that our algorithm achieves the state-of-the-art performance whilst maintaining an acceptable computational cost.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"29 1","pages":"14937-14945"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82169547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-26DOI: 10.1609/aaai.v37i13.26973
Yi He, Wenxin Tai, Fan Zhou, Yi Yang
In financial economics, studies have shown that the textual content in the earnings conference call transcript has predictive power for a firm's future risk. However, the conference call transcript is very long and contains diverse non-relevant content, which poses challenges for the text-based risk forecast. This study investigates the structural dependency within a conference call transcript by explicitly modeling the dialogue between managers and analysts. Specifically, we utilize TextRank to extract information and exploit the semantic correlation within a discussion using hypergraph learning. This novel design can improve the transcript representation performance and reduce the risk of forecast errors. Experimental results on a large-scale dataset show that our approach can significantly improve prediction performance compared to state-of-the-art text-based models.
{"title":"Exploring Hypergraph of Earnings Call for Risk Prediction (Student Abstract)","authors":"Yi He, Wenxin Tai, Fan Zhou, Yi Yang","doi":"10.1609/aaai.v37i13.26973","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26973","url":null,"abstract":"In financial economics, studies have shown that the textual content in the earnings conference call transcript has predictive power for a firm's future risk. However, the conference call transcript is very long and contains diverse non-relevant content, which poses challenges for the text-based risk forecast. This study investigates the structural dependency within a conference call transcript by explicitly modeling the dialogue between managers and analysts. Specifically, we utilize TextRank to extract information and exploit the semantic correlation within a discussion using hypergraph learning. This novel design can improve the transcript representation performance and reduce the risk of forecast errors. Experimental results on a large-scale dataset show that our approach can significantly improve prediction performance compared to state-of-the-art text-based models.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"47 1","pages":"16226-16227"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82381685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raven’s Progressive Matrices (RPMs) have been widely used to evaluate the visual reasoning ability of humans. To tackle the challenges of visual perception and logic reasoning on RPMs, we propose a Hierarchical ConViT with Attention-based Relational Reasoner (HCV-ARR). Traditional solution methods often apply relatively shallow convolution networks to visually perceive shape patterns in RPM images, which may not fully model the long-range dependencies of complex pattern combinations in RPMs. The proposed ConViT consists of a convolutional block to capture the low-level attributes of visual patterns, and a transformer block to capture the high-level image semantics such as pattern formations. Furthermore, the proposed hierarchical ConViT captures visual features from multiple receptive fields, where the shallow layers focus on the image fine details while the deeper layers focus on the image semantics. To better model the underlying reasoning rules embedded in RPM images, an Attention-based Relational Reasoner (ARR) is proposed to establish the underlying relations among images. The proposed ARR well exploits the hidden relations among question images through the developed element-wise attentive reasoner. Experimental results on three RPM datasets demonstrate that the proposed HCV-ARR achieves a significant performance gain compared with the state-of-the-art models. The source code is available at: https://github.com/wentaoheunnc/HCV-ARR.
{"title":"Hierarchical ConViT with Attention-Based Relational Reasoner for Visual Analogical Reasoning","authors":"Wentao He, Jialu Zhang, Jianfeng Ren, Ruibin Bai, Xudong Jiang","doi":"10.1609/aaai.v37i1.25072","DOIUrl":"https://doi.org/10.1609/aaai.v37i1.25072","url":null,"abstract":"Raven’s Progressive Matrices (RPMs) have been widely used to evaluate the visual reasoning ability of humans. To tackle the challenges of visual perception and logic reasoning on RPMs, we propose a Hierarchical ConViT with Attention-based Relational Reasoner (HCV-ARR). Traditional solution methods often apply relatively shallow convolution networks to visually perceive shape patterns in RPM images, which may not fully model the long-range dependencies of complex pattern combinations in RPMs. The proposed ConViT consists of a convolutional block to capture the low-level attributes of visual patterns, and a transformer block to capture the high-level image semantics such as pattern formations. Furthermore, the proposed hierarchical ConViT captures visual features from multiple receptive fields, where the shallow layers focus on the image fine details while the deeper layers focus on the image semantics. To better model the underlying reasoning rules embedded in RPM images, an Attention-based Relational Reasoner (ARR) is proposed to establish the underlying relations among images. The proposed ARR well exploits the hidden relations among question images through the developed element-wise attentive reasoner. Experimental results on three RPM datasets demonstrate that the proposed HCV-ARR achieves a significant performance gain compared with the state-of-the-art models. The source code is available at: https://github.com/wentaoheunnc/HCV-ARR.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"15 1","pages":"22-30"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82427960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-26DOI: 10.1609/aaai.v37i5.25720
Tien Mai, Arunesh Sinha
Vaccine delivery in under-resourced locations with security risks is not just challenging but also life threatening. The COVID pandemic and the need to vaccinate added even more urgency to this issue. Motivated by this problem, we propose a general framework to set-up limited temporary (vaccination) centers that balance physical security and desired (vaccine) service coverage with limited resources. We set-up the problem as a Stackelberg game between the centers operator (defender) and an adversary, where the set of centers is not fixed a priori but is part of the decision output. This results in a mixed combinatorial and continuous optimization problem. As part of our scalable approximation solution, we provide a fundamental contribution by identifying general duality conditions of switching max and min when both discrete and continuous variables are involved. Via detailed experiments, we show that the solution proposed is scalable in practice.
{"title":"Securing Lifelines: Safe Delivery of Critical Services in Areas with Volatile Security Situation via a Stackelberg Game Approach","authors":"Tien Mai, Arunesh Sinha","doi":"10.1609/aaai.v37i5.25720","DOIUrl":"https://doi.org/10.1609/aaai.v37i5.25720","url":null,"abstract":"Vaccine delivery in under-resourced locations with security risks is not just challenging but also life threatening. The COVID pandemic and the need to vaccinate added even more urgency to this issue. Motivated by this problem, we propose a general framework to set-up limited temporary (vaccination) centers that balance physical security and desired (vaccine) service coverage with limited resources. We set-up the problem as a Stackelberg game between the centers operator (defender) and an adversary, where the set of centers is not fixed a priori but is part of the decision output. This results in a mixed combinatorial and continuous optimization problem. As part of our scalable approximation solution, we provide a fundamental contribution by identifying general duality conditions of switching max and min when both discrete and continuous variables are involved. Via detailed experiments, we show that the solution proposed is scalable in practice.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"13 1","pages":"5805-5813"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81864405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-26DOI: 10.1609/aaai.v37i13.26896
M. Zong, Bhaskar Krishnamachari
Researchers have been interested in developing AI tools to help students learn various mathematical subjects. One challenging set of tasks for school students is learning to solve math word problems. We explore how recent advances in natural language processing, specifically the rise of powerful transformer based models, can be applied to help math learners with such problems. Concretely, we evaluate the use of GPT-3, a 1.75B parameter transformer model recently released by OpenAI, for three related challenges pertaining to math word problems corresponding to systems of two linear equations. The three challenges are classifying word problems, extracting equations from word problems, and generating word problems. For the first challenge, we define a set of problem classes and find that GPT-3 has generally very high accuracy in classifying word problems (80%-100%), for all but one of these classes. For the second challenge, we find the accuracy for extracting equations improves with number of examples provided to the model, ranging from an accuracy of 31% for zero-shot learning to about 69% using 3-shot learning, which is further improved to a high value of 80% with fine-tuning. For the third challenge, we find that GPT-3 is able to generate problems with accuracy ranging from 33% to 93%, depending on the problem type.
{"title":"Solving Math Word Problems concerning Systems of Equations with GPT-3","authors":"M. Zong, Bhaskar Krishnamachari","doi":"10.1609/aaai.v37i13.26896","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26896","url":null,"abstract":"Researchers have been interested in developing AI tools to help students learn various mathematical subjects. One challenging set of tasks for school students is learning to solve math word problems. We explore how recent advances in natural language processing, specifically the rise of powerful transformer based models, can be applied to help math learners with such problems. Concretely, we evaluate the use of GPT-3, a 1.75B parameter transformer model recently released by OpenAI, for three related challenges pertaining to math word problems corresponding to systems of two linear equations. The three challenges are classifying word problems, extracting equations from word problems, and generating word problems. For the first challenge, we define a set of problem classes and find that GPT-3 has generally very high accuracy in classifying word problems (80%-100%), for all but one of these classes. For the second challenge, we find the accuracy for extracting equations improves with number of examples provided to the model, ranging from an accuracy of 31% for zero-shot learning to about 69% using 3-shot learning, which is further improved to a high value of 80% with fine-tuning. For the third challenge, we find that GPT-3 is able to generate problems with accuracy ranging from 33% to 93%, depending on the problem type.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"2 1","pages":"15972-15979"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78654967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-26DOI: 10.1609/aaai.v37i4.25506
Erel Cohen, Omer Lev, R. Zivan
Belief propagation is a widely used incomplete optimization algorithm, whose main theoretical properties hold only under the assumptions that beliefs are not equal. Nevertheless, there is much evidence that equality between beliefs does occur. A method to overcome belief equality by using unary function-nodes is assumed to resolve the problem. We focus on Min-sum, the belief propagation version for solving constraint optimization problems. We prove that on a single cycle graph, belief equality can be avoided only when the algorithm converges to the optimal solution. In any other case, the unary function methods will not prevent equality, rendering some existing results in need of reassessment. We differentiate between belief equality, which includes equal beliefs in a single message, and assignment equality, that prevents a coherent selection of assignments to variables. We show the necessary and satisfying conditions for both.
{"title":"Separate but Equal: Equality in Belief Propagation for Single Cycle Graphs","authors":"Erel Cohen, Omer Lev, R. Zivan","doi":"10.1609/aaai.v37i4.25506","DOIUrl":"https://doi.org/10.1609/aaai.v37i4.25506","url":null,"abstract":"Belief propagation is a widely used incomplete optimization algorithm, whose main theoretical properties hold only under the assumptions that beliefs are not equal. Nevertheless, there is much evidence that equality between beliefs does occur. A method to overcome belief equality by using unary function-nodes is assumed to resolve the problem.\u0000\u0000We focus on Min-sum, the belief propagation version for solving constraint optimization problems. We prove that on a single cycle graph, belief equality can be avoided only when the algorithm converges to the optimal solution. In any other case, the unary function methods will not prevent equality, rendering some existing results in need of reassessment. We differentiate between belief equality, which includes equal beliefs in a single message, and assignment equality, that prevents a coherent selection of assignments to variables. We show the necessary and satisfying conditions for both.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"1 1","pages":"3924-3931"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78871085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-26DOI: 10.1609/aaai.v37i13.26927
Saed Rezayi
Representation Learning is the core of Machine Learning and Artificial Intelligence as it summarizes input data points into low dimensional vectors. This low dimensional vectors should be accurate portrayals of the input data, thus it is crucial to find the most effective and robust representation possible for given input as the performance of the ML task is dependent on the resulting representations. In this summary, we discuss an approach to augment representation learning which relies on external knowledge. We briefly describe the shortcoming of the existing techniques and describe how an auxiliary knowledge source could result in obtaining improved representations.
{"title":"Learning Better Representations Using Auxiliary Knowledge","authors":"Saed Rezayi","doi":"10.1609/aaai.v37i13.26927","DOIUrl":"https://doi.org/10.1609/aaai.v37i13.26927","url":null,"abstract":"Representation Learning is the core of Machine Learning and Artificial Intelligence as it summarizes input data points into low dimensional vectors. This low dimensional vectors should be accurate portrayals of the input data, thus it is crucial to find the most effective and robust representation possible for given input as the performance of the ML task is dependent on the resulting representations. In this summary, we discuss an approach to augment representation learning which relies on external knowledge. We briefly describe the shortcoming of the existing techniques and describe how an auxiliary knowledge source could result in obtaining improved representations.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"39 1","pages":"16133-16134"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76100014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spike camera, a new type of neuromorphic visual sensor that imitates the sampling mechanism of the primate fovea, can capture photons and output 40000 Hz binary spike streams. Benefiting from the asynchronous sampling mechanism, the spike camera can record fast-moving objects and clear images can be recovered from the spike stream at any specified timestamps without motion blurring. Despite these, due to the dense time sequence information of the discrete spike stream, it is not easy to directly apply the existing algorithms of traditional cameras to the spike camera. Therefore, it is necessary and interesting to explore a universally effective representation of dense spike streams to better fit various network architectures. In this paper, we propose to mine temporal-robust features of spikes in time-frequency space with wavelet transforms. We present a novel Wavelet-Guided Spike Enhancing (WGSE) paradigm consisting of three consecutive steps: multi-level wavelet transform, CNN-based learnable module, and inverse wavelet transform. With the assistance of WGSE, the new streaming representation of spikes can be learned. We demonstrate the effectiveness of WGSE on two downstream tasks, achieving state-of-the-art performance on the image reconstruction task and getting considerable performance on semantic segmentation. Furthermore, We build a new spike-based synthesized dataset for semantic segmentation. Code and Datasets are available at https://github.com/Leozhangjiyuan/WGSE-SpikeCamera.
{"title":"Learning Temporal-Ordered Representation for Spike Streams Based on Discrete Wavelet Transforms","authors":"Jiyuan Zhang, Shanshan Jia, Zhaofei Yu, Tiejun Huang","doi":"10.1609/aaai.v37i1.25085","DOIUrl":"https://doi.org/10.1609/aaai.v37i1.25085","url":null,"abstract":"Spike camera, a new type of neuromorphic visual sensor that imitates the sampling mechanism of the primate fovea, can capture photons and output 40000 Hz binary spike streams. Benefiting from the asynchronous sampling mechanism, the spike camera can record fast-moving objects and clear images can be recovered from the spike stream at any specified timestamps without motion blurring. Despite these, due to the dense time sequence information of the discrete spike stream, it is not easy to directly apply the existing algorithms of traditional cameras to the spike camera. Therefore, it is necessary and interesting to explore a universally effective representation of dense spike streams to better fit various network architectures. In this paper, we propose to mine temporal-robust features of spikes in time-frequency space with wavelet transforms. We present a novel Wavelet-Guided Spike Enhancing (WGSE) paradigm consisting of three consecutive steps: multi-level wavelet transform, CNN-based learnable module, and inverse wavelet transform. With the assistance of WGSE, the new streaming representation of spikes can be learned. We demonstrate the effectiveness of WGSE on two downstream tasks, achieving state-of-the-art performance on the image reconstruction task and getting considerable performance on semantic segmentation. Furthermore, We build a new spike-based synthesized dataset for semantic segmentation. Code and Datasets are available at https://github.com/Leozhangjiyuan/WGSE-SpikeCamera.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"3 1","pages":"137-147"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76182204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}