{"title":"Metamorphic Testing of Image Captioning Systems via Image-Level Reduction","authors":"Xiaoyuan Xie;Xingpeng Li;Songqiang Chen","doi":"10.1109/TSE.2024.3463747","DOIUrl":null,"url":null,"abstract":"The Image Captioning (IC) technique is widely used to describe images in natural language. However, even state-of-the-art IC systems can still produce incorrect captions and lead to misunderstandings. Recently, some IC system testing methods have been proposed. However, these methods still rely on pre-annotated information and hence cannot really alleviate the difficulty in identifying the test oracle. Furthermore, their methods artificially manipulate objects, which may generate unreal images as test cases and thus lead to less meaningful testing results. Thirdly, existing methods have various requirements on the eligibility of source test cases, and hence cannot fully utilize the given images to perform testing. To tackle these issues, in this paper, we propose \n<sc>ReIC</small>\n to perform metamorphic testing for the IC systems with some image-level reduction transformations like image cropping and stretching. Instead of relying on the pre-annotated information, \n<sc>ReIC</small>\n uses a localization method to align objects in the caption with corresponding objects in the image, and checks whether each object is correctly described or deleted in the caption after transformation. With the image-level reduction transformations, \n<sc>ReIC</small>\n does not artificially manipulate any objects and hence can avoid generating unreal follow-up images. Additionally, it eliminates the requirement on the eligibility of source test cases during the metamorphic transformation process, as well as decreases the ambiguity and boosts the diversity among the follow-up test cases, which consequently enables testing to be performed on any test image and reveals more distinct valid violations. We employ \n<sc>ReIC</small>\n to test five popular IC systems. The results demonstrate that \n<sc>ReIC</small>\n can sufficiently leverage the provided test images to generate follow-up cases of good realism, and effectively detect a great number of distinct violations, without the need for any pre-annotated information.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"50 11","pages":"2962-2982"},"PeriodicalIF":6.5000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10684067/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
The Image Captioning (IC) technique is widely used to describe images in natural language. However, even state-of-the-art IC systems can still produce incorrect captions and lead to misunderstandings. Recently, some IC system testing methods have been proposed. However, these methods still rely on pre-annotated information and hence cannot really alleviate the difficulty in identifying the test oracle. Furthermore, their methods artificially manipulate objects, which may generate unreal images as test cases and thus lead to less meaningful testing results. Thirdly, existing methods have various requirements on the eligibility of source test cases, and hence cannot fully utilize the given images to perform testing. To tackle these issues, in this paper, we propose
ReIC
to perform metamorphic testing for the IC systems with some image-level reduction transformations like image cropping and stretching. Instead of relying on the pre-annotated information,
ReIC
uses a localization method to align objects in the caption with corresponding objects in the image, and checks whether each object is correctly described or deleted in the caption after transformation. With the image-level reduction transformations,
ReIC
does not artificially manipulate any objects and hence can avoid generating unreal follow-up images. Additionally, it eliminates the requirement on the eligibility of source test cases during the metamorphic transformation process, as well as decreases the ambiguity and boosts the diversity among the follow-up test cases, which consequently enables testing to be performed on any test image and reveals more distinct valid violations. We employ
ReIC
to test five popular IC systems. The results demonstrate that
ReIC
can sufficiently leverage the provided test images to generate follow-up cases of good realism, and effectively detect a great number of distinct violations, without the need for any pre-annotated information.
期刊介绍:
IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include:
a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models.
b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects.
c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards.
d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues.
e) System issues: Hardware-software trade-offs.
f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.