{"title":"关于简化混合布尔算术表达式","authors":"Yu. V. Kosolapov","doi":"10.3103/S0146411624700299","DOIUrl":null,"url":null,"abstract":"<p>Mixed Boolean-arithmetic expressions (MBA expressions) with <i>t</i> integer <i>n</i>-bit variables are often used for program obfuscations. Obfuscation consists of replacing short expressions with longer equivalent expressions that seem to take the analyst more time to explore. This paper shows that to simplify linear MBA expressions (reduce the number of terms), a technique similar to the technique of decoding linear codes by information sets can be applied. Based on this technique, algorithms for simplifying linear MBA expressions are constructed: an algorithm for finding an expression of minimum length and an algorithm for reducing the length of an expression. Based on the length reduction algorithm, an algorithm is constructed that allows us to estimate the resistance of an MBA expression to simplification. We experimentally estimate the dependence of the average number of terms in a linear MBA expression returned by simplification algorithms on <i>n</i>, the number of decoding iterations, and the power of the set of Boolean functions, by which a linear combination with a minimum number of nonzero coefficients is sought. The results of the experiments for all considered <i>t</i> and <i>n</i> show that if before obfuscation the linear MBA expression contained <i>r</i> = 1, 2, 3 terms, then the developed simplification algorithms with a probability close to one allow using the obfuscated version of this expression find an equivalent one with no more than <i>r</i> terms. This is the main difference between the information set decoding technique and the well-known techniques for simplifying linear MBA expressions, where the goal is to reduce the number of terms to no more than <i>2</i><sup><i>t</i></sup>. We also found that for randomly generated linear MBA expressions with increasing <i>n</i>, the average number of terms in the returned expression tends to <i>2</i><sup><i>t</i></sup> and does not differ from the average number of terms in the linear expression returned by known simplification algorithms. The results obtained, in particular, make it possible to determine <i>t</i> and <i>n</i> for which the number of terms in the simplified linear MBA expression on average will not be less than the given one.</p>","PeriodicalId":46238,"journal":{"name":"AUTOMATIC CONTROL AND COMPUTER SCIENCES","volume":"58 7","pages":"836 - 852"},"PeriodicalIF":0.6000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On Simplifying Mixed Boolean-Arithmetic Expressions\",\"authors\":\"Yu. V. Kosolapov\",\"doi\":\"10.3103/S0146411624700299\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Mixed Boolean-arithmetic expressions (MBA expressions) with <i>t</i> integer <i>n</i>-bit variables are often used for program obfuscations. Obfuscation consists of replacing short expressions with longer equivalent expressions that seem to take the analyst more time to explore. This paper shows that to simplify linear MBA expressions (reduce the number of terms), a technique similar to the technique of decoding linear codes by information sets can be applied. Based on this technique, algorithms for simplifying linear MBA expressions are constructed: an algorithm for finding an expression of minimum length and an algorithm for reducing the length of an expression. Based on the length reduction algorithm, an algorithm is constructed that allows us to estimate the resistance of an MBA expression to simplification. We experimentally estimate the dependence of the average number of terms in a linear MBA expression returned by simplification algorithms on <i>n</i>, the number of decoding iterations, and the power of the set of Boolean functions, by which a linear combination with a minimum number of nonzero coefficients is sought. The results of the experiments for all considered <i>t</i> and <i>n</i> show that if before obfuscation the linear MBA expression contained <i>r</i> = 1, 2, 3 terms, then the developed simplification algorithms with a probability close to one allow using the obfuscated version of this expression find an equivalent one with no more than <i>r</i> terms. This is the main difference between the information set decoding technique and the well-known techniques for simplifying linear MBA expressions, where the goal is to reduce the number of terms to no more than <i>2</i><sup><i>t</i></sup>. We also found that for randomly generated linear MBA expressions with increasing <i>n</i>, the average number of terms in the returned expression tends to <i>2</i><sup><i>t</i></sup> and does not differ from the average number of terms in the linear expression returned by known simplification algorithms. The results obtained, in particular, make it possible to determine <i>t</i> and <i>n</i> for which the number of terms in the simplified linear MBA expression on average will not be less than the given one.</p>\",\"PeriodicalId\":46238,\"journal\":{\"name\":\"AUTOMATIC CONTROL AND COMPUTER SCIENCES\",\"volume\":\"58 7\",\"pages\":\"836 - 852\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2025-02-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AUTOMATIC CONTROL AND COMPUTER SCIENCES\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.3103/S0146411624700299\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AUTOMATIC CONTROL AND COMPUTER SCIENCES","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.3103/S0146411624700299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
On Simplifying Mixed Boolean-Arithmetic Expressions
Mixed Boolean-arithmetic expressions (MBA expressions) with t integer n-bit variables are often used for program obfuscations. Obfuscation consists of replacing short expressions with longer equivalent expressions that seem to take the analyst more time to explore. This paper shows that to simplify linear MBA expressions (reduce the number of terms), a technique similar to the technique of decoding linear codes by information sets can be applied. Based on this technique, algorithms for simplifying linear MBA expressions are constructed: an algorithm for finding an expression of minimum length and an algorithm for reducing the length of an expression. Based on the length reduction algorithm, an algorithm is constructed that allows us to estimate the resistance of an MBA expression to simplification. We experimentally estimate the dependence of the average number of terms in a linear MBA expression returned by simplification algorithms on n, the number of decoding iterations, and the power of the set of Boolean functions, by which a linear combination with a minimum number of nonzero coefficients is sought. The results of the experiments for all considered t and n show that if before obfuscation the linear MBA expression contained r = 1, 2, 3 terms, then the developed simplification algorithms with a probability close to one allow using the obfuscated version of this expression find an equivalent one with no more than r terms. This is the main difference between the information set decoding technique and the well-known techniques for simplifying linear MBA expressions, where the goal is to reduce the number of terms to no more than 2t. We also found that for randomly generated linear MBA expressions with increasing n, the average number of terms in the returned expression tends to 2t and does not differ from the average number of terms in the linear expression returned by known simplification algorithms. The results obtained, in particular, make it possible to determine t and n for which the number of terms in the simplified linear MBA expression on average will not be less than the given one.
期刊介绍:
Automatic Control and Computer Sciences is a peer reviewed journal that publishes articles on• Control systems, cyber-physical system, real-time systems, robotics, smart sensors, embedded intelligence • Network information technologies, information security, statistical methods of data processing, distributed artificial intelligence, complex systems modeling, knowledge representation, processing and management • Signal and image processing, machine learning, machine perception, computer vision