Subsampling under distributional constraints

IF 2.1 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Statistical Analysis and Data Mining Pub Date : 2024-02-09 DOI:10.1002/sam.11661
Florian Combes, Ricardo Fraiman, Badih Ghattas
{"title":"Subsampling under distributional constraints","authors":"Florian Combes, Ricardo Fraiman, Badih Ghattas","doi":"10.1002/sam.11661","DOIUrl":null,"url":null,"abstract":"Some complex models are frequently employed to describe physical and mechanical phenomena. In this setting, we have an input <mjx-container aria-label=\"upper X\" ctxtmenu_counter=\"0\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-role=\"latinletter\" data-semantic-speech=\"upper X\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/7d0e6166-8cd5-42ea-ba28-4c46d5e0ba78/sam11661-math-0001.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-role=\"latinletter\" data-semantic-speech=\"upper X\" data-semantic-type=\"identifier\">X</mi></mrow>$$ X $$</annotation></semantics></math></mjx-assistive-mml></mjx-container> in a general space, and an output <mjx-container aria-label=\"upper Y equals f left parenthesis upper X right parenthesis\" ctxtmenu_counter=\"1\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow data-semantic-children=\"0,8\" data-semantic-content=\"1\" data-semantic- data-semantic-role=\"equality\" data-semantic-speech=\"upper Y equals f left parenthesis upper X right parenthesis\" data-semantic-type=\"relseq\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-parent=\"9\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-mo data-semantic- data-semantic-operator=\"relseq,=\" data-semantic-parent=\"9\" data-semantic-role=\"equality\" data-semantic-type=\"relation\" rspace=\"5\" space=\"5\"><mjx-c></mjx-c></mjx-mo><mjx-mrow data-semantic-annotation=\"clearspeak:simple\" data-semantic-children=\"2,6\" data-semantic-content=\"7,2\" data-semantic- data-semantic-parent=\"9\" data-semantic-role=\"simple function\" data-semantic-type=\"appl\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"simple function\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-mo data-semantic-added=\"true\" data-semantic- data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"application\" data-semantic-type=\"punctuation\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo><mjx-mrow data-semantic-children=\"4\" data-semantic-content=\"3,5\" data-semantic- data-semantic-parent=\"8\" data-semantic-role=\"leftright\" data-semantic-type=\"fenced\"><mjx-mo data-semantic- data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"open\" data-semantic-type=\"fence\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-parent=\"6\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-mo data-semantic- data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"close\" data-semantic-type=\"fence\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo></mjx-mrow></mjx-mrow></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/462084af-71bd-4d96-b7da-d144735fbe99/sam11661-math-0002.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow data-semantic-=\"\" data-semantic-children=\"0,8\" data-semantic-content=\"1\" data-semantic-role=\"equality\" data-semantic-speech=\"upper Y equals f left parenthesis upper X right parenthesis\" data-semantic-type=\"relseq\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-parent=\"9\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\">Y</mi><mo data-semantic-=\"\" data-semantic-operator=\"relseq,=\" data-semantic-parent=\"9\" data-semantic-role=\"equality\" data-semantic-type=\"relation\">=</mo><mrow data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-children=\"2,6\" data-semantic-content=\"7,2\" data-semantic-parent=\"9\" data-semantic-role=\"simple function\" data-semantic-type=\"appl\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"simple function\" data-semantic-type=\"identifier\">f</mi><mo data-semantic-=\"\" data-semantic-added=\"true\" data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"application\" data-semantic-type=\"punctuation\">⁡</mo><mrow data-semantic-=\"\" data-semantic-children=\"4\" data-semantic-content=\"3,5\" data-semantic-parent=\"8\" data-semantic-role=\"leftright\" data-semantic-type=\"fenced\"><mo data-semantic-=\"\" data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"open\" data-semantic-type=\"fence\" stretchy=\"false\">(</mo><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-parent=\"6\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\">X</mi><mo data-semantic-=\"\" data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"close\" data-semantic-type=\"fence\" stretchy=\"false\">)</mo></mrow></mrow></mrow>$$ Y=f(X) $$</annotation></semantics></math></mjx-assistive-mml></mjx-container> where <mjx-container aria-label=\"f\" ctxtmenu_counter=\"2\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-role=\"latinletter\" data-semantic-speech=\"f\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/c29317d8-616a-414b-b372-90c993754c09/sam11661-math-0003.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-role=\"latinletter\" data-semantic-speech=\"f\" data-semantic-type=\"identifier\">f</mi></mrow>$$ f $$</annotation></semantics></math></mjx-assistive-mml></mjx-container> is a very complicated function, whose computational cost for every new input is very high, and may be also very expensive. We are given two sets of observations of <mjx-container aria-label=\"upper X\" ctxtmenu_counter=\"3\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-role=\"latinletter\" data-semantic-speech=\"upper X\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/3f3db9eb-4570-4f8a-97ff-6809e982abe8/sam11661-math-0004.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-role=\"latinletter\" data-semantic-speech=\"upper X\" data-semantic-type=\"identifier\">X</mi></mrow>$$ X $$</annotation></semantics></math></mjx-assistive-mml></mjx-container>, <mjx-container aria-label=\"upper S 1\" ctxtmenu_counter=\"4\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow><mjx-msub data-semantic-children=\"0,1\" data-semantic- data-semantic-role=\"latinletter\" data-semantic-speech=\"upper S 1\" data-semantic-type=\"subscript\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-parent=\"2\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-script style=\"vertical-align: -0.15em; margin-left: -0.032em;\"><mjx-mn data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic- data-semantic-parent=\"2\" data-semantic-role=\"integer\" data-semantic-type=\"number\" size=\"s\"><mjx-c></mjx-c></mjx-mn></mjx-script></mjx-msub></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/4ec4ad24-2769-40bd-a111-0ff284af8d8b/sam11661-math-0005.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow><msub data-semantic-=\"\" data-semantic-children=\"0,1\" data-semantic-role=\"latinletter\" data-semantic-speech=\"upper S 1\" data-semantic-type=\"subscript\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-parent=\"2\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\">S</mi><mn data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic-parent=\"2\" data-semantic-role=\"integer\" data-semantic-type=\"number\">1</mn></msub></mrow>$$ {S}_1 $$</annotation></semantics></math></mjx-assistive-mml></mjx-container> and <mjx-container aria-label=\"upper S 2\" ctxtmenu_counter=\"5\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow><mjx-msub data-semantic-children=\"0,1\" data-semantic- data-semantic-role=\"latinletter\" data-semantic-speech=\"upper S 2\" data-semantic-type=\"subscript\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-parent=\"2\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-script style=\"vertical-align: -0.15em; margin-left: -0.032em;\"><mjx-mn data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic- data-semantic-parent=\"2\" data-semantic-role=\"integer\" data-semantic-type=\"number\" size=\"s\"><mjx-c></mjx-c></mjx-mn></mjx-script></mjx-msub></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/bb365000-a4df-473a-b8e4-dfcf2b501f67/sam11661-math-0006.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow><msub data-semantic-=\"\" data-semantic-children=\"0,1\" data-semantic-role=\"latinletter\" data-semantic-speech=\"upper S 2\" data-semantic-type=\"subscript\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-parent=\"2\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\">S</mi><mn data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic-parent=\"2\" data-semantic-role=\"integer\" data-semantic-type=\"number\">2</mn></msub></mrow>$$ {S}_2 $$</annotation></semantics></math></mjx-assistive-mml></mjx-container> of different sizes such that only <mjx-container aria-label=\"f left parenthesis upper S 1 right parenthesis\" ctxtmenu_counter=\"6\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow data-semantic-children=\"0,6\" data-semantic-content=\"7,0\" data-semantic- data-semantic-role=\"simple function\" data-semantic-speech=\"f left parenthesis upper S 1 right parenthesis\" data-semantic-type=\"appl\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"simple function\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-mo data-semantic-added=\"true\" data-semantic- data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"application\" data-semantic-type=\"punctuation\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo><mjx-mrow data-semantic-children=\"3\" data-semantic-content=\"4,5\" data-semantic- data-semantic-parent=\"8\" data-semantic-role=\"leftright\" data-semantic-type=\"fenced\"><mjx-mo data-semantic-added=\"true\" data-semantic- data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"open\" data-semantic-type=\"fence\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo><mjx-msub data-semantic-children=\"1,2\" data-semantic- data-semantic-parent=\"6\" data-semantic-role=\"latinletter\" data-semantic-type=\"subscript\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-parent=\"3\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-script style=\"vertical-align: -0.15em; margin-left: -0.032em;\"><mjx-mn data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic- data-semantic-parent=\"3\" data-semantic-role=\"integer\" data-semantic-type=\"number\" size=\"s\"><mjx-c></mjx-c></mjx-mn></mjx-script></mjx-msub><mjx-mo data-semantic-added=\"true\" data-semantic- data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"close\" data-semantic-type=\"fence\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo></mjx-mrow></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/4f57f6c2-b903-46a0-84c4-8e5a71ed7666/sam11661-math-0007.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow data-semantic-=\"\" data-semantic-children=\"0,6\" data-semantic-content=\"7,0\" data-semantic-role=\"simple function\" data-semantic-speech=\"f left parenthesis upper S 1 right parenthesis\" data-semantic-type=\"appl\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"simple function\" data-semantic-type=\"identifier\">f</mi><mo data-semantic-=\"\" data-semantic-added=\"true\" data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"application\" data-semantic-type=\"punctuation\">⁡</mo><mrow data-semantic-=\"\" data-semantic-children=\"3\" data-semantic-content=\"4,5\" data-semantic-parent=\"8\" data-semantic-role=\"leftright\" data-semantic-type=\"fenced\"><mo data-semantic-=\"\" data-semantic-added=\"true\" data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"open\" data-semantic-type=\"fence\">(</mo><msub data-semantic-=\"\" data-semantic-children=\"1,2\" data-semantic-parent=\"6\" data-semantic-role=\"latinletter\" data-semantic-type=\"subscript\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-parent=\"3\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\">S</mi><mn data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic-parent=\"3\" data-semantic-role=\"integer\" data-semantic-type=\"number\">1</mn></msub><mo data-semantic-=\"\" data-semantic-added=\"true\" data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"close\" data-semantic-type=\"fence\">)</mo></mrow></mrow>$$ f\\left({S}_1\\right) $$</annotation></semantics></math></mjx-assistive-mml></mjx-container> is available. We tackle the problem of selecting a subset <mjx-container aria-label=\"upper S 3 subset of upper S 2\" ctxtmenu_counter=\"7\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow data-semantic-children=\"2,6\" data-semantic-content=\"3\" data-semantic- data-semantic-role=\"set\" data-semantic-speech=\"upper S 3 subset of upper S 2\" data-semantic-type=\"relseq\"><mjx-msub data-semantic-children=\"0,1\" data-semantic- data-semantic-parent=\"7\" data-semantic-role=\"latinletter\" data-semantic-type=\"subscript\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-parent=\"2\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-script style=\"vertical-align: -0.15em; margin-left: -0.032em;\"><mjx-mn data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic- data-semantic-parent=\"2\" data-semantic-role=\"integer\" data-semantic-type=\"number\" size=\"s\"><mjx-c></mjx-c></mjx-mn></mjx-script></mjx-msub><mjx-mo data-semantic- data-semantic-operator=\"relseq,⊂\" data-semantic-parent=\"7\" data-semantic-role=\"set\" data-semantic-type=\"relation\" rspace=\"5\" space=\"5\"><mjx-c></mjx-c></mjx-mo><mjx-msub data-semantic-children=\"4,5\" data-semantic- data-semantic-parent=\"7\" data-semantic-role=\"latinletter\" data-semantic-type=\"subscript\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-parent=\"6\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-script style=\"vertical-align: -0.15em; margin-left: -0.032em;\"><mjx-mn data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic- data-semantic-parent=\"6\" data-semantic-role=\"integer\" data-semantic-type=\"number\" size=\"s\"><mjx-c></mjx-c></mjx-mn></mjx-script></mjx-msub></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/ff1fb648-0e13-4503-9d3e-0e7e51f4478d/sam11661-math-0008.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow data-semantic-=\"\" data-semantic-children=\"2,6\" data-semantic-content=\"3\" data-semantic-role=\"set\" data-semantic-speech=\"upper S 3 subset of upper S 2\" data-semantic-type=\"relseq\"><msub data-semantic-=\"\" data-semantic-children=\"0,1\" data-semantic-parent=\"7\" data-semantic-role=\"latinletter\" data-semantic-type=\"subscript\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-parent=\"2\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\">S</mi><mn data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic-parent=\"2\" data-semantic-role=\"integer\" data-semantic-type=\"number\">3</mn></msub><mo data-semantic-=\"\" data-semantic-operator=\"relseq,⊂\" data-semantic-parent=\"7\" data-semantic-role=\"set\" data-semantic-type=\"relation\">⊂</mo><msub data-semantic-=\"\" data-semantic-children=\"4,5\" data-semantic-parent=\"7\" data-semantic-role=\"latinletter\" data-semantic-type=\"subscript\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-parent=\"6\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\">S</mi><mn data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic-parent=\"6\" data-semantic-role=\"integer\" data-semantic-type=\"number\">2</mn></msub></mrow>$$ {S}_3\\subset {S}_2 $$</annotation></semantics></math></mjx-assistive-mml></mjx-container> of smaller size on which to run the complex model <mjx-container aria-label=\"f\" ctxtmenu_counter=\"8\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-role=\"latinletter\" data-semantic-speech=\"f\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/b4c52b54-aedc-4b11-ae25-6085d8e94e16/sam11661-math-0009.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-role=\"latinletter\" data-semantic-speech=\"f\" data-semantic-type=\"identifier\">f</mi></mrow>$$ f $$</annotation></semantics></math></mjx-assistive-mml></mjx-container>, and such that the empirical distribution of <mjx-container aria-label=\"f left parenthesis upper S 3 right parenthesis\" ctxtmenu_counter=\"9\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow data-semantic-children=\"0,6\" data-semantic-content=\"7,0\" data-semantic- data-semantic-role=\"simple function\" data-semantic-speech=\"f left parenthesis upper S 3 right parenthesis\" data-semantic-type=\"appl\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"simple function\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-mo data-semantic-added=\"true\" data-semantic- data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"application\" data-semantic-type=\"punctuation\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo><mjx-mrow data-semantic-children=\"3\" data-semantic-content=\"4,5\" data-semantic- data-semantic-parent=\"8\" data-semantic-role=\"leftright\" data-semantic-type=\"fenced\"><mjx-mo data-semantic-added=\"true\" data-semantic- data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"open\" data-semantic-type=\"fence\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo><mjx-msub data-semantic-children=\"1,2\" data-semantic- data-semantic-parent=\"6\" data-semantic-role=\"latinletter\" data-semantic-type=\"subscript\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-parent=\"3\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-script style=\"vertical-align: -0.15em; margin-left: -0.032em;\"><mjx-mn data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic- data-semantic-parent=\"3\" data-semantic-role=\"integer\" data-semantic-type=\"number\" size=\"s\"><mjx-c></mjx-c></mjx-mn></mjx-script></mjx-msub><mjx-mo data-semantic-added=\"true\" data-semantic- data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"close\" data-semantic-type=\"fence\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo></mjx-mrow></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/4cb7dec6-3f53-4721-977e-c8e0032eb454/sam11661-math-0010.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow data-semantic-=\"\" data-semantic-children=\"0,6\" data-semantic-content=\"7,0\" data-semantic-role=\"simple function\" data-semantic-speech=\"f left parenthesis upper S 3 right parenthesis\" data-semantic-type=\"appl\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"simple function\" data-semantic-type=\"identifier\">f</mi><mo data-semantic-=\"\" data-semantic-added=\"true\" data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"application\" data-semantic-type=\"punctuation\">⁡</mo><mrow data-semantic-=\"\" data-semantic-children=\"3\" data-semantic-content=\"4,5\" data-semantic-parent=\"8\" data-semantic-role=\"leftright\" data-semantic-type=\"fenced\"><mo data-semantic-=\"\" data-semantic-added=\"true\" data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"open\" data-semantic-type=\"fence\">(</mo><msub data-semantic-=\"\" data-semantic-children=\"1,2\" data-semantic-parent=\"6\" data-semantic-role=\"latinletter\" data-semantic-type=\"subscript\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-parent=\"3\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\">S</mi><mn data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic-parent=\"3\" data-semantic-role=\"integer\" data-semantic-type=\"number\">3</mn></msub><mo data-semantic-=\"\" data-semantic-added=\"true\" data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"close\" data-semantic-type=\"fence\">)</mo></mrow></mrow>$$ f\\left({S}_3\\right) $$</annotation></semantics></math></mjx-assistive-mml></mjx-container> is close to that of <mjx-container aria-label=\"f left parenthesis upper S 1 right parenthesis\" ctxtmenu_counter=\"10\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow data-semantic-children=\"0,6\" data-semantic-content=\"7,0\" data-semantic- data-semantic-role=\"simple function\" data-semantic-speech=\"f left parenthesis upper S 1 right parenthesis\" data-semantic-type=\"appl\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"simple function\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-mo data-semantic-added=\"true\" data-semantic- data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"application\" data-semantic-type=\"punctuation\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo><mjx-mrow data-semantic-children=\"3\" data-semantic-content=\"4,5\" data-semantic- data-semantic-parent=\"8\" data-semantic-role=\"leftright\" data-semantic-type=\"fenced\"><mjx-mo data-semantic-added=\"true\" data-semantic- data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"open\" data-semantic-type=\"fence\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo><mjx-msub data-semantic-children=\"1,2\" data-semantic- data-semantic-parent=\"6\" data-semantic-role=\"latinletter\" data-semantic-type=\"subscript\"><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-parent=\"3\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi><mjx-script style=\"vertical-align: -0.15em; margin-left: -0.032em;\"><mjx-mn data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic- data-semantic-parent=\"3\" data-semantic-role=\"integer\" data-semantic-type=\"number\" size=\"s\"><mjx-c></mjx-c></mjx-mn></mjx-script></mjx-msub><mjx-mo data-semantic-added=\"true\" data-semantic- data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"close\" data-semantic-type=\"fence\" style=\"margin-left: 0.056em; margin-right: 0.056em;\"><mjx-c></mjx-c></mjx-mo></mjx-mrow></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/dd7f9f4b-7fc5-4fc4-82e6-210e76febd4c/sam11661-math-0011.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow data-semantic-=\"\" data-semantic-children=\"0,6\" data-semantic-content=\"7,0\" data-semantic-role=\"simple function\" data-semantic-speech=\"f left parenthesis upper S 1 right parenthesis\" data-semantic-type=\"appl\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"simple function\" data-semantic-type=\"identifier\">f</mi><mo data-semantic-=\"\" data-semantic-added=\"true\" data-semantic-operator=\"appl\" data-semantic-parent=\"8\" data-semantic-role=\"application\" data-semantic-type=\"punctuation\">⁡</mo><mrow data-semantic-=\"\" data-semantic-children=\"3\" data-semantic-content=\"4,5\" data-semantic-parent=\"8\" data-semantic-role=\"leftright\" data-semantic-type=\"fenced\"><mo data-semantic-=\"\" data-semantic-added=\"true\" data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"open\" data-semantic-type=\"fence\">(</mo><msub data-semantic-=\"\" data-semantic-children=\"1,2\" data-semantic-parent=\"6\" data-semantic-role=\"latinletter\" data-semantic-type=\"subscript\"><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-parent=\"3\" data-semantic-role=\"latinletter\" data-semantic-type=\"identifier\">S</mi><mn data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"normal\" data-semantic-parent=\"3\" data-semantic-role=\"integer\" data-semantic-type=\"number\">1</mn></msub><mo data-semantic-=\"\" data-semantic-added=\"true\" data-semantic-operator=\"fenced\" data-semantic-parent=\"6\" data-semantic-role=\"close\" data-semantic-type=\"fence\">)</mo></mrow></mrow>$$ f\\left({S}_1\\right) $$</annotation></semantics></math></mjx-assistive-mml></mjx-container>. We suggest three algorithms to solve this problem and show their efficiency using simulated datasets and the Airfoil self-noise data set.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"36 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Analysis and Data Mining","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1002/sam.11661","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Some complex models are frequently employed to describe physical and mechanical phenomena. In this setting, we have an input in a general space, and an output where is a very complicated function, whose computational cost for every new input is very high, and may be also very expensive. We are given two sets of observations of , and of different sizes such that only is available. We tackle the problem of selecting a subset of smaller size on which to run the complex model , and such that the empirical distribution of is close to that of . We suggest three algorithms to solve this problem and show their efficiency using simulated datasets and the Airfoil self-noise data set.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
分布约束下的子采样
我们经常使用一些复杂的模型来描述物理和机械现象。在这种情况下,我们有一个一般空间中的输入 X$$ X $$,以及一个输出 Y=f(X)$$ Y=f(X) $$,其中 f$$ f $ 是一个非常复杂的函数,它对每一个新输入的计算成本都非常高,也可能非常昂贵。我们给定了两组不同大小的 X$$ X$, S1$$ {S}_1 $$ 和 S2$$ {S}_2 $$ 的观测值,其中只有 f(S1)$$ f\left({S}_1\right) $$ 是可用的。我们要解决的问题是选择一个规模较小的子集 S3⊂S2$$ {S}_3\subset {S}_2 $$ 来运行复杂模型 f$$ f $$,并使 f(S3)$$ f\left({S}_3\right) $$ 的经验分布与 f(S1)$$ f\left({S}_1\right) $$ 的经验分布接近。我们提出了三种算法来解决这个问题,并使用模拟数据集和机翼自噪声数据集展示了它们的效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Statistical Analysis and Data Mining
Statistical Analysis and Data Mining COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCEC-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
CiteScore
3.20
自引率
7.70%
发文量
43
期刊介绍: Statistical Analysis and Data Mining addresses the broad area of data analysis, including statistical approaches, machine learning, data mining, and applications. Topics include statistical and computational approaches for analyzing massive and complex datasets, novel statistical and/or machine learning methods and theory, and state-of-the-art applications with high impact. Of special interest are articles that describe innovative analytical techniques, and discuss their application to real problems, in such a way that they are accessible and beneficial to domain experts across science, engineering, and commerce. The focus of the journal is on papers which satisfy one or more of the following criteria: Solve data analysis problems associated with massive, complex datasets Develop innovative statistical approaches, machine learning algorithms, or methods integrating ideas across disciplines, e.g., statistics, computer science, electrical engineering, operation research. Formulate and solve high-impact real-world problems which challenge existing paradigms via new statistical and/or computational models Provide survey to prominent research topics.
期刊最新文献
Quantifying Epistemic Uncertainty in Binary Classification via Accuracy Gain A new logarithmic multiplicative distortion for correlation analysis Revisiting Winnow: A modified online feature selection algorithm for efficient binary classification A random forest approach for interval selection in functional regression Characterizing climate pathways using feature importance on echo state networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1