{"title":"Mating with Multi-Armed Bandits: Reinforcement Learning Models of Human Mate Search.","authors":"Daniel Conroy-Beam","doi":"10.1162/opmi_a_00156","DOIUrl":null,"url":null,"abstract":"<p><p>Mate choice requires navigating an exploration-exploitation trade-off. Successful mate choice requires choosing partners who have preferred qualities; but time spent determining one partner's qualities could have been spent exploring for potentially superior alternatives. Here I argue that this dilemma can be modeled in a reinforcement learning framework as a multi-armed bandit problem. Moreover, using agent-based models and a sample of <i>k</i> = 522 real-world romantic dyads, I show that a reciprocity-weighted Thompson sampling algorithm performs well both in guiding mate search in noisy search environments and in reproducing the mate choices of real-world participants. These results provide a formal model of the understudied psychology of human mate search. They additionally offer implications for our understanding of person perception and mate choice.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"995-1011"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11338293/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Mind","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1162/opmi_a_00156","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
Mate choice requires navigating an exploration-exploitation trade-off. Successful mate choice requires choosing partners who have preferred qualities; but time spent determining one partner's qualities could have been spent exploring for potentially superior alternatives. Here I argue that this dilemma can be modeled in a reinforcement learning framework as a multi-armed bandit problem. Moreover, using agent-based models and a sample of k = 522 real-world romantic dyads, I show that a reciprocity-weighted Thompson sampling algorithm performs well both in guiding mate search in noisy search environments and in reproducing the mate choices of real-world participants. These results provide a formal model of the understudied psychology of human mate search. They additionally offer implications for our understanding of person perception and mate choice.
择偶需要在探索与开发之间进行权衡。成功的择偶需要选择具有优先选择特质的伴侣,但确定一个伴侣特质所花费的时间本可以用来探索潜在的更优选择。在这里,我认为可以在强化学习框架中将这种两难问题建模为多臂强盗问题。此外,通过使用基于代理的模型和 k = 522 个现实世界中恋爱配对的样本,我证明了互惠加权的汤普森抽样算法在指导嘈杂搜索环境中的配偶搜索和再现现实世界参与者的配偶选择方面都表现出色。这些结果为研究不足的人类配偶搜索心理提供了一个正式模型。此外,它们还为我们理解人的感知和择偶提供了启示。