{"title":"Who Got Lost in the Mall? Challenges in Counting and Classifying False Memories","authors":"Gillian Murphy, Ciara M. Greene","doi":"10.1002/acp.70044","DOIUrl":null,"url":null,"abstract":"<p>What is a memory? Can an outside observer really ascertain whether someone is remembering an event? How can they do so reliably? These are challenging questions that we face as memory researchers, particularly when we try to tease apart true and false memories and beliefs. In this issue, Andrews and Brewin (<span>2024</span>) reanalysed a portion of the data from our recent rich false memory study (Murphy, Dawson, et al. <span>2023</span>) and developed a novel coding scheme based on counting reported details. They also applied further, more stringent criteria to classify the false memories reported in our study and concluded that this method yields a different false memory rate from the scheme used in our original paper, and that this rate is different again from participants' own self-reported memories. These findings do not surprise us—in our experience, different coding schemes will always yield different rates—but we do disagree with both the methods used and the conclusions that Andrews and Brewin drew from these findings.</p><p>To provide some context, we first offer a brief overview of the replication study. This was conducted by a team of students as a collaborative project (Murphy and Greene <span>2023</span>) and closely adhered to the methods of the classic Lost in the Mall study (Loftus and Pickrell <span>1995</span>). Participants signed up for a study about how we remember our childhoods and their informant (usually their mother) completed an online survey telling us about some true childhood events as well as some information about shopping trips when the participant was a child. We then sent participants a survey in which they were shown three true memory descriptions (taken from their informant's account) and one false memory prompt that described the participant getting lost in a shopping mall as a child; this false event was created by slotting the informant-provided details into a pre-prepared narrative in which the participant was described as getting lost for a short period of time, becoming upset and then being found by an elderly woman before being reunited with their parent. Participants were then interviewed on two separate occasions, for 20–30 min, where they were encouraged to try to remember as much as they could about the event. The transcripts of these conversations were then coded for the presence of a memory using a pre-registered coding scheme. At the conclusion of the second interview, participants self-reported whether or not they remembered each of the events, before being debriefed. Participants and informants reported enjoying the study and largely did not object to the deception employed (Murphy, Maher, et al. <span>2023</span>). In a follow-up study, we confirmed that our debriefing methods were effective at retracting these false memories (Greene et al. <span>2024</span>).</p><p>It is important to first note that we welcome scrutiny and discussion of our results. Our participants (and their parents) generously volunteered a lot of time to complete our study and, as a research team, we exerted significant effort to make our anonymised data open and accessible to other researchers. It is gratifying to see our hard-won data informing the work of other researchers. So, while we do not agree with the methods or conclusions put forward by Andrews and Brewin, we do wholeheartedly support their careful inspection of our work. Many of the questions raised by Andrews and Brewin relate to foundational principles in false memory research, principles we considered at length in running this study and which we feel would benefit from further discussion. Our methodological choices (and our pre-registered hypotheses) reflect our conviction that all coding schemes are imperfect and there is no absolute rate of false memory formation that we could or should expect to observe in any one study—we therefore set out to record multiple different measures of memory that could be considered in the round. Here, we take the opportunity to unpack some of these thorny issues further and offer our perspective, which aligns in many ways with the commentary by Wade et al. (<span>2025</span>). We make four arguments related to the coding of false memories: 1. There is no one perfect false memory coding scheme, 2. There is no absolute false memory rate, 3. Memory distortion is an active process, not a passive ‘hacking’ of one's memory, and 4. Interviews are a noisy means of assessing memories.</p><p>We agree with Andrews and Brewin's (<span>2024</span>) basic finding that any given coding scheme is likely to give different results from another. As we reported in our original paper, at the conclusion of the study during the second interview, we observed a false memory rate of 35% when applying the Loftus and Pickrell (<span>1995</span>) coding scheme to the interview transcript, but we also observed a self-reported false memory rate of 14% (alongside an additional 52% of participants who self-reported believing that the event had occurred). Furthermore, when we showed excerpts from the transcripts to a mock jury and asked them whether the interviewee was remembering an event, they demonstrated only moderate agreement with the other methods of coding false memories (55% agreement with the coding scheme, 70% agreement with self-report), with mock jurors adopting more liberal thresholds for classifying memories.</p><p>These findings, while noteworthy, were very much in line with prior published work that has established that memory rates can vary hugely when different schemes are applied. For example, Wade et al. (<span>2018</span>) reanalysed a rich false memory study conducted by Shaw and Porter (<span>2015</span>), reporting a false memory rate of just 30%, in contrast to the originally reported 70%. These discrepancies are interesting, both theoretically and practically, and speak to the enormous challenge inherent in trying to classify a memory. There have been extensive debates in the memory literature regarding how best to code transcripts for the presence of a false memory, including arguments that we should rely on participant self-reports rather than researcher-produced coding schemes. These arguments have provoked nuanced discussions of the nature of (false) memory and how we should define it (Shaw <span>2018</span>).</p><p>The selection of these ‘key details’ is arbitrary, and Andrews and Brewin's analysis suggests that some were quite poorly chosen. For example, 0% of those classed as having a full memory in our analysis noted their age when recalling the event. This is not surprising to us, given that the false event was the third event being discussed in that interview, and all of the events were from around that period of childhood. As we will later discuss, the natural rhythms of conversation are such that many people do not mention the age they were when an event took place, even if it was a detail we provided them with to help them imagine when it might have taken place. Others ‘key details’, such as the detail about the elderly woman, are actually multiple details bundled together, whereby the participant was only coded as explicitly recalling that detail if they mentioned the person was 1. old, 2. female, and 3. performed a helpful act. It is arbitrary to declare certain details to be so central to the memory that a failure to mention one specific detail results in a memory being downgraded or discounted entirely.</p><p>At the start of the study, this participant stated that she did not remember this event at all, but by interview two here she gives a rich and detailed account of getting lost in a shopping mall, which was coded as a full memory using the Loftus and Pickrell scheme. When asked, she self-reported a clear memory for this event and said she would be extremely willing to testify that it happened (9/10 on a Likert scale of willingness). However, she does not mention her age, being upset, or the name of the shopping mall (though she does name the shop itself), so would only score a 3/6 on Andrews and Brewin's novel coding scheme. Indeed, they observed that not a single participant reported more than four of their six core details. Despite this, the participant offers rich sensory details and insists she has a very clear and trusted memory of the event. This example also highlights another problem with this counting approach, in that each detail is given equal weighting. Reporting the name of the shopping mall is equally important as remembering being lost.</p><p>This is a memory of a true event (and an important one at that), and yet the participant only explicitly reports two out of the six details (visiting her mother in the hospital and, with uncertainty, receiving a Dora doll). She does not note her age or the name of the hospital, or mention that her mother hugged her when she came in. She does mention noticing how small the baby was but does not specifically recall commenting on his ears or fingers. Note too that this participant reports a slightly different version of the event (stating it was her cousin and not her grandmother who accompanied her) and though she says she only remembers the detail about the doll because it was contained in the original prompt, she has seemingly fleshed out that image so that she now notes the doll was wrapped in a blanket rather than wrapped with wrapping paper. She does report additional details that were not in the prompt (e.g., the silver Toyota, the physical location of everyone in the room), but these do not form part of Andrews and Brewin's scheme, which seems to assume that, in order to be considered a rich and detailed memory, the prompt should be repeated back verbatim. Yet this participant reports remembering this event, and we are confident that if you asked a layperson (or a jury member) whether this participant was remembering this event, they would say yes. Counting the presence of ‘key’ details from a prompt is one way to assess the richness of false memories, but it is not the only way—and in fact, we would argue it is one of the least valuable in terms of understanding memory (re)construction.</p><p>Whether false memories occur in a given paradigm 5% of the time or 55% of the time does not change what these paradigms tell us about the nature of human memory, nor does it change the forensic implications. Even if we could settle on an agreed rate (a very difficult task given the variables involved), that would not tell an investigator or an expert witness whether a given memory is false (Smeets et al. <span>2017</span>). The Lost in the Mall study is so well known because it established that false memories <i>can</i> happen, but neither the original Loftus and Pickrell paper nor our replication study made any claims about the absolute rate at which this should be expected to occur. Other work has also clearly demonstrated that though around a quarter of participants typically form a false memory in a given study (Scoboria et al. <span>2017</span>), that does not mean that only a quarter of the population are susceptible to forming false memories (Murphy, Loftus, et al. <span>2023</span>; Patihis <span>2018</span>).</p><p>For the self-report question, these participants were explicitly asked if they <i>remembered being lost in a shopping mall</i>, <i>and they indicated that they did</i>. To then remove those participants for not mentioning being lost earlier in the interview is clearly a highly restrictive way to classify memories.</p><p>Andrews and Brewin also quite notably fail to mention the high rates of belief in these fabricated events. Altogether, the self-reported data suggested that 66% of participants remembered (14%) or believed (52%) that the event had occurred. Thus, the Loftus and Pickrell coding scheme provided higher estimates of false memories than self-report, but also failed to capture that the majority of participants came to believe the event happened and were willing to testify to that fact. As discussed by Scoboria and Mazzoni (<span>2017</span>), belief has been shown to be more than sufficient to cause changes in behaviour (Bernstein et al. <span>2015</span>) and so false beliefs are an important outcome from rich false memory studies.</p><p>Where we perhaps most strongly disagree with Andrews and Brewin is their assertion that ‘half the group described potentially true events’. The possibility that participants really did get lost in a shopping mall as children is of course a pertinent one in a study like this—hence why other studies have utilised less commonplace experiences (e.g., Hyman Jr. et al. <span>1995</span>). While we identified three participants in our original study that we believed could have been reporting a true event (based on their persistent reporting of the event, from the initial survey through to the post-debrief follow-up), Andrews and Brewin declared half of the false memory reports to be ‘potentially true’, marking perhaps the most significant step on their journey from our 35% estimate to their 4% estimate.</p><p>The rationale for these memories being potentially true raises an interesting theoretical point about the nature of memory. Andrews and Brewin note that these memories were likely real because, for example, the participant reported getting lost in a different shop from the one we prompted them with. However, mountains of evidence on the reconstructive nature of memory would predict exactly this, that participants would take our prompt and actively merge it with their own knowledge and experiences (Greene et al. <span>2022</span>; Lindsay and Hyman Jr <span>2017</span>; Loftus and Pickrell <span>1995</span>; Murphy et al. <span>2019</span>). The nature of the Lost in the Mall paradigm is particularly active—participants are explicitly encouraged to search their memories and have a discussion with the interviewer about what they can recall and what images they see in their mind. This is in contrast to the kind of process implied by Andrews and Brewin. The expectation that we would hand participants a prompt and they would then recite it back to us, verbatim, with no changes and all so-called ‘core details’ intact suggests a very passive process, almost a hacking of memory where a complete event is ‘uploaded’ to our participants' minds.</p><p>Andrews and Brewin argue that the events that they have classified as potentially true were recalled with greater certainty and detail and less closely matched the details provided in the prompt (i.e., a different shopping mall was named). They suggest that these were true events that really happened and thus were reported with more certainty. However, it may also be that <i>because</i> the participant actively connected the fake story to other, real events from their lives, these participants built more detailed and convincing false memories—indeed, extant research clearly indicates that people do integrate real personal experiences into false memories in just this manner (Shaw and Porter <span>2015</span>; Zaragoza et al. <span>2019</span>). We do not have the data here to answer this question with any certainty, but we would welcome an experimental assessment of this point in the future. Regardless, we would not predict that participants would ever passively accept every detail supplied to them and note that the real-world harms that may arise from, say, suggestive therapy practices, are not contingent on wholesale adoption of every presented detail either.</p><p>Perhaps our greatest lesson in carrying out this large-scale study was the fact that the interview transcripts are a product of natural conversation. When we devise coding schemes, we can sometimes fall into the logical trap of thinking we are applying the coding scheme to a participant's memory. As we cannot see inside their brain and scrutinise their recollections directly, we are in fact coding the way they <i>speak</i> about their memory. In a study like this, participants are not delivering a monologue; they are engaging in dialogue with an interviewer. Thus, their answers are contingent on the questions they are asked.</p><p>This distinction was particularly pronounced in our study, as we had six student interviewers conducting this project and there was variation in their styles. Though they were well trained and all followed the same interview schedule, they had different personalities and also varying levels of rapport with the participants. We saw considerable variation in how rates of false memories changed between the coded booklet survey (before any contact with the researcher), the coded interview transcript, and the participants' own self-reported memory declaration. For example, the participants assigned to one researcher had a 10% false memory rate in the booklet survey, which rose to a 50% rate by the second interview, but returned to a 10% rate for self-report. Another interviewer's participants had an 18% false memory rate in the booklet survey that actually dropped to 10% during the interviews, then dropped again to 0% for self-report. This may have been due to differences between interviewers, as it seems they varied in the follow-up questions they asked and what kind of information they encouraged the participant to say ‘on the record’, as it were. We also note that the associative nature of the recall process would predict that slightly different details are recalled during different attempts (Odinot et al. <span>2013</span>) – humans are not jukeboxes, and a similar prompt will not elicit an identical recollection on each occasion. In addition, participants' level of attention to the conversation and the recalled event is likely to wax and wane over the course of the conversation, and previous research suggests that the attentiveness of a listener can impact what details are recalled (Pasupathi and Oldroyd <span>2015</span>).</p><p>The role of the interviewer is particularly pertinent when applying a count-based scheme like that of Andrews and Brewin. They noted that very few of our participants recounted their age when discussing their false memory. Of course, this is not how conversation normally works. Imagine someone asking you about your first day of school, at the age of four. You would not typically begin your account by saying, ‘I was four years old when I started school’, unless that detail felt particularly pertinent to you (‘…so I was the youngest because everyone else was at least five’). Instead, you might talk about your memories of the classroom, the teacher, the other children etc. If age was to be considered an important detail a priori, it would be important to add a question to the interview schedule (‘and what age were you when this happened?’) to fairly judge whether participants recall that detail or not. Absence of evidence is not evidence of absence—we simply do not know whether participants came to remember that they were about five when this false event occurred, as we did not ask them and cannot draw conclusions from their failure to mention it. In our replication study, we saw the role of the interviewers as encouraging participants to talk and so they asked an array of open-ended questions (e.g., ‘and can you picture what it was like in the shop? Who would have been with you?’ etc.). They were facilitators of the conversation, not examiners of memory detail. Interviewers were also expected to maintain the study's ruse at this point; as participants were unaware we were studying false memories, it was important not to interrogate participants to the point where they might question if the event really happened.</p><p>Many participants may not have actually stated that they got lost, because it was implied by the question. We note that media training often encourages interviewees to include the question in their answer, so that when extracted out of context in a soundbite the quote is more detailed (i.e., when asked when you will launch a product, rather than saying ‘December’, you might be encouraged to say ‘We will launch this product in December’). Training is required to learn to speak like this precisely because we <i>do not</i> naturally speak so repetitively in natural conversation. When coding memory, it is therefore important to remain cognisant of the specific prompts offered to an interviewee as clearly that gives context to what they do and do not say in their narration.</p><p>It is useful for researchers to reflect on the role of interviewers in rich false memory studies and to consider in advance what their approach will be. Decisions about the interview style and coding scheme ought to be made in unison (and ideally, preregistered), as the interviewer has such an influence on what the participant is likely to speak about. As we have discussed, it is difficult to move the goalposts after the fact and employ a detail-based scheme when the interviews were not set up to assess the presence or absence of those specific details. In our study, we found it useful to combine the natural (imperfect) conversation between participant and interviewer with some standardised questions that they answered during and after the event (Do you remember this event? How vivid is your memory? etc.) and to consider the resulting data in a holistic manner.</p><p>In our Lost in the Mall replication, we reported a top-line false memory rate of 35%, which is in line with the rates reported across a range of similar studies (see Scoboria et al. <span>2017</span> for a mega analysis of false memory implantation studies). Our position is certainly not to argue that any particular false memory rate is ‘correct’; as noted in our replication paper and in the above discussion, we advocate for the use of multiple coding methods, including self-report where appropriate. Just as importantly, we argue that memory reports should be evaluated holistically, with consideration of the context in which the reports were obtained (here, via a naturalistic conversation). We do not consider the use of reductive and over-simplistic count schemes to be a useful measure of memory (true or false) and reject the idea that memory prompts should be repeated back without alteration in order for a participant's recollection to be considered a memory.</p><p>The clinical and forensic implications of the Lost in the Mall study (and our replication study) remain clear and important. We note that the implantation methods used in these studies are fairly light-touch. Though the studies are enormously burdensome to conduct, involving contact with parents and multiple interviews and online surveys per participant, the actual manipulation of memory is quite mild. Participants are presented with a very short summary of a supposed event from their childhood and are asked to reflect on whether they remember it. That is all. As Scoboria and Mazzoni (<span>2017</span>) noted, this pales in comparison to the kind of memory distortion that might occur over years of suggestive therapy. We therefore respectfully submit that to quibble over the precise rate of false memory in a given study is essentially to miss the point regarding the potential harms to therapeutic patients (Wade et al. <span>2025</span>).</p><p><b>Gillian Murphy:</b> writing – original draft, conceptualization. <b>Ciara M. Greene:</b> conceptualization, writing – review and editing.</p><p>The authors have nothing to report.</p><p>The authors declare no conflicts of interest.</p>","PeriodicalId":48281,"journal":{"name":"Applied Cognitive Psychology","volume":"39 2","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/acp.70044","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Cognitive Psychology","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/acp.70044","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
What is a memory? Can an outside observer really ascertain whether someone is remembering an event? How can they do so reliably? These are challenging questions that we face as memory researchers, particularly when we try to tease apart true and false memories and beliefs. In this issue, Andrews and Brewin (2024) reanalysed a portion of the data from our recent rich false memory study (Murphy, Dawson, et al. 2023) and developed a novel coding scheme based on counting reported details. They also applied further, more stringent criteria to classify the false memories reported in our study and concluded that this method yields a different false memory rate from the scheme used in our original paper, and that this rate is different again from participants' own self-reported memories. These findings do not surprise us—in our experience, different coding schemes will always yield different rates—but we do disagree with both the methods used and the conclusions that Andrews and Brewin drew from these findings.
To provide some context, we first offer a brief overview of the replication study. This was conducted by a team of students as a collaborative project (Murphy and Greene 2023) and closely adhered to the methods of the classic Lost in the Mall study (Loftus and Pickrell 1995). Participants signed up for a study about how we remember our childhoods and their informant (usually their mother) completed an online survey telling us about some true childhood events as well as some information about shopping trips when the participant was a child. We then sent participants a survey in which they were shown three true memory descriptions (taken from their informant's account) and one false memory prompt that described the participant getting lost in a shopping mall as a child; this false event was created by slotting the informant-provided details into a pre-prepared narrative in which the participant was described as getting lost for a short period of time, becoming upset and then being found by an elderly woman before being reunited with their parent. Participants were then interviewed on two separate occasions, for 20–30 min, where they were encouraged to try to remember as much as they could about the event. The transcripts of these conversations were then coded for the presence of a memory using a pre-registered coding scheme. At the conclusion of the second interview, participants self-reported whether or not they remembered each of the events, before being debriefed. Participants and informants reported enjoying the study and largely did not object to the deception employed (Murphy, Maher, et al. 2023). In a follow-up study, we confirmed that our debriefing methods were effective at retracting these false memories (Greene et al. 2024).
It is important to first note that we welcome scrutiny and discussion of our results. Our participants (and their parents) generously volunteered a lot of time to complete our study and, as a research team, we exerted significant effort to make our anonymised data open and accessible to other researchers. It is gratifying to see our hard-won data informing the work of other researchers. So, while we do not agree with the methods or conclusions put forward by Andrews and Brewin, we do wholeheartedly support their careful inspection of our work. Many of the questions raised by Andrews and Brewin relate to foundational principles in false memory research, principles we considered at length in running this study and which we feel would benefit from further discussion. Our methodological choices (and our pre-registered hypotheses) reflect our conviction that all coding schemes are imperfect and there is no absolute rate of false memory formation that we could or should expect to observe in any one study—we therefore set out to record multiple different measures of memory that could be considered in the round. Here, we take the opportunity to unpack some of these thorny issues further and offer our perspective, which aligns in many ways with the commentary by Wade et al. (2025). We make four arguments related to the coding of false memories: 1. There is no one perfect false memory coding scheme, 2. There is no absolute false memory rate, 3. Memory distortion is an active process, not a passive ‘hacking’ of one's memory, and 4. Interviews are a noisy means of assessing memories.
We agree with Andrews and Brewin's (2024) basic finding that any given coding scheme is likely to give different results from another. As we reported in our original paper, at the conclusion of the study during the second interview, we observed a false memory rate of 35% when applying the Loftus and Pickrell (1995) coding scheme to the interview transcript, but we also observed a self-reported false memory rate of 14% (alongside an additional 52% of participants who self-reported believing that the event had occurred). Furthermore, when we showed excerpts from the transcripts to a mock jury and asked them whether the interviewee was remembering an event, they demonstrated only moderate agreement with the other methods of coding false memories (55% agreement with the coding scheme, 70% agreement with self-report), with mock jurors adopting more liberal thresholds for classifying memories.
These findings, while noteworthy, were very much in line with prior published work that has established that memory rates can vary hugely when different schemes are applied. For example, Wade et al. (2018) reanalysed a rich false memory study conducted by Shaw and Porter (2015), reporting a false memory rate of just 30%, in contrast to the originally reported 70%. These discrepancies are interesting, both theoretically and practically, and speak to the enormous challenge inherent in trying to classify a memory. There have been extensive debates in the memory literature regarding how best to code transcripts for the presence of a false memory, including arguments that we should rely on participant self-reports rather than researcher-produced coding schemes. These arguments have provoked nuanced discussions of the nature of (false) memory and how we should define it (Shaw 2018).
The selection of these ‘key details’ is arbitrary, and Andrews and Brewin's analysis suggests that some were quite poorly chosen. For example, 0% of those classed as having a full memory in our analysis noted their age when recalling the event. This is not surprising to us, given that the false event was the third event being discussed in that interview, and all of the events were from around that period of childhood. As we will later discuss, the natural rhythms of conversation are such that many people do not mention the age they were when an event took place, even if it was a detail we provided them with to help them imagine when it might have taken place. Others ‘key details’, such as the detail about the elderly woman, are actually multiple details bundled together, whereby the participant was only coded as explicitly recalling that detail if they mentioned the person was 1. old, 2. female, and 3. performed a helpful act. It is arbitrary to declare certain details to be so central to the memory that a failure to mention one specific detail results in a memory being downgraded or discounted entirely.
At the start of the study, this participant stated that she did not remember this event at all, but by interview two here she gives a rich and detailed account of getting lost in a shopping mall, which was coded as a full memory using the Loftus and Pickrell scheme. When asked, she self-reported a clear memory for this event and said she would be extremely willing to testify that it happened (9/10 on a Likert scale of willingness). However, she does not mention her age, being upset, or the name of the shopping mall (though she does name the shop itself), so would only score a 3/6 on Andrews and Brewin's novel coding scheme. Indeed, they observed that not a single participant reported more than four of their six core details. Despite this, the participant offers rich sensory details and insists she has a very clear and trusted memory of the event. This example also highlights another problem with this counting approach, in that each detail is given equal weighting. Reporting the name of the shopping mall is equally important as remembering being lost.
This is a memory of a true event (and an important one at that), and yet the participant only explicitly reports two out of the six details (visiting her mother in the hospital and, with uncertainty, receiving a Dora doll). She does not note her age or the name of the hospital, or mention that her mother hugged her when she came in. She does mention noticing how small the baby was but does not specifically recall commenting on his ears or fingers. Note too that this participant reports a slightly different version of the event (stating it was her cousin and not her grandmother who accompanied her) and though she says she only remembers the detail about the doll because it was contained in the original prompt, she has seemingly fleshed out that image so that she now notes the doll was wrapped in a blanket rather than wrapped with wrapping paper. She does report additional details that were not in the prompt (e.g., the silver Toyota, the physical location of everyone in the room), but these do not form part of Andrews and Brewin's scheme, which seems to assume that, in order to be considered a rich and detailed memory, the prompt should be repeated back verbatim. Yet this participant reports remembering this event, and we are confident that if you asked a layperson (or a jury member) whether this participant was remembering this event, they would say yes. Counting the presence of ‘key’ details from a prompt is one way to assess the richness of false memories, but it is not the only way—and in fact, we would argue it is one of the least valuable in terms of understanding memory (re)construction.
Whether false memories occur in a given paradigm 5% of the time or 55% of the time does not change what these paradigms tell us about the nature of human memory, nor does it change the forensic implications. Even if we could settle on an agreed rate (a very difficult task given the variables involved), that would not tell an investigator or an expert witness whether a given memory is false (Smeets et al. 2017). The Lost in the Mall study is so well known because it established that false memories can happen, but neither the original Loftus and Pickrell paper nor our replication study made any claims about the absolute rate at which this should be expected to occur. Other work has also clearly demonstrated that though around a quarter of participants typically form a false memory in a given study (Scoboria et al. 2017), that does not mean that only a quarter of the population are susceptible to forming false memories (Murphy, Loftus, et al. 2023; Patihis 2018).
For the self-report question, these participants were explicitly asked if they remembered being lost in a shopping mall, and they indicated that they did. To then remove those participants for not mentioning being lost earlier in the interview is clearly a highly restrictive way to classify memories.
Andrews and Brewin also quite notably fail to mention the high rates of belief in these fabricated events. Altogether, the self-reported data suggested that 66% of participants remembered (14%) or believed (52%) that the event had occurred. Thus, the Loftus and Pickrell coding scheme provided higher estimates of false memories than self-report, but also failed to capture that the majority of participants came to believe the event happened and were willing to testify to that fact. As discussed by Scoboria and Mazzoni (2017), belief has been shown to be more than sufficient to cause changes in behaviour (Bernstein et al. 2015) and so false beliefs are an important outcome from rich false memory studies.
Where we perhaps most strongly disagree with Andrews and Brewin is their assertion that ‘half the group described potentially true events’. The possibility that participants really did get lost in a shopping mall as children is of course a pertinent one in a study like this—hence why other studies have utilised less commonplace experiences (e.g., Hyman Jr. et al. 1995). While we identified three participants in our original study that we believed could have been reporting a true event (based on their persistent reporting of the event, from the initial survey through to the post-debrief follow-up), Andrews and Brewin declared half of the false memory reports to be ‘potentially true’, marking perhaps the most significant step on their journey from our 35% estimate to their 4% estimate.
The rationale for these memories being potentially true raises an interesting theoretical point about the nature of memory. Andrews and Brewin note that these memories were likely real because, for example, the participant reported getting lost in a different shop from the one we prompted them with. However, mountains of evidence on the reconstructive nature of memory would predict exactly this, that participants would take our prompt and actively merge it with their own knowledge and experiences (Greene et al. 2022; Lindsay and Hyman Jr 2017; Loftus and Pickrell 1995; Murphy et al. 2019). The nature of the Lost in the Mall paradigm is particularly active—participants are explicitly encouraged to search their memories and have a discussion with the interviewer about what they can recall and what images they see in their mind. This is in contrast to the kind of process implied by Andrews and Brewin. The expectation that we would hand participants a prompt and they would then recite it back to us, verbatim, with no changes and all so-called ‘core details’ intact suggests a very passive process, almost a hacking of memory where a complete event is ‘uploaded’ to our participants' minds.
Andrews and Brewin argue that the events that they have classified as potentially true were recalled with greater certainty and detail and less closely matched the details provided in the prompt (i.e., a different shopping mall was named). They suggest that these were true events that really happened and thus were reported with more certainty. However, it may also be that because the participant actively connected the fake story to other, real events from their lives, these participants built more detailed and convincing false memories—indeed, extant research clearly indicates that people do integrate real personal experiences into false memories in just this manner (Shaw and Porter 2015; Zaragoza et al. 2019). We do not have the data here to answer this question with any certainty, but we would welcome an experimental assessment of this point in the future. Regardless, we would not predict that participants would ever passively accept every detail supplied to them and note that the real-world harms that may arise from, say, suggestive therapy practices, are not contingent on wholesale adoption of every presented detail either.
Perhaps our greatest lesson in carrying out this large-scale study was the fact that the interview transcripts are a product of natural conversation. When we devise coding schemes, we can sometimes fall into the logical trap of thinking we are applying the coding scheme to a participant's memory. As we cannot see inside their brain and scrutinise their recollections directly, we are in fact coding the way they speak about their memory. In a study like this, participants are not delivering a monologue; they are engaging in dialogue with an interviewer. Thus, their answers are contingent on the questions they are asked.
This distinction was particularly pronounced in our study, as we had six student interviewers conducting this project and there was variation in their styles. Though they were well trained and all followed the same interview schedule, they had different personalities and also varying levels of rapport with the participants. We saw considerable variation in how rates of false memories changed between the coded booklet survey (before any contact with the researcher), the coded interview transcript, and the participants' own self-reported memory declaration. For example, the participants assigned to one researcher had a 10% false memory rate in the booklet survey, which rose to a 50% rate by the second interview, but returned to a 10% rate for self-report. Another interviewer's participants had an 18% false memory rate in the booklet survey that actually dropped to 10% during the interviews, then dropped again to 0% for self-report. This may have been due to differences between interviewers, as it seems they varied in the follow-up questions they asked and what kind of information they encouraged the participant to say ‘on the record’, as it were. We also note that the associative nature of the recall process would predict that slightly different details are recalled during different attempts (Odinot et al. 2013) – humans are not jukeboxes, and a similar prompt will not elicit an identical recollection on each occasion. In addition, participants' level of attention to the conversation and the recalled event is likely to wax and wane over the course of the conversation, and previous research suggests that the attentiveness of a listener can impact what details are recalled (Pasupathi and Oldroyd 2015).
The role of the interviewer is particularly pertinent when applying a count-based scheme like that of Andrews and Brewin. They noted that very few of our participants recounted their age when discussing their false memory. Of course, this is not how conversation normally works. Imagine someone asking you about your first day of school, at the age of four. You would not typically begin your account by saying, ‘I was four years old when I started school’, unless that detail felt particularly pertinent to you (‘…so I was the youngest because everyone else was at least five’). Instead, you might talk about your memories of the classroom, the teacher, the other children etc. If age was to be considered an important detail a priori, it would be important to add a question to the interview schedule (‘and what age were you when this happened?’) to fairly judge whether participants recall that detail or not. Absence of evidence is not evidence of absence—we simply do not know whether participants came to remember that they were about five when this false event occurred, as we did not ask them and cannot draw conclusions from their failure to mention it. In our replication study, we saw the role of the interviewers as encouraging participants to talk and so they asked an array of open-ended questions (e.g., ‘and can you picture what it was like in the shop? Who would have been with you?’ etc.). They were facilitators of the conversation, not examiners of memory detail. Interviewers were also expected to maintain the study's ruse at this point; as participants were unaware we were studying false memories, it was important not to interrogate participants to the point where they might question if the event really happened.
Many participants may not have actually stated that they got lost, because it was implied by the question. We note that media training often encourages interviewees to include the question in their answer, so that when extracted out of context in a soundbite the quote is more detailed (i.e., when asked when you will launch a product, rather than saying ‘December’, you might be encouraged to say ‘We will launch this product in December’). Training is required to learn to speak like this precisely because we do not naturally speak so repetitively in natural conversation. When coding memory, it is therefore important to remain cognisant of the specific prompts offered to an interviewee as clearly that gives context to what they do and do not say in their narration.
It is useful for researchers to reflect on the role of interviewers in rich false memory studies and to consider in advance what their approach will be. Decisions about the interview style and coding scheme ought to be made in unison (and ideally, preregistered), as the interviewer has such an influence on what the participant is likely to speak about. As we have discussed, it is difficult to move the goalposts after the fact and employ a detail-based scheme when the interviews were not set up to assess the presence or absence of those specific details. In our study, we found it useful to combine the natural (imperfect) conversation between participant and interviewer with some standardised questions that they answered during and after the event (Do you remember this event? How vivid is your memory? etc.) and to consider the resulting data in a holistic manner.
In our Lost in the Mall replication, we reported a top-line false memory rate of 35%, which is in line with the rates reported across a range of similar studies (see Scoboria et al. 2017 for a mega analysis of false memory implantation studies). Our position is certainly not to argue that any particular false memory rate is ‘correct’; as noted in our replication paper and in the above discussion, we advocate for the use of multiple coding methods, including self-report where appropriate. Just as importantly, we argue that memory reports should be evaluated holistically, with consideration of the context in which the reports were obtained (here, via a naturalistic conversation). We do not consider the use of reductive and over-simplistic count schemes to be a useful measure of memory (true or false) and reject the idea that memory prompts should be repeated back without alteration in order for a participant's recollection to be considered a memory.
The clinical and forensic implications of the Lost in the Mall study (and our replication study) remain clear and important. We note that the implantation methods used in these studies are fairly light-touch. Though the studies are enormously burdensome to conduct, involving contact with parents and multiple interviews and online surveys per participant, the actual manipulation of memory is quite mild. Participants are presented with a very short summary of a supposed event from their childhood and are asked to reflect on whether they remember it. That is all. As Scoboria and Mazzoni (2017) noted, this pales in comparison to the kind of memory distortion that might occur over years of suggestive therapy. We therefore respectfully submit that to quibble over the precise rate of false memory in a given study is essentially to miss the point regarding the potential harms to therapeutic patients (Wade et al. 2025).
Gillian Murphy: writing – original draft, conceptualization. Ciara M. Greene: conceptualization, writing – review and editing.
期刊介绍:
Applied Cognitive Psychology seeks to publish the best papers dealing with psychological analyses of memory, learning, thinking, problem solving, language, and consciousness as they occur in the real world. Applied Cognitive Psychology will publish papers on a wide variety of issues and from diverse theoretical perspectives. The journal focuses on studies of human performance and basic cognitive skills in everyday environments including, but not restricted to, studies of eyewitness memory, autobiographical memory, spatial cognition, skill training, expertise and skilled behaviour. Articles will normally combine realistic investigations of real world events with appropriate theoretical analyses and proper appraisal of practical implications.