Large language models (LLMs) have demonstrated excellent performance in various natural language tasks. However, in practical applications, LLMs frequently exhibit hallucinations, generating content that deviates from instructions or facts, especially in complex reasoning tasks. Existing research has simulated real human behavior by utilizing multi-agent debate, voting, and review, enhancing the model’s reasoning capabilities. However, simple multi-agent systems have not accomplished the progressive verification of all reasoning steps. Additionally, the issues of unstable response quality and the continuous learning ability of agents have not been addressed. Therefore, in this work, we propose a Multi-agent Collaborative Filtering framework (MCF) in the form of cross-examination among agents. This aims to cross-verify each step while filtering and selecting the highest-quality responses from the response space. Additionally, to enable agents to achieve continuous learning capabilities, this paper proposes methods for the automated construction and efficient retrieval of the experience repository. Extensive experiments on ten reasoning datasets of three types (Arithmetic, Commonsense, and Symbolic) indicate that MCF can enhance the diversity of large language models, overcome hallucinations, and filter out effective responses in a rich response space. Moreover, the improvement of agents’ reasoning capabilities through the experience repository is also verified. Compared to the state-of-the-art, the method proposed in this paper shows superior performance.