Many representation learning methods have gradually emerged to better exploit the properties of multi-view data. However, these existing methods still have the following areas to be improved: 1) Most of them overlook the ex-ante interpretability of the model, which renders the model more complex and more difficult for people to understand; 2) They underutilize the potential of the bi-topological spaces, which bring additional structural information to the representation learning process. This lack is detrimental when dealing with data that exhibits topological properties or has complex geometrical relationships between different views. Therefore, to address the above challenges, we propose an optimization-oriented multi-view representation learning framework in implicit bi-topological spaces. On one hand, we construct an intrinsically interpretability end-to-end white-box model that directly conducts the representation learning procedure while improving the transparency of the model. On the other hand, the integration of bi-topological spaces information within the network via manifold learning facilitates the comprehensive utilization of information from the data, ultimately enhancing representation learning and yielding superior performance for downstream tasks. Extensive experimental results demonstrate that the proposed method exhibits promising performance and is feasible in the downstream tasks.