Rolling machinery is ubiquitous in power transmission and transformation equipment, but it suffers from severe faults during long-term running. Automatic fault diagnosis plays an important role in the production safety of power equipment. This paper proposes a novel cross-domain co-attention network (CDCAN) for fault diagnosis of rolling machinery. Multiscale features cross time and frequency domains are respectively extracted from raw vibration signal, which are then fused with a co-attention mechanism. This architecture fuses layer-wise activations to enable CDCAN to fully learn the shared representation with consistency across time and frequency domains. This characteristic helps CDCAN provide more faithful diagnoses than state-of-the-art methods. Experiments on bearing and gearbox datasets are conducted to evaluate the fault-diagnosis performance. Extensive experimental results and comprehensive analysis demonstrate the superiority of the proposed CDCAN in term of diagnosis correctness and adaptability.