Achieving complete coverage in complex areas is a critical objective for tilling tasks such as cleaning, painting, maintenance, and inspection. However, existing robots in the market, with their fixed morphologies, face limitations when it comes to accessing confined spaces. Reconfigurable tiling robots provide a feasible solution to this challenge. By shapeshifting among the available morphologies to adapt to the different conditions of complex environments, these robots can enhance the efficiency of complete coverage. However, the ability to change shape is constrained by energy usage considerations. Hence, it is important to have an optimal strategy to generate a trajectory that covers confined areas with minimal reconfiguration actions while taking into account the finite set of possible shapes. This paper proposes a complete coverage planning (CCP) framework for a reconfigurable tiling robot called hTetrakis, which consists of three polyiamonds blocks. The CCP framework leverages Deep Reinforcement Learning (DRL) to derive an optimal action policy within a polyiamonds shape-based workspace. By maximizing cumulative rewards to optimize the overall kinetic energy-based costweight, the proposed DRL model plans the hTetrakis shapes and its trajectories simultaneously. To this end, the DRL model utilizes Convolutional Neural Networks (CNNs) with Long Short-Term Memory (LSTM) network and adopts the Actor–Critic deep reinforcement learning agent with Experience Replay (ACER) approach for off-policy decision-making. By producing trajectories with reduced costs and time, the proposed CCP framework surpasses conventional heuristic optimization methods like Particle Swarm Optimization (PSO), Differential Evolution (DE), Genetic Algorithm (GA) and Ant Colony Optimization (ACO) rely on tiling strategies.