Disease mapping models estimate the spatio-temporal variation in population-level disease risks or rates across a set of areal units for time periods, aiming to identify temporal trends and spatial hotspots. Highly parameterised Bayesian hierarchical models with over random effects are commonly used to estimate this spatio-temporal variation, which are assigned autoregressive and conditional autoregressive prior distributions. These models work well when there are tens of thousands of data points, but are likely to be computationally burdensome when this rises to hundreds of thousands or above. This paper proposes a computationally efficient alternative, which can fit a range of spatio-temporal disease trends almost as well as existing highly parameterised models but only takes around 5% to 40% of the time to implement. It achieves this by modelling the average spatial and temporal trends in the data with autoregressive type random effects, which are augmented by an observation-driven process using functions of earlier data as additional covariates in the model. The efficacy of this methodology is tested by simulation, before being applied to the motivating study that estimates the spatio-temporal trends in asthma, cancer, coronary heart and chronic obstructive pulmonary disease prevalences for small areas over years in England.