For multivariate nonparametric regression, doubly penalized ANOVA modeling (DPAM) has recently been proposed, using hierarchical total variations (HTVs) and empirical norms as penalties on the component functions such as main effects and multi-way interactions in a functional ANOVA decomposition of the underlying regression function. The two penalties play complementary roles: the HTV penalty promotes sparsity in the selection of basis functions within each component function, whereas the empirical-norm penalty promotes sparsity in the selection of component functions. To facilitate large-scale training of DPAM using backfitting or block minimization, two suitable primal-dual algorithms are developed, including both batch and stochastic versions, for updating each component function in single-block optimization. Existing applications of primal-dual algorithms are intractable for DPAM with both HTV and empirical-norm penalties. The validity and advantage of the stochastic primal-dual algorithms are demonstrated through extensive numerical experiments, compared with their batch versions and a previous active-set algorithm, in large-scale scenarios.