Differentially private learning algorithms inject noise into the learning where the most common private learning algorithm, DP-SGD, adds independent Gaussian noise in each iteration. Motivated by the practical considerations in federated learning, recent work on matrix factorization mechanisms has shown empirically that introducing correlations in the noise can greatly improve their utility. We characterize the asymptotic objective suboptimality for any choice of the correlation function, giving precise analytical bounds for linear regression. We show, using these bounds, how correlated noise provably improves upon vanilla DP-SGD as a function of problem parameters such as the effective dimension and condition number. Moreover, our analytical expression for the near-optimal correlation function circumvents the cubic complexity of the semi-definite program used to optimize the noise correlation in prior work. We validate these theoretical results with experiments on private deep learning in both centralized and federated settings. Our work matches or outperforms prior work while being efficient both in terms of computation and memory.