Models and Methods IV: Fit PSD Model by SVI Algorithm

Introduction

We use the stochastic variational inference algorithm (SVI) to fit the PSD model (Gopalan et al. 2016).

PSD Model

Suppose we have \(I\) diploid individuals genotyped at \(J\) biallelic loci. Let \((g_{ij}^1,g_{ij}^2)\) represents the genotype at marker \(j\) of person \(i\), where \(g_{ij}^a\) represent the observed number of copies of allele 1 at seat \(a\). Thus, \((g_{ij}^1,g_{ij}^2)\) equals \((1,1)\), \((1,0)\), \((0,1)\), or \((0,0)\) accordingly, as \(i\) has genotype 1/1, 1/2, 2/1, or 2/2 at marker \(j\). Let \(g_{ij}=g_{ij}^1+g_{ij}^2\). These individuals are drawn from an admixed population with contributions from \(K\) postulated ancestral populations. Population \(k\) contributes a fraction \(p_{ik}\) of individual \(i\)’s genome. Note that \(\sum_{k=1}^Kp_{ik}=1\), and \(p_{ik}\geq 0\). Allele 1 at SNP \(j\) has frequency \(f_{kj}\) in population \(k\). Note that \(0\leq f_{kj}\leq 1\). Similar to the EM algorithm, we consider \((z_{ij}^1,z_{ij}^2)\), where \(z_{ij}^a\) is an element of the set \(\{1,\ldots,K\}\), denotes the population from which the genes of individual \(i\) of marker \(j\) at position \(a\) really come. Let \(z_{ijk}^a=\textbf{1}(z_{ij}^a=k)\), obviously, \(z_{ijk}^a\in\{0,1\}\), and \(\sum_{k=1}^Kz_{ijk}^a=1\).

Different from EM and SQP algorithms, variational inference does not optimize the maximum likelihood \(\mathcal{L}(G|P,F)\), but the posterior \(P(Z,P,F|G)\), which also reflects the difference between the frequency school and the Bayesian school. To be specific, under the previous assumptions, the observed variable is \(G\), the unobserved variables include latent variable \(Z\) and parameters \(P\), \(F\). We take all unobserved variables together as variational objects, which are all involved in the estimation. At the same time, we take the model parameter \(K\), which is not involved in the estimation, as the hyperparameter.

Stochastic Variational Inference Algorithm

Modern applications of probability models often require analyzing massive data. However, most posterior inference algorithms do not easily scale. CAVI is no exception. The reason is that the coordinate ascent structure of the algorithm requires iterating through the entire data set at each iteration. As the data set size grows, each iteration becomes more computationally expensive (Blei, Kucukelbir, and McAuliffe 2017).

An alternative to coordinate ascent is gradient-based optimization, which climbs the ELBO by computing and following its gradient at each iteration. This perspective is the key to scaling up variational inference using stochastic variational inference (SVI) (Hoffman et al. 2013), a method that combines natural gradients (Amari 1998) and stochastic optimization (Amari 1998). It repeatedly (a) subsamples a data point from the full data set; (b) uses the current global parameters to compute the optimal local parameters for the subsampled data point; and (c) adjusts the current global parameters in an appropriate way.

: proof of convergence of the algorithm

Fit PSD Model by SVI Algorithm

Our model, the choice of variational family is the same as the VI algorithm. Our goal is to update the global variable \(P\) iteratively. In each iteration, we first sample a SNP location \(j\) and all observations \(g_{ij}\) at that location. Then, in the sampled data, we update \(F\) with fixed \(P\) in the same way as VI until convergence. Next, we update the global variable \[\tilde{p}_{ik}^{(t+1)}=(1-\rho_t)\tilde{p}_{ik}^{(t)}+\rho_t\big[\alpha_k+J(\tilde{z}_{ijk}^1+\tilde{z}_{ijk}^2)\big],\] where step size \(\rho_t=(\tau_0+t)^{-\kappa}\). We set \(\tau_0\) to 1 and \(\kappa\) to 0.5. We take a validation set that does not participate in the training, and then compute the log-likelihood function on the validation set until the change is small enough.

Acceleration

Algorithm Implementation

: implementation details

Literature Cited

Amari, Shun-Ichi. 1998. “Natural Gradient Works Efficiently in Learning.” Neural Computation 10 (2): 251–76.

Blei, David M, Alp Kucukelbir, and Jon D McAuliffe. 2017. “Variational Inference: A Review for Statisticians.” Journal of the American Statistical Association 112 (518): 859–77.

Gopalan, Prem, Wei Hao, David M Blei, and John D Storey. 2016. “Scaling Probabilistic Models of Genetic Variation to Millions of Humans.” Nature Genetics 48 (12): 1587–90.

Hoffman, Matthew D, David M Blei, Chong Wang, and John Paisley. 2013. “Stochastic Variational Inference.” Journal of Machine Learning Research 14: 1303–47.

Jonathon Chow