By Lowell Thompson and Ashley Hou
This is a collaborative tutorial aimed at simplifying a common machine learning method known as singular value decomposition. Learn how these techniques impact computational neuroscience research as well!
Singular value decomposition is a method to factorize an arbitrary \(m\ \times\ n\) matrix, \(A\), into two orthonormal matrices \(U\) and \(V\), and a diagonal matrix \(\Sigma\). \(A\) can be written as \(U\Sigma V^T\). The diagonal entries of \(\Sigma\), called singular values, are arranged to be in decreasing magnitude. The columns of \(U\) and \(V\) are composed of the left and right singular vectors. Therefore, we can express \(U\Sigma V^T\) as a weighted sum of outer products of the corresponding left and right singular vectors, \(\sigma_i u_i v_i^T\).
In neuroscience applications, we have a matrix \(R\) of the firing rate of a given neuron, where the first dimension represents different frontoparallel motion directions and the second represents disparity (a measure that helps determine the depth of an object). Relationships between preferences for direction and depth could serve as a potential mechanism underlying 3D motion perception. SVD can be used to determine whether a neuron’s joint tuning function for these properties is separable or inseparable. Separability entails a constant relationship between the two properties: a particular direction preference will be maintained across all disparity levels, and vice versa. If this were the case, then all of the vectors in the firing matrix could be described in terms of a single linearly independent vector, or function. This is also known as a rank 1 matrix.
Using SVD, we can approximate \(R\) by \(\sigma_1 u_1 v_1^T\), which is obtained by truncating the sum after the 1st singular value. This will be a low-rank approximation of \(R\). If \(R\) is fully separable in direction and disparity, only the first singular value will be non-zero, indicating the matrix is of rank 1. \(\sigma_1 u_1 v_1^T\) will then be a close approximation of \(R\). In general, the closer \(R\) is to being separable, the more dominant the first singular value \(\sigma_1\) will be over the other singular values, and the closer the approximation \(\sigma_1 u_1 v_1^T\) will be to the original matrix \(R\).
Below is an interactive example that can help you visualize this concept. On the left is the representation of a neuron’s joint tuning function, given by a matrix who’s rows and columns are defined by the “Direction” and “Disparity” properties. The values within each cell of the matrix are an example neuron’s firing rate for the given combination of these properties. On the right, we have altered the representation of the matrix by plotting the firing rate of the neuron across different disparity levels for each direction of lateral motion. The peak firing rate is deemed the neuron’s disparity preference at that particular frontoparallel motion direction. Notice that regardless of motion direction, this neuron maintains a similar, slightly negative disparity preference (preferring objects near the observer). These cell types are predominantly found in the middle temporal (MT) cortex of rhesus macaques, an area of the brain that seems to be specialized for both 2D and 3D motion processing (Smolyanskaya, Ruff, & Born, 2013; Sanada & DeAngelis, 2014).
Using the slider on the bottom of the graph, you are manipulating the example neuron’s separability. As you move the slider to the far right side, representing the largest degree of inseparability for this example, the disparity tuning curves develop a peculiar pattern. That is, for directions of motion that are nearly opposite to one another (~180 degrees apart), the disparity preference of the neuron is flipped. These types of neurons are predominantly found an area that lies just above MT in the cortical hierarchy, the medial superior temporal area (MST). Cells of this type have been deemed “direction-dependent disparity selectivity” (DDD) neurons, and are potentially useful in differentiating self-motion from object-motion, although this is hypothesis has not been critically evaluated (Roy et al., 1992b; Roy & Wurtz, 1990; Yang et al., 2011).
Another plot is displayed below that illustrates how the singular values of a matrix will change depending on the cell’s separability. Notice as the cell becomes less separable, the magnitude of the first singular value decreases, and the contribution of other singular values begins to increase. The inset plot illustrates this using a common metric for evaluating separability, known as the degree of inseparability. This is simply the ratio of the first singular value compared to the sum of all the singular values.
Lastly, we’ve provided another interactive graph where the left portion is the same example neuron from the previous graph. On the right, is the prediction generated from \(\sigma_1 u_1 v_1^T\). As you move the slider to the right, increasing the degree of inseparability, you’ll notice the prediction becomes increasingly dissimilar to the actual firing matrix.