Independent Component Analysis, or ICA is a concept we would related to PCA. Components are extracted by the mathematical process called SVD – single value decomposition. SVD is ubiquitous and so widely applied in real world. It’s worth a deep deep understanding.
Give a set of massive chaotic data points, we can identify properties such as mean, standard deviation, variance, kurtosis etc. on one dimension. Now if we know there are two dimensions, we can compute “variance” relative to the dimension axis:

Take the derivative of Var(theta) to find max/min value, the equation simplified to

Big question here: why square? dot product itself is measuring the deviation between data points to the try-to-find axis, why don’t you take derivative of simple dot product to find max/min value?
Further, quadruple raised to compute another angle(axis) of rotating.

The steps are detailed in professor Nathan Kutz’s video. It’s the only teacher who fully decompose ICA in such a manner, however the explanation to my above two questions is missing.