Python OpenCV PCACompute Eigenvalue -
when using python 2.7.5 opencv (osx), run pca on sequence of images (cols pixels, rows frames per this answer.
how eigenvalues corresponding eigenvectors? looks it's property of pca object in c++, python equivalent pcacompute()
simple function.
seems strange omit such key part of pca.
matmul.cpp confirms pca::operator()
being used pcacompute()
, eigenvalues discarded. did this:
# following mimics pca::operator() implementation opencv's # matmul.cpp() wrapped python cv2.pcacompute(). can't # use pcacompute() though discards eigenvalues. # scrambled faster nvariables >> nobservations. bitmask 0 , # therefore default / redundant, included abide online docs. covar, mean = cv2.calccovarmatrix(pcainput, cv2.cv.cv_covar_scale | cv2.cv.cv_covar_rows | cv2.cv.cv_covar_scrambled) eval, evec = cv2.eigen(covar, computeeigenvectors=true)[1:] # conversion + normalisation required due 'scrambled' mode evec = cv2.gemm(evec, pcainput - mean, 1, none, 0) # apply_along_axis() slices 1d rows, normalize() returns 4x1 vectors evec = numpy.apply_along_axis(lambda n: cv2.normalize(n).flat, 1, evec)
(simplifying assumptions: rows = observations, cols = variables; , there's many more variables observations. both true in case.)
this pretty much works. in following, old_evec
result cv2.pcacompute()
:
in [101]: evec out[101]: array([[ 3.69396088e-05, 1.66745325e-05, 4.97117583e-05, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ -7.23531536e-06, -3.07411122e-06, -9.58259793e-06, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 1.01496237e-05, 4.60048715e-06, 1.33919606e-05, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], ..., [ -1.42024751e-04, 5.21386198e-05, 3.59923394e-04, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ -5.28685812e-05, 8.50139472e-05, -3.13278542e-04, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 2.96546917e-04, 1.23437674e-04, 4.98598461e-04, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00]]) in [102]: old_evec out[102]: array([[ 3.69395821e-05, 1.66745194e-05, 4.97117981e-05, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ -7.23533140e-06, -3.07411415e-06, -9.58260534e-06, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ 1.01496662e-05, 4.60050160e-06, 1.33920075e-05, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], ..., [ -1.42029530e-04, 5.21366564e-05, 3.60067672e-04, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ -5.29163444e-05, 8.50261567e-05, -3.13150231e-04, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], [ -7.13724992e-04, -8.52700090e-04, 1.57953508e-03, ..., 0.00000000e+00, 0.00000000e+00, 0.00000000e+00]], dtype=float32)
there's kind of loss of precision, visible toward end of outputs (though in fact quick plotting of absolute difference reveals no pattern imprecision).
57% of elements have nonzero absolute difference.
of these, 95% different less 2e-16 , mean a.d. 5.3e-4 - however, a.d. can high 0.059, lot when consider eigenvector values lie between -0.048 0.045 .
there code in pca::operator()
converts biggest ctype; on other hand old_evec
float32 compared own code producing float64. it's worth mentioning when compiling numpy got precision-related errors.
overall, loss of precision seems related low eigenvalued eigenvectors again points rounding error etc. above implementation produces results similar pcacompute(), having duplicated behaviour.
Comments
Post a Comment