i take a look at PCA (principle component analysis). i'm not sure this is implemented somewhere else but a quick review of my collage notes (reference needed) lead me the code below, and data is (reference needed):
x y
2.5 2.4
0.5 0.7
2.2 2.9
1.9 2.2
3.1 3.0
2.3 2.7
2 1.6
1 1.1
1.5 1.6
1.1 0.9
''' *@author beck *@date Sep 14, 2012 *PCA with Python *bekoc.blogspot.com ''' import numpy as np import matplotlib.pyplot as plt import pylab xs= np.loadtxt("pcaData",delimiter=" ", skiprows=1, usecols=(0,1)) # numpy array - similar to C array notation. #get mean meanx=np.average(xs[:,0]) meany=np.average(xs[:,1]) correctedX=[value-meanx for value in (xs[:,0])] #X data with the means subtracted correctedY=[value-meany for value in (xs[:,1])] #Y data with the means subtracted data= np.array([correctedX,correctedY]) print data.shape covData=np.cov(data)#calculate covariance matrix eigenvalues, eigenvectors = np.linalg.eig(covData) print eigenvectors print eigenvectors[0][0] #eigenvectors are both unit eigenvectors print eigenvectors[1][0] x= [n for n in range (-2,3)] y= [eigenvectors[1][0]*i/eigenvectors[0][0] for i in x ] y1= [eigenvectors[1][1]*i/eigenvectors[0][1] for i in x ] print x print y plt.plot(x, y,linestyle='--', label='eigenvector1') plt.plot(x, y1, linestyle='--', label='eigenvector2') plt.plot(data[0,:],data[1,:], marker='+', linestyle=' ', label= "Normalized data" ) #plt.plot(xs[:,0],xs[:,1],marker='+',linestyle=' ') pylab.ylim([-2,2]) pylab.xlim([-2,2]) plt.title('PCA example') plt.legend() plt.show()
The code includes step 1 to 5
PCA summary :1- Given a dataset calculate normalized data (mean substructed data), let's say n dimension (feature) data
2-calculate covariance matrix of normalized data
3-calculate eigenvalues and eigenvectors of the covariance matrix
4-eigenvector with the largest eigenvalue is the principal component
5-choose p eigenvectors and multiply with your data
6-now your data is p dimension.
The green dotted plot of the eigenvector shows the most significant relation between dimensions
Please refer to simple and consise tutorial at georgemdallas blog
|