correlation coefficient function?
Posted: Thu Nov 11, 2010 3:29 pm
Hi: I have need for a correlation coefficient function. Did some searching and found these Spearman functions, but they don't seem to provide logical values. For two identical columns, the wt function returns a 1 like it should, but for a test with one column times 2 the second column (which should still be perfectly correlated), it returns -522
When comparing two columns with random #'s (rand(1000)), it returns a 64, then when I change the range of the random #'s (rand(1000) & rand(100)) I get a spearman of 2 ??
function spearman vectorA, vectorB -- returns Sperman correlation coefficient for 2 vectors (YOU NEED TO OPTIMIZE THIS LIKE wtSpearman2)
put the number of lines in vectorA into vectorASize
put the number of lines in vectorB into vectorBSize
if vectorBSize <> VectorASize then
answer "ERROR in function wtSpearman: vectors not same size"
exit to MetaCard
end if
repeat with z = 1 to vectorASize
put line z of vectorA into r
put line z of vectorB into s
put (r - s)^2 & comma after summationVar
end repeat
put 6 * sum(summationVar) into numeratorVar
put vectorASize * (vectorASize^2 - 1) into denominatorVar
if denominatorVar = 0 then put 0.000001 into denominatorVar --DO I REALLY NEED THIS TO PREVENT DIVIDE BY ZERO???
return 1 - (numeratorVar/denominatorVar) -- unweighted Spearman Rank correlation coefficient
end spearman
function wtSpearman vectorA, vectorB -- returns weighted Sperman correlation coefficient for 2 vectors
put the number of lines in vectorA into vectorASize
put the number of lines in vectorB into vectorBSize
if vectorBSize <> VectorASize then
answer "ERROR in function wtSpearman: vectors not same size"
exit to MetaCard
end if
put 0 into lineCounter -- initialize variable
repeat for each line r in vectorA
add 1 to lineCounter
put line lineCounter of vectorB into s
put 2/(r^-1 + s^-1) & comma after summationVar
end repeat
put (1/vectorASize)*(sum(summationVar)) into hBar
return (1/(vectorASize - 1)) * ((12*hBar) - (5*vectorASize) - 7) -- Weighted Spearman Rank correlation coefficient
end wtSpearman
When comparing two columns with random #'s (rand(1000)), it returns a 64, then when I change the range of the random #'s (rand(1000) & rand(100)) I get a spearman of 2 ??
function spearman vectorA, vectorB -- returns Sperman correlation coefficient for 2 vectors (YOU NEED TO OPTIMIZE THIS LIKE wtSpearman2)
put the number of lines in vectorA into vectorASize
put the number of lines in vectorB into vectorBSize
if vectorBSize <> VectorASize then
answer "ERROR in function wtSpearman: vectors not same size"
exit to MetaCard
end if
repeat with z = 1 to vectorASize
put line z of vectorA into r
put line z of vectorB into s
put (r - s)^2 & comma after summationVar
end repeat
put 6 * sum(summationVar) into numeratorVar
put vectorASize * (vectorASize^2 - 1) into denominatorVar
if denominatorVar = 0 then put 0.000001 into denominatorVar --DO I REALLY NEED THIS TO PREVENT DIVIDE BY ZERO???
return 1 - (numeratorVar/denominatorVar) -- unweighted Spearman Rank correlation coefficient
end spearman
function wtSpearman vectorA, vectorB -- returns weighted Sperman correlation coefficient for 2 vectors
put the number of lines in vectorA into vectorASize
put the number of lines in vectorB into vectorBSize
if vectorBSize <> VectorASize then
answer "ERROR in function wtSpearman: vectors not same size"
exit to MetaCard
end if
put 0 into lineCounter -- initialize variable
repeat for each line r in vectorA
add 1 to lineCounter
put line lineCounter of vectorB into s
put 2/(r^-1 + s^-1) & comma after summationVar
end repeat
put (1/vectorASize)*(sum(summationVar)) into hBar
return (1/(vectorASize - 1)) * ((12*hBar) - (5*vectorASize) - 7) -- Weighted Spearman Rank correlation coefficient
end wtSpearman