Hi: I have need for a correlation coefficient function. Did some searching and found these Spearman functions, but they don't seem to provide logical values. For two identical columns, the wt function returns a 1 like it should, but for a test with one column times 2 the second column (which should still be perfectly correlated), it returns -522
When comparing two columns with random #'s (rand(1000)), it returns a 64, then when I change the range of the random #'s (rand(1000) & rand(100)) I get a spearman of 2 ??
function spearman vectorA, vectorB -- returns Sperman correlation coefficient for 2 vectors (YOU NEED TO OPTIMIZE THIS LIKE wtSpearman2)
put the number of lines in vectorA into vectorASize
put the number of lines in vectorB into vectorBSize
if vectorBSize <> VectorASize then
answer "ERROR in function wtSpearman: vectors not same size"
exit to MetaCard
end if
repeat with z = 1 to vectorASize
put line z of vectorA into r
put line z of vectorB into s
put (r - s)^2 & comma after summationVar
end repeat
put 6 * sum(summationVar) into numeratorVar
put vectorASize * (vectorASize^2 - 1) into denominatorVar
if denominatorVar = 0 then put 0.000001 into denominatorVar --DO I REALLY NEED THIS TO PREVENT DIVIDE BY ZERO???
return 1 - (numeratorVar/denominatorVar) -- unweighted Spearman Rank correlation coefficient
end spearman
function wtSpearman vectorA, vectorB -- returns weighted Sperman correlation coefficient for 2 vectors
put the number of lines in vectorA into vectorASize
put the number of lines in vectorB into vectorBSize
if vectorBSize <> VectorASize then
answer "ERROR in function wtSpearman: vectors not same size"
exit to MetaCard
end if
put 0 into lineCounter -- initialize variable
repeat for each line r in vectorA
add 1 to lineCounter
put line lineCounter of vectorB into s
put 2/(r^-1 + s^-1) & comma after summationVar
end repeat
put (1/vectorASize)*(sum(summationVar)) into hBar
return (1/(vectorASize - 1)) * ((12*hBar) - (5*vectorASize) - 7) -- Weighted Spearman Rank correlation coefficient
end wtSpearman
correlation coefficient function?
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller
-
- Posts: 349
- Joined: Tue Oct 28, 2008 1:23 am
- Contact:
-
- Posts: 349
- Joined: Tue Oct 28, 2008 1:23 am
- Contact:
Re: correlation coefficient function?
I found the formula for Pearson product moment correlation coefficient function and Livecoded it for anyone who needs it. The commented out "Consolex" field is for debugging along the way:
function pearson vectorA, vectorB
-- comment sample is A: 1,2,3 B: 2,5,6
--put "" into field "consolex"
put the number of lines of vectorA into Asize
put the number of lines of vectorB into Bsize
if Asize<> Bsize then
put "columns need to be the same size"
exit to hypercard
end if
--doing EXY 1*2 + 2*5 + 3*6 = 30
put 1 into x
put "" into EXY
repeat for Asize
put line x of VectorA * line x of VectorB & "," after EXY
put x+1 into x
end repeat
put sum(EXY) into EXY
--put "EXY = " && EXY & return after field "consolex"
--doing EX 1+2+3 = 6
put 1 into x
put "" into EX
repeat for Asize
put line x of VectorA & "," after EX
put x+1 into x
end repeat
put sum(EX) into EX
--put "EX = " && EX & return after field "consolex"
--doing EX2 1^2+2^2+3^2 = 14
put 1 into x
put "" into EX2
repeat for Asize
put (line x of VectorA) * (line x of VectorA) & "," after EX2
put x+1 into x
end repeat
put sum(EX2) into EX2
--put "EX2 = " && EX2 & return after field "consolex"
--doing EY 2+5+6 = 13
put 1 into x
put "" into EY
repeat for Asize
put line x of VectorB & "," after EY
put x+1 into x
end repeat
put sum(EY) into EY
--put "EY = " && EY & return after field "consolex"
--doing EY2 2^2+5^2+6^2 = 65
put 1 into x
put "" into EY2
repeat for Asize
put (line x of VectorB) * (line x of VectorB) & "," after EY2
put x+1 into x
end repeat
put sum(EY2) into EY2
--put "EY2 = " && EY2 & return after field "consolex"
--doing N = 3
put Asize into N
--put return & "N = " & Asize after field "consolex"
--doing c : EXY - EX*EY / N = 30-6*13/3 = 4
put EXY-(EY*EX)/N into c
--put return & "c = " & C after field "consolex"
--doing d
put EX2 - (EX*EX)/N into d
--put return & "d = " & d after field "consolex"
--doing e
put EY2 - (EY*EY)/N into e
--put return & "e = " & e after field "consolex"
--doing r
put c / sqrt(d*e) into r
return r
end pearson
function pearson vectorA, vectorB
-- comment sample is A: 1,2,3 B: 2,5,6
--put "" into field "consolex"
put the number of lines of vectorA into Asize
put the number of lines of vectorB into Bsize
if Asize<> Bsize then
put "columns need to be the same size"
exit to hypercard
end if
--doing EXY 1*2 + 2*5 + 3*6 = 30
put 1 into x
put "" into EXY
repeat for Asize
put line x of VectorA * line x of VectorB & "," after EXY
put x+1 into x
end repeat
put sum(EXY) into EXY
--put "EXY = " && EXY & return after field "consolex"
--doing EX 1+2+3 = 6
put 1 into x
put "" into EX
repeat for Asize
put line x of VectorA & "," after EX
put x+1 into x
end repeat
put sum(EX) into EX
--put "EX = " && EX & return after field "consolex"
--doing EX2 1^2+2^2+3^2 = 14
put 1 into x
put "" into EX2
repeat for Asize
put (line x of VectorA) * (line x of VectorA) & "," after EX2
put x+1 into x
end repeat
put sum(EX2) into EX2
--put "EX2 = " && EX2 & return after field "consolex"
--doing EY 2+5+6 = 13
put 1 into x
put "" into EY
repeat for Asize
put line x of VectorB & "," after EY
put x+1 into x
end repeat
put sum(EY) into EY
--put "EY = " && EY & return after field "consolex"
--doing EY2 2^2+5^2+6^2 = 65
put 1 into x
put "" into EY2
repeat for Asize
put (line x of VectorB) * (line x of VectorB) & "," after EY2
put x+1 into x
end repeat
put sum(EY2) into EY2
--put "EY2 = " && EY2 & return after field "consolex"
--doing N = 3
put Asize into N
--put return & "N = " & Asize after field "consolex"
--doing c : EXY - EX*EY / N = 30-6*13/3 = 4
put EXY-(EY*EX)/N into c
--put return & "c = " & C after field "consolex"
--doing d
put EX2 - (EX*EX)/N into d
--put return & "d = " & d after field "consolex"
--doing e
put EY2 - (EY*EY)/N into e
--put return & "e = " & e after field "consolex"
--doing r
put c / sqrt(d*e) into r
return r
end pearson
Re: correlation coefficient function?
Hi adventuresofgreg,
This is an other way for the correlation coefficient, useful if you don't know the number and the data values.
Data must be stored in a field, but the lines need not to be contiguous.
(Sorry, but in the original script variables name ands comments are in french).
Regards
This is an other way for the correlation coefficient, useful if you don't know the number and the data values.
Data must be stored in a field, but the lines need not to be contiguous.
(Sorry, but in the original script variables name ands comments are in french).
Code: Select all
on mouseUp
if fld "Data" = "" then
exit mouseUp
end if
ask "Correlation on lines N° (number of lines separated by a comma) :" with "1,2"
if it is not "" then
put item 1 of it & tab & item 2 of it into Thelines
else
exit mouseUp
end if
set itemdelimiter to tab
put the number of items in line 1 of fld "Data" into nbitems
put fld "Data" into Thedata
repeat with i = 1 to the number of items in Thelines
if the number of items in line (item i of Thelines) of fld "Data" is not nbitems then
answer "The number of items by lines must be equivalent"
exit mouseUp
end if
end repeat
lock screen
set cursor to watch
--- et on commence le calcul des différentes valeurs de X et Y
put "0" into moylignX --- average of X
repeat with i = 1 to nbitems
add item i of line (item 1 of Thelines) of Thedata to moylignX
end repeat
put moylignX/nbitems into moylignX
put "0" into moylignY --- average of Y
repeat with i = 1 to nbitems
add item i of line (item 2 of Thelines) of Thedata to moylignY
end repeat
put moylignY/nbitems into moylignY
put "0" into moyXauCar
repeat with i = 1 to nbitems -- calcul du carré des variables
add (item i of line (item 1 of Thelines) of Thedata)^2 to moyXauCar
end repeat
put moyXauCar/nbitems into moyXauCar
put "0" into moyYauCar
repeat with i = 1 to nbitems -- calcul du carré des variables
add (item i of line (item 2 of Thelines) of Thedata)^2 to moyYauCar
end repeat
put moyYauCar/nbitems into moyYauCar
put "0" into moyXY
repeat with i = 1 to nbitems -- calcul du carré des variables
add (item i of line (item 1 of Thelines) of Thedata)*(item i of line (item 2 of Thelines) of Thedata) to moyXY
end repeat
put moyXY/nbitems into moyXY
put moyXauCar-moylignX^2 into ecartypeX
put moyYauCar-moylignY^2 into ecartypeY
put moyXY-(moylignX*moylignY) into coVarXY
put coVarXY/(sqrt(ecartypeX*ecartypeY)) into lacor
put "correlation coefficient = " & lacor into fld "CoefCor"
end mouseUp