correlation coefficient function?

Anything beyond the basics in using the LiveCode language. Share your handlers, functions and magic here.

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

correlation coefficient function?

Post by adventuresofgreg » Thu Nov 11, 2010 3:29 pm

Hi: I have need for a correlation coefficient function. Did some searching and found these Spearman functions, but they don't seem to provide logical values. For two identical columns, the wt function returns a 1 like it should, but for a test with one column times 2 the second column (which should still be perfectly correlated), it returns -522

When comparing two columns with random #'s (rand(1000)), it returns a 64, then when I change the range of the random #'s (rand(1000) & rand(100)) I get a spearman of 2 ??


function spearman vectorA, vectorB -- returns Sperman correlation coefficient for 2 vectors (YOU NEED TO OPTIMIZE THIS LIKE wtSpearman2)
put the number of lines in vectorA into vectorASize
put the number of lines in vectorB into vectorBSize
if vectorBSize <> VectorASize then
answer "ERROR in function wtSpearman: vectors not same size"
exit to MetaCard
end if
repeat with z = 1 to vectorASize
put line z of vectorA into r
put line z of vectorB into s
put (r - s)^2 & comma after summationVar
end repeat
put 6 * sum(summationVar) into numeratorVar
put vectorASize * (vectorASize^2 - 1) into denominatorVar
if denominatorVar = 0 then put 0.000001 into denominatorVar --DO I REALLY NEED THIS TO PREVENT DIVIDE BY ZERO???
return 1 - (numeratorVar/denominatorVar) -- unweighted Spearman Rank correlation coefficient
end spearman

function wtSpearman vectorA, vectorB -- returns weighted Sperman correlation coefficient for 2 vectors
put the number of lines in vectorA into vectorASize
put the number of lines in vectorB into vectorBSize
if vectorBSize <> VectorASize then
answer "ERROR in function wtSpearman: vectors not same size"
exit to MetaCard
end if
put 0 into lineCounter -- initialize variable
repeat for each line r in vectorA
add 1 to lineCounter
put line lineCounter of vectorB into s
put 2/(r^-1 + s^-1) & comma after summationVar
end repeat
put (1/vectorASize)*(sum(summationVar)) into hBar
return (1/(vectorASize - 1)) * ((12*hBar) - (5*vectorASize) - 7) -- Weighted Spearman Rank correlation coefficient
end wtSpearman

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: correlation coefficient function?

Post by adventuresofgreg » Thu Nov 11, 2010 4:29 pm

I found the formula for Pearson product moment correlation coefficient function and Livecoded it for anyone who needs it. The commented out "Consolex" field is for debugging along the way:

function pearson vectorA, vectorB
-- comment sample is A: 1,2,3 B: 2,5,6
--put "" into field "consolex"

put the number of lines of vectorA into Asize
put the number of lines of vectorB into Bsize
if Asize<> Bsize then
put "columns need to be the same size"
exit to hypercard
end if

--doing EXY 1*2 + 2*5 + 3*6 = 30
put 1 into x
put "" into EXY
repeat for Asize
put line x of VectorA * line x of VectorB & "," after EXY
put x+1 into x
end repeat
put sum(EXY) into EXY
--put "EXY = " && EXY & return after field "consolex"

--doing EX 1+2+3 = 6
put 1 into x
put "" into EX
repeat for Asize
put line x of VectorA & "," after EX
put x+1 into x
end repeat
put sum(EX) into EX
--put "EX = " && EX & return after field "consolex"

--doing EX2 1^2+2^2+3^2 = 14
put 1 into x
put "" into EX2
repeat for Asize
put (line x of VectorA) * (line x of VectorA) & "," after EX2
put x+1 into x
end repeat
put sum(EX2) into EX2
--put "EX2 = " && EX2 & return after field "consolex"

--doing EY 2+5+6 = 13
put 1 into x
put "" into EY
repeat for Asize
put line x of VectorB & "," after EY
put x+1 into x
end repeat
put sum(EY) into EY
--put "EY = " && EY & return after field "consolex"

--doing EY2 2^2+5^2+6^2 = 65
put 1 into x
put "" into EY2
repeat for Asize
put (line x of VectorB) * (line x of VectorB) & "," after EY2
put x+1 into x
end repeat
put sum(EY2) into EY2
--put "EY2 = " && EY2 & return after field "consolex"

--doing N = 3
put Asize into N
--put return & "N = " & Asize after field "consolex"

--doing c : EXY - EX*EY / N = 30-6*13/3 = 4
put EXY-(EY*EX)/N into c
--put return & "c = " & C after field "consolex"

--doing d
put EX2 - (EX*EX)/N into d
--put return & "d = " & d after field "consolex"

--doing e
put EY2 - (EY*EY)/N into e
--put return & "e = " & e after field "consolex"

--doing r
put c / sqrt(d*e) into r
return r

end pearson

NoN'
Posts: 96
Joined: Thu Jul 03, 2008 9:56 pm
Contact:

Re: correlation coefficient function?

Post by NoN' » Mon Nov 22, 2010 1:49 am

Hi adventuresofgreg,

This is an other way for the correlation coefficient, useful if you don't know the number and the data values.
Data must be stored in a field, but the lines need not to be contiguous.
(Sorry, but in the original script variables name ands comments are in french).

Code: Select all

on mouseUp
  if fld "Data" = "" then
     exit mouseUp
  end if
  
  ask "Correlation on lines N° (number of lines separated by a comma) :" with "1,2"
  if it is not "" then 
    put item 1 of it & tab & item 2 of it into Thelines
  else
    exit mouseUp
  end if
  
  set itemdelimiter to tab
  
  put the number of items in line 1 of fld "Data" into nbitems
  put fld "Data" into Thedata
  
  repeat with i = 1 to the number of items in Thelines
    if the number of items in line (item i of Thelines) of fld "Data" is not nbitems then
      answer "The number of items by lines must be equivalent"
      exit mouseUp
    end if
  end repeat
  
  lock screen
  set cursor to watch
   
--- et on commence le calcul des différentes valeurs de X et Y
   
  put "0" into moylignX  --- average of X
  repeat with i = 1 to nbitems
    add item i of line (item 1 of Thelines) of Thedata to moylignX
  end repeat
  put moylignX/nbitems into moylignX
   
  put "0" into moylignY  --- average of Y
  repeat with i = 1 to nbitems
    add item i of line (item 2 of Thelines) of Thedata to moylignY
  end repeat
  put moylignY/nbitems into moylignY
   
  put "0" into moyXauCar
  repeat with i = 1 to nbitems -- calcul du carré des variables
    add (item i of line (item 1 of Thelines) of Thedata)^2 to moyXauCar
  end repeat
  put moyXauCar/nbitems into moyXauCar
   
   
  put "0" into moyYauCar
  repeat with i = 1 to nbitems -- calcul du carré des variables
    add (item i of line (item 2 of Thelines) of Thedata)^2 to moyYauCar
  end repeat
  put moyYauCar/nbitems into moyYauCar
   
   
  put "0" into moyXY
  repeat with i = 1 to nbitems -- calcul du carré des variables
    add (item i of line (item 1 of Thelines) of Thedata)*(item i of line (item 2 of Thelines) of Thedata) to moyXY
  end repeat
  put moyXY/nbitems into moyXY
   
  put moyXauCar-moylignX^2 into ecartypeX
  put moyYauCar-moylignY^2 into ecartypeY
   
  put moyXY-(moylignX*moylignY) into coVarXY
  put coVarXY/(sqrt(ecartypeX*ecartypeY)) into lacor
   
  put "correlation coefficient = " & lacor into fld "CoefCor"
   
end mouseUp
Regards

Post Reply