fastest way to do a count

Anything beyond the basics in using the LiveCode language. Share your handlers, functions and magic here.

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

fastest way to do a count

Post by adventuresofgreg » Fri Dec 09, 2011 8:37 pm

Hi. I have a very large list of numbers and for each number, i would like to count the number of times it repeats itself in the list.

I am currently using lineoffset(thenumber,thelist), then deleting the line out of the list and repeating until lineoffset is 0.

but this can be quite slow over 10's of thousands of lines

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4003
Joined: Sun Jan 07, 2007 9:12 pm
Location: Bochum, Germany

Re: fastest way to do a count

Post by bn » Fri Dec 09, 2011 9:36 pm

Hi Greg,

try this on a field that contains your numbers assuming each line is 1 number.
make a second field for the result. It will display each number and a tab and the number of occurences of that number.

Code: Select all

on mouseUp
   put field 1 into tData
   if tData is "" then exit mouseUp
   repeat for each line aLine in tData
      add 1 to tArray[aLine]
   end repeat
   combine tArray by return and tab
   set the itemDelimiter to tab
   sort tArray by item 2 of each
   
   put tArray into field 2
end mouseUp
only works if each line consists of only one number, could be changed if data format is different but you would have to say so.

I append a tiny stack that creates 20000 lines of random number in the range 1 to 30 and puts it into field 1 and then you can count the number of occurences of each number in field 1

countOccurencesOfNumbers.livecode.zip
(1.44 KiB) Downloaded 249 times
Kind regads

Bernd

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: fastest way to do a count

Post by adventuresofgreg » Fri Dec 09, 2011 9:49 pm

Thanks Bernd: That looks like it would work, but it is a bit more complicated than I specified:

The number is actually a list of comma delimited numbers and there is a second word to each number that needs to be averaged

ie: the list would look like:

4,3,5,6,8,1,2,9 .085
3,1,4,6,1,1,8,9 .07
5,7,3,6,4,5,2,9 -.623
2,9,5,6,5,6,7,4 .543
3,3,9,4,8,1,2,1 -.023

for each line, I need to count the occurances of the first word, and then, for all matching 1st words, I need to calculate an average of the seconds words.

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4003
Joined: Sun Jan 07, 2007 9:12 pm
Location: Bochum, Germany

Re: fastest way to do a count

Post by bn » Fri Dec 09, 2011 9:55 pm

Hi Greg,

I am not shure I get what you mean.

Could you give an example not only of the data structure but also of the averaging bit. What do mean by first word: the first item = the first number?

Kind regards

bernd

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: fastest way to do a count

Post by adventuresofgreg » Fri Dec 09, 2011 10:17 pm

Hi Bernd:

Here is a sample list. Each line consists of 2 words. word 1 is a comma delimited group of numbers, and the second word is a number

word 1 word 2
4,3,5,6,8,1,2,9 .085
3,1,4,6,1,1,8,9 .07
4,3,5,6,8,1,2,9 -.623
4,3,5,6,8,1,2,9 .543
3,3,9,4,8,1,2,1 -.023

So, for the first line, I want to count the number of times word 1 appears in the entire list. The answer = 3 in this case. And, I want to calculate an average for the second words for all matching 1st words - like for this example: average(.085,-.623,.543)

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4003
Joined: Sun Jan 07, 2007 9:12 pm
Location: Bochum, Germany

Re: fastest way to do a count

Post by bn » Sat Dec 10, 2011 12:04 am

Hi Greg,

try the stack I attach.

It still uses arrays and does an arithmetic mean = average. (the sum of word 2 divided by number of occurences). In my testing it worked. It does the calculation on all word 1 even if it occurs only once. You could exclude that in the code.
countOccOfNumbersAndAverages.livecode.zip
(1.94 KiB) Downloaded 262 times
Tell me how it goes and how fast it is.

Kind regards

Bernd

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: fastest way to do a count

Post by adventuresofgreg » Sat Dec 10, 2011 1:00 am

Hi Bernd: Yes - that looks really good. Thanks a ton! I'll run a test on my 100,000 line file and time it. It should be much faster that my script. One problem.. before counting the occurances of word 1, we need to delete word 1 from the list so that it doesn't count itself. I could just subtract 1 from the count, but this line's 2nd word number cannot be included in the average. I'm not sure how to do that aside from deleting the line from the list before activating the count script.

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4003
Joined: Sun Jan 07, 2007 9:12 pm
Location: Bochum, Germany

Re: fastest way to do a count

Post by bn » Sat Dec 10, 2011 1:15 am

Hi Gregg,

in the example you gave you do count the first occurrence:
word 1 word 2
4,3,5,6,8,1,2,9 .085
3,1,4,6,1,1,8,9 .07
4,3,5,6,8,1,2,9 -.623
4,3,5,6,8,1,2,9 .543
3,3,9,4,8,1,2,1 -.023

So, for the first line, I want to count the number of times word 1 appears in the entire list. The answer = 3 in this case. And, I want to calculate an average for the second words for all matching 1st words - like for this example: average(.085,-.623,.543)
I am a little confused. Do you want to exclude word 2 of every first occurrence of word 1? In your example you did use all 3 word 2 values for the average:

Code: Select all

average(.085,-.623,.543)
so you actually took into account the first occurrence of 4,3,5,6,8,1,2,9

Kind regards

Bernd

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: fastest way to do a count

Post by adventuresofgreg » Sat Dec 10, 2011 1:16 am

Correct. Sorry - In my example, I forgot to delete it from the list before counting and averaging.

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4003
Joined: Sun Jan 07, 2007 9:12 pm
Location: Bochum, Germany

Re: fastest way to do a count

Post by bn » Sat Dec 10, 2011 1:52 am

Hi Gregg,

I gave it a try.

Now the count will be 0 if a word 1 only occurred once, 1 if it occurred twice etc.
The averages will be based on word 2 second occurrence to nth occurrence divided by occurrence - 1
If a word 1 only shows up once the average will be word 2 (you could change that)

Please test extensively before using in "production". It has gotten a bit more complicated.

Edit: I cleaned up the attachement and tested it and it seems to work allright.
countOccOfNumbersAndAveragesIIII.livecode.zip
(3.5 KiB) Downloaded 253 times
Kind regards

Bernd
Last edited by bn on Sat Dec 10, 2011 5:03 pm, edited 1 time in total.

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: fastest way to do a count

Post by adventuresofgreg » Sat Dec 10, 2011 5:02 pm

Thanks Bernd. I'll take a look. I think a slightly less complicated way would be to just include the word 1, and it's average, then subtract it out from the final sum before calculating the average. I'll play around with it. Thanks!

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4003
Joined: Sun Jan 07, 2007 9:12 pm
Location: Bochum, Germany

Re: fastest way to do a count

Post by bn » Sat Dec 10, 2011 5:04 pm

Hi Gregg,

I just edited my post and uploaded a cleaned up version of the stack.
You may want to have a look.

Kind regards

Bernd

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4003
Joined: Sun Jan 07, 2007 9:12 pm
Location: Bochum, Germany

Re: fastest way to do a count

Post by bn » Sat Dec 10, 2011 5:13 pm

Hi Gregg,

apparently we were online at the same time, just wanted to point you to the cleaned up version which I recommend. (countOccOfNumbersAndAveragesIIII.livecode.zip)

Since you did not really describe your usecase I had to guess at what you wanted. I think you can easily change the code to suit your needs. If not, just describe what exactly you want to achieve and what you want changed and I see what I can do.

Kind regards

Bernd

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: fastest way to do a count

Post by adventuresofgreg » Sat Dec 10, 2011 5:25 pm

Hi Bernd: I incorporated your new version and ran it - BLINDINGLY fast! I compared the results to my script and they match. Nice work. Thanks again.

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4003
Joined: Sun Jan 07, 2007 9:12 pm
Location: Bochum, Germany

Re: fastest way to do a count

Post by bn » Sat Dec 10, 2011 5:39 pm

Hi Gregg,

glad the results are the same ;)

would you care to estimate/measure how long your version of the script takes and how long the new version takes on your data.

I know that my version takes around a second for 100,000 lines.

Kind regards

Bernd

Post Reply

Return to “Talking LiveCode”