array calculations slower after combine/split

Anything beyond the basics in using the LiveCode language. Share your handlers, functions and magic here.

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
Wil
Posts: 3
Joined: Mon Apr 06, 2020 12:42 pm

array calculations slower after combine/split

Post by Wil » Mon Apr 06, 2020 4:24 pm

Take a look at the script below:

Code: Select all

on mouseUp
   repeat with i = 1 to 500000
      put random (32000) into myArray [i]
   end repeat
   put the milliSeconds into ttt
   put average (myArray) into mean1
   put the milliSeconds - ttt into time1
   
   combine myArray by return
   split myArray by return
   put the milliSeconds into ttt
   put average (myArray) into mean2
   put the milliSeconds - ttt into time2
   
   put time1 && time2 & cr & mean1 && mean2
end mouseUp
On my system time1 is about 100 milliseconds, whereas time2 is about 150 milliseconds. The content of the array is exactly the same before and after the combine/split, which is also shown by the fact that mean1=mean2. On LC version 6 the difference is even more pronounced: 8 versus 32 milliseconds. I have not the faintest idea of the cause of this difference. Am I overlooking something? Something trivial perhaps?
Wil Dijkstra

LCMark
Livecode Staff Member
Livecode Staff Member
Posts: 1233
Joined: Thu Apr 11, 2013 11:27 am

Re: array calculations slower after combine/split

Post by LCMark » Mon Apr 06, 2020 5:06 pm

@Wil: It isn't *quite* the same...

In the first case, the values in the array are numbers already (random returns a number).

In the second case, the values in the array are strings (split doesn't know what is on each line, after all) so average has to first convert the string to a number in order to do its job.

So the difference is that the second 'average' is also doing 500000 string to number conversions, as well as computing the average.

Wil
Posts: 3
Joined: Mon Apr 06, 2020 12:42 pm

Re: array calculations slower after combine/split

Post by Wil » Mon Apr 06, 2020 5:20 pm

Ah, of course, already thought there should be an easy explanation.
Thanks, Wil

Wil
Posts: 3
Joined: Mon Apr 06, 2020 12:42 pm

Re: array calculations slower after combine/split

Post by Wil » Mon Apr 06, 2020 5:48 pm

Nevertheless something mysterious remains. I add "put myArray + 0 into myArray" after split:

Code: Select all

on mouseUp
   repeat with i = 1 to 500000
      put random (32000) into myArray [i]
   end repeat
   put the milliSeconds into ttt
   put average (myArray) into mean1
   put the milliSeconds - ttt into time1
   
   combine myArray by return
   split myArray by return
   put myArray + 0 into myArray
   put the milliSeconds into ttt
   put average (myArray) into mean2
   put the milliSeconds - ttt into time2
   
   put time1 && time2 & cr & mean1 && mean2
end mouseUp
Apparently I convinced the array even more that it consists of numbers. Now time2 is less then time1: about 60 milliseconds (versus 100 for time1)! So what could now be the difference between the array before combine/split and after combine/split and adding zero?
Wil Dijkstra

LCMark
Livecode Staff Member
Livecode Staff Member
Posts: 1233
Joined: Thu Apr 11, 2013 11:27 am

Re: array calculations slower after combine/split

Post by LCMark » Mon Apr 06, 2020 6:11 pm

Cache coherence I think...

Average iterates over the elements of an array in hash-order (as this is the fastest way to iterate through an array and order of processing doesn't matter in this case).

The loop which constructs the values results in the values in memory being in numeric order (of i).

When you `+ 0` to an array, the engine will iterate through the array in hash-order (again because the order of processing doesn't matter) which means that the values in the new array formed will be in hash-order in memory.

This means that when average processes the new array, the order of values of the elements in memory is the same as the order of processing of those values.

Put another way, the average(myArray) gives rise to lots of cache misses, average(myArray + 0) gives rise to very few cache misses.

So, it isn't the combine/split which is having this effect - its the +0 which is (helpfully?) essentially reordering the values in memory so they match the order average wants them in.

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10057
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: array calculations slower after combine/split

Post by FourthWorld » Mon Apr 06, 2020 6:45 pm

LCMark wrote:
Mon Apr 06, 2020 6:11 pm
Put another way, the average(myArray) gives rise to lots of cache misses, average(myArray + 0) gives rise to very few cache misses.
I'm pre-coffee. How do we add 0 to an array?
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

LCMark
Livecode Staff Member
Livecode Staff Member
Posts: 1233
Joined: Thu Apr 11, 2013 11:27 am

Re: array calculations slower after combine/split

Post by LCMark » Mon Apr 06, 2020 7:03 pm

@FourthWorld: The arithmetic operators have always been overloaded - you can do Number op Number, Array op Number or Array op Array.

Post Reply