Why Array over list

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

monki
Posts: 59
Joined: Tue Dec 13, 2011 10:56 pm

Why Array over list

Post by monki » Sun Oct 23, 2016 6:08 pm

OK, I'm trying to understand when it's better to use an array over an itemized list. The list seems easier to deal with, not needing to find keys and so on. What do arrays have going for them? Is there already a good discussion about this that someone could point me towards?

thanks.
Last edited by monki on Sun Oct 23, 2016 7:10 pm, edited 1 time in total.

jmburnod
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 2729
Joined: Sat Dec 22, 2007 5:35 pm
Contact:

Re: Why Array over list

Post by jmburnod » Sun Oct 23, 2016 7:02 pm

Hi,
I always use itemized list because it is more lisible (for me).
I guess array is faster and useful for very large data
Best regards
Jean-Marc
https://alternatic.ch

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10305
Joined: Wed May 06, 2009 2:28 pm

Re: Why Array over list

Post by dunbarx » Mon Oct 24, 2016 3:43 am

Arrays contain additional degrees of freedom, one might say, additional dimensions, derived from associations between data. Consider this classic example of array use>

You want to count the frequency of words in a body of text. The array way, with a field "display" so you can see the text, and most of the handler simply to set the thing up:

Code: Select all

on mouseUp
   put "" into fld "display"
   repeat 100
      put any word of "cat dog snail fly hippo" & space after temp
   end repeat
   put temp into fld "display"
   repeat for each word tWord in temp
      add 1 to tWordCount[tWord] --ADD 1 TO EACH ELEMENT OF A KEY
   end repeat
   answer tWordCount["fly"]
end mouseup
Now do this with any handler of your choice, without using an array. No problem, but not in a single "working" line either.

The structure of an array automatically does what you would have to manage explicitly in that handler that I assume you are now writing.

Craig Newman

EDIT:

Note that in that handler you are likely just about finished with, that ALL the wordCounts are contained within the above array, not just "fly".

AxWald
Posts: 578
Joined: Thu Mar 06, 2014 2:57 pm

Re: Why Array over list

Post by AxWald » Mon Oct 24, 2016 3:20 pm

Hi,

as far as I (!) understand this:

Lists & items are more understandable for those of us *1) that don't come with much theoretical knowledge in informatics or mathematics, maybe also for those not yet spoiled by the cryptic syntax of the traditional lingo (C, RegExp and such atrocities ...).

Arrays, on the other hand, come with brackets and are stored in a not-human-readable form - exactly what above mentioned, more formally educated people, love & rejoice.

*1) "for the rest of us", as Apple claimed, before they started to "think different" ...

Joke aside.
I'd expect arrays to become more suitable compared to lists when it's about speed and complexity. For complexity, I have still to find the barrier when working with lists.
For speed, I tried Craigs example:

At first, I had to make a small change to store enough data to get some measurements. A field is just too small. So I load a custom property first:

Code: Select all

   repeat 10000000
      put any word of "cat dog snail fly hippo" & space after temp
   end repeat
   set the waste_bin of this stack to temp
Now Craigs code:

Code: Select all

   put the millisecs into t1
   put the waste_bin of this stack into temp
   
   repeat for each word tWord in temp
      add 1 to tWordCount[tWord]
   end repeat

   put "ARR " & tWordCount["fly"] & " / " & the millisecs - t1 & CR after fld "display"
He had challenged us:
dunbarx wrote:Now do this with any handler of your choice, without using an array. No problem, but not in a single "working" line either.
Hmmm. Let's see if this is possible:

Code: Select all

   put the millisecs into t1
   put the waste_bin of this stack into temp
   
   repeat for each word MyWord in temp
      if myWord <> "fly" then next repeat else add 1 to MyCounter
   end repeat
   
   put "LST " & MyCounter & " / " & the millisecs - t1 & CR after fld "display"
For sure, I only count the flies. But why counting hippos if I don't need to ;-)

Result in fld "display" (LC 6.7.10, "obsolete"):

Code: Select all

ARR 2000958 / 2066
ARR 2000958 / 2069
ARR 2000958 / 2029
ARR 2000958 / 2051
ARR 2000958 / 2095
LST 2000958 / 1961
LST 2000958 / 1977
LST 2000958 / 1938
LST 2000958 / 1985
LST 2000958 / 1977
Interesting! The list version is even a tiny bit faster!

The result in LC 9.0 dp1:

Code: Select all

ARR 2000958 / 15955
ARR 2000958 / 15938
ARR 2000958 / 15966
LST 2000958 / 19301
LST 2000958 / 19351
LST 2000958 / 19293
Ouch!
LC 8.1 isn't much better ...

Code: Select all

ARR 2000958 / 13994
ARR 2000958 / 13942
ARR 2000958 / 13990
LST 2000958 / 15057
LST 2000958 / 15011
LST 2000958 / 15197
I "resaved" the stack then in native 8.1 format, and created a new "waste_bin". Took ages! Results was even worse.

Summary: Working with arrays actually is a tiny bit faster in the new LC versions.

I have to admit that I tried to recreate Craigs example (counting all animals) in 1 line, but didn't succeed in a given time & in a desired speed:

Code: Select all

do "add 1 to My" & MyWord
needs pre-initialized variables (else the "do" doesn't work), and is boring slow (only slightly faster than the 9.0 dp1 results ...).

Have fun!
All code published by me here was created with Community Editions of LC (thus is GPLv3).
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10305
Joined: Wed May 06, 2009 2:28 pm

Re: Why Array over list

Post by dunbarx » Mon Oct 24, 2016 5:04 pm

@AxWald.

Fun indeed.

I added an edit this morning, pointing out that the array method contains all the counts. I bet that the "list" version would take a bit longer if all were delineated. The list has to use a line of code for each counter. The array simply acts like a post office sorter, (I see the alien in "Men in Black 2") the distribution being managed internally. Anyway, at least the code is shorter.

But is it more readable? The expanded code in the list version have, at least, however verbose, one line of code for each step in the process. The fact that array variables are not visible in the clear, as I like to call it, except during stepwise debugging in the script editor, throws many users. I have to combine those variables now and then just to see what is going on, if debugging viewing is inconvenient.

Speed is another issue, as you mentioned. What if there were a hundred words in a large body of text? Long handler in the list...

Craig

monki
Posts: 59
Joined: Tue Dec 13, 2011 10:56 pm

Re: Why Array over list

Post by monki » Mon Oct 24, 2016 9:19 pm

dunbarx wrote:Arrays contain additional degrees of freedom, one might say, additional dimensions, derived from associations between data. Consider this classic example of array use>
...
Now do this with any handler of your choice, without using an array. No problem, but not in a single "working" line either.
So, the creation of associated variables on the "fly" as it were. OK, I can see where that could come in handy.
Thanks

monki
Posts: 59
Joined: Tue Dec 13, 2011 10:56 pm

Re: Why Array over list

Post by monki » Mon Oct 24, 2016 9:38 pm

AxWald wrote:Lists & items are more understandable for those of us *1) that don't come with much theoretical knowledge in informatics or mathematics, maybe also for those not yet spoiled by the cryptic syntax of the traditional lingo (C, RegExp and such atrocities ...).

Arrays, on the other hand, come with brackets and are stored in a not-human-readable form - exactly what above mentioned, more formally educated people, love & rejoice.
That's my suspicion as well, at least part of the time :wink:

But dunbarx's post does point towards some useful ideas for creating variables on the fly, when the number, and name, of those variables will change depending on the text I'm throwing at it. I can see how using an array in this context would make things a lot easier: organizing a list of characters from a book by the first letter of their first name, for example. Which is what I'm going to use this to do :D

AxWald
Posts: 578
Joined: Thu Mar 06, 2014 2:57 pm

Re: Why Array over list

Post by AxWald » Tue Oct 25, 2016 8:52 am

Craig,

you're right. Here we actually see a point of complexity & speed where the array notation has its advantages.

Btw., for "counting all" I wrote:

Code: Select all

   put the millisecs into t1
   put the waste_bin of this stack into temp
   
   repeat for each word MyWord in "cat dog snail fly hippo"
      do "put 0 into My" & MyWord --  initialize 'em variables
   end repeat

   repeat for each word MyWord in temp
      do "add 1 to My" & MyWord
   end repeat
   
   put "LST " & Myfly & " / " & the millisecs - t1 & CR after fld "display"
The extra loop for initializing wouldn't hurt this much IMHO, but the "do" construction is costly - especially when done 10000000 times ;-)

With 6.7.10 it looks like this then:

Code: Select all

ARR 2000958 / 2020
ARR 2000958 / 2022
LST 2000958 / 18468
LST 2000958 / 18724
I don't even want to time this with an "official" version ...

Well, learned something new once more :) Guess I have to keep this in the back of my head, and have to swallow the toad of accepting the array syntax ...

Would be less a hassle if we could use some SQL on an array at least. This would solve this many problems for me (working with data from different databases)!
Hmmm. I wonder if above problem (counting all) couldn't be solved via a temp SQLite table. Must try this some day ;-)

Have fun!
All code published by me here was created with Community Editions of LC (thus is GPLv3).
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!

[-hh]
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 2262
Joined: Thu Feb 28, 2013 11:52 pm

Re: Why Array over list

Post by [-hh] » Tue Oct 25, 2016 12:10 pm

Some of the aspects again, more specialized (we are in the beginners forum):

If you have a string myList of a lot of items, say N items, with N > 1000. Now do

Code: Select all

-- have both, myList and myArray, for comparing
put myList into myArray
split myArray by comma
Now myArray has N elements, myArray[1] to myArray[1000], and
item x of myList = myArray[x].

Then frequent accesses to a single item x of myList:
     get item x of myList
are *much* slower than frequent accesses to element x of myArray:
     get myArray[x]
[See stack "Speed" by Sarah https://github.com/trozware/rev_stacks ]

Especially (I use this often),
if the items/elements are numbers
and f(i) is a number that may depend on i

then

Code: Select all

repeat with i=1 to N
  add f(i) to item i of myList
end repeat
is *much* slower than

Code: Select all

repeat for each key i of myArray
  add f(i) to myArray[i]
end repeat
In the special case that f(i) is a constant, say f(i)=42, the latter is even more comfortable to get:

Code: Select all

  add 42 to myArray -- adds to each element of myArray
  -- similar for subtract/multiply/divide
shiftLock happens

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10305
Joined: Wed May 06, 2009 2:28 pm

Re: Why Array over list

Post by dunbarx » Tue Oct 25, 2016 1:59 pm

Aha!

It was not lost on me that Hermann used 42 in his post.

Craig

[-hh]
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 2262
Joined: Thu Feb 28, 2013 11:52 pm

Re: Why Array over list

Post by [-hh] » Tue Oct 25, 2016 3:08 pm

Hi Craig.
Of course the 42 was for you, exclusively.
[Using an array for building such a 'counted set' as the examples of you and AxWald above was, by the way, your first help for me when I started here, in 2013.]

Using the constant one I had this today in a list field of > 30000 lines:
There are possibly non-contiguous highlited lines (for example highlited from a search or ready for reordering), may be several thousands. Now shift the highlites (not the lines) one line down, may be repeatedly.

The items-approach is not too slow:

Code: Select all

put the hilitedLines of fld f into hL
repeat with i=1 to the num of items of hL
   add 1 to item i of hL
end repeat
set hilitedLines of fld f to hL
--- can be improved by a repeat-for-each
put the hilitedLines of fld f into hL
put empty into hL2
repeat for each item i in hL
   put comma & (1+i) after hL2
end repeat
set hilitedLines of fld f to item 2 to -1 of hL2
But the array approach is crazy fast:

Code: Select all

put the hilitedLines of fld f into hL
split hL by comma
add 1 to hL
combine hL with comma
-- sort items of hL numeric -- not necessary here
set hilitedLines of fld f to hL
Note for beginners. The disadvantage of such a splitAndCombine-method is that the order of the items becomes possibly changed and one has to sort after combining if the order is needed.
This is not necessary for setting the hilitedLines, where we are allowed to use a numerically unordered list (and always get an ordered list).

Hermann

[Edit. Actually I use this for shifting the highlites correspondingly, after reordering the list, not just for fun, as above.]
Last edited by [-hh] on Tue Oct 25, 2016 3:55 pm, edited 1 time in total.
shiftLock happens

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10305
Joined: Wed May 06, 2009 2:28 pm

Re: Why Array over list

Post by dunbarx » Tue Oct 25, 2016 3:42 pm

@Hermann. :wink:

@All. Hermann's latest example bears careful examination. It is very short, and well worth it. Make a small list field and fill it with a dozen lines of short text. Set the non-contiguous and multipleHilitle to "true". Select a couple of non-contiguous lines.

Now step through his handler, and watch what happens in the debugger (you must expand the array variable when it appears after the "split" command) when that "1" is added to the array.

Arrays variables have within them a compactness that dwarfs the functionality in "ordinary" variable usage, when appropriate. It isn't that they cannot be substituted by ordinary variable techniques, it is that they offer great power when used properly.

When appropriate. It just takes practice...

Craig

[-hh]
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 2262
Joined: Thu Feb 28, 2013 11:52 pm

Re: Why Array over list

Post by [-hh] » Tue Oct 25, 2016 3:45 pm

Code: Select all

add 1 to hL
This simple line, adding 1 to several thousand elements of hL shows for me the full beauty of this language.

I hope so much that such a construct will come in LCB too, even more general, as mapping a function to each element of a list of numbers or points or n-tuples.

Code: Select all

map f to hL
Let us dream ...
shiftLock happens

AxWald
Posts: 578
Joined: Thu Mar 06, 2014 2:57 pm

Re: Why Array over list

Post by AxWald » Tue Oct 25, 2016 4:17 pm

FOUL!!! ;-)

"repeat with i [...]"
is nearly always *much* slower than
"repeat for each [...]"
;-)

Besides, I'm too stupid to get this array thingie. Tried to build something based on Hermanns post - at first I want a nice array of numbers, this seems to work:

Code: Select all

   repeat 1000000
      put random(999) & tab after temp
   end repeat
   delete last char of temp
   set the list_bin of this stack to temp
   split temp by tab
   set the array_bin of this stack to temp
Inspecting the custom props shows "array_bin" that cannot be displayed etc. Checking:

Code: Select all

the array_bin of this stack[1]
fails - bummer!
But:

Code: Select all

put the array_bin of this stack into myArray
put myArray[2]
works at least.

Now I set up a test; lets try to add random(3) to each element of the list/ array. List first:

Code: Select all

   put the millisecs into t1
   put 0*1 into myCount
   set itemdel to tab
   repeat for each item i in the list_bin of this stack
      put  i + random(3) & tab after myTemp     -- increment each item of the list, and create a temp_list
      add 1 to myCount     -- remember how many, for control
   end repeat
   delete last char of myTemp
   set the list_bin of this stack to myTemp    -- update the original list
   put "LST " & myCount & " / " & item 1 of MyTemp & " / " & the millisecs - t1 & CR after fld "display"
I need to set the itemdel, but can work directly on the custom prop. But I need a temp list, cannot change the list directly in this way - here the array should have a big advantage!
At the end I show item 1 to be sure it has incremented. OK.

Now this with array:

Code: Select all

   put the millisecs into t1
   put 0*1 onto mySum
   put 0*1 into myCount
   put the array_bin of this stack into myArray
   repeat for each key i in myArray
      -- put i +random(3) into i  --  doesn't add
      -- put i +random(3) into MyArray[i]  --  only adds random(3)
      add 1 to myCount
   end repeat
   set the array_bin of this stack to myArray     -- update the original list
   put "ARR " & myCount & " / " & myArray[1] & " / " & the millisecs - t1 & CR after fld "display"
Hmmmmm.
What's wrong? The first one (put i +random(3) into i) doesn't add at all, the second one (put i +random(3) into MyArray) seems to add only the random(3) :/

And since this is a beginner mistake for sure, I'm not even OT here \o/
Curious what fun we'll have if we come to more complex arrays ;-)

Have fun!
All code published by me here was created with Community Editions of LC (thus is GPLv3).
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10305
Joined: Wed May 06, 2009 2:28 pm

Re: Why Array over list

Post by dunbarx » Tue Oct 25, 2016 5:39 pm

the array_bin of this stack[1]
I see what you are trying to get, but this is not valid syntax. You cannot get the value of a custom property and an array element at the same time. And arrays are not visible anywhere but during examination in the debugger. They have to be combined first.

As for your array handler, it all works fine for me in v.6.7.9. Are you sure you are not seeing what you want to see? Try this and see if there is something I am missing:

Code: Select all

on mouseUp
   put 0 into myCount
   put 1 into myArray[1]
   put 2 into myArray[2]
   repeat for each key xx in myArray
            put xx +10 into xx  
            put xx +10 into MyArray[xx] 
            add 1 to myCount
      end repeat
end mouseUp
You did not expect changes in xx to increase the number of loops, did you? That is locked at the beginning of the control structure, and changes to the variable xx have nothing to do with the number of keys at the start.

Craig

Post Reply