Why Array over list
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller
Why Array over list
OK, I'm trying to understand when it's better to use an array over an itemized list. The list seems easier to deal with, not needing to find keys and so on. What do arrays have going for them? Is there already a good discussion about this that someone could point me towards?
thanks.
thanks.
Last edited by monki on Sun Oct 23, 2016 7:10 pm, edited 1 time in total.
Re: Why Array over list
Hi,
I always use itemized list because it is more lisible (for me).
I guess array is faster and useful for very large data
Best regards
Jean-Marc
I always use itemized list because it is more lisible (for me).
I guess array is faster and useful for very large data
Best regards
Jean-Marc
https://alternatic.ch
Re: Why Array over list
Arrays contain additional degrees of freedom, one might say, additional dimensions, derived from associations between data. Consider this classic example of array use>
You want to count the frequency of words in a body of text. The array way, with a field "display" so you can see the text, and most of the handler simply to set the thing up:
Now do this with any handler of your choice, without using an array. No problem, but not in a single "working" line either.
The structure of an array automatically does what you would have to manage explicitly in that handler that I assume you are now writing.
Craig Newman
EDIT:
Note that in that handler you are likely just about finished with, that ALL the wordCounts are contained within the above array, not just "fly".
You want to count the frequency of words in a body of text. The array way, with a field "display" so you can see the text, and most of the handler simply to set the thing up:
Code: Select all
on mouseUp
put "" into fld "display"
repeat 100
put any word of "cat dog snail fly hippo" & space after temp
end repeat
put temp into fld "display"
repeat for each word tWord in temp
add 1 to tWordCount[tWord] --ADD 1 TO EACH ELEMENT OF A KEY
end repeat
answer tWordCount["fly"]
end mouseup
The structure of an array automatically does what you would have to manage explicitly in that handler that I assume you are now writing.
Craig Newman
EDIT:
Note that in that handler you are likely just about finished with, that ALL the wordCounts are contained within the above array, not just "fly".
Re: Why Array over list
Hi,
as far as I (!) understand this:
Lists & items are more understandable for those of us *1) that don't come with much theoretical knowledge in informatics or mathematics, maybe also for those not yet spoiled by the cryptic syntax of the traditional lingo (C, RegExp and such atrocities ...).
Arrays, on the other hand, come with brackets and are stored in a not-human-readable form - exactly what above mentioned, more formally educated people, love & rejoice.
*1) "for the rest of us", as Apple claimed, before they started to "think different" ...
Joke aside. I'd expect arrays to become more suitable compared to lists when it's about speed and complexity. For complexity, I have still to find the barrier when working with lists.
For speed, I tried Craigs example:
At first, I had to make a small change to store enough data to get some measurements. A field is just too small. So I load a custom property first:
Now Craigs code:
He had challenged us:
For sure, I only count the flies. But why counting hippos if I don't need to ;-)
Result in fld "display" (LC 6.7.10, "obsolete"):Interesting! The list version is even a tiny bit faster!
The result in LC 9.0 dp1:Ouch!
LC 8.1 isn't much better ...I "resaved" the stack then in native 8.1 format, and created a new "waste_bin". Took ages! Results was even worse.
Summary: Working with arrays actually is a tiny bit faster in the new LC versions.
I have to admit that I tried to recreate Craigs example (counting all animals) in 1 line, but didn't succeed in a given time & in a desired speed:needs pre-initialized variables (else the "do" doesn't work), and is boring slow (only slightly faster than the 9.0 dp1 results ...).
Have fun!
as far as I (!) understand this:
Lists & items are more understandable for those of us *1) that don't come with much theoretical knowledge in informatics or mathematics, maybe also for those not yet spoiled by the cryptic syntax of the traditional lingo (C, RegExp and such atrocities ...).
Arrays, on the other hand, come with brackets and are stored in a not-human-readable form - exactly what above mentioned, more formally educated people, love & rejoice.
*1) "for the rest of us", as Apple claimed, before they started to "think different" ...
Joke aside. I'd expect arrays to become more suitable compared to lists when it's about speed and complexity. For complexity, I have still to find the barrier when working with lists.
For speed, I tried Craigs example:
At first, I had to make a small change to store enough data to get some measurements. A field is just too small. So I load a custom property first:
Code: Select all
repeat 10000000
put any word of "cat dog snail fly hippo" & space after temp
end repeat
set the waste_bin of this stack to temp
Code: Select all
put the millisecs into t1
put the waste_bin of this stack into temp
repeat for each word tWord in temp
add 1 to tWordCount[tWord]
end repeat
put "ARR " & tWordCount["fly"] & " / " & the millisecs - t1 & CR after fld "display"
Hmmm. Let's see if this is possible:dunbarx wrote:Now do this with any handler of your choice, without using an array. No problem, but not in a single "working" line either.
Code: Select all
put the millisecs into t1
put the waste_bin of this stack into temp
repeat for each word MyWord in temp
if myWord <> "fly" then next repeat else add 1 to MyCounter
end repeat
put "LST " & MyCounter & " / " & the millisecs - t1 & CR after fld "display"
Result in fld "display" (LC 6.7.10, "obsolete"):
Code: Select all
ARR 2000958 / 2066
ARR 2000958 / 2069
ARR 2000958 / 2029
ARR 2000958 / 2051
ARR 2000958 / 2095
LST 2000958 / 1961
LST 2000958 / 1977
LST 2000958 / 1938
LST 2000958 / 1985
LST 2000958 / 1977
The result in LC 9.0 dp1:
Code: Select all
ARR 2000958 / 15955
ARR 2000958 / 15938
ARR 2000958 / 15966
LST 2000958 / 19301
LST 2000958 / 19351
LST 2000958 / 19293
LC 8.1 isn't much better ...
Code: Select all
ARR 2000958 / 13994
ARR 2000958 / 13942
ARR 2000958 / 13990
LST 2000958 / 15057
LST 2000958 / 15011
LST 2000958 / 15197
Summary: Working with arrays actually is a tiny bit faster in the new LC versions.
I have to admit that I tried to recreate Craigs example (counting all animals) in 1 line, but didn't succeed in a given time & in a desired speed:
Code: Select all
do "add 1 to My" & MyWord
Have fun!
All code published by me here was created with Community Editions of LC (thus is GPLv3).
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
Re: Why Array over list
@AxWald.
Fun indeed.
I added an edit this morning, pointing out that the array method contains all the counts. I bet that the "list" version would take a bit longer if all were delineated. The list has to use a line of code for each counter. The array simply acts like a post office sorter, (I see the alien in "Men in Black 2") the distribution being managed internally. Anyway, at least the code is shorter.
But is it more readable? The expanded code in the list version have, at least, however verbose, one line of code for each step in the process. The fact that array variables are not visible in the clear, as I like to call it, except during stepwise debugging in the script editor, throws many users. I have to combine those variables now and then just to see what is going on, if debugging viewing is inconvenient.
Speed is another issue, as you mentioned. What if there were a hundred words in a large body of text? Long handler in the list...
Craig
Fun indeed.
I added an edit this morning, pointing out that the array method contains all the counts. I bet that the "list" version would take a bit longer if all were delineated. The list has to use a line of code for each counter. The array simply acts like a post office sorter, (I see the alien in "Men in Black 2") the distribution being managed internally. Anyway, at least the code is shorter.
But is it more readable? The expanded code in the list version have, at least, however verbose, one line of code for each step in the process. The fact that array variables are not visible in the clear, as I like to call it, except during stepwise debugging in the script editor, throws many users. I have to combine those variables now and then just to see what is going on, if debugging viewing is inconvenient.
Speed is another issue, as you mentioned. What if there were a hundred words in a large body of text? Long handler in the list...
Craig
Re: Why Array over list
So, the creation of associated variables on the "fly" as it were. OK, I can see where that could come in handy.dunbarx wrote:Arrays contain additional degrees of freedom, one might say, additional dimensions, derived from associations between data. Consider this classic example of array use>
...
Now do this with any handler of your choice, without using an array. No problem, but not in a single "working" line either.
Thanks
Re: Why Array over list
That's my suspicion as well, at least part of the timeAxWald wrote:Lists & items are more understandable for those of us *1) that don't come with much theoretical knowledge in informatics or mathematics, maybe also for those not yet spoiled by the cryptic syntax of the traditional lingo (C, RegExp and such atrocities ...).
Arrays, on the other hand, come with brackets and are stored in a not-human-readable form - exactly what above mentioned, more formally educated people, love & rejoice.

But dunbarx's post does point towards some useful ideas for creating variables on the fly, when the number, and name, of those variables will change depending on the text I'm throwing at it. I can see how using an array in this context would make things a lot easier: organizing a list of characters from a book by the first letter of their first name, for example. Which is what I'm going to use this to do

Re: Why Array over list
Craig,
you're right. Here we actually see a point of complexity & speed where the array notation has its advantages.
Btw., for "counting all" I wrote:The extra loop for initializing wouldn't hurt this much IMHO, but the "do" construction is costly - especially when done 10000000 times ;-)
With 6.7.10 it looks like this then:I don't even want to time this with an "official" version ...
Well, learned something new once more :) Guess I have to keep this in the back of my head, and have to swallow the toad of accepting the array syntax ...
Would be less a hassle if we could use some SQL on an array at least. This would solve this many problems for me (working with data from different databases)!
Hmmm. I wonder if above problem (counting all) couldn't be solved via a temp SQLite table. Must try this some day ;-)
Have fun!
you're right. Here we actually see a point of complexity & speed where the array notation has its advantages.
Btw., for "counting all" I wrote:
Code: Select all
put the millisecs into t1
put the waste_bin of this stack into temp
repeat for each word MyWord in "cat dog snail fly hippo"
do "put 0 into My" & MyWord -- initialize 'em variables
end repeat
repeat for each word MyWord in temp
do "add 1 to My" & MyWord
end repeat
put "LST " & Myfly & " / " & the millisecs - t1 & CR after fld "display"
With 6.7.10 it looks like this then:
Code: Select all
ARR 2000958 / 2020
ARR 2000958 / 2022
LST 2000958 / 18468
LST 2000958 / 18724
Well, learned something new once more :) Guess I have to keep this in the back of my head, and have to swallow the toad of accepting the array syntax ...
Would be less a hassle if we could use some SQL on an array at least. This would solve this many problems for me (working with data from different databases)!
Hmmm. I wonder if above problem (counting all) couldn't be solved via a temp SQLite table. Must try this some day ;-)
Have fun!
All code published by me here was created with Community Editions of LC (thus is GPLv3).
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
Re: Why Array over list
Some of the aspects again, more specialized (we are in the beginners forum):
If you have a string myList of a lot of items, say N items, with N > 1000. Now doNow myArray has N elements, myArray[1] to myArray[1000], and
item x of myList = myArray[x].
Then frequent accesses to a single item x of myList:
get item x of myList
are *much* slower than frequent accesses to element x of myArray:
get myArray[x]
[See stack "Speed" by Sarah https://github.com/trozware/rev_stacks ]
Especially (I use this often),
if the items/elements are numbers
and f(i) is a number that may depend on i
thenis *much* slower than
In the special case that f(i) is a constant, say f(i)=42, the latter is even more comfortable to get:
If you have a string myList of a lot of items, say N items, with N > 1000. Now do
Code: Select all
-- have both, myList and myArray, for comparing
put myList into myArray
split myArray by comma
item x of myList = myArray[x].
Then frequent accesses to a single item x of myList:
get item x of myList
are *much* slower than frequent accesses to element x of myArray:
get myArray[x]
[See stack "Speed" by Sarah https://github.com/trozware/rev_stacks ]
Especially (I use this often),
if the items/elements are numbers
and f(i) is a number that may depend on i
then
Code: Select all
repeat with i=1 to N
add f(i) to item i of myList
end repeat
Code: Select all
repeat for each key i of myArray
add f(i) to myArray[i]
end repeat
Code: Select all
add 42 to myArray -- adds to each element of myArray
-- similar for subtract/multiply/divide
shiftLock happens
Re: Why Array over list
Aha!
It was not lost on me that Hermann used 42 in his post.
Craig
It was not lost on me that Hermann used 42 in his post.
Craig
Re: Why Array over list
Hi Craig.
Of course the 42 was for you, exclusively.
[Using an array for building such a 'counted set' as the examples of you and AxWald above was, by the way, your first help for me when I started here, in 2013.]
Using the constant one I had this today in a list field of > 30000 lines:
There are possibly non-contiguous highlited lines (for example highlited from a search or ready for reordering), may be several thousands. Now shift the highlites (not the lines) one line down, may be repeatedly.
The items-approach is not too slow:
But the array approach is crazy fast:
Note for beginners. The disadvantage of such a splitAndCombine-method is that the order of the items becomes possibly changed and one has to sort after combining if the order is needed.
This is not necessary for setting the hilitedLines, where we are allowed to use a numerically unordered list (and always get an ordered list).
Hermann
[Edit. Actually I use this for shifting the highlites correspondingly, after reordering the list, not just for fun, as above.]
Of course the 42 was for you, exclusively.
[Using an array for building such a 'counted set' as the examples of you and AxWald above was, by the way, your first help for me when I started here, in 2013.]
Using the constant one I had this today in a list field of > 30000 lines:
There are possibly non-contiguous highlited lines (for example highlited from a search or ready for reordering), may be several thousands. Now shift the highlites (not the lines) one line down, may be repeatedly.
The items-approach is not too slow:
Code: Select all
put the hilitedLines of fld f into hL
repeat with i=1 to the num of items of hL
add 1 to item i of hL
end repeat
set hilitedLines of fld f to hL
--- can be improved by a repeat-for-each
put the hilitedLines of fld f into hL
put empty into hL2
repeat for each item i in hL
put comma & (1+i) after hL2
end repeat
set hilitedLines of fld f to item 2 to -1 of hL2
Code: Select all
put the hilitedLines of fld f into hL
split hL by comma
add 1 to hL
combine hL with comma
-- sort items of hL numeric -- not necessary here
set hilitedLines of fld f to hL
This is not necessary for setting the hilitedLines, where we are allowed to use a numerically unordered list (and always get an ordered list).
Hermann
[Edit. Actually I use this for shifting the highlites correspondingly, after reordering the list, not just for fun, as above.]
Last edited by [-hh] on Tue Oct 25, 2016 3:55 pm, edited 1 time in total.
shiftLock happens
Re: Why Array over list
@Hermann.
@All. Hermann's latest example bears careful examination. It is very short, and well worth it. Make a small list field and fill it with a dozen lines of short text. Set the non-contiguous and multipleHilitle to "true". Select a couple of non-contiguous lines.
Now step through his handler, and watch what happens in the debugger (you must expand the array variable when it appears after the "split" command) when that "1" is added to the array.
Arrays variables have within them a compactness that dwarfs the functionality in "ordinary" variable usage, when appropriate. It isn't that they cannot be substituted by ordinary variable techniques, it is that they offer great power when used properly.
When appropriate. It just takes practice...
Craig

@All. Hermann's latest example bears careful examination. It is very short, and well worth it. Make a small list field and fill it with a dozen lines of short text. Set the non-contiguous and multipleHilitle to "true". Select a couple of non-contiguous lines.
Now step through his handler, and watch what happens in the debugger (you must expand the array variable when it appears after the "split" command) when that "1" is added to the array.
Arrays variables have within them a compactness that dwarfs the functionality in "ordinary" variable usage, when appropriate. It isn't that they cannot be substituted by ordinary variable techniques, it is that they offer great power when used properly.
When appropriate. It just takes practice...
Craig
Re: Why Array over list
Code: Select all
add 1 to hL
I hope so much that such a construct will come in LCB too, even more general, as mapping a function to each element of a list of numbers or points or n-tuples.
Code: Select all
map f to hL
shiftLock happens
Re: Why Array over list
FOUL!!! ;-)
"repeat with i [...]"
is nearly always *much* slower than
"repeat for each [...]"
;-)
Besides, I'm too stupid to get this array thingie. Tried to build something based on Hermanns post - at first I want a nice array of numbers, this seems to work:
Inspecting the custom props shows "array_bin" that cannot be displayed etc. Checking:
fails - bummer!
But:works at least.
Now I set up a test; lets try to add random(3) to each element of the list/ array. List first:I need to set the itemdel, but can work directly on the custom prop. But I need a temp list, cannot change the list directly in this way - here the array should have a big advantage!
At the end I show item 1 to be sure it has incremented. OK.
Now this with array:Hmmmmm.
What's wrong? The first one (put i +random(3) into i) doesn't add at all, the second one (put i +random(3) into MyArray) seems to add only the random(3) :/
And since this is a beginner mistake for sure, I'm not even OT here \o/
Curious what fun we'll have if we come to more complex arrays ;-)
Have fun!
"repeat with i [...]"
is nearly always *much* slower than
"repeat for each [...]"
;-)
Besides, I'm too stupid to get this array thingie. Tried to build something based on Hermanns post - at first I want a nice array of numbers, this seems to work:
Code: Select all
repeat 1000000
put random(999) & tab after temp
end repeat
delete last char of temp
set the list_bin of this stack to temp
split temp by tab
set the array_bin of this stack to temp
Code: Select all
the array_bin of this stack[1]
But:
Code: Select all
put the array_bin of this stack into myArray
put myArray[2]
Now I set up a test; lets try to add random(3) to each element of the list/ array. List first:
Code: Select all
put the millisecs into t1
put 0*1 into myCount
set itemdel to tab
repeat for each item i in the list_bin of this stack
put i + random(3) & tab after myTemp -- increment each item of the list, and create a temp_list
add 1 to myCount -- remember how many, for control
end repeat
delete last char of myTemp
set the list_bin of this stack to myTemp -- update the original list
put "LST " & myCount & " / " & item 1 of MyTemp & " / " & the millisecs - t1 & CR after fld "display"
At the end I show item 1 to be sure it has incremented. OK.
Now this with array:
Code: Select all
put the millisecs into t1
put 0*1 onto mySum
put 0*1 into myCount
put the array_bin of this stack into myArray
repeat for each key i in myArray
-- put i +random(3) into i -- doesn't add
-- put i +random(3) into MyArray[i] -- only adds random(3)
add 1 to myCount
end repeat
set the array_bin of this stack to myArray -- update the original list
put "ARR " & myCount & " / " & myArray[1] & " / " & the millisecs - t1 & CR after fld "display"
What's wrong? The first one (put i +random(3) into i) doesn't add at all, the second one (put i +random(3) into MyArray) seems to add only the random(3) :/
And since this is a beginner mistake for sure, I'm not even OT here \o/
Curious what fun we'll have if we come to more complex arrays ;-)
Have fun!
All code published by me here was created with Community Editions of LC (thus is GPLv3).
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
Re: Why Array over list
I see what you are trying to get, but this is not valid syntax. You cannot get the value of a custom property and an array element at the same time. And arrays are not visible anywhere but during examination in the debugger. They have to be combined first.the array_bin of this stack[1]
As for your array handler, it all works fine for me in v.6.7.9. Are you sure you are not seeing what you want to see? Try this and see if there is something I am missing:
Code: Select all
on mouseUp
put 0 into myCount
put 1 into myArray[1]
put 2 into myArray[2]
repeat for each key xx in myArray
put xx +10 into xx
put xx +10 into MyArray[xx]
add 1 to myCount
end repeat
end mouseUp
Craig