Page 1 of 1
Manipulating word chunks
Posted: Thu Aug 06, 2009 12:39 am
by phaworth
I have a need to extract the first word of all the lines in a variable.
I have been trying:
Code: Select all
get word 1 of all lines of myvariable
and
Code: Select all
get word 1 of line 1 to (the number of lines in myvariable) of my variable
The first try resulted in a compilation error and the second returned zero, which is the value of word 1 of the first line but no other values.
What is the correct way to achieve this? Do I have to use a repeat loop to go through all the lines and extract the first word from each one or is there a better way?
Thanks,
Pete
Posted: Thu Aug 06, 2009 2:59 am
by Mark
Hi Pete,
Yes, a "repeat for each" loop is probably the simplest and fastest solution.
Best,
Mark
Posted: Thu Aug 06, 2009 4:57 am
by shadowslash
Try this small bit of code I made..
Code: Select all
on mouseUp
repeat with x=1 to the num of lines in myVariable
get the first word of line x of myvariable
end repeat
end mouseUp
Posted: Thu Aug 06, 2009 5:37 am
by Janschenkel
The script as proposed by
shadowslash will definitely work, but can be optimized. This should be faster for longer texts:
Code: Select all
on mouseUp
repeat for each line theLine in myVariable
put the first word of theLine & return after theFirstWordList
end repeat
-- strip the trailing return
delete the last char of theFirstWordList
end mouseUp
When you use a
repeat for each line loop, the engine will go through the data one line at a time, and can remember where it's at in the variable.
If you use a
repeat with theIndex = 1 to the number of lines loop, the engine has to start counting lines again from the very beginning of the text, in order to find
line theIndex of the data.
Also, when you use a
repeat with loop, it's faster to evaluate the
end value just once - otherwise the engine may have to recount every time, just to make sure it hasn't gone too far. So it's faster to:
Code: Select all
put the number of lines in theVariable into theCount
repeat with theIndex = 1 to theCount
...
end repeat
Anyway, using
repeat for each can improve performance dramatically, and it's best to pick up on these peculiarities of the language as quickly as possible
Jan Schenkel.
Posted: Thu Aug 06, 2009 5:55 am
by shadowslash
I modified Janschenkel's script a bit and I came up with this:
Code: Select all
on mouseUp
repeat for each line theLine in myVariable
get the first word of theLine
if theFirstWordList is empty then
put it into theFirstWordList
else
put it & return after theFirstWordList
end if
end repeat
-- strip the trailing return
delete the last char of theFirstWordList
end mouseUp
This ensures that there isn't an empty line at the end of
theFirstWordList variable. Alternatively, you can also use:
Code: Select all
delete the last line of theFirstWordList
to make your code smaller but then it's really up to you.

Posted: Thu Aug 06, 2009 8:45 am
by Klaus
You could also use an array!
...
set columndelimiter to SPACE
split myVariable by column
put myVariable[1] into list_of_first_words
...
Done
Best
Klaus
Posted: Thu Aug 06, 2009 8:54 am
by Mark
Dear shadowslash and Klaus,
Don't use additional if statements only to delete one character at the end. It slows down your script.
Don't use arrays in a repeat loop either. This slows down your script significantly.
Best,
Mark
Posted: Thu Aug 06, 2009 9:04 am
by Klaus
Dag Mark,
I NEVER did, do or will do this!
Best
Klaus
Posted: Fri Aug 07, 2009 11:54 am
by shadowslash
Klaus wrote:You could also use an array!
...
set columndelimiter to SPACE
split myVariable by column
put myVariable[1] into list_of_first_words
...
Done
Best
Klaus
Lol arrays are like ancient hieroglyphics to me...

Eventhough I already am doing a lot of stuff through Revolution, I haven't even
DARED try using an array statement
yet.

Posted: Fri Aug 07, 2009 5:21 pm
by !Jerry!
shadowslash wrote:I modified Janschenkel's script a bit and I came up with this:
Code: Select all
on mouseUp
repeat for each line theLine in myVariable
get the first word of theLine
if theFirstWordList is empty then
put it into theFirstWordList
else
put it & return after theFirstWordList
end if
end repeat
-- strip the trailing return
delete the last char of theFirstWordList
end mouseUp
Hi shadowslash,
are you sure, that your modification is right?
It seems to me you would get two non-separate words in the first line of "theFirstWordList" ...
regards,
Jerry
Posted: Fri Aug 07, 2009 10:33 pm
by phaworth
Thanks for all the ideas. I haven't been getting emails about the posts so have only just seen them all. The info about the efficiency considerations in repeat loops is great, thanks for that
I have a repeat loop at the moment that is doing the job for me but the simplicity of using an array is appealing. Would be interesting to measure the efficiency of the array approach vs a repeat loop, although I don;t expect to have enough data for it to make an appreciable difference (a few hundred lines at most).
If anyone from RunRev is watching, I can't help feeling that there should be a way to do this with something like "get word 1 of all lines of myVariable".
Pete
Posted: Sat Aug 08, 2009 12:27 am
by shadowslash
!Jerry! wrote:Hi shadowslash,
are you sure, that your modification is right?
It seems to me you would get two non-separate words in the first line of "theFirstWordList" ...
regards,
Jerry
Hi Jerry,
Thanks for reminding me! I didn't see that one.
Code: Select all
on mouseUp
repeat for each line theLine in myVariable
get the first word of theLine
if theFirstWordList is empty then
put it into theFirstWordList
else
put it & return after theFirstWordList
end if
end repeat
end mouseUp