Multiple itemDelimiters

Something you want to see in a LiveCode product? Want a new forum set up for a specific topic? Talk about it here.

Moderators: heatherlaine, Klaus, FourthWorld, robinmiller, kevinmiller

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 5276
Joined: Wed May 06, 2009 2:28 pm
Location: New York, NY

Multiple itemDelimiters

Post by dunbarx » Wed Sep 24, 2014 11:36 pm

I have wanted this since forever. In the beginning, there was the comma. Then came any character. A wonderful improvement. But I wish there were, like custom properties, the ability to define as many "item" delimiters as one wanted. In many scripts I write, and especially in loops, I set and reset the itemDelimiter, back and forth, parsing data in different portions of the processed text. Consider data sets like:

aaa,bbb,ccc#ddd,eee,fff -- note the "," and "#" separating different portions of each string
ggg,hhh,jjj#kkk,xxx,yyy

Not sure what to call them, ( "item1.item2...")? So you could (pseudo):

Code: Select all

get theAboveText
set the itemDel to comma
set the item2Del to "#"

repeat with x = 1 to the number of lines of it
  repeat with y = 1 to the number of items of line x of it
    repeat with z = 1 to the number of item2 of item y of line x of it
     put item2 z of item y of line x of it & return after temp
    end repeat
  end repeat
end repeat
That sort of thing...

I admit the naming scheme ("item2") is awful, but the concept seems like a winner. It would be as if not only could you delimit by items, but also by "buckets", "packages", "parcels" and "trainLoads":

Code: Select all

answer parcel 2 of item 3 of bucket 4 of yourString
Forget having as many as you want; perhaps only a small fixed number of new "item" keywords would be necessary. I bet adding the four above would cover most bases.

Craig Newman

WaltBrown
Posts: 466
Joined: Mon May 11, 2009 9:12 pm
Location: NE USA

Re: Multiple itemDelimiters

Post by WaltBrown » Thu Sep 25, 2014 11:37 am

I had seen a request like that a while ago. Terms like wordDelim, phraseDelim, sentenceDelim, chunkDelim, might be other arbitrary names to hold multiple delimiters.
Walt Brown
Omnis traductor traditor

phaworth
Posts: 592
Joined: Thu Jun 11, 2009 9:51 pm

Re: Multiple itemDelimiters

Post by phaworth » Thu Sep 25, 2014 7:37 pm

I like that idea. An alternative might be to have a "delimited by" clause with the default being comma. So your code would become

Code: Select all

get theAboveText
repeat with x = 1 to the number of lines of it
  repeat with y = 1 to the number of items of line x of it
    repeat with z = 1 to the number of items delimited by "#" of item y of line x of it
     put item z of item y of line x of it & return after temp
    end repeat
  end repeat
end repeat

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 5276
Joined: Wed May 06, 2009 2:28 pm
Location: New York, NY

Re: Multiple itemDelimiters

Post by dunbarx » Fri Sep 26, 2014 12:51 am

Well, that is a great way to eliminate the silly names of the new "item" keywords. And it reads in a very grown-up way.

It is not as simple or neat as an explicit "newItem" keyword (like "trainload"), though. But I bet it would be easier to sell that way, and has the advantage of having virtually no limits as to the number of new keywords. Maybe "trainLoad" is a bit much, after all.

Note that there is another thread where it has been discovered that in LC 7, both the itemDelimiter and the "split" command (so far) may now contain any number of characters:

Code: Select all

set the itemDelimiter to "XYZ"
answer item 2 of "aaaXYZbbb" --gives "bbb"
I guess this is a good thing. It would allow the new syntax you suggested to be very descriptive:

Code: Select all

...the number of items delimited by "colorChoice" of... 
or whatever.

Craig

paul@researchware.com
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 71
Joined: Wed Aug 26, 2009 7:42 pm
Location: Randolph, MA USA
Contact:

Re: Multiple itemDelimiters

Post by paul@researchware.com » Fri Sep 26, 2014 1:45 pm

I really like the 'item delimited by' convention. However, I think to be reasonably parsable by programming language syntax parser, it would also need to be in the 'put statement. As in:

put item z delimited by "#" of item y [delimited by ","] of line x of it & return after temp

Otherwise, the interpreter has to some how remember the delimiter associated with each variable to differentiate between "item z" and "item y"
Paul Dupuis
ResearchWare, Inc.

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 5276
Joined: Wed May 06, 2009 2:28 pm
Location: New York, NY

Re: Multiple itemDelimiters

Post by dunbarx » Sun Sep 28, 2014 1:24 am

Good point.

Unless the parser always knows that if no "delimited by" phrase is seen, then either the default delimiter is to be used (comma), or the most recent explicit itemdelimiter declaration. I think this should be readily implemented.

Craig

paul@researchware.com
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 71
Joined: Wed Aug 26, 2009 7:42 pm
Location: Randolph, MA USA
Contact:

Re: Multiple itemDelimiters

Post by paul@researchware.com » Sun Sep 28, 2014 1:45 am

Craig,

Exactly my point. Using a parser rule that if no "delimited by" phrase is seen, then either the default delimiter is to be used (comma), or the most recent explicit itemdelimiter declaration the original sample code (below) would not be parsed as intended.

Code: Select all

repeat with x = 1 to the number of lines of it
  repeat with y = 1 to the number of items of line x of it
    repeat with z = 1 to the number of items delimited by "#" of item y of line x of it
     put item z of item y of line x of it & return after temp
    end repeat
  end repeat
end repeat
The line put item z of item y of line x of it & return after temp would see that no "delimited by" clause was present, so it would use comma or the most recent delimited by (from the repeat above) of "#" and both item z and item y would look for "#" delimiters.

I mean, you can write a parser to bind a variable (z or y) to remember that if the variable is preceded by "item" remember the previously bound item delimiter, but from a interpreter coding perspective, it is much more complicated than using the "delimited by syntax in the put statement itself.
Paul Dupuis
ResearchWare, Inc.

phaworth
Posts: 592
Joined: Thu Jun 11, 2009 9:51 pm

Re: Multiple itemDelimiters

Post by phaworth » Sun Sep 28, 2014 4:21 am

Aplogies, I just missed the "delimited by" clause from the put by oversight.
It would make coding simpler to use the repeat for format:

Repeat for each item ritem delimited by "#" of item x of line y of tText
put ritem.....

I'm now thinking this syntax could be used to extend LC to include generic chunks. We'd need a new keyword, perhaps "chunk", and then we could:

put chunk 4 delimited by " $" of chunk 1 delimited by "*".....

Pete

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 5276
Joined: Wed May 06, 2009 2:28 pm
Location: New York, NY

Re: Multiple itemDelimiters

Post by dunbarx » Mon Sep 29, 2014 4:42 pm

Paul.

Phaworth's typo aside, are we all now on the same page? I think all the new ideas, especially a new keyword (and why not "chunk"? It is not currently a native word) would allow exquisite control over text parsing.

What office do we protest outside of? I will make placards.

Craig

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 5892
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: Multiple itemDelimiters

Post by FourthWorld » Mon Sep 29, 2014 4:57 pm

You could submit it to the request queue:
http://quality.runrev.com/

If there's enough interest in this you may be able to find someone in the community to implement it.
Richard Gaskin
Community volunteer LiveCode Community Liaison

LiveCode development, training, and consulting services: Fourth World Systems: http://FourthWorld.com
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 5276
Joined: Wed May 06, 2009 2:28 pm
Location: New York, NY

Re: Multiple itemDelimiters

Post by dunbarx » Tue Sep 30, 2014 4:18 pm

Enhancement logged. Mark replied:
Hi Craig,

I wonder if a slightly different approach might be better here.

What you are essentially doing is saying you have a string which is a sequence
of nested lists. At the first level, the list is delimited by ',', and at the
second level the list is delimited by '#'.

I wonder if extending the split command to handle arbitrarily deep nestings
might be more appropriate. For example:

split it by "," then "#"

Would give you an array:
it[1][1]="ggg"
it[2][1]="hhh"
it[3][1]="jjj"
it[3][2]="kkk"
it[4][1]="xxx"
it[4][1]="yyy"

Essentially, the command would have an identical effect as follows:
split it by ","
repeat with i = 1 to the number of elements of it
split it by "#"
end repeat

The loop would then become:

repeat for each line tLine in it
split tLine by "," then "#"
repeat for each element tItem1 in tLine
repeat for each element tItem2 in tItem1
put tItem2 & return after temp
end repeat
end repeat
end repeat

It provides a similar feature as the one you suggest, except that could perhaps
be a bit more efficient (only one pass is ever done over the input string) and
avoids having to manage multiple delimiters simulataneously.

Warmest Regards,

Mark.


Apparently there is interest in Scotland. I replied:
Mark.

That is very nice indeed for the "split" command, to create a multi-level array in one pass. But what if I want to parse my data in the clear? In that case, as the lower portion of that thread delves into, it is necessary to declare a few explicit delimiters and place them as required in an implicit "nesting".

Both enhancements would be fantastic, but only upping the "split" command would leave much on the table, don't you agree?

Anyway, thank you for getting back to me. You are the Dan Winkler of the 21st century.

Craig

paul@researchware.com
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 71
Joined: Wed Aug 26, 2009 7:42 pm
Location: Randolph, MA USA
Contact:

Re: Multiple itemDelimiters

Post by paul@researchware.com » Tue Sep 30, 2014 4:30 pm

I concur with Craig, both an enhanced split command for dimensional data beyond 2 and an 'chunk x delimited by y' for would be very helpful additions for people dealing with large amounts of data.

Sometimes you have a big blob of data you want to parse all at once (split) and some time you want a specific item from a specific line (chunk 3 delimited by "#" of chunk 2 delimited by "," of chunk 6 delimited by tab of tData)
Paul Dupuis
ResearchWare, Inc.

[-hh]
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 1680
Joined: Thu Feb 28, 2013 11:52 pm
Location: Göttingen, DE

Re: Multiple itemDelimiters

Post by [-hh] » Sat Oct 18, 2014 3:34 am

Better give it up.

Originally I intended to support your request. It's very creative and kind of trying to introduce multiplication after one has addition. But then I saw today arguments in the use-list: ...

Edit. If you can't see these posts, they can't be there.
Last edited by [-hh] on Sun Oct 19, 2014 4:56 am, edited 1 time in total.
shiftLock happens

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 5276
Joined: Wed May 06, 2009 2:28 pm
Location: New York, NY

Re: Multiple itemDelimiters

Post by dunbarx » Sun Oct 19, 2014 3:06 am

Hermann.

I did not see the posts on the use-list you refer to.

Oh, and never give up.

Craig

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 5276
Joined: Wed May 06, 2009 2:28 pm
Location: New York, NY

super itemDel redux

Post by dunbarx » Tue May 15, 2018 6:16 pm

I am re-opening this issue (See "Multiple itemDelimiters" in the forum). I had asked that the language be enhanced with additional chunkDelimiters, "parcels", "servings", whatever.

So one could have item 3 of parcel 4 of serving 5 of myText, all those gadgets defined beforehand.

Phaworth cleverly suggested a variant on this theme, created on the fly:

Code: Select all

repeat with z = 1 to the number of items delimited by "#" of line x of it
Both methods seem incredibly powerful and useful. I am wondering if this might yet gain traction.

Craig Newman

Post Reply

Return to “Feature Requests”