Various ways of matching a line

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: Klaus, FourthWorld, heatherlaine, kevinmiller

Post Reply
MichaelBluejay
Posts: 155
Joined: Thu Jul 01, 2010 11:50 am

Various ways of matching a line

Post by MichaelBluejay » Thu Oct 10, 2019 5:57 pm

It's easy to accidentally match the wrong line. Here I'll offer some solutions, and invite alternatives. (Note: I'm adding good alternatives that were suggested in subsequent posts to this initial post, for one-stop shopping, since the whole point is to make this post a resource for those trying to solve this problem.)

Let's say we have a variable "tData" that contains some lines, with each line being a tab-delimited list, with the first column being an ID.
(The "t" that starts the variable name is standard LiveCode convention to identify temporary variables. You don't have to follow that convention, but you'll see it a lot on the forum and in the documentation.)

Code: Select all

1  heather founder
42 mary    engineer
3  john    intern
2  pat     CEO
5  bob     accountant
If we want to retrieve the line of data for id #5, this would work: put line lineoffset(5, tData) of tData into theLine -- returns line #5

But if we try to get the line for id #2, we get the wrong line: put line lineoffset(2, tdata) of tData into theLine -- returns line #2

Similarly, if any of the other columns contain the string we're searching for, we'll accidentally match the first of those lines as well.

There are various solutions to this problem.

(1) Prepend zeroes. When putting the data in the variable, prepend zeroes to all IDs less than a certain number, e.g. 001, 002, 003. Then match with: lineoffset(002 & tab, tData)

This won't accidentally match 42, and won't match any other line that might happen to use the phrase 002, because we're searching for only 002 followed by a tab.

(2) Match on the cr and a tab. The return character in LC is referenced by either "cr" or "return". So we could try: lineoffset(cr & 2 & tab, cr & tData). That ensures that we don't match the line that starts with 42. By including the tab in our search, we avoid matching lines that start with 21, 25, 28, etc. To ensure that we can match the first line, which doesn't have a return character before it, we prepend "cr &" to the text we're searching. That also solves the problem where the line number returned would be the line *before* the data we want (we match the line with the return character, which is the previous line): by adding the "cr" to the data we're searching for, we've increased the number of lines in the search target by 1. [thanks to klaus for the "cr &" trick.)

(3) Use LC's "filter" command. Filter returns only the stuff from a container that you want. If you don't put the result into a new container, it destroys data in the original container.

Code: Select all

filter lines of fld "data" matching "cat*" -- the field will be modified, with non-matching lines erased
filter items of tData matching "cat" into results -- original variable not modified; item must match "cat" exactly since we didn't use *
We can use filter to get the data for our original problem:

Code: Select all

filter lines of tdata matching ("2" &tab& "*") into tResults
or

Code: Select all

filter lines of tData with regex pattern "2\t" into tResults
(Thanks to bwmilby for pointing out the filter command.)

(4) Loop through the lines. By looping through the lines, we can test for an exact match:

Code: Select all

put tExampleStringAbove into tContainer
put findLineOfId(2,tContainer) into lineNumber

function findLineOfId idNum,container
   repeat with loopLineNum = 1 to the number of lines in container
      if line loopLineNum of container begins with idNum & tab then
         return loopLineNum
      end if
   end repeat
end findLineOfId
For a function that will match *any* field (not just the first one), see AxWald's post below.

(5) Put the data into an associative array. An associative array is an array where the keys are strings rather than integers. For example:

put myArray["color"] = "blue"

We could write a function to loop the data, storing it into an associative array for future use.

Code: Select all

set the itemDelimiter to tab -- default is comma
repeat for each line theLine in tData
   put item 2 of theLine into myArray[item 1 of theLine]["name"]
   put item 3 of theLine into myArray[item 1 of theLine]["job"]
end repeat
Given an ID, we can then access the data like so:

put myArray[5]["name"] into tName
put myArray[4]["job"] into tJob

There are probably other ways to do this.
Last edited by MichaelBluejay on Sat Oct 12, 2019 5:03 pm, edited 14 times in total.

Klaus
Posts: 11262
Joined: Sat Apr 08, 2006 8:41 am
Location: Germany
Contact:

Re: Various ways of matching a line

Post by Klaus » Thu Oct 10, 2019 6:14 pm

(2) Match on the cr also. This method isn't ideal because it comes with more problems. The return character in LC is referenced by either "cr" or "return". So we could try: lineoffset(cr & 2, tData). That ensures that we don't match the line that starts with 42. However, we would accidentally match lines that started with 21, 25, 28, etc. Also, we could never match the first line, because there's no <cr> before it. We'd have to have a blank line at the start. Finally, when matching (cr & something), the line number returned is the line *before* the one that has the data we want, so we'd have to add 1 to the result.
The trick to avoid the addition is:

Code: Select all

...
put lineoffset(cr & 2, CR & tData) into tLineNumber
...

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3321
Joined: Sun Jan 07, 2007 9:12 pm
Location: Bochum, Germany

Re: Various ways of matching a line

Post by bn » Thu Oct 10, 2019 11:35 pm

Hi Michael,

here is an example to turn your data into an array.
It supposes that your data is in a field. I take it as it is as words separated by spaces

Code: Select all

1  heather founder
42 mary    engineer
3  john    intern
2  pat     CEO
5  bob     accountant

Code: Select all

on mouseUp
   local tData, tArray
   put field 1 into tData
   
   put tData into tArray
   
   split tArray by return and space
   
   repeat for each line aLine in tData
      put word 2 of aLine into tArray[word 1 of aLine]["name"]
      put word 3 of aLine into tArray[word 1 of aLine]["job"]
   end repeat
   
   answer tArray[2]["name"] & cr & tArray[2]["job"]
end mouseUp
Kind regards
Bernd

bwmilby
Posts: 278
Joined: Wed Jun 07, 2017 5:37 am
Location: Henrico, VA
Contact:

Re: Various ways of matching a line

Post by bwmilby » Fri Oct 11, 2019 2:36 am

I would also check out the filter command. Converting to an array would be good if you needed to repeatedly go into your list and pull out values. If it is more of a one shot type of thing, filter may be faster. You can use a regex to specify a number at the beginning of a line. It can either be destructive (change the initial variable) or put the result into a new container.
Brian Milby

Script Tracker https://github.com/bwmilby/scriptTracker

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 660
Joined: Wed Nov 22, 2006 3:42 pm
Location: France
Contact:

Re: Various ways of matching a line

Post by Thierry » Fri Oct 11, 2019 9:01 am

Hi Michael,

Here is a naive but functionnal example,
using the matchText() function without transforming your input datas
and putting cols value in respective variables...

Code: Select all

/*
1  heather founder
42 mary    engineer
3  john    intern
2  pat     CEO
5  bob     accountant
*/

on test_CSV_MichaelBluejay
   local T
   put line 2 to 6 of the script of me into T
   answer search( T, "id", 2)
   answer search( T, "name", "mary")
   answer search( T, "function", "accountant")
end test_CSV_MichaelBluejay

function search @T, labelColumn, whichOne
   local C1,C2,C3 -- columns
   switch labelColumn
      case "id"
         if matchText( T, "(?m)^" & whichOne & "\s+([^\s]+)\s+(.*?)$", C2, C3) then \
               return format("for id %d get:\nname: %s\nfunction: %s",whichOne, C2,C3)
         break
      case "name"
         if matchText( T, "(?m)^(\d+)\s+" & whichOne & "\s+(.*?)$", C1, C3) then \
               return format("for name %s get:\nid: %d\nfunction: %s",whichOne, C1,C3)
         break
      case "function"
         if matchText( T, "(?m)^(\d+)\s+([^\s]+)\s+" & whichOne & "$", C1, C2) then \
               return format("for function %s get:\nid: %d\nname: %s",whichOne, C1,C2)
         break
      default
         return "Don't understand: " & labelColumn
   end switch
   return "for " &labelColumn &": "& whichOne & " found nothing!"
end search
Regards,

Thierry
Thierry Douez - https://sunny-tdz.com
Pourquoi tant de notes lorsqu'il suffit de jouer les plus belles... [Barbara]

AxWald
Posts: 368
Joined: Thu Mar 06, 2014 2:57 pm

Re: Various ways of matching a line

Post by AxWald » Fri Oct 11, 2019 10:32 am

Hi,
MichaelBluejay wrote:
Thu Oct 10, 2019 5:57 pm

Code: Select all

put line lineoffset(5, tData) into theLine -- returns 5
Nope. Not at all. Didn't you test this? ;-))
  1. First, "put line [aNumber] into theLine" will throw "compilation error (Chunk: missing chunk)" - line x of what?
  2. Second, "line lineoffset(5, tData) of tData" would return "5 bob accountant" - not "5".
MichaelBluejay wrote:
Thu Oct 10, 2019 5:57 pm
There are probably other ways to do this.
Yep. When working with tabular data for some weeks you'll soon realize that "lineOffset()" has only limited use.
As you found out, it returns the number of the first line containing your search string anywhere in it - which is quite useless with short & ubiquitous search strings.
What you'll rather want is a function that finds you a complete match in a certain "field" of your data, like this:

Code: Select all

function findLine what, theStrg, theField, theDelim
   if theField is empty then put 1 into theField     -- field number, default: 1
   if theDelim is empty then put tab into theDelim   -- itemdelimiter, default: tab
   set itemdel to theDelim
   put 0 into myCnt
   repeat for each line myLine in theStrg            -- find first occurence
      add 1 to myCnt
      if (item theField of myLine) = what then
         return myCnt
      end if                                         -- and return the line number
   end repeat
   return empty                                      -- or report "Nada!"
end findLine
Have fun!
Livecode programming until the cat hits the fan ...

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 6226
Joined: Wed May 06, 2009 2:28 pm
Location: New York, NY

Re: Various ways of matching a line

Post by dunbarx » Fri Oct 11, 2019 4:05 pm

Michael.

This is not the first time you have posted thoughtful musings on the innards of LC. Keep it up!

Craig

MichaelBluejay
Posts: 155
Joined: Thu Jul 01, 2010 11:50 am

Re: Various ways of matching a line

Post by MichaelBluejay » Fri Oct 11, 2019 5:22 pm

Yes, AxWald, I absolutely should have tested the code before I posted it. Usually I do, I forgot in this case, my bad, I'll try to remember for the future.

Thank you everyone for the thoughtful ideas. I incorporated them into the original post, to make it more like an article.

Post Reply

Return to “Getting Started with LiveCode - Complete Beginners”