Processing name, address, phone & URLs - text (pattern) manipulation libraries for Livecode?
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller
Processing name, address, phone & URLs - text (pattern) manipulation libraries for Livecode?
Hi everyone,
I'm wondering if knowledgable people here are aware of any good text processing libraries for Livecode.
The first problem I'm interested in is processing address information - where a user may have copied name, address and URL information, and (as reliably as possible) breaking this up into constituent elements.
Examples include recognising cities, countries, phone numbers, rules for breaking name strings etc. I'd also be interested in ability to identify and extract URIs from a text (which might be in html format or plain text). Website URLs, email addresses, twitter handles - that sort of thing.
A lot of this string pattern matching, and I can think of lots of ways of doing this, but it occurs to me it's a pretty standard problem, so it's likely there is an existing solution.
Perhaps there is a more general text processing library that allows one to specify a set of rules and actions (e.g. processing a set of rules and building up results into a property array).
I can think of ways of doing all of this, but before I start rolling my own solution I thought it worth checking.
~ Rodney
I'm wondering if knowledgable people here are aware of any good text processing libraries for Livecode.
The first problem I'm interested in is processing address information - where a user may have copied name, address and URL information, and (as reliably as possible) breaking this up into constituent elements.
Examples include recognising cities, countries, phone numbers, rules for breaking name strings etc. I'd also be interested in ability to identify and extract URIs from a text (which might be in html format or plain text). Website URLs, email addresses, twitter handles - that sort of thing.
A lot of this string pattern matching, and I can think of lots of ways of doing this, but it occurs to me it's a pretty standard problem, so it's likely there is an existing solution.
Perhaps there is a more general text processing library that allows one to specify a set of rules and actions (e.g. processing a set of rules and building up results into a property array).
I can think of ways of doing all of this, but before I start rolling my own solution I thought it worth checking.
~ Rodney
Re: Processing name, address, phone & URLs - text (pattern) manipulation libraries for Livecode?
Hello,
this code extract all email adresses from a field:
this code extract all email adresses from a field:
Code: Select all
on MouseUp
put field 1 into testo
repeat forever
if matchText(testo, "((\w|\.)+@(\w|\.)+)" , trovato) then
put trovato & return after listaEmail
put matchChunk(testo, "((\w|\.)+@(\w|\.)+)" , inizio, fine)
put char fine to -1 of testo into testo
else
exit repeat
end if
end repeat
put ListaEmail
end MouseUp
Livecode Wiki: http://livecode.wikia.com
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w
-
- Livecode Opensource Backer
- Posts: 9359
- Joined: Fri Feb 19, 2010 10:17 am
- Location: Bulgaria
Re: Processing name, address, phone & URLs - text (pattern) manipulation libraries for Livecode?
I don't think you need any libraries, after all:
1. email addresses always have an ampersand (@) in them.
2. URLs always have "www." in them.
3. Telephone numbers usually contain multiple digit numbers.
4. Addresses almost always contain "street"/"avenue"/"boulevard"/"square"/"plaza"/"place"
or their abbreviations.
So . . . if you have, say, comma-delimited text strings containing these things in random order
running each line through a SWITCH statement and then reordering those items in a list field should not be
unduly difficult.
1. email addresses always have an ampersand (@) in them.
2. URLs always have "www." in them.
3. Telephone numbers usually contain multiple digit numbers.
4. Addresses almost always contain "street"/"avenue"/"boulevard"/"square"/"plaza"/"place"
or their abbreviations.
So . . . if you have, say, comma-delimited text strings containing these things in random order
running each line through a SWITCH statement and then reordering those items in a list field should not be
unduly difficult.
Re: Processing name, address, phone & URLs - text (pattern) manipulation libraries for Livecode?
erm... ampersand = "&", not "@"richmond62 wrote: ↑Thu Aug 12, 2021 10:37 amI don't think you need any libraries, after all:
1. email addresses always have an ampersand (@) in them.
2. URLs always have "www." in them.
3. Telephone numbers usually contain multiple digit numbers.
4. Addresses almost always contain "street"/"avenue"/"boulevard"/"square"/"plaza"/"place"
or their abbreviations.
So . . . if you have, say, comma-delimited text strings containing these things in random order
running each line through a SWITCH statement and then reordering those items in a list field should not be
unduly difficult.
Re: emails - all emails contain the '@' but not all '@' signify an email.
You probably not only want to detect the "@" but also assess the validity of the email format (for example stam@gmail is not a valid email - or sometimes people will address someone with an @ handle, for example @Richmond - not a valid email ). I've 'borrowed' the algorithm generously provided with the liveCloud starter solutions which works well:
Code: Select all
function isValidEmailFormat pEmail
# PURPOSE : returns boolean describing valilidty of email provided
return matchText(pEmail,"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$")
end isValidEmailFormat
And addresses can vary significantly and not include any of the keywords you mention, so that's not reliable either (for example my street address only consists of the name of a hill in London with no other qualifiers).
Not so straightforward once you dig into the detail...
And besides, i think the OP was asking if there was a ready made solution or should he roll his own...
-
- Livecode Opensource Backer
- Posts: 9359
- Joined: Fri Feb 19, 2010 10:17 am
- Location: Bulgaria
Re: Processing name, address, phone & URLs - text (pattern) manipulation libraries for Livecode?
Erm, Yes: the 'at' sign; at least in Bulgarian it haserm... ampersand = "&", not "@"
a name: кломба.
Re: Processing name, address, phone & URLs - text (pattern) manipulation libraries for Livecode?
That's all hungarian to me...
there's a world for it in Greek as well: Παπάκι, which means duckling - don't ask me why it's the name of the 'at' sign...
there's a world for it in Greek as well: Παπάκι, which means duckling - don't ask me why it's the name of the 'at' sign...