Regex to remove multiple return characters

LiveCode is the premier environment for creating multi-platform solutions for all major operating systems - Windows, Mac OS X, Linux, the Web, Server environments and Mobile platforms. Brand new to LiveCode? Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

stam
Posts: 2741
Joined: Sun Jun 04, 2006 9:39 pm
Location: London, UK

Re: Regex to remove multiple return characters

Post by stam » Thu Apr 06, 2023 7:16 pm

Hi Craig,
I'm writing this at work, waiting for a case to arrive, so really not able to test.

If you want to ensure all whitespace chars are considered an 'empty line' if there is no text, then the regex above is fairly foolproof on that (having said that, that too is untested ;) )
I'll have to check at some point, but I'm sure you're right on the 'empty' thing.

S.

Cairoo
Posts: 107
Joined: Wed Dec 05, 2012 5:54 pm

Re: Regex to remove multiple return characters

Post by Cairoo » Thu Apr 06, 2023 7:26 pm

@stam,
I agree with you that the regex "\R" is the correct one.
I should have said, it seems like LiveCode's replaceText function wrongly interprets the regex "\n" as "\r", and doesn't correctly interpret the regex "\r". Sorry about swapping the two.
Here's what convinced me of that:
This code:

Code: Select all

local tText
put cr & cr & cr & "test" & cr & cr & cr into tText
put replaceText(replaceText(replaceText(tText,"^\n",""),"\n\n",""),"\n$","")
removes the empty lines delimited by the CR characters, whereas is should not, because the regex "\n" should not match CR.

And this code:

Code: Select all

local tText
put cr & cr & cr & "test" & cr & cr & cr into tText
put replaceText(replaceText(replaceText(tText,"^\r",""),"\r\r",""),"\r$","")
should remove the CR characters, but doesn't.

Gerrie

PS: Yes, I'm with you on the "filter lines ...", it does work perfectly. I'm only trying to point out what seems to be a bug in the replaceText function.

Cairoo
Posts: 107
Joined: Wed Dec 05, 2012 5:54 pm

Re: Regex to remove multiple return characters

Post by Cairoo » Thu Apr 06, 2023 9:09 pm

Sorry everyone, I found the reason for what seemed to be a bug in the replaceText function, but isn't.

My confusion came as a result of CR and LF both being considered by LiveCode as control character 10. I expected CR to mean control character 13, but according to the LC dictionary, return, CR and LF are synonyms, and they all mean control character 10. It's confusing because one would expect CR to mean "Carriage Return", which is control character 13.

So sorry about confusing you all. With the above in mind, the replaceText function does interpret the "\n" and "\r" regex's correctly. It was me who didn't realize CR and LF in LiveCode means the same thing. However, I wish it didn't, but it's just one of those things that can't be changed.

Gerrie

jameshale
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 474
Joined: Thu Sep 04, 2008 6:23 am
Location: Melbourne Australia

Re: Regex to remove multiple return characters

Post by jameshale » Fri Apr 07, 2023 3:01 am

Goodness I didn't realise there would be so much discussion.
I tried a few of the suggestions and some worked some of the time.
Interestingly replacetext is meant to do a "replace all" but only does this sometimes.
Anyway after sleeping on it I realise there was a much simpler solution in my case.
You see it is I that add all these returns.
I guess it is my age showing as I tend to program incrementally in that after my initial burst I then try things out (as one does in LC) and fix any errors. However these are far too often band aids. In this particular case I was updating my app to use the responsive layout to replace all the code I had written to check the screen sizes and orientations that could play havoc with my apps appearance. needless to say I had many band aids in place. One such band aid was to centre a block of text vertically by adding empty lines (return characters) at the beginning of the text block. This works beautifully but inadvertently in my moving to use the responsive layout I called this routine twice in one place.
The first solution I thought of was a to get rid of the result of the first pass, hence my original question.
On waking today, I realised I only need to reinstate the initial text block for the second pass. So just a one liner placed appropriately.

LC's regex is a bit funny. I usually test out my regex in BBEDIT as it can visually display the "found" chars. A find/replace all of this grep string "^\r" did what I wanted. unfortunately LC didn't want to play ball. Hence my original question.

That you all for your discussion.

stam
Posts: 2741
Joined: Sun Jun 04, 2006 9:39 pm
Location: London, UK

Re: Regex to remove multiple return characters

Post by stam » Fri Apr 07, 2023 5:35 pm

jameshale wrote:
Fri Apr 07, 2023 3:01 am
LC's regex is a bit funny. I usually test out my regex in BBEDIT as it can visually display the "found" chars. A find/replace all of this grep string "^\r" did what I wanted. unfortunately LC didn't want to play ball. Hence my original question.
While we all love bbedit, for regex, I cannot recommend https://regex101.com enough. It's free, it's got an amazing support for learning and testing regex (including a regex debugger) and lets you store your various regex code snippets privately or publicly. It also can show you the captured text as well as captured groups (the groups are what is really needed for matchText). And importantly for coding in various languages it has support for the multiple flavours.

jameshale
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 474
Joined: Thu Sep 04, 2008 6:23 am
Location: Melbourne Australia

Re: Regex to remove multiple return characters

Post by jameshale » Sat Apr 08, 2023 1:11 am

@stam, yes it does seem a good tool.
^\r was not a good choice here but…^\R was.
Thanks.

Post Reply

Return to “Getting Started with LiveCode - Experienced Developers”