Page 1 of 1

Delete anything between "<" and ">"

Posted: Sat Nov 02, 2013 5:51 pm
by ARAS
Hello all,
How to delete anything between these "<" ">" characters and the characters.

For example

Code: Select all

<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table> 
Ouput
Header 1
Header 2
row 1, cell 1
row 1, cell 2
row 2, cell 1
row 2, cell 2
Thank you

Re: Delete anything between "<" and ">"

Posted: Sat Nov 02, 2013 6:40 pm
by Thierry
Hi,

Code: Select all

on mouseUp
   get replaceText( inputText, "(?m)<[^>]*>", empty)
   filter IT without empty
   put IT into outputText
end mouseUp
Regards,

Thierry

Re: Delete anything between "<" and ">"

Posted: Sat Nov 02, 2013 8:21 pm
by ARAS
Thierry wrote:Hi,

Code: Select all

on mouseUp
   get replaceText( inputText, "(?m)<[^>]*>", empty)
   filter IT without empty
   put IT into outputText
end mouseUp
Regards,

Thierry
Thank you so much Thierry. It works!!!

Could you explain the code below?

Code: Select all

get replaceText( inputText, "(?m)<[^>]*>", empty)
How did it empty words between those symbols?

ARAS

Re: Delete anything between "<" and ">"

Posted: Sat Nov 02, 2013 8:37 pm
by Thierry
ARAS wrote:
Thierry wrote: Thank you so much Thierry. It works!!!
Could you explain the code below?

Code: Select all

get replaceText( inputText, "(?m)<[^>]*>", empty)
How did it empty words between those symbols?

ARAS
That's all the magic of regular expression which,
in fact, has nothing magic :)

Ok, I split the regex:

Code: Select all

(?m)   -> multi-lines behavior
1st <  -> match this character
[^>]   ->  match anything but not >
[^>]*  ->  match anything but not >, zero or more times
last > ->  match this character
Does that make sense?

HTH,

Thierry

Re: Delete anything between "<" and ">"

Posted: Sat Nov 02, 2013 9:29 pm
by ARAS
Thanks Thierry,

You are a code magician :o

I made some experiments.

It makes sense now.

I've written my experiment details below. It might be helpful for somebody.

InputText

Code: Select all

<table border="1">
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>

My Input Text
Solution

Code: Select all

get replaceText( inputText, "(?m)<[^>]*>", empty)
Result

Code: Select all

Header 1
Header 2
row 1, cell 1
row 1, cell 2
row 2, cell 1
row 2, cell 2
My Input Text

Experiments
Code 1

Code: Select all

get replaceText( inputText, "(?m)<", empty)
Result

Code: Select all

table border="1">
tr>
th>Header 1/th>
th>Header 2/th>
/tr>
tr>
td>row 1, cell 1/td>
td>row 1, cell 2/td>
/tr>
tr>
td>row 2, cell 1/td>
td>row 2, cell 2/td>
/tr>
/table>
My Input Text

Code 2

Code: Select all

get replaceText( inputText, "[^>]>", empty)
Result

Code: Select all

<table border="1
<t
<tHeader 1</t
<tHeader 2</t
</t
<t
<trow 1, cell 1</t
<trow 1, cell 2</t
</t
<t
<trow 2, cell 1</t
<trow 2, cell 2</t
</t
</tabl
My Input Text
Code 3

Code: Select all

get replaceText( inputText, "[^>]*>", empty)
Result

Code: Select all

My Input Text
Code 4

Code: Select all

get replaceText( inputText, "(?m)<[^>]*>", A)
Result

Code: Select all

A
A
AHeader 1A
AHeader 2A
A
A
Arow 1, cell 1A
Arow 1, cell 2A
A
A
Arow 2, cell 1A
Arow 2, cell 2A
A
A
My Input Text
Best wishes,
ARAS