Parsing scripts for comments
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller
Parsing scripts for comments
I'm doing a stack that analyses text files of scripts from other stacks.
What is the simplest way to detect a comment?
I don't want to have to compare for "#" or "//" or "--" or "/* separately, so I'm sure that there must be a better way.
Any hints?
I'm sure Thierry could make a regex - but I'm not sure I could understand it
Kelly
What is the simplest way to detect a comment?
I don't want to have to compare for "#" or "//" or "--" or "/* separately, so I'm sure that there must be a better way.
Any hints?
I'm sure Thierry could make a regex - but I'm not sure I could understand it
Kelly
-
- VIP Livecode Opensource Backer
- Posts: 9663
- Joined: Wed May 06, 2009 2:28 pm
- Location: New York, NY
Re: Parsing scripts for comments
Hi.
Regex can do anything. I never use it.
But regex notwithstanding, what turns you off the above sort of ordinary LC gadgetry?
Craig
Regex can do anything. I never use it.
Why not? I assume you know that something like this is required (pseudo)I don't want to have to compare for "#" or "//" or "--" or "/* separately,
Code: Select all
repeat with y = 1 to the number of lines of yourScript
if "--" or "#" or "/" are in the first few non-space chars in line y of yourScript then
put y & return after accum
Craig
Re: Parsing scripts for comments
I don't think that sort of pseudo code would work. What I don't like the look of isrepeat with y = 1 to the number of lines of yourScript
if "--" or "#" or "/" are in the first few non-space chars in line y of yourScript then
put y & return after accum
Code: Select all
if ("--" is in line1) OR \
("#" is in line1) OR \
et. ad nauseam ...
-
- VIP Livecode Opensource Backer
- Posts: 9663
- Joined: Wed May 06, 2009 2:28 pm
- Location: New York, NY
Re: Parsing scripts for comments
Still not sure what the issue is. In a field 1 with:
One get 2, 4 and 6, the lines that are comments.
In a sense, this is very mundane, not nearly as sexy as some regex outrage. And there may be some tweaking needed, for example if spaces precede the comment chars. Also, in any LiveCode session, one must choose a single comment character string, so the fact that all three are present in the example above is overkill.
But I find this simple and ordinary, if unexciting, and ask again what disturbs you about doing something similar to this.
Craig
And this in a button script:aasd
-- abcdefg
kjkjhj
// typo
xxxx
# liveCode
dgdg
Code: Select all
on mouseup
get fld 1
repeat with y = 1 to the number of lines of it
if char 1 of line y of it is in "/#-" then put y & return after accum
end repeat
answer accum
end mouseup
In a sense, this is very mundane, not nearly as sexy as some regex outrage. And there may be some tweaking needed, for example if spaces precede the comment chars. Also, in any LiveCode session, one must choose a single comment character string, so the fact that all three are present in the example above is overkill.
But I find this simple and ordinary, if unexciting, and ask again what disturbs you about doing something similar to this.
Craig
Re: Parsing scripts for comments
Or just use regex;)
Literally this was discussed a couple of days ago here: viewtopic.php?f=7&t=35679&sid=3e7c85626 ... a61#p20405
The limitation in Craig’s solution is that you can’t detect multi-line comments delimited by /* and */, keeping in mind these can be used to also comment in the middle of a line. So just parsing the start of a line won’t do. Also, even for single line comments I tend to add these to the end of a line...
You can of course do all of this with “normal” LC script but that rapidly produces a more complex/lengthy script, where regex would be a lot simpler... in the example in the link above our delim1 would be “//“, “#”, “--“ or “/*”, and your delim2 would be return or “*/“ respectively.
Literally this was discussed a couple of days ago here: viewtopic.php?f=7&t=35679&sid=3e7c85626 ... a61#p20405
The limitation in Craig’s solution is that you can’t detect multi-line comments delimited by /* and */, keeping in mind these can be used to also comment in the middle of a line. So just parsing the start of a line won’t do. Also, even for single line comments I tend to add these to the end of a line...
You can of course do all of this with “normal” LC script but that rapidly produces a more complex/lengthy script, where regex would be a lot simpler... in the example in the link above our delim1 would be “//“, “#”, “--“ or “/*”, and your delim2 would be return or “*/“ respectively.
Re: Parsing scripts for comments
Then, what's the use to make one?I'm sure Thierry could make a regex - but I'm not sure I could understand it
Almost my friend, almost...Regex can do anything. I never use it.
Happy Easter,
Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
-
- VIP Livecode Opensource Backer
- Posts: 9663
- Joined: Wed May 06, 2009 2:28 pm
- Location: New York, NY
Re: Parsing scripts for comments
What Stam says about limitations is all true, as well as the extra work to include all such variants from "ordinary" commenting
But I never knew, or forgot, that the tags /* and */ can be used inside a working line of code. I would never do this, but cool to know...
Craig
But I never knew, or forgot, that the tags /* and */ can be used inside a working line of code. I would never do this, but cool to know...
Craig
Re: Parsing scripts for comments
The script collection I am working with uses all four variants, with and without a space between the delimiter and the text as well as having extensive multiline comments - ALL within ONE script. The script is 25K, so there are lots of things to comment on, but the stylistic variation really shows how flexible the LiveCode parser is. Conceptually, it may be simple Craig, but the devil is in the details as always!
In the end, I lifted some code from the GXL2 editor and made it work for my situation, and it seems to be good enough for what I need.
Thanks for the suggestions.
PS Thierry - don't give up. On another project I actually used a regex wildcard on a filter and it worked a treat in less than 10 characters. I just have to keep practicing so that I don't forget it before I need it again.
In the end, I lifted some code from the GXL2 editor and made it work for my situation, and it seems to be good enough for what I need.
Thanks for the suggestions.
PS Thierry - don't give up. On another project I actually used a regex wildcard on a filter and it worked a treat in less than 10 characters. I just have to keep practicing so that I don't forget it before I need it again.
Re: Parsing scripts for comments
Here is one way with regex.
the regex on top, the text to parse on the left and the result on the right (black bg)
Code: Select all
if sunnYmatchAll( T, rex, A, N, "both") then
repeat for each key K in A
if A[ K][ 1][ 2] is not empty then
get colorString
else if A[ K][ 2][ 2] is not empty then
get colorComment
end if
put A[ K][ 0] into Z
set the forecolor of char Z[ 0] to Z[ 1] of fld "fOUT" to IT
end repeat
end if
Thierry
PS: there are technics to write regex so they are easier to be read....
Last edited by Thierry on Fri Apr 09, 2021 6:16 am, edited 1 time in total.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
Re: Parsing scripts for comments
Dear Thierry, that is one very fine and mind-bending piece of regex
-------------------
edit: This works really well and picks out all comments; but it also picks up any text within quotation marks, was this the intent?
Re: Parsing scripts for comments
Hi Stam,
Yes, that's how it works without using my sunnYrex library.
This can be avoided, but only if you can manage back references
in the replacement text
The reason behind is I need to parse strings to avoid false positives;
e.g " xxxxx -- # // not a comment "
In my demo, I use this to colorize the strings and the comments with 2 different colors;
thus the proof that we know if it is a comment or a string.
It's possible to filter the resulting array for comments only.
I see if I get some time free to make a variant of this code
to achieve what you would like to have.
And finally, this was an interesting exercice
and at the same time a response to kelly's OP
Be well,
Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
-
- VIP Livecode Opensource Backer
- Posts: 9837
- Joined: Sat Apr 08, 2006 7:05 am
- Location: Los Angeles
- Contact:
Re: Parsing scripts for comments
An intriguing challenge, Kelly. What are you going to do with the output?
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: Parsing scripts for comments
I'm trying to make charts like this to understand the program flow after parsing the text files generated by Brian Milby's invaluable ScriptTracker:
Do you know of any algorithmic way of sorting out the lines? Right now I manually adjust the nodes and then save their positions so that they will be "pretty" when the script is reopened subsequently.-
- VIP Livecode Opensource Backer
- Posts: 9663
- Joined: Wed May 06, 2009 2:28 pm
- Location: New York, NY
Re: Parsing scripts for comments
Hi.
When you say "sorting out the lines", what do you mean? I posted a small stack last week in another thread that allows one to move the various nodes and keep the connecting lines intact:
But I am not sure this is what you meant.
Craig
When you say "sorting out the lines", what do you mean? I posted a small stack last week in another thread that allows one to move the various nodes and keep the connecting lines intact:
But I am not sure this is what you meant.
Craig