Page 1 of 2

Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 5:53 pm
by Simon Knight
Hi,

I wish to changes some file names from camel case to delimited words e.g.
MyGreatLivecodeProject.livecode changes to My-Great-Livecode-Project.livecode
The only method that I can see having looked in the dictionary is to examine the Ascii character values and use the comparison that if greater than 96dec then lower case.

However, before I start writing code I thought I would ask if there is a more elegant way of inserting the delimiter character.

best wishes

S

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 6:06 pm
by richmond62
Chop the "long thing" up into individual words and pop them in a list, then run through the list bunging
a hyphen before each one.

I'll try and have a go at it after supper.

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 6:39 pm
by dunbarx
Hi.

I am assuming you already have the camel case string. Otherwise, Richmond is right, head it off at the pass.

You have to loop through each char. If you find a capital letter, you then have to find the next one, or there will be no way to parse the word the first capital letter belonged to. With these start/end pairs, you can then add your delimiter.

Craig

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 7:14 pm
by richmond62
Of course, being a sadistic bast*rd I wonder how you'll deal with

АзСъмЕдинШотландицОтПловдив :D

For those of you who care (cough, cough) that says, in Bulgarian, "I'm a Scotsman from Plovdiv"

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 7:26 pm
by Simon Knight
OK I surrender.....

Yes the filenames exist and use camel case so the problem is finding the word boundaries. I was hoping for a command that could be used in a statement like "if char n is a capital then..." but fear I will have to resort to ascii values.

And yes my file names are mostly in English.

best wishes

S

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 7:29 pm
by Simon Knight
OK I surrender.....

Yes the filenames exist and use camel case so the problem is finding the word boundaries. I was hoping for a command that could be used in a statement like "if char n is a capital then..." but fear I will have to resort to ascii values.

And yes my file names are mostly in English.

best wishes

S

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 7:46 pm
by richmond62
Screenshot 2020-11-23 at 20.41.16.png
-
I had a very good supper.

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 8:20 pm
by dunbarx
Simon.

Are you dealing with existing camel case strings, or building them yourself?

Richmond is all about building them yourself. I am all about parsing existing strings.

Craig

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 8:33 pm
by richmond62
Richmond is all about building them yourself.
I have been building myself for years: that's one of the reasons I look like a concrete monstrosity. 8)

My steam-punk stack will "do-the-do" with any came case string you throw at it.

OUCH!

No it won't . . . ONLY those using the BASIC LATIN set:

https://www.unicode.org/charts/PDF/U0000.pdf

That's ASCII to living fossils like myself.

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 10:10 pm
by Simon Knight
Great stuff - so I'm in a new gang - the living fossils!

I love your variable names - have you ever worked in cryptology - fDICEDCARROTS, LYNE ?!!!

Thanks for the stack and reminding me of codepoints - bloody UTF almost as bad as XML and CSV. (There, that will start a fight).

Craig - yes I already have the camel text as I used it in many file names and I have decided to rename the files with a dash used to delimit key words. Key words being universal tags that will be visible anywhere in any OS that allows long file names, unlike extended attributes (I'm on a mac at the moment)

best wishes

S

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 11:33 pm
by SparkOut
Simon Knight wrote:
Mon Nov 23, 2020 10:10 pm
bloody UTF almost as bad as XML and CSV. (There, that will start a fight).
or at least, elicit comment. UTF should have been implemented at the start, and been the de facto "ASCII" without the constant reworking to become an emoji repository. There. Now let the punches fly.

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 11:37 pm
by dunbarx
Simon.

Have I, lost track of the original question?

Didn't you want to parse words out of a string, based on the positions of Capital letters within that string? And reformat it with delimiter characters between those words? This seems like a ten-line handler, but I am not sure of the direction anymore.

Craig

Re: Identifying words in camel case phrases

Posted: Mon Nov 23, 2020 11:46 pm
by SparkOut
I think the simple upper/lower case word delimiter based on 7 bit ASCII encoding was the original surmise, but got subverted by questions about non-English or other extended character set considerations.

Re: Identifying words in camel case phrases

Posted: Tue Nov 24, 2020 12:29 am
by dunbarx
Aha.

Just as I thought.

Simon, you can turn this into a function if you want, but if you have your camelCase string in a field 1 and this in a button script:

Code: Select all

on mouseUp
   get fld 1
   put 1 into y
   repeat the number of chars of it
      add 1 to y
      if charToNum(char y of it) >= 65 and  charToNum(char y of it) <= 90 then
         put "-" before char y of it
         add 1 to y
      end if
   end repeat
   answer it
end mouseUp
Craig

Re: Identifying words in camel case phrases

Posted: Tue Nov 24, 2020 7:43 am
by richmond62
got subverted
I'm yer man. :D