I am writing some script to search through a set of text divided into paragraphs.
It searches for each character of the alphabet and returns a value that states the character, the paragraph number and the character number (i.e. a=1,2 is stating that a is the second character of the first paragraph). I have scripted it to run through a selected number of times and list them by character, i.e.:
a=1,2 a=2,5 a=3,7
b=2,8 b=2,50, b=3,1
etc.
All this is working well. Box ticked.
However, for the lesser characters, (q, z, etc) I am getting repeated answers, such as q=1,200 q=1,200 q=1,200.
What I want is where a character only appears once, that only one entry will occur.
In my method below, I go through, find the character, index it, then replace it with a ? then repeat, so when it repeats, it can't find the character because it's not there, but a placeholder character is there so it gets counted for the other characters. In theory, if the script fails to find a character, it should just jump out of the script, not repeat the output. It currently doesn't output anything for Z in my sample text because it doesn't exist in the text. But only where it appears at some stage will it repeat the output.
Hope that makes sense...
What is happening? Where did I go wrong?
Thanks in advance team.
Code: Select all
on getNumbers
--clear out the output field--
put empty into fld "Table"
--put the sample text into a container--
Put Fld "InputText" into tText
--delete all common punctuation--
replace space with empty in tText
replace "." with empty in tText
replace "," with empty in tText
replace ";" with empty in tText
replace "?" with empty in tText
replace "-" with empty in tText
replace "!" with empty in tText
replace quote with empty in tText
replace "'" with empty in tText
--establish alphabet to use... is there a better way of doing this?--
put "abcdefghijklmnopqrstuvwxyz" into tAlphabet
--work out how many entries to output--
Put Field "NoOfTimes" into z
--go through alphabet one letter at a time--
repeat for each char y in tAlphabet
--do the search for amount of times of the entries required--
repeat z times
--work out what the character number is--
get Offset(y,tText)
--if there isn't an instance of the character, exit--
if it = 0 then exit repeat
--hold the character index number for use in output--
put it into tCharOffset
--repeat the above process to find the paragraph index--
get paragraphOffset(y,tText)
if it = 0 then exit repeat
put it into tParaOffset
--remove the character so it doesn't get searched again in the next repeat and replace with a placeholder so it gets counted--
replace y with "?" in character tCharOffset of paragraph tParaOffset in tText
--output the character, paragraph number and character number--
put space & y & "=" & tParaOffset & comma & tCharOffset after tOutput
end repeat
--put a return after the series of output for each character--
put cr after tOutput
end repeat
--delete rogue character at start--
delete character 1 of tOutput
--show all of the output in user-readable form--
put tOutput into fld "Table"
end getNumbers