code writing standards

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller

Post Reply
dhobbs
Posts: 24
Joined: Tue Dec 15, 2009 6:25 pm

code writing standards

Post by dhobbs » Sat Mar 05, 2011 5:45 pm

Hi,
I've been working on a short program that calculates the formula weight of a given molecular formula. Conceptually it is very easy, and I've done the same thing using Java and Python, but as practice I'm trying to use LiveCode to do the same thing. Input is a formula like: "C6H5Cl1". In the past I've parsed the input into linked lists, such as: (C,H,Cl) and (6,5,1). I then use the position in the linked lists to eventually calculate the formula weight after searching another set of lists with the periodic table in them.

What I have done so far works, and it is fast enough for my uses. What I am interested in knowing is if there is a more efficient way of doing the same thing, and if there are standard code writing protocols that I'm not observing that would make the code better. For example, I could use an array instead of two lists, but I don't know if speed or readability would improve.

Code: Select all

on mouseUp
  put ("C","H","N","O","Cl","Br","S") into gElemDict  --list of element symbols
  put (12.01115,1.00794,14.0067,15.9994,35.453,79.904,32.064) into gNumDict  --list of atomic weights of elements
   put toUpper (text of fld "formula") into tMyVar  --simplifies formatting
   put "" into tElemList
   put "" into tNumList
   put "letter" into tCharType
   repeat with i=1 to (the length of tMyVar) 
      put char 1 of tMyVar into tElem
      delete char 1 of tMyVar
      if matchText (tElem, "[A-Z]") is true then 
         if tCharType is "number" then put comma after tNumList
         put tElem after tElemList
         put "letter" into tCharType
      else if isNumber (tElem) is true then 
         if tCharType is "letter" then put comma after tElemList
         put "number" into tCharType
         put tElem after tNumList
      end if
   end repeat
   delete the last char of tElemList
   repeat with i=1 to (number of items in tElemList)
      get item i of tElemList
      if (the length of it) = 2 then    --element symbols can only be two letters, the second one is lower case by convention
         put toLower (char 2 of it) into char 2 of item i of tElemList
      end if
   end repeat
   put tElemList into fld "Field1"
   put tNumList into fld "Field2"
   
   put 0 into tTemp
   repeat with  i=1 to (number of items in tElemList)
      put item i of tElemList into tElem
      put item (itemOffset (tElem, gElemDict)) of gNumDict into tAtomicMass
      put item i of tNumList into tNum
      put (tNum * tAtomicMass) + tTemp into tTemp
   end repeat
   put tTemp into fld "Field3"
end mouseUp
The output of this code is a number that adds up all the element weight. Next I will calculate the percent composition of each element, but that will be easy from here.

Any comments? I made it work (mostly by trial and error), but would appreciate any feedback to help me program better.

Thanks,

--Doug

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10305
Joined: Wed May 06, 2009 2:28 pm

Re: code writing standards

Post by dunbarx » Sat Mar 05, 2011 9:43 pm

Parsing exercises are always fun, and frequently exasperating. It is sometimes a matter of style where verbosity and readability are in either conflict or cahoots. I bet if you read your code, you could find ways to shorten it without losing anything. You obviously are capable.

Anyway,I wrote this quickly

Code: Select all

on mouseup
     put ("C","H","N","O","Cl","Br","S") into elemList  --list of element symbols
     put (12.01115,1.00794,14.0067,15.9994,35.453,79.904,32.064) into weightList 
   put fld "formula" into formula --source for chemical formula in this field
   repeat with y = 1 to the length of formula
      if char y of formula is not a number then
         put char y of formula after temp
         put comma after subscripts
      else
         put comma after temp
         put char y of formula after subscripts
      end if
   end repeat
   
   replace ",," with comma in temp
   replace ",," with comma in subscripts
   delete char 1 of subscripts 
   
   repeat with y = 1 to the number of items of temp
      add item itemOffset(item y of temp,elemList) of weightList * item y of subscripts to accum
   end repeat
   answer accum
end mouseup
The housekeeping of all those commas might be seen as cleaning up sloppy coding. Maybe. The single line in the last repeat loop might be seen as overly compact. Maybe.

I have a chemistry background. The parsing is difficult if you do not have a subscript after each element, not ordinary practice. CO2 could be "cobalt2" The above script requires CO2 to be C1O2. So did yours. You could test, and hope that the format is rock solid, for a lower case letter in the second position. Thus "CO2" would be distinguished from "Co2". I will fool with it some more...

Craig Newman

dhobbs
Posts: 24
Joined: Tue Dec 15, 2009 6:25 pm

Re: code writing standards

Post by dhobbs » Sat Mar 05, 2011 10:50 pm

Craig,
Thanks for the snippet. I wonder if we are pretty much doing the same thing. Of course, I have to enforce some restrictions on input (must start with a letter, every element must only contain two letters but integers can have any number). I'm not worried about the error catching code just yet. One item I particularly liked about your example is that it catches the double comma which arrises when two characters of the same type are sequential. I solved this issue using tCharType which would switch from one category to the other after each transition, but then had to delete an extra trailing comma from the element list.

But then I got confused and wondered if I should put the whole thing in a repeating switch statement, focussing on letter/number transitions.

Just curious, what sort of chemistry is in your background? I am a medicinal chemist and I am using this as the first step in a program that will compute masses (and exact masses) for use in Mass Spectroscopy interpretation.

--Doug

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10305
Joined: Wed May 06, 2009 2:28 pm

Re: code writing standards

Post by dunbarx » Sun Mar 06, 2011 1:09 am

Doug

If every element had two letters, the parsing would be trivial. I know you did not mean that.

I bet the comma thing could be made more elegant with better loops. Not sure if this is worth the effort.

I think the lower case thing ought to work; nobody who is anybody writes cobalt as "CO", and anyway there are not that many two letter combinations that make could be confused in that way. Sulfur Iodide mistaken for Silicon? You could even have an alert if one of the relatively few combinations that might be ambiguous was to be manipulated.

My chemistry background is secret. However, I do own one of the worlds top element collections, displayed in a wooden periodic table with physical samples in front of their symbols. 87 elements. It glows in the dark.

Craig Newman

Post Reply