LC > 5.5.1 performance is really disappointing!

Anything beyond the basics in using the LiveCode language. Share your handlers, functions and magic here.

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

LC > 5.5.1 performance is really disappointing!

Post by adventuresofgreg » Sat Mar 06, 2021 4:21 pm

I use LC on my 2014 MacPro running Sierra to process MASSIVE amounts of data (machine learning for algorithmic trading systems) including importing large data files, and processing very large arrays. I have never EVER been able to even come close to matching the performance of LC version 5.5.1 with the newer > v7 build. This is why I have not upgraded LC and I am still using the old version.

I recently upgraded my machine to a new 12 core MacPro running Catalina. Old versions of Livecode < v 6 won't run on this new OS. Therefore, I upgraded LC to 9.6 and I'm really upset that my processing times are now over DOUBLE!!!! I have some processes that can take half a day to complete, and now I'm looking at an entire day. And the slowness is across the board - array processing is twice as slow, dumping data out to fields is sometimes 10 x slower!

A possible solution is to downgrade my OS on the new machine to Sierra (not even sure that's possible), but I'm not sure if the old version of LC (5.5.1) is incompatible with the new machine itself (the hardware), or just the new OS. Can anyone offer insight on this??

bogs
Posts: 5435
Joined: Sat Feb 25, 2017 10:45 pm

Re: LC > 5.5.1 performance is really disappointing!

Post by bogs » Sat Mar 06, 2021 4:34 pm

Hardware in and of itself is (generally) meaningless in most cases, unless it is of a completely different type of hardware than what your OS was running on previously.

By "completely different type", I am not talking about 3 cores vs. 12 cores, but instead think 'architecture', i.e. your moving from say x86 to ras.Pi.

Ultimately, however, the best way to test it is 1 of 2 methods -

Real hardware -
Partition a small amount of space (30 to 50 gigs lets say) and install the older os that 5.5 works on, then install 5.5 and see if it works. Downsides to this method would mainly center around the security of running a much older OS.

Fake hardware -
Download and install a virtual machine software (Virtualbox is free and I believe runs on nearly anything, VmWare Player is also free and etc etc), install the older OS on that and test the software.

The second is my preferred method as it eliminates problems like hardware completely, minimizes issues with security, and (on your current box) should run at least as fast as your old system did on real hardware (of the time).
Image

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 9823
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: LC > 5.5.1 performance is really disappointing!

Post by FourthWorld » Sat Mar 06, 2021 6:29 pm

adventuresofgreg wrote:
Sat Mar 06, 2021 4:21 pm
I use LC on my 2014 MacPro running Sierra to process MASSIVE amounts of data (machine learning for algorithmic trading systems) including importing large data files, and processing very large arrays. I have never EVER been able to even come close to matching the performance of LC version 5.5.1 with the newer > v7 build. This is why I have not upgraded LC and I am still using the old version.
Good to see you back here, Greg, though it implies a break from your travels, which is too bad for me because you always post the best travel pics on FB. :)

I have a sense of the volume of data you're working with from our last phone conversation some time ago; not surprising that it's grown since.

Where speed is the sole criterion driving our choices, any scripting language may not be the best one. LC's general performance is roughly on par with most others, like Ruby, Python, etc. Among scripting languages, only PHP 7 and later break away from the cluster of performance metrics in any notable way.

So one option might be to consider offloading the heavy lifting to PHP 7 (or even better v9 which is slightly faster still).

But even better might be to take advantage of LC's ability to integrate compiled machine code via the externals interface. That would give you performance far beyond any scripting language in the one area where it's most critical, while still allowing you to enjoy all the benefits of LC for everything else. Several of our members here are quite adept at externals writing and can work affordably on a contract basis.

While you ponder that, there's also a third option: revising the script for optimization. You've been doing this a while so perhaps the code is already pretty tight as it is. But one thing I've learned since the Unicode and bug fixes slowed down the performance from older versions (you'd be surprised how much of the older performance happened because of insufficient internal error checking), is that with fresh eyeballs on it we can almost always revise the code to at least regain the speed enjoyed earlier. And many times I've seen improvements that exceed original performance.

Feel free to call if you want to brainstorm any of this. I'm curious to hear what you've been up to since we last spoke, and I always enjoy the ambitiousness of your work.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

PaulDaMacMan
Posts: 626
Joined: Wed Apr 24, 2013 4:53 pm
Contact:

Re: LC > 5.5.1 performance is really disappointing!

Post by PaulDaMacMan » Sun Mar 07, 2021 5:29 am

Have you tried 6.7.11 or whatever the last version of 6 was? I believe that was the last version I had significantly faster text processing with, I stuck with that mostly until LiveCode Builder (which I really wanted to get into) came to fruition in v.8. Building a LiveCode Builder extension, perhaps tapping into OS APIs or wrapping some existing speedy foreign code library that does the processing you need, may be another option besides PHP via shell/open process, or building an external (although that's sort of a similar thing).
My GitHub Repos: https://github.com/PaulMcClernan/
Related YouTube Videos: PlayList

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: LC > 5.5.1 performance is really disappointing!

Post by adventuresofgreg » Tue Mar 09, 2021 3:08 pm

FourthWorld wrote:
Sat Mar 06, 2021 6:29 pm
While you ponder that, there's also a third option: revising the script for optimization.
Richard my man! Nice to hear from you. I reviewed an old post of mine from years (and years) ago where you submitted a function speed comparison between pre-7 and post-7, which was very enlightening. Perhaps you and/or others in the forum could assist in optimizing the functions that I'm generally using. I'll start with this basic file loading which is currently taking about 55 minutes with LC v9.6 (30 minutes with LC v5):

Code: Select all

put field "symbolList" into symbolList

-- Open the files and parse (10,000 files) 
repeat for each line symb in symbolList 
   put "file:" & the defaultfolder & "/" & symb into thefile
   put url theFile into thedata
   
   repeat for each line myline in thedata --(6000 lines)
       put convertDate(item 1 of myline) into thedate
       put item 2 of myline into myOpen
       put item 3 of myline into myClose
       put abs(myopen-myclose) & "," after STDEVlist
       put standarddeviation(STDEVlist) into theSTdev
       
          -- As I work my way from line to line in the ascending date thedata data, I am 
          -- collecting a group of 200 values and then performing some math on them like stdev, or a average, etc.
       if the number of items of theStDev > 200 then delete item 1 of theStdev
       
       put symb & myClose & tab into mainData[symb][thedate]
   
   end repeat
end repeat


function convertdate adate
   set the itemdelimiter to "-"
   put item 1 of adate into theyear
   put item 2 of adate into themonth
   put item 3 of adate into theday
   put themonth & "/" & theday & "/" & theyear into newDate
   convert newDate to date 
   return newDate
end convertdate

PaulDaMacMan
Posts: 626
Joined: Wed Apr 24, 2013 4:53 pm
Contact:

Re: LC > 5.5.1 performance is really disappointing!

Post by PaulDaMacMan » Tue Mar 09, 2021 3:31 pm

Code: Select all

   put "file:" & the defaultfolder & "/" & symb into thefile
   put url theFile into thedata
Not sure if it's still a good idea to copy the whole file into memory in the age of solid state dives? Maybe if it's on a network share.
My GitHub Repos: https://github.com/PaulMcClernan/
Related YouTube Videos: PlayList

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: LC > 5.5.1 performance is really disappointing!

Post by adventuresofgreg » Tue Mar 09, 2021 3:39 pm

PaulDaMacMan wrote:
Tue Mar 09, 2021 3:31 pm

Code: Select all

   put "file:" & the defaultfolder & "/" & symb into thefile
   put url theFile into thedata
Not sure if it's still a good idea to copy the whole file into memory in the age of solid state dives? Maybe if it's on a network share.
I'm pulling values out of each file and organizing them into an array by date and symbol mainData[symbol[date] because there is a lot more processing to be done such as charting, more indicators, etc. I can't see how one could do that while leaving all of the data in a file. Interesting idea though..

bogs
Posts: 5435
Joined: Sat Feb 25, 2017 10:45 pm

Re: LC > 5.5.1 performance is really disappointing!

Post by bogs » Tue Mar 09, 2021 3:51 pm

adventuresofgreg wrote:
Tue Mar 09, 2021 3:39 pm
I can't see how one could do that while leaving all of the data in a file.
I am pretty sure your well ahead of me on the bell curve, and there may well be reasons why *not* to do this, but wouldn't 'read until...' allow you to grab chunks of the file and process it, then move on to the next chunk?

Alternately, line counting the file (assuming it is made up of straight lines of information) would do the same? Either road only puts the information directly from the file through whatever process your doing.
Image

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: LC > 5.5.1 performance is really disappointing!

Post by adventuresofgreg » Tue Mar 09, 2021 3:56 pm

bogs wrote:
Tue Mar 09, 2021 3:51 pm
adventuresofgreg wrote:
Tue Mar 09, 2021 3:39 pm
I can't see how one could do that while leaving all of the data in a file.
I am pretty sure your well ahead of me on the bell curve, and there may well be reasons why *not* to do this, but wouldn't 'read until...' allow you to grab chunks of the file and process it, then move on to the next chunk?

Alternately, line counting the file (assuming it is made up of straight lines of information) would do the same? Either road only puts the information directly from the file through whatever process your doing.
If you are suggesting reading in a row of the file at a time, I would assume that although we are not loading the entire file into memory, and not consuming that memory, the work of reading one line at a time, and repeating until the file is done would be far more CPU intensive - no?

PaulDaMacMan
Posts: 626
Joined: Wed Apr 24, 2013 4:53 pm
Contact:

Re: LC > 5.5.1 performance is really disappointing!

Post by PaulDaMacMan » Tue Mar 09, 2021 4:07 pm

adventuresofgreg wrote:
Tue Mar 09, 2021 3:56 pm
bogs wrote:
Tue Mar 09, 2021 3:51 pm
adventuresofgreg wrote:
Tue Mar 09, 2021 3:39 pm
I can't see how one could do that while leaving all of the data in a file.
I am pretty sure your well ahead of me on the bell curve, and there may well be reasons why *not* to do this, but wouldn't 'read until...' allow you to grab chunks of the file and process it, then move on to the next chunk?

Alternately, line counting the file (assuming it is made up of straight lines of information) would do the same? Either road only puts the information directly from the file through whatever process your doing.
If you are suggesting reading in a row of the file at a time, I would assume that although we are not loading the entire file into memory, and not consuming that memory, the work of reading one line at a time, and repeating until the file is done would be far more CPU intensive - no?
What I was suggesting was that you treat the file itself as the container for the file's data, instead of copying the entire file's data it into a variable in memory before you even start processing the data, as in something like this:

Code: Select all

put word one of line 1 of URL tMyFile into myArray["Whatever"]
No open/read from/close involved (although I'm not sure if if it's significantly faster than that combination, probably worth testing if it hasn't been, which I'm pretty sure it has)
My GitHub Repos: https://github.com/PaulMcClernan/
Related YouTube Videos: PlayList

bogs
Posts: 5435
Joined: Sat Feb 25, 2017 10:45 pm

Re: LC > 5.5.1 performance is really disappointing!

Post by bogs » Tue Mar 09, 2021 4:22 pm

adventuresofgreg wrote:
Tue Mar 09, 2021 3:56 pm
bogs wrote:
Tue Mar 09, 2021 3:51 pm
adventuresofgreg wrote:
Tue Mar 09, 2021 3:39 pm
I can't see how one could do that while leaving all of the data in a file.
I am pretty sure your well ahead of me on the bell curve, and there may well be reasons why *not* to do this, but wouldn't 'read until...' allow you to grab chunks of the file and process it, then move on to the next chunk?

Alternately, line counting the file (assuming it is made up of straight lines of information) would do the same? Either road only puts the information directly from the file through whatever process your doing.
If you are suggesting reading in a row of the file at a time, I would assume that although we are not loading the entire file into memory, and not consuming that memory, the work of reading one line at a time, and repeating until the file is done would be far more CPU intensive - no?
I am suggesting what Paul said a little better than I did, process the read or lines of the file directly in your working statements instead of reading the file into memory, then processing it (probably close to) the exact same way. Dealing with the smaller chunks directly *might* be faster ( I don't know this for a fact, I don't have any extremely large data hanging around at the moment).

I don't believe it would be more cpu intensive, after all, your processing smaller chunks, but it would be more i/o intensive (disc drive). However, since your moving (relatively) minor amounts of data from the file to do the work, it may be worth the trade off.

As I said before though, I may well be missing something. After all, I'm still having issues with printing haha.
Image

dunbarx
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 9643
Joined: Wed May 06, 2009 2:28 pm
Location: New York, NY

Re: LC > 5.5.1 performance is really disappointing!

Post by dunbarx » Tue Mar 09, 2021 4:38 pm

I never do anything like this, so this post is just my uninformed opinion.

Isn't is much faster to read everything into a variable in one shot, and work thereafter wholly inside LC? I think (thought) that I/O stuff is always slower than internal stuff.

Craig

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 9823
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: LC > 5.5.1 performance is really disappointing!

Post by FourthWorld » Tue Mar 09, 2021 4:55 pm

Two questions, Greg:

1. How confident are you that the routines you've shared here (thank you for that; too many posts here have no code and thereby limit what we can do to help) are the main bottlenecks in your system?

2. Do you have data you can share? 1 file would be nice, but a collection of a hundred or so would allow a representative sample useful for good benchmarking.

Bonus question: do you have any control over the format of the data you receive (I'd guess not but it never hurts to ask)?

As for "read until", it can help with some things but not many in terms of speed. With very large files it can be beneficial n reducing the memory shuffling need for large contiguous blocks. But its main benefit is conceptual convenience, and like all conveniences that favor the programmer it usually means the machine is working harder. It's not just the extra system calls to the storage driver (as Paul says, with SSDs those are nearly inconsequential), but mostly in introducing a character-by-character comparison of everything coming in from the read buffer as it looks for CR.

In short, if the files are reasonably small (< a few MBs) I wouldn't worry about that part.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: LC > 5.5.1 performance is really disappointing!

Post by adventuresofgreg » Tue Mar 09, 2021 5:12 pm

dunbarx wrote:
Tue Mar 09, 2021 4:38 pm
I never do anything like this, so this post is just my uninformed opinion.

Isn't is much faster to read everything into a variable in one shot, and work thereafter wholly inside LC? I think (thought) that I/O stuff is always slower than internal stuff.

Craig
I would think so as well, but this is worth a test

adventuresofgreg
Posts: 349
Joined: Tue Oct 28, 2008 1:23 am
Contact:

Re: LC > 5.5.1 performance is really disappointing!

Post by adventuresofgreg » Tue Mar 09, 2021 6:43 pm

FourthWorld wrote:
Tue Mar 09, 2021 4:55 pm
1. How confident are you that the routines you've shared here (thank you for that; too many posts here have no code and thereby limit what we can do to help) are the main bottlenecks in your system?

2. Do you have data you can share? 1 file would be nice, but a collection of a hundred or so would allow a representative sample useful for good benchmarking.
I've just shown one of many routines to come, and I haven't a clue where the bottle necks are. Any alternative approaches are pretty easy for me to test, so these suggestions are great.

To start with, I'm going to take a look at reading 1 line at a time from the file to see if there are any improvements.

After thinking about it, I do believe that I could do more processing while I'm writing the original data to disc after fetching it from the data feed - some savings there for sure I think.

Post Reply

Return to “Talking LiveCode”