LC > 5.5.1 performance is really disappointing!
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller
-
- Posts: 349
- Joined: Tue Oct 28, 2008 1:23 am
- Contact:
LC > 5.5.1 performance is really disappointing!
I use LC on my 2014 MacPro running Sierra to process MASSIVE amounts of data (machine learning for algorithmic trading systems) including importing large data files, and processing very large arrays. I have never EVER been able to even come close to matching the performance of LC version 5.5.1 with the newer > v7 build. This is why I have not upgraded LC and I am still using the old version.
I recently upgraded my machine to a new 12 core MacPro running Catalina. Old versions of Livecode < v 6 won't run on this new OS. Therefore, I upgraded LC to 9.6 and I'm really upset that my processing times are now over DOUBLE!!!! I have some processes that can take half a day to complete, and now I'm looking at an entire day. And the slowness is across the board - array processing is twice as slow, dumping data out to fields is sometimes 10 x slower!
A possible solution is to downgrade my OS on the new machine to Sierra (not even sure that's possible), but I'm not sure if the old version of LC (5.5.1) is incompatible with the new machine itself (the hardware), or just the new OS. Can anyone offer insight on this??
I recently upgraded my machine to a new 12 core MacPro running Catalina. Old versions of Livecode < v 6 won't run on this new OS. Therefore, I upgraded LC to 9.6 and I'm really upset that my processing times are now over DOUBLE!!!! I have some processes that can take half a day to complete, and now I'm looking at an entire day. And the slowness is across the board - array processing is twice as slow, dumping data out to fields is sometimes 10 x slower!
A possible solution is to downgrade my OS on the new machine to Sierra (not even sure that's possible), but I'm not sure if the old version of LC (5.5.1) is incompatible with the new machine itself (the hardware), or just the new OS. Can anyone offer insight on this??
Re: LC > 5.5.1 performance is really disappointing!
Hardware in and of itself is (generally) meaningless in most cases, unless it is of a completely different type of hardware than what your OS was running on previously.
By "completely different type", I am not talking about 3 cores vs. 12 cores, but instead think 'architecture', i.e. your moving from say x86 to ras.Pi.
Ultimately, however, the best way to test it is 1 of 2 methods -
Real hardware -
Partition a small amount of space (30 to 50 gigs lets say) and install the older os that 5.5 works on, then install 5.5 and see if it works. Downsides to this method would mainly center around the security of running a much older OS.
Fake hardware -
Download and install a virtual machine software (Virtualbox is free and I believe runs on nearly anything, VmWare Player is also free and etc etc), install the older OS on that and test the software.
The second is my preferred method as it eliminates problems like hardware completely, minimizes issues with security, and (on your current box) should run at least as fast as your old system did on real hardware (of the time).
By "completely different type", I am not talking about 3 cores vs. 12 cores, but instead think 'architecture', i.e. your moving from say x86 to ras.Pi.
Ultimately, however, the best way to test it is 1 of 2 methods -
Real hardware -
Partition a small amount of space (30 to 50 gigs lets say) and install the older os that 5.5 works on, then install 5.5 and see if it works. Downsides to this method would mainly center around the security of running a much older OS.
Fake hardware -
Download and install a virtual machine software (Virtualbox is free and I believe runs on nearly anything, VmWare Player is also free and etc etc), install the older OS on that and test the software.
The second is my preferred method as it eliminates problems like hardware completely, minimizes issues with security, and (on your current box) should run at least as fast as your old system did on real hardware (of the time).
-
- VIP Livecode Opensource Backer
- Posts: 9833
- Joined: Sat Apr 08, 2006 7:05 am
- Location: Los Angeles
- Contact:
Re: LC > 5.5.1 performance is really disappointing!
Good to see you back here, Greg, though it implies a break from your travels, which is too bad for me because you always post the best travel pics on FB.adventuresofgreg wrote: ↑Sat Mar 06, 2021 4:21 pmI use LC on my 2014 MacPro running Sierra to process MASSIVE amounts of data (machine learning for algorithmic trading systems) including importing large data files, and processing very large arrays. I have never EVER been able to even come close to matching the performance of LC version 5.5.1 with the newer > v7 build. This is why I have not upgraded LC and I am still using the old version.
I have a sense of the volume of data you're working with from our last phone conversation some time ago; not surprising that it's grown since.
Where speed is the sole criterion driving our choices, any scripting language may not be the best one. LC's general performance is roughly on par with most others, like Ruby, Python, etc. Among scripting languages, only PHP 7 and later break away from the cluster of performance metrics in any notable way.
So one option might be to consider offloading the heavy lifting to PHP 7 (or even better v9 which is slightly faster still).
But even better might be to take advantage of LC's ability to integrate compiled machine code via the externals interface. That would give you performance far beyond any scripting language in the one area where it's most critical, while still allowing you to enjoy all the benefits of LC for everything else. Several of our members here are quite adept at externals writing and can work affordably on a contract basis.
While you ponder that, there's also a third option: revising the script for optimization. You've been doing this a while so perhaps the code is already pretty tight as it is. But one thing I've learned since the Unicode and bug fixes slowed down the performance from older versions (you'd be surprised how much of the older performance happened because of insufficient internal error checking), is that with fresh eyeballs on it we can almost always revise the code to at least regain the speed enjoyed earlier. And many times I've seen improvements that exceed original performance.
Feel free to call if you want to brainstorm any of this. I'm curious to hear what you've been up to since we last spoke, and I always enjoy the ambitiousness of your work.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
-
- Posts: 627
- Joined: Wed Apr 24, 2013 4:53 pm
- Contact:
Re: LC > 5.5.1 performance is really disappointing!
Have you tried 6.7.11 or whatever the last version of 6 was? I believe that was the last version I had significantly faster text processing with, I stuck with that mostly until LiveCode Builder (which I really wanted to get into) came to fruition in v.8. Building a LiveCode Builder extension, perhaps tapping into OS APIs or wrapping some existing speedy foreign code library that does the processing you need, may be another option besides PHP via shell/open process, or building an external (although that's sort of a similar thing).
-
- Posts: 349
- Joined: Tue Oct 28, 2008 1:23 am
- Contact:
Re: LC > 5.5.1 performance is really disappointing!
Richard my man! Nice to hear from you. I reviewed an old post of mine from years (and years) ago where you submitted a function speed comparison between pre-7 and post-7, which was very enlightening. Perhaps you and/or others in the forum could assist in optimizing the functions that I'm generally using. I'll start with this basic file loading which is currently taking about 55 minutes with LC v9.6 (30 minutes with LC v5):FourthWorld wrote: ↑Sat Mar 06, 2021 6:29 pmWhile you ponder that, there's also a third option: revising the script for optimization.
Code: Select all
put field "symbolList" into symbolList
-- Open the files and parse (10,000 files)
repeat for each line symb in symbolList
put "file:" & the defaultfolder & "/" & symb into thefile
put url theFile into thedata
repeat for each line myline in thedata --(6000 lines)
put convertDate(item 1 of myline) into thedate
put item 2 of myline into myOpen
put item 3 of myline into myClose
put abs(myopen-myclose) & "," after STDEVlist
put standarddeviation(STDEVlist) into theSTdev
-- As I work my way from line to line in the ascending date thedata data, I am
-- collecting a group of 200 values and then performing some math on them like stdev, or a average, etc.
if the number of items of theStDev > 200 then delete item 1 of theStdev
put symb & myClose & tab into mainData[symb][thedate]
end repeat
end repeat
function convertdate adate
set the itemdelimiter to "-"
put item 1 of adate into theyear
put item 2 of adate into themonth
put item 3 of adate into theday
put themonth & "/" & theday & "/" & theyear into newDate
convert newDate to date
return newDate
end convertdate
-
- Posts: 627
- Joined: Wed Apr 24, 2013 4:53 pm
- Contact:
Re: LC > 5.5.1 performance is really disappointing!
Code: Select all
put "file:" & the defaultfolder & "/" & symb into thefile
put url theFile into thedata
-
- Posts: 349
- Joined: Tue Oct 28, 2008 1:23 am
- Contact:
Re: LC > 5.5.1 performance is really disappointing!
I'm pulling values out of each file and organizing them into an array by date and symbol mainData[symbol[date] because there is a lot more processing to be done such as charting, more indicators, etc. I can't see how one could do that while leaving all of the data in a file. Interesting idea though..PaulDaMacMan wrote: ↑Tue Mar 09, 2021 3:31 pmNot sure if it's still a good idea to copy the whole file into memory in the age of solid state dives? Maybe if it's on a network share.Code: Select all
put "file:" & the defaultfolder & "/" & symb into thefile put url theFile into thedata
Re: LC > 5.5.1 performance is really disappointing!
I am pretty sure your well ahead of me on the bell curve, and there may well be reasons why *not* to do this, but wouldn't 'read until...' allow you to grab chunks of the file and process it, then move on to the next chunk?adventuresofgreg wrote: ↑Tue Mar 09, 2021 3:39 pmI can't see how one could do that while leaving all of the data in a file.
Alternately, line counting the file (assuming it is made up of straight lines of information) would do the same? Either road only puts the information directly from the file through whatever process your doing.
-
- Posts: 349
- Joined: Tue Oct 28, 2008 1:23 am
- Contact:
Re: LC > 5.5.1 performance is really disappointing!
If you are suggesting reading in a row of the file at a time, I would assume that although we are not loading the entire file into memory, and not consuming that memory, the work of reading one line at a time, and repeating until the file is done would be far more CPU intensive - no?bogs wrote: ↑Tue Mar 09, 2021 3:51 pmI am pretty sure your well ahead of me on the bell curve, and there may well be reasons why *not* to do this, but wouldn't 'read until...' allow you to grab chunks of the file and process it, then move on to the next chunk?adventuresofgreg wrote: ↑Tue Mar 09, 2021 3:39 pmI can't see how one could do that while leaving all of the data in a file.
Alternately, line counting the file (assuming it is made up of straight lines of information) would do the same? Either road only puts the information directly from the file through whatever process your doing.
-
- Posts: 627
- Joined: Wed Apr 24, 2013 4:53 pm
- Contact:
Re: LC > 5.5.1 performance is really disappointing!
What I was suggesting was that you treat the file itself as the container for the file's data, instead of copying the entire file's data it into a variable in memory before you even start processing the data, as in something like this:adventuresofgreg wrote: ↑Tue Mar 09, 2021 3:56 pmIf you are suggesting reading in a row of the file at a time, I would assume that although we are not loading the entire file into memory, and not consuming that memory, the work of reading one line at a time, and repeating until the file is done would be far more CPU intensive - no?bogs wrote: ↑Tue Mar 09, 2021 3:51 pmI am pretty sure your well ahead of me on the bell curve, and there may well be reasons why *not* to do this, but wouldn't 'read until...' allow you to grab chunks of the file and process it, then move on to the next chunk?adventuresofgreg wrote: ↑Tue Mar 09, 2021 3:39 pmI can't see how one could do that while leaving all of the data in a file.
Alternately, line counting the file (assuming it is made up of straight lines of information) would do the same? Either road only puts the information directly from the file through whatever process your doing.
Code: Select all
put word one of line 1 of URL tMyFile into myArray["Whatever"]
Re: LC > 5.5.1 performance is really disappointing!
I am suggesting what Paul said a little better than I did, process the read or lines of the file directly in your working statements instead of reading the file into memory, then processing it (probably close to) the exact same way. Dealing with the smaller chunks directly *might* be faster ( I don't know this for a fact, I don't have any extremely large data hanging around at the moment).adventuresofgreg wrote: ↑Tue Mar 09, 2021 3:56 pmIf you are suggesting reading in a row of the file at a time, I would assume that although we are not loading the entire file into memory, and not consuming that memory, the work of reading one line at a time, and repeating until the file is done would be far more CPU intensive - no?bogs wrote: ↑Tue Mar 09, 2021 3:51 pmI am pretty sure your well ahead of me on the bell curve, and there may well be reasons why *not* to do this, but wouldn't 'read until...' allow you to grab chunks of the file and process it, then move on to the next chunk?adventuresofgreg wrote: ↑Tue Mar 09, 2021 3:39 pmI can't see how one could do that while leaving all of the data in a file.
Alternately, line counting the file (assuming it is made up of straight lines of information) would do the same? Either road only puts the information directly from the file through whatever process your doing.
I don't believe it would be more cpu intensive, after all, your processing smaller chunks, but it would be more i/o intensive (disc drive). However, since your moving (relatively) minor amounts of data from the file to do the work, it may be worth the trade off.
As I said before though, I may well be missing something. After all, I'm still having issues with printing haha.
-
- VIP Livecode Opensource Backer
- Posts: 9658
- Joined: Wed May 06, 2009 2:28 pm
- Location: New York, NY
Re: LC > 5.5.1 performance is really disappointing!
I never do anything like this, so this post is just my uninformed opinion.
Isn't is much faster to read everything into a variable in one shot, and work thereafter wholly inside LC? I think (thought) that I/O stuff is always slower than internal stuff.
Craig
Isn't is much faster to read everything into a variable in one shot, and work thereafter wholly inside LC? I think (thought) that I/O stuff is always slower than internal stuff.
Craig
-
- VIP Livecode Opensource Backer
- Posts: 9833
- Joined: Sat Apr 08, 2006 7:05 am
- Location: Los Angeles
- Contact:
Re: LC > 5.5.1 performance is really disappointing!
Two questions, Greg:
1. How confident are you that the routines you've shared here (thank you for that; too many posts here have no code and thereby limit what we can do to help) are the main bottlenecks in your system?
2. Do you have data you can share? 1 file would be nice, but a collection of a hundred or so would allow a representative sample useful for good benchmarking.
Bonus question: do you have any control over the format of the data you receive (I'd guess not but it never hurts to ask)?
As for "read until", it can help with some things but not many in terms of speed. With very large files it can be beneficial n reducing the memory shuffling need for large contiguous blocks. But its main benefit is conceptual convenience, and like all conveniences that favor the programmer it usually means the machine is working harder. It's not just the extra system calls to the storage driver (as Paul says, with SSDs those are nearly inconsequential), but mostly in introducing a character-by-character comparison of everything coming in from the read buffer as it looks for CR.
In short, if the files are reasonably small (< a few MBs) I wouldn't worry about that part.
1. How confident are you that the routines you've shared here (thank you for that; too many posts here have no code and thereby limit what we can do to help) are the main bottlenecks in your system?
2. Do you have data you can share? 1 file would be nice, but a collection of a hundred or so would allow a representative sample useful for good benchmarking.
Bonus question: do you have any control over the format of the data you receive (I'd guess not but it never hurts to ask)?
As for "read until", it can help with some things but not many in terms of speed. With very large files it can be beneficial n reducing the memory shuffling need for large contiguous blocks. But its main benefit is conceptual convenience, and like all conveniences that favor the programmer it usually means the machine is working harder. It's not just the extra system calls to the storage driver (as Paul says, with SSDs those are nearly inconsequential), but mostly in introducing a character-by-character comparison of everything coming in from the read buffer as it looks for CR.
In short, if the files are reasonably small (< a few MBs) I wouldn't worry about that part.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
-
- Posts: 349
- Joined: Tue Oct 28, 2008 1:23 am
- Contact:
Re: LC > 5.5.1 performance is really disappointing!
I would think so as well, but this is worth a test
-
- Posts: 349
- Joined: Tue Oct 28, 2008 1:23 am
- Contact:
Re: LC > 5.5.1 performance is really disappointing!
I've just shown one of many routines to come, and I haven't a clue where the bottle necks are. Any alternative approaches are pretty easy for me to test, so these suggestions are great.FourthWorld wrote: ↑Tue Mar 09, 2021 4:55 pm1. How confident are you that the routines you've shared here (thank you for that; too many posts here have no code and thereby limit what we can do to help) are the main bottlenecks in your system?
2. Do you have data you can share? 1 file would be nice, but a collection of a hundred or so would allow a representative sample useful for good benchmarking.
To start with, I'm going to take a look at reading 1 line at a time from the file to see if there are any improvements.
After thinking about it, I do believe that I could do more processing while I'm writing the original data to disc after fetching it from the data feed - some savings there for sure I think.