Garbage collection e-mail thread is giving my nightmares
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller
Garbage collection e-mail thread is giving my nightmares
Hi everyone,
There is a thread on the email list (to which I cannot post for some reason, so I am posting it here)
http://lists.runrev.com/pipermail/use-l ... 29935.html
it is literally giving me nightmares.
I have been working under the assumption that livecode can handle arrays as big as free memory will allow on either 32 or 64 bit (linux) systems.
It turnes out that livecode has trouble handling large arrays ......
This is very problematic for me, since my application has to do a lot of data processing of arrays which are in turn displayed in data grids throughout the application.
I don't know what to do, I don't know how to test it...... quite frankly its hurt my trust in the lc engine.
My application is a communication platform with no limit on the number of users or the number of pieces of content that each user can add. I could easily have 5000 or 10,000 users data being loaded into a global memory arrray
How can I have peace of mind on this subject?.....
PS> I am also running on 7.3 .... the last time i tried version 8 it was still full of things that drove me crazy....disappearing windows and etc. also my app did not display correctly ..... and so i have not put in the time to switch to v8
There is a thread on the email list (to which I cannot post for some reason, so I am posting it here)
http://lists.runrev.com/pipermail/use-l ... 29935.html
it is literally giving me nightmares.
I have been working under the assumption that livecode can handle arrays as big as free memory will allow on either 32 or 64 bit (linux) systems.
It turnes out that livecode has trouble handling large arrays ......
This is very problematic for me, since my application has to do a lot of data processing of arrays which are in turn displayed in data grids throughout the application.
I don't know what to do, I don't know how to test it...... quite frankly its hurt my trust in the lc engine.
My application is a communication platform with no limit on the number of users or the number of pieces of content that each user can add. I could easily have 5000 or 10,000 users data being loaded into a global memory arrray
How can I have peace of mind on this subject?.....
PS> I am also running on 7.3 .... the last time i tried version 8 it was still full of things that drove me crazy....disappearing windows and etc. also my app did not display correctly ..... and so i have not put in the time to switch to v8
Last edited by makeshyft on Tue Aug 23, 2016 4:06 am, edited 1 time in total.
Founder & Developer @ MakeShyft R.D.A - https://www.makeshyft.com
Build Software with AppStarterStack for Livecode - https://www.AppStarterStack.com
Save Time with The Time Saver's Toolbox - https://www.TimeSaversToolbox.com
Build Software with AppStarterStack for Livecode - https://www.AppStarterStack.com
Save Time with The Time Saver's Toolbox - https://www.TimeSaversToolbox.com
-
- VIP Livecode Opensource Backer
- Posts: 9842
- Joined: Sat Apr 08, 2006 7:05 am
- Location: Los Angeles
- Contact:
Re: Garbage collection e-mail thread is giving my nightmares
It's very helpful to read the entire thread. A few posts later LC's CTO Mark Waddingham offered this reply:
http://lists.runrev.com/pipermail/use-l ... 29987.html
There he includes this link to an article from Microsoft themselves about the limits of memory management on Win32:
https://blogs.technet.microsoft.com/ask ... fying-3gb/
Beyond the OS limits themselves, associative arrays in any language are likely to have performance degradation beyond a certain number of elements, given how the hashing works to assign buckets and the need to resolve collisions as the number of elements in a bucket grows. In some respects this is similar to extents in file systems, and like FS extents the performance drop may not be significant. depending on how frequently you're accessing the data.
What exactly are you building, and can you point us to how this is handled in other languages so that we may be able to propose an enhancement to the engine or better yet find a good solution with what we have in hand today?
http://lists.runrev.com/pipermail/use-l ... 29987.html
There he includes this link to an article from Microsoft themselves about the limits of memory management on Win32:
https://blogs.technet.microsoft.com/ask ... fying-3gb/
Beyond the OS limits themselves, associative arrays in any language are likely to have performance degradation beyond a certain number of elements, given how the hashing works to assign buckets and the need to resolve collisions as the number of elements in a bucket grows. In some respects this is similar to extents in file systems, and like FS extents the performance drop may not be significant. depending on how frequently you're accessing the data.
What exactly are you building, and can you point us to how this is handled in other languages so that we may be able to propose an enhancement to the engine or better yet find a good solution with what we have in hand today?
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: Garbage collection e-mail thread is giving my nightmares
thanks for the link I will be reading the 3GB article in all its glorious detail.....
So I am building the U.M.P..... Universal Mediation Program....a communication platform for conflict resolution.
Currently each participant submits their content statements,responses,agreements,etc..... and each time every participant submits their file, all the files are compiled together into one database so that the data can be tabulated and analyzed.... this is known as a session file..... this happens in "progress rounds"...so each round has each own session file....so once it is created it never changes......until an entirely new one is generated in the next progress round...... which is why.....
Currently I load the session file (MySQL DB or SQLite) into memory and organize the data in various arrays .....
So i think the ump can handle quite a few thousands and records and participants...... but what freaks me out is an unexpected memory limit....
i am happy to hear the problems arise only on windows machines and for huge mediations we will be recommending using 64-bit builds for sure.
It is hard to simulate large mediations without writing a "simulator" that can simulate the many kinds of content and interactions that can happen during a mediation.
i am not surprised by some kind of performance degradation when there is truly a large amount of elements or keys.....BUT....
instability is what is not acceptable.
so if we are talking about a slowdown..... people will know that large mediations will require more powerful systems......so thats ok........ crashes and unresponsive processes is what i must avoid at all cost.
I guess I have no choice but to build a simulator sooner than I had hoped.
Thanks for all your help
So I am building the U.M.P..... Universal Mediation Program....a communication platform for conflict resolution.
Currently each participant submits their content statements,responses,agreements,etc..... and each time every participant submits their file, all the files are compiled together into one database so that the data can be tabulated and analyzed.... this is known as a session file..... this happens in "progress rounds"...so each round has each own session file....so once it is created it never changes......until an entirely new one is generated in the next progress round...... which is why.....
Currently I load the session file (MySQL DB or SQLite) into memory and organize the data in various arrays .....
So i think the ump can handle quite a few thousands and records and participants...... but what freaks me out is an unexpected memory limit....
i am happy to hear the problems arise only on windows machines and for huge mediations we will be recommending using 64-bit builds for sure.
It is hard to simulate large mediations without writing a "simulator" that can simulate the many kinds of content and interactions that can happen during a mediation.
i am not surprised by some kind of performance degradation when there is truly a large amount of elements or keys.....BUT....
instability is what is not acceptable.
so if we are talking about a slowdown..... people will know that large mediations will require more powerful systems......so thats ok........ crashes and unresponsive processes is what i must avoid at all cost.
I guess I have no choice but to build a simulator sooner than I had hoped.
Thanks for all your help
Founder & Developer @ MakeShyft R.D.A - https://www.makeshyft.com
Build Software with AppStarterStack for Livecode - https://www.AppStarterStack.com
Save Time with The Time Saver's Toolbox - https://www.TimeSaversToolbox.com
Build Software with AppStarterStack for Livecode - https://www.AppStarterStack.com
Save Time with The Time Saver's Toolbox - https://www.TimeSaversToolbox.com
-
- VIP Livecode Opensource Backer
- Posts: 9842
- Joined: Sat Apr 08, 2006 7:05 am
- Location: Los Angeles
- Contact:
Re: Garbage collection e-mail thread is giving my nightmares
Get ready to be freaked out often. All systems have limits, often far below the addressing mode of the CPU. For example, the Home version of Win 9 is 64-bit but only supports a max of 8 GB RAM.makeshyft wrote:... but what freaks me out is an unexpected memory limit....
Complicating things is how memory allocation works under the hood. Malloc usually needs contiguous space, but in some cases even if a system has more than the amount being requested, if it's not contiguous an allocation request may fail.
And then of course there's the plethora of laptops sold with just 4 GB RAM, and many with just 2 GB.
If this is a server system you'll have control over the system configuration, but RAM is expensive with hosting so expect an expensive managed service.
If this is for a local system you'll need to find a way to work with paged data, perhaps some form of MapReduce, or using SQL to handle as much of the analysis as it might be practical for. After all, you can't count on your customers having large amounts of RAM.
Working out the tradeoffs of moving stuff between disk and RAM is the essence of computer science. We can learn how large scale systems solve this, and apply the relevant portions to your task. The specifics of how that works will depend on the specifics of your application and its current algos, but one thing to expect is that it's very rare that any single algorithm will scale automatically. At a certain point, all algos designed for one level of scale will require revision as scale increases.
Since your concerns began with a discussion on the use-livecode list, and your questions could benefit from the core dev team members already involved in that discussion, have you considered replying to that thread with your questions?
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: Garbage collection e-mail thread is giving my nightmares
Hi Richard,
Thanks for sharing your thoughts... I am having issues with the mailing list, which I have to investigate. But this was weighing on my mind too much to wait for that.... I think I will contact the team once I have more of an idea of how this effects the UMP.
Just to be clear the desktop version I built was never intended to scale past a certain point...... so there is a "cloud" integration in the pipeline where we don't use MYSQL as a server db, but one of the distributed databases instead....so the tabulation and the loading of data occurs by querying the distributed database instead.
This first version only is supposed to be able to handle a limited amount of participants..... the data is mostly string based so 3 GB of data is ALOT of data. which is why I have not really bothered to investigate these scaling issues yet.
The thread really just surprised me .... and it seemed that everyone else seemed surprised as well....lol
I think what I'm going to do is build a simulator of course to see just how much data can be handled before it all hits the fan.
Thanks again. I feel better much better about it for now.
Thanks for sharing your thoughts... I am having issues with the mailing list, which I have to investigate. But this was weighing on my mind too much to wait for that.... I think I will contact the team once I have more of an idea of how this effects the UMP.
Just to be clear the desktop version I built was never intended to scale past a certain point...... so there is a "cloud" integration in the pipeline where we don't use MYSQL as a server db, but one of the distributed databases instead....so the tabulation and the loading of data occurs by querying the distributed database instead.
This first version only is supposed to be able to handle a limited amount of participants..... the data is mostly string based so 3 GB of data is ALOT of data. which is why I have not really bothered to investigate these scaling issues yet.
The thread really just surprised me .... and it seemed that everyone else seemed surprised as well....lol
I think what I'm going to do is build a simulator of course to see just how much data can be handled before it all hits the fan.
Thanks again. I feel better much better about it for now.
Founder & Developer @ MakeShyft R.D.A - https://www.makeshyft.com
Build Software with AppStarterStack for Livecode - https://www.AppStarterStack.com
Save Time with The Time Saver's Toolbox - https://www.TimeSaversToolbox.com
Build Software with AppStarterStack for Livecode - https://www.AppStarterStack.com
Save Time with The Time Saver's Toolbox - https://www.TimeSaversToolbox.com
Re: Garbage collection e-mail thread is giving my nightmares
This application might be useful as a helpful pseudo workaround for this problem.
https://www.techpowerup.com/forums/thre ... re.112556/
Not tested with livecode yet
https://www.techpowerup.com/forums/thre ... re.112556/
Not tested with livecode yet
Founder & Developer @ MakeShyft R.D.A - https://www.makeshyft.com
Build Software with AppStarterStack for Livecode - https://www.AppStarterStack.com
Save Time with The Time Saver's Toolbox - https://www.TimeSaversToolbox.com
Build Software with AppStarterStack for Livecode - https://www.AppStarterStack.com
Save Time with The Time Saver's Toolbox - https://www.TimeSaversToolbox.com
Re: Garbage collection e-mail thread is giving my nightmares
If you use a database, use the database to all organizing and filtering stuff, databases are born to go over OS limits. Use livecode just to give a nice GUI to interact with the database.
Livecode Wiki: http://livecode.wikia.com
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w
Re: Garbage collection e-mail thread is giving my nightmares
thats a good point, will leverage that when I can. thanks
Founder & Developer @ MakeShyft R.D.A - https://www.makeshyft.com
Build Software with AppStarterStack for Livecode - https://www.AppStarterStack.com
Save Time with The Time Saver's Toolbox - https://www.TimeSaversToolbox.com
Build Software with AppStarterStack for Livecode - https://www.AppStarterStack.com
Save Time with The Time Saver's Toolbox - https://www.TimeSaversToolbox.com