
Manipulating large text file stalling
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller
Re: Manipulating large text file stalling
They ARE being used and accessed in your repeat loop! 

Re: Manipulating large text file stalling
Hmmmmmm, I'm going to have to take a look on a bigger screen, I must be code blind!
Re: Manipulating large text file stalling
Make sure that within the loop there is a line
wait 0 milliseconds with messages
That is the reason you are getting the "not responding" issue. Adding this line will not optimise or speed up the processing but it will avoid the user thinking the program has crashed.
wait 0 milliseconds with messages
That is the reason you are getting the "not responding" issue. Adding this line will not optimise or speed up the processing but it will avoid the user thinking the program has crashed.
-
- VIP Livecode Opensource Backer
- Posts: 10043
- Joined: Sat Apr 08, 2006 7:05 am
- Contact:
Re: Manipulating large text file stalling
Your script requires a lot of fields with specific names. I'm happy to help optimize it, but rather than going through the script to find all the field references and create those fields from scratch it would be very helpful if you could post a stack that already has the required objects in it.Andycal wrote:Right, finally generated some dummy data with dupes. Taken hours, but helped me with the debugging process a lot. I discovered I hadn't converted all my fields to variables.
Anyway, the data dump is here : https://www.dropbox.com/s/83eoycq83hjw3 ... a.csv?dl=0
That's generated with some online generator, de-duped and then I forced some duplicates in.
The latest code doing the work is here:
This is proving to be a very good debugging lesson, thanks y'all!Code: Select all
on mouseUp put zero into field "lblProc" put zero into field "lblDupes" # transfer everything to vars put field 1 into tSource put empty into tDestination put fld "lblProc" into tCounter Put empty into tCounter set the lockscreen to true repeat for each line tLine in tSource put tCounter+1 into tCounter if tCounter Mod 50 = 0 then set the lockscreen to false put tDestination into field 2 put tCounter into fld "lblProc" set the lockscreen to true end if #put the value of field "lblProc" + 1 into field "lblProc" put item 11 of tLine into tEmail # If no email, store it in another field if tEmail is empty then put tLine & CR after fld "fldNoEmail" end if # put tLine into fld "lblCurrent" put lineoffset(tEmail, tDestination) into wLine if wLine is not 0 then put the value of field "lblDupes" + 1 into field "lblDupes" put line wLine of tDestination into tProc set the lockscreen to false put line wLine of tDestination & CR after field "fldDupetxt" set the lockscreen to true # Loyalty cards put item seven of tLine into tCardOne put item seven of tProc into tCardTwo put tCardOne + tCardTwo into tTotal put tTotal into item seven of tProc # End Loyalty cards # Browsin put item 9 of tLine into tBrowOne put item 9 of tProc into tBrowTwo put tBrowOne + tBrowTwo into tBrowTotal put tBrowTotal into item 9 of tProc # end Browsin # Back to wow put item 13 of tLine into tBackOne put item 13 of tProc into tBackTwo put tBackOne + tBackTwo into tBackTotal put tBackTotal into item 13 of tProc # end back to wow # Eyebrow Tint put item 14 of tLine into tTintOne put item 14 of tProc into tTintTwo put tTintOne + tTintTwo into tTintTotal put tTintTotal into item 14 of tProc # end eyebrow Tint # EyeLash Tint put item 15 of tLine into tLashOne put item 15 of tProc into tLashTwo put tLashOne + tLashTwo into tLashTotal put tLashTotal into item 15 of tProc # end eyelash tint # FLC put item 16 of tLine into tFLCOne put item 16 of tProc into tFLCTwo put tFLCOne + tFLCTwo into tFLCTotal put tFLCTotal into item 16 of tProc # end FLC # EyeDo put item 18 of tLine into tEyeOne put item 18 of tProc into tEyeTwo put tEyeOne + tEyeTwo into tEyeTotal put tEyeTotal into item 18 of tProc # end eyedo # WaxLash put item 22 of tLine into tWLashOne put item 22 of tProc into tWLashTwo put tWLashOne + tWLashTwo into tWLashTotal put tWLashTotal into item 22 of tProc # end WaxLash # wax Treat put item 23 of tLine into tWaxTreatOne put item 23 of tProc into tWaxTreatTwo put tWaxTreatOne + tWaxTreatTwo into tWaxTreatTotal put tWaxTreatTotal into item 23 of tProc # end wax treat # Wow Brow put item 24 of tLine into tWowOne put item 24 of tProc into tWowTwo put tWowOne + tWowTwo into tWowTotal put tWowTotal into item 24 of tProc # end Wow Brow put tProc into line wLine of tDestination else put tLine & CR after tDestination end repeat set the lockscreen to false put tDestination into field 2 end mouseUp
Thanks -
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: Manipulating large text file stalling
I think it may be solved!
Thierry dropped me a PM and took a look, and he only blummin' well got it down to 30 seconds!!!
Rather than me just pile in here with the source, I'm going to look at it, digest it and understand it. It's also highlighted some other areas where I need to be altering my data handling. I'll report back with a fully optimised and working script.
This really has been a fantastic exercise and I thank you all, especially Thierry who provided that pivotal "Ah-ha!" moment!
Thierry dropped me a PM and took a look, and he only blummin' well got it down to 30 seconds!!!
Rather than me just pile in here with the source, I'm going to look at it, digest it and understand it. It's also highlighted some other areas where I need to be altering my data handling. I'll report back with a fully optimised and working script.
This really has been a fantastic exercise and I thank you all, especially Thierry who provided that pivotal "Ah-ha!" moment!
-
- VIP Livecode Opensource Backer
- Posts: 10043
- Joined: Sat Apr 08, 2006 7:05 am
- Contact:
Re: Manipulating large text file stalling
You're in good hands. Thierry's been a consistently generous contributor to these forums, and does excellent work.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: Manipulating large text file stalling
Agree with that. Here's the bit of genius that I really liked:
So instead of my clunky checking every little bit, do it in one swoop.
There's a lot of arrays going on as well. I never got on with those, so I'm having fun working out what they're doing.
Code: Select all
repeat for each item N in "7,9,13,14,15,16,18,23,24"
put (item N of tLine) + (item N of tProc) into item N of tProc
end repeat
There's a lot of arrays going on as well. I never got on with those, so I'm having fun working out what they're doing.
-
- VIP Livecode Opensource Backer
- Posts: 10043
- Joined: Sat Apr 08, 2006 7:05 am
- Contact:
Re: Manipulating large text file stalling
Arrays are well worth learning in any language, and in LiveCode they have so many uses you'll find the time an excellent investment.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: Manipulating large text file stalling
Hi Andy,
what do you want to do with Arsenio,Hampton, he sneakes 6 times into your sample data
he lives in lines
18727,18735,18736,18737,18738,18739 of your sample data (without header line)
ipsum.porta.elit@mollis.co.uk Arsenio,Hampton
Does your real data contain more than double entries? Or should I delete those from my copy of your sample data?
Anyways Arsenio Hampton to 2 occurrences using the code below you get for 49777 lines of data with 20 duplicates computational time about a second, loading fields with the result about additional 9 seconds
All times using LC 7.1 RC1 on a MacBook Pro Intel i5 2.53 GHz.
I used Thierry's code for updating the items.
Kind regards
Bernd
what do you want to do with Arsenio,Hampton, he sneakes 6 times into your sample data
he lives in lines
18727,18735,18736,18737,18738,18739 of your sample data (without header line)
ipsum.porta.elit@mollis.co.uk Arsenio,Hampton
Does your real data contain more than double entries? Or should I delete those from my copy of your sample data?
Anyways Arsenio Hampton to 2 occurrences using the code below you get for 49777 lines of data with 20 duplicates computational time about a second, loading fields with the result about additional 9 seconds
All times using LC 7.1 RC1 on a MacBook Pro Intel i5 2.53 GHz.
I used Thierry's code for updating the items.
Code: Select all
on mouseUp
put the milliseconds into t
lock screen
put field "source" into tSource
put "7,9,13,14,15,16,18,23,24" into tItems
split tSource by return
repeat for each key aKey in tSource
put item 11 of tSource[aKey] into tEmail
put aKey & comma after tEmailArray[tEmail]["theLines"]
end repeat
repeat for each key aKey in tEmailArray
put tEmailArray[aKey]["theLines"] into tLines
delete last char of tLines -- a comma
if the number of items of tLines > 1 then
sort items of tLines ascending numeric
repeat for each item N in tItems
put item N of tSource[item 1 of tLines] + item N of tSource[item 2 of tLines] into item N of tSource[item 2 of tLines]
end repeat
put tSource[item 1 of tLines] & cr after tCollectDupes
add 1 to tCountDupes
put tSource[item 2 of tLines] & cr after tCollectDest
add 1 to tCountProc
else
put tSource[item 1 of tLines] & cr after tCollectDest
add 1 to tCountProc
end if
end repeat
put "processing ms " & the milliseconds - t
put the milliseconds into t
put tCountDupes into field "lblDupes"
put tCountProc into field "lblProc"
put tCollectDupes into field "fldDupeTxt"
put tCollectDest into field "Destination"
put " loading fields ms " & the milliseconds - t after msg
unlock screen
end mouseUp
Bernd
Re: Manipulating large text file stalling
Hi,
Finally got some time to clear a bit the script..
Here are the timings on my mini-mac:
Andy, I'll send you the whole stack.
Kind regards,
Thierry
Finally got some time to clear a bit the script..
Here are the timings on my mini-mac:
Code: Select all
-- change the filenames to your need
local filenameOutput = "/Users/U/Desktop/dummydataOut.txt"
local filenameInput = "/Users/U/Desktop/dummydata.csv"
-- only for debugging (when it works, drop it + line where the test is )
local Maxi4testing = 1001
-- vars shared between handlers
local fldNoEmail, lblDupes, fldDupetxt, lblProc
local destination, Timers, N
on mouseUp
put zero into field "lblProc"
put zero into field "lblDupes"
put empty into fld 1
put empty into field 2
put empty into destination
put empty into fldNoEmail
put empty into lblDupes
put empty into fldDupetxt
put empty into lblProc
startChrono 1
put URL ("file:" & filenameInput ) into tSource
startChrono 2
processCSVfile tSource
-- refresh our view
put fldNoEmail into fld "fldNoEmail"
put lblDupes into field "lblDupes"
put fldDupetxt into field "fldDupetxt"
put lblProc into fld "lblProc"
put N into fld "lblProc"
startChrono 3
-- Transform our array back to text ( one line per value)
combine destination by return
startChrono 4
-- well, updating the field text takes 10 seconds!!!!
-- put ArrayDestination into field 2
-- instead, save it as a text file:
put destination into URL ( "file:" & filenameOutput)
put "Check results in file: " & filenameOutput into field 2
startChrono 5
showTimers Timers
end mouseUp
local kMail = 11
on processCSVfile @CSVtext
put 0 into N
repeat for each line aLine in CSVtext
add 1 to N
-- drop it when script works:
-- if N > Maxi4testing then exit repeat
-- view progress: ( comment all only for speed measurement)
-- if N Mod 1000 = 0 then
-- put N into fld "lblProc"
-- -- give the engine some time to breath:
-- wait 1 milliseconds with messages
-- end if
put item kMail of aLine into tEmail
if tEmail is empty then -- If no email, store it apart
put aLine & cr after fldNoEmail
next repeat
end if
if destination[ tEmail] is empty then
put aLine into destination[ tEmail]
next repeat
end if
add 1 to lblDupes
put destination[ tEmail] into tProc
put tProc & cr after fldDupetxt
repeat for each item X in "7,9,13,14,15,16,18,23,24"
add (item X of aLine) to item X of tProc
end repeat
put tProc into destination[ tEmail]
end repeat
end processCSVfile
on showTimers T
local s
put "loading CSV file:" &tab& T[ 2] - T[ 1] &cr into s
put "repeat loop:" &tab& T[ 3] - T[ 2] &cr after s
put "combine:" &tab& T[ 4] - T[ 3] &cr after s
put "fill up field:" &tab& T[ 5] - T[ 4] &cr after s
put "whole script:" &tab& T[ 5] - T[ 1] &cr after s
answer s
end showTimers
on startChrono n
put the milliseconds into Timers[ n]
end startChrono
Kind regards,
Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
Re: Manipulating large text file stalling
Hi,
After few interesting exchanges with Bernd,
here is a way to gain some CPU cycles:
So, instead of:
do that:
and here the new results you can compare with the previous ones.
Tested with LC 7.1 (rc 2).
Kind regards,
Thierry
After few interesting exchanges with Bernd,
here is a way to gain some CPU cycles:
So, instead of:
Code: Select all
combine destination by return
Code: Select all
repeat for each element aRecord in destination
put aRecord & cr after tCollect
end repeat
Tested with LC 7.1 (rc 2).
Kind regards,
Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
Re: Manipulating large text file stalling
Hi,
Just downloaded LC 8.0 dp 4 on my Mac and tried to run the same stack.
It works as expected. Great!
Here are the results:
Seems a bit slower ?
Thierry
Just downloaded LC 8.0 dp 4 on my Mac and tried to run the same stack.
It works as expected. Great!
Here are the results:
Seems a bit slower ?
Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!