How to sort out this
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller
-
- VIP Livecode Opensource Backer
- Posts: 7230
- Joined: Sat Apr 08, 2006 8:31 pm
- Location: Minneapolis MN
- Contact:
Re: How to sort out this
Maybe you can post 50 lines or so here for testing. Plain text please.
Jacqueline Landman Gay | jacque at hyperactivesw dot com
HyperActive Software | http://www.hyperactivesw.com
HyperActive Software | http://www.hyperactivesw.com
-
- Posts: 203
- Joined: Wed Jul 23, 2008 8:46 am
Re: How to sort out this
I've uploaded the file in different host and in zip format. let me know if you have issue downloading the file:
Code: Select all
http://www.mediafire.com/file/c00yhr2z79r3uy6/sample.zip
Re: How to sort out this
Well, I don't think the lineoffset method is viable for the real file. I let it run for >10 min and it still didn't finish.
The below code took between 18 and 23 sec for the actual data (first file - .7z)
The below code took between 18 and 23 sec for the actual data (first file - .7z)
Code: Select all
on mouseUp
local tList, tKey, tLines, tCounts, tDupes, tStart
set the itemDel to ":"
put the millisecs into tStart
put url ("file:" & field "FileName") into tList
-- look at each line in the original data
repeat for each line tLine in tList
put item 1 of tLine into tKey
put item 2 of tLine into tLines[tKey]
add 1 to tCounts[tKey]
end repeat
-- go through array and remove dupes
repeat for each key tKey in tLines
if tCounts[tKey] > 1 then
delete variable tLines[tKey]
end if
end repeat
combine tLines using return and ":"
set the text of field "Destination" to tLines
answer the millisecs - tStart
end mouseUp
Brian Milby
Script Tracker https://github.com/bwmilby/scriptTracker
Script Tracker https://github.com/bwmilby/scriptTracker
-
- Posts: 203
- Joined: Wed Jul 23, 2008 8:46 am
Re: How to sort out this
already done trying your latest code. its working like a charm. seems I've got expected result by checking so far. a BIG SALUTE for youbwmilby wrote: ↑Sun Jul 22, 2018 8:51 pmWell, I don't think the lineoffset method is viable for the real file. I let it run for >10 min and it still didn't finish.
The below code took between 18 and 23 sec for the actual data (first file - .7z)
HowToSortOutThis.livecode.zipCode: Select all
on mouseUp local tList, tKey, tLines, tCounts, tDupes, tStart set the itemDel to ":" put the millisecs into tStart put url ("file:" & field "FileName") into tList -- look at each line in the original data repeat for each line tLine in tList put item 1 of tLine into tKey put item 2 of tLine into tLines[tKey] add 1 to tCounts[tKey] end repeat -- go through array and remove dupes repeat for each key tKey in tLines if tCounts[tKey] > 1 then delete variable tLines[tKey] end if end repeat combine tLines using return and ":" set the text of field "Destination" to tLines answer the millisecs - tStart end mouseUp
cheers
Re: How to sort out this
Glad to help out. Of course my solution would not have been possible without the 2 previous suggestions on using arrays.
Also, looking at this sample data, item 2 is item 1 + what looks to be a 2 or 4 digit year (for the duplicates). The second pass could be eliminated if you tested for that instead.
Also, looking at this sample data, item 2 is item 1 + what looks to be a 2 or 4 digit year (for the duplicates). The second pass could be eliminated if you tested for that instead.
Brian Milby
Script Tracker https://github.com/bwmilby/scriptTracker
Script Tracker https://github.com/bwmilby/scriptTracker
-
- Posts: 203
- Joined: Wed Jul 23, 2008 8:46 am
Re: How to sort out this
dear bwmilby,
you dont know how much you helped me! my work with those text files been pending for many months since I started this thread. at last, you made it.
I also give thanks and gratitude to those who helped in getting the proper script either by providing code or idea. You all are so nice guys. Love you all
you dont know how much you helped me! my work with those text files been pending for many months since I started this thread. at last, you made it.
I also give thanks and gratitude to those who helped in getting the proper script either by providing code or idea. You all are so nice guys. Love you all
Re: How to sort out this
Hi, Alemrantareq.
Here's a version which finds 457999 singleton keys in your file of 1373240 lines in under 6 seconds here.
Does it work for you?function singletonKeys pLines, pItemDelimiter
local tAllKeys, tPriorKey, tDuplicatedKeys, tSingletonKeys
sort pLines
set the itemDelimiter to pItemDelimiter
repeat for each line tLine in pLines
get item 1 of tLine
if it is tPriorkey then
put "true" into tDuplicatedKeys[ it ]
else
put item 2 of tLine into tAllKeys[ it ]
put it into tPriorKey
end if
end repeat
difference tAllKeys with tDuplicatedKeys into tSingletonKeys
combine tSingletonKeys with return and pItemDelimiter
return tSingletonKeys
end singletonKeys
-- Dick
Re: How to sort out this
That is pretty slick, rkriesel, pretty slick indeed.
Re: How to sort out this
Yes, I find Brian's thoughts and code constantly fascinating, but that bit you put up shouldn't be sold short. Good is good after all
-
- Posts: 203
- Joined: Wed Jul 23, 2008 8:46 am
Re: How to sort out this
hi Dick, I didn't expect you would make another faster way out; so your reply given me a big surprise
by the way, I've tried your function by this way:
unfortunately, livecode crashes when I go to filter the sample file provided here. Would you pls discover me where I'm going wrong? thanks in advance..
by the way, I've tried your function by this way:
Code: Select all
on mouseUp
answer file "Select the File:" with type "Text File|txt"
put url ("file:" & it) into Temp
set the text of fld "f1" to singletonKeys(Temp)
end mouseUp
function singletonKeys pLines, pItemDelimiter
local tAllKeys, tPriorKey, tDuplicatedKeys, tSingletonKeys
sort pLines
set the itemDelimiter to pItemDelimiter
repeat for each line tLine in pLines
get item 1 of tLine
if it is tPriorkey then
put "true" into tDuplicatedKeys[ it ]
else
put item 2 of tLine into tAllKeys[ it ]
put it into tPriorKey
end if
end repeat
difference tAllKeys with tDuplicatedKeys into tSingletonKeys
combine tSingletonKeys with return and pItemDelimiter
return tSingletonKeys
end singletonKeys
Re: How to sort out this
singletonKeys(Temp) should be singletonKeys(Temp, “:”)
But no smart quotes.
But no smart quotes.
Brian Milby
Script Tracker https://github.com/bwmilby/scriptTracker
Script Tracker https://github.com/bwmilby/scriptTracker
-
- Posts: 203
- Joined: Wed Jul 23, 2008 8:46 am
Re: How to sort out this
hi bwmilby, if you are talking about this:
still no success, got my LC crashed while filtering
Code: Select all
set the text of fld "f1" to singletonKeys(Temp, ":")
Re: How to sort out this
What does "got my LC crashed" mean? Do you have any evidence to share?alemrantareq wrote: ↑Tue Jul 24, 2018 6:57 pmhi bwmilby, if you are talking about this:still no success, got my LC crashed while filteringCode: Select all
set the text of fld "f1" to singletonKeys(Temp, ":")
-
- Posts: 203
- Joined: Wed Jul 23, 2008 8:46 am
Re: How to sort out this
hi rkriesel,
well, at the middle of processing, LC shown me (Not Responding) in the title bar. So I forcely closed LC thinking it crashed. But this time I didn't and let it remained still. And finally I got the output.
I included millisec count to see how many secs the process takes and here's the result: your script took 100319 millisecs to process the sample file while bwmilby's script took 43428 millisecs to process the same file. So bwmilby's script wins the race
Despite that, you and bwmilby both won my heart
well, at the middle of processing, LC shown me (Not Responding) in the title bar. So I forcely closed LC thinking it crashed. But this time I didn't and let it remained still. And finally I got the output.
I included millisec count to see how many secs the process takes and here's the result: your script took 100319 millisecs to process the sample file while bwmilby's script took 43428 millisecs to process the same file. So bwmilby's script wins the race
Despite that, you and bwmilby both won my heart