Performance considerations when working with large directory
Posted: Mon Apr 29, 2013 9:27 pm
Good afternoon everyone,
I am in the process of writing a file parsing program that will be processing files from a directory that has ~100,000 files. My general strategy is as follows:
1. Let the user select which directory they wish to process files from
2. Grab a file, one at a time, process it, append to an output file, and then grab the next file for processing.
Before I start grabbing files however, I need to get a list of the files in the directory (in addition to some file details such as date modified and date created). This I am currently doing the following way:
The problem I am finding is that the files function is taking a long while to process this (understandably so), but when processing this number of files, I find it inappropriate for the program to be unresponsive to the user without any option to opt out, or any status bar, etc.
Is there a way to instead grab a smaller number of files from the directory (say maybe 1000), process them, provide the user with status regarding where the program is in processing, and then grab the next 1000 files in the directory ad nauseum? I look forward to hearing back from you guys!
Thanks,
Tom
I am in the process of writing a file parsing program that will be processing files from a directory that has ~100,000 files. My general strategy is as follows:
1. Let the user select which directory they wish to process files from
2. Grab a file, one at a time, process it, append to an output file, and then grab the next file for processing.
Before I start grabbing files however, I need to get a list of the files in the directory (in addition to some file details such as date modified and date created). This I am currently doing the following way:
Code: Select all
answer folder "Select the folder of event files you wish to search:"
set the directory to it
put it into field "fld_Folder"
put the detailed files into fileDetails
Is there a way to instead grab a smaller number of files from the directory (say maybe 1000), process them, provide the user with status regarding where the program is in processing, and then grab the next 1000 files in the directory ad nauseum? I look forward to hearing back from you guys!
Thanks,
Tom