Using CSV as reference file

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: Klaus, FourthWorld, heatherlaine, kevinmiller

wlaughto
Posts: 19
Joined: Fri Apr 07, 2017 5:06 am

Using CSV as reference file

Post by wlaughto » Tue May 26, 2020 7:15 am

Hi,

I need to be able to import a CSV file to use as reference data and 'link' to this data via a given input key. Not sure the best, most efficient way to attack this particular problem, so here goes.

I have an external CSV file which is developed and updated periodically by a 3rd party and is access via HTML - e.g. put url "https://www.matthewproctor.com/Content/ ... tcodes.csv" into tPostcodeData.

It contains Post Code data. It has 14 columns, of which only 4 are of interest to me. The 'key' (that I need) to the CSV file is in the 2nd field (column). The CSV file is delimited by comma and each field is enclosed in quotes e.g. "222","2001","Sydney GPO","Sydney Metro Area","NSW". The first row of the CSV contains the field names e.g. "id","postcode","name","area","state"


What I need to understand:-

1. The best structure to import this data into
2. Make field 2 (postcode) the key of data structure
3. Easiest way to get rid of column 1 (id) - if necessary
4. Most efficient way to lookup values in this file by key

The file contains up to 18,000 records, so looping thru this file to find values is sub-optimal.

Any advice would be most welcomed.

Regards

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4685
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: Using CSV as reference file

Post by richmond62 » Tue May 26, 2020 7:45 am

I am a pain in the bum because I tend to write in these reply boxes as I try things out.

1.
It has 14 columns, of which only 4 are of interest to me.
Which 4? The first, the last, 4 distributed in some odd fashion through each row?

2. Import the CSV file as a text file and strip out the double quotes.
-
Screenshot 2020-05-26 at 9.52.15.png
-
Rude, Crude, and Inefficient.

Got to get that code away from fields.

IMPORT button:

Code: Select all

on mouseUp
   answer file "Choose a CSV file to import"
   if the result = "cancel" 
   then exit mouseUp
   else
      set the text of fld "prePROC" to URL ("file:" & it)
   end if
   put "&&&" after fld "prePROC"
   put the number of chars in fld "prePROC" into fld "KOUNT"
end mouseUp
REMOVE QUOTES button:

Code: Select all

on mouseUp
   put empty into fld "PROC1"
   put empty into fld "KOUNT2"
   put fld "prePROC" into rawDDD
   repeat until char 1 of rawDDD is "&"
      add 1 to fld "KOUNT2"
      if char 1 of rawDDD is quote then
         delete char 1 of rawDDD
      else
         put char 1 of rawDDD after fld "PROC1"
         delete char 1 of rawDDD
      end if
   end repeat
end mouseUp
After 45 minutes the REMOVE button had chewed its way through about a third of the imported text.

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4685
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: Using CSV as reference file

Post by richmond62 » Tue May 26, 2020 8:48 am

Introduced a COMBO button:

Code: Select all

on mouseUp
   answer file "Choose a CSV file to import"
   if the result = "cancel" 
   then exit mouseUp
   else
      put URL ("file:" & it) into rawDDD
      put "&&&" after rawDDD
   end if
   -------------
   put empty into fld "KOUNT2"
   put empty into fld "PROC1"
   repeat until char 1 of rawDDD is "&"
      add 1 to fld "KOUNT2"
      if char 1 of rawDDD is quote then
         delete char 1 of rawDDD
      else
         put char 1 of rawDDD after procDDD
         delete char 1 of rawDDD
      end if
   end repeat
   put rawDDD into fld "PROC!"
end mouseUp
Still pretty slow.

*
AND, for some "funny" reason does NOT strip out the quotes.

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4685
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: Using CSV as reference file

Post by richmond62 » Tue May 26, 2020 9:06 am

At the risk of being offensive I have cut that enormously long CVS file down to 10 lines:
australian_postcodes_cutshort.csv.zip
(1.04 KiB) Downloaded 8 times
This leaves us with something manageable for development.

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4685
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: Using CSV as reference file

Post by richmond62 » Tue May 26, 2020 9:12 am

The 'key' (that I need) to the CSV file is in the 2nd field (column).
I don't think I understand that: as my Granny said, "A picture is worth a thousand words." :D
-
Screenshot 2020-05-26 at 11.08.49.png
-
If you could download that picture and mark the key I think things would be easier.

Do you mean by "key" the factor that is to be used to order the data?

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4685
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: Using CSV as reference file

Post by richmond62 » Tue May 26, 2020 9:42 am

If that is what you mean . . .
-
Screenshot 2020-05-26 at 11.40.42.png
-
button REMOVE QUOTES & FIRST LINE & REORDER

Code: Select all

on mouseUp
   put empty into fld "PROC1"
   put empty into fld "KOUNT2"
   put fld "prePROC" into rawDDD
   repeat until char 1 of rawDDD is "&"
      add 1 to fld "KOUNT2"
      if char 1 of rawDDD is quote then
         delete char 1 of rawDDD
      else
         put char 1 of rawDDD after fld "PROC1"
         delete char 1 of rawDDD
      end if
   end repeat
   delete line 1 of fld "PROC1"
   ---
   set the itemDelimiter to ","
   put 1 into LYNE
   repeat until line LYNE of fld "PROC1" is empty
      put empty into TAIL
      put item 1 of line LYNE of fld "PROC1" into TWOO
      put item 2 of line LYNE of fld "PROC1" into WUN
      put 3 into ITEMM
      repeat until item ITEMM of line LYNE of fld "PROC1" is empty
         put (item ITEMM of line LYNE of fld "PROC1") & "," after TAIL
         add 1 to ITEMM
      end repeat
      put WUN & "," & TWOO & "," & TAIL into line LYNE of fld "PROC1" 
      add 1 to LYNE
   end repeat
end mouseUp
Last edited by richmond62 on Tue May 26, 2020 10:26 am, edited 1 time in total.

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4685
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: Using CSV as reference file

Post by richmond62 » Tue May 26, 2020 10:13 am

The file contains up to 18,000 records, so looping thru this file to find values is sub-optimal.
You're telling me! 8)

I don't think you need to LOOP through the file, but you do need to READ through the comma-delimited items in the file
after it has been processed in the way I have demonstrated.

The key (and that does not mean your key) to the whole problem (which comes down to speed) lies with the word item.
-
Screenshot 2020-05-26 at 12.25.03.png
-

button LOOKUP:

Code: Select all

on mouseUp
   set the itemDelimiter to ","
   put empty into fld "PROC2"
   ask "Key value? "
   put it into KEYY
   put it into fld "KEYYY"
   put 1 into LYNE
   put 1 into LYNE2
   repeat until line LYNE of fld "PROC1" is empty
      if  item 1 of line LYNE of fld "PROC1" contains KEYY then
         put line LYNE of fld "PROC1" into line LYNE2 of fld "PROC2"
         add 1 to LYNE2
      else
         --do nix
      end if
      add 1 to LYNE
   end repeat
end mouseUp
Attachments
OZ_PC.livecode.zip
Here's the stack.
(2.36 KiB) Downloaded 7 times

mrcoollion
Posts: 562
Joined: Thu Sep 11, 2014 1:49 pm
Location: The Netherlands

Re: Using CSV as reference file

Post by mrcoollion » Tue May 26, 2020 11:30 am

Made a little example on how I would import the file into an array and then show it in a TreeView.
takes about 4 seconds.
ImportCSV01.zip
Import CSV into Array en TreeView
(1.1 KiB) Downloaded 20 times
Below is the code in the button and you need a TreeView named 'TreeViewData'.

Code: Select all

on mouseUp
   set the arraydata of widget "TreeViewData" to empty
   answer file "Choose a CSV file to import"
   if the result = "cancel" 
   then exit mouseUp
   else
      put URL ("file:" & it) into gCSVData
   end if
   ----------------------------------------------------------
   replace quote with "" in gCSVData
   replace comma with tab in gCSVData
   put line 1 of gCSVData into tColumnDescr
   set the itemdelimiter to tab
   put the number of items of tColumnDescr into tNbrOfColumns
   delete line 1 of gCSVData -- do not want description row
   put 0 into tCounter
   repeat for each line tLineData in gCSVData
      add 1 to tCounter
      repeat with tColumnNbr = 1 to  tNbrOfColumns
         put item tColumnNbr of tColumnDescr into tColumnName
         put item tColumnNbr of tLineData into tColumnData
         put tColumnData into aArrayData[tCounter][tColumnName]
      end repeat
   end repeat
   set the arraydata of widget "TreeViewData" to aArrayData
end mouseUp
Hope this helps some?

regards,

Paul

wlaughto
Posts: 19
Joined: Fri Apr 07, 2017 5:06 am

Re: Using CSV as reference file

Post by wlaughto » Tue May 26, 2020 1:51 pm

Wow - richmond62, thanks for your efforts here.

The 'key' I want from the data is in the 'postcode' field (column 2) in the spreadsheet.

The other fields I need are in columns 3,4 & 11 (locality, state, sa3name).

Is it possible to get rid of the quotes with something like:-

put url "https://www.matthewproctor.com/Content/ ... tcodes.csv" into tPostcodeData
replace quote with empty in tPostcodeData

mrcoollion, thanks will test this tomorrow.

wlaughto
Posts: 19
Joined: Fri Apr 07, 2017 5:06 am

Re: Using CSV as reference file

Post by wlaughto » Tue May 26, 2020 2:00 pm

mrcoollion wrote:
Tue May 26, 2020 11:30 am
Made a little example on how I would import the file into an array and then show it in a TreeView.
takes about 4 seconds.

ImportCSV01.zip

Below is the code in the button and you need a TreeView named 'TreeViewData'.

Code: Select all

on mouseUp
   set the arraydata of widget "TreeViewData" to empty
   answer file "Choose a CSV file to import"
   if the result = "cancel" 
   then exit mouseUp
   else
      put URL ("file:" & it) into gCSVData
   end if
   ----------------------------------------------------------
   replace quote with "" in gCSVData
   replace comma with tab in gCSVData
   put line 1 of gCSVData into tColumnDescr
   set the itemdelimiter to tab
   put the number of items of tColumnDescr into tNbrOfColumns
   delete line 1 of gCSVData -- do not want description row
   put 0 into tCounter
   repeat for each line tLineData in gCSVData
      add 1 to tCounter
      repeat with tColumnNbr = 1 to  tNbrOfColumns
         put item tColumnNbr of tColumnDescr into tColumnName
         put item tColumnNbr of tLineData into tColumnData
         put tColumnData into aArrayData[tCounter][tColumnName]
      end repeat
   end repeat
   set the arraydata of widget "TreeViewData" to aArrayData
end mouseUp
Hope this helps some?

regards,

Paul
My real issue then is being able to prompt for postcode input and display the postcode detail from the table, specifically sa3name, locality & state. The imported postcode file is a reference file - I don't need to display it. What is most efficient way to link input postcode to the above table?

In SQL terms:-
select sa3name, locality, state from aArrayData where postcode = inputpostcode

ghettocottage
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 343
Joined: Tue Apr 10, 2012 9:18 am

Re: Using CSV as reference file

Post by ghettocottage » Tue May 26, 2020 2:41 pm

I have an external CSV file which is developed and updated periodically by a 3rd party and is access via HTML - e.g. put url "https://www.matthewproctor.com/Content/ ... tcodes.csv" into tPostcodeData.
how often is this csv file updated? where does this file live normally? who puts it on the server?

I ask, because a slightly different approach could be to have a simple script on your server that imports/updates a mysql database every night ( or more often).

then you work with the database rather than having to churn through a csv file. The server would do the heavy work for you so your app could just request a small bit of info at a time.

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4685
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: Using CSV as reference file

Post by richmond62 » Tue May 26, 2020 3:45 pm

Wow - richmond62, thanks for your efforts here.

The 'key' I want from the data is in the 'postcode' field (column 2) in the spreadsheet.

The other fields I need are in columns 3,4 & 11 (locality, state, sa3name).
Knowing 'that' the whole procedure can be shortened, and a list field can be produced that contains ONLY columns 2,3,4 & 11.

mrcoollion's idea with the array is very clever, but I am not sure how you get at the data you need at the end of it.

At present I have a 5 minute break in the middle of online teaching, so I will take a quite a bit until I get round to that. :D

AxWald
Posts: 435
Joined: Thu Mar 06, 2014 2:57 pm

Re: Using CSV as reference file

Post by AxWald » Tue May 26, 2020 4:11 pm

Hi,
wlaughto wrote:
Tue May 26, 2020 7:15 am
The file contains up to 18,000 records, so looping thru this file to find values is sub-optimal.
Really? A button to read 'em all:

Code: Select all

on mouseUp
   answer file "What file?"
   if it is empty then exit mouseUp
   put URL ("file:" & it) into myVar
   put the millisecs into t1                          --  we got data, start timer!
   delete line 1 of myVar                             --  kill header line
   repeat for each line L in myVar                    --  extract used data   
      put clearQuote(item 2 of L) & tab & \
            clearQuote(item 3 of L) & tab & \
            clearQuote(item 4 of L) & tab & \
            clearQuote(item 11 of L) & CR after myData
   end repeat
   delete char -1 of myData
   set the c_myData of this stack to myData           --  save for later
   put the millisecs - t1 into myTime
   answer "Processed" && the number of lines of myData && "lines in" && myTime && "millisecs"
end mouseUp

function clearQuote what
   replace quote with empty in what
   return what
end clearQuote
Result: "Processed 18272 lines in 154 millisecs" - now this didn't take this long.
Your sanitized data are now saved in a custom property to wait for further processing:

Code: Select all

0200	ANU	ACT	
0200	Australian National University	ACT	
0800	DARWIN	NT	Darwin City
0801	DARWIN	NT	Darwin City
...

Another button, to get the entries matching a certain postcode (in fld "postcode_fld"):

Code: Select all

on mouseUp
   put the c_myData of this stack into myVar        --  load data
   put fld "postcode_fld" into myPC
   if myPC is not in myVar then                     --  just to be sure ...
      answer error "Your postcode doesn't occur!"
      exit mouseUp
   end if
   put the millisecs into t1                         --  start timer
   set itemdel to tab
   repeat for each line L in myVar
      if item 1 of L = myPC then                     --  filter matching lines
         put L & CR after myData
      end if
   end repeat
   delete char -1 of myData
   put myData into fld "result_fld"                   --  display results
   put the millisecs - t1 into myTime
   answer "Filtered" && the number of lines of myData && "lines from total" \
   && the number of lines in myVar && "in" && myTime && "millisecs"
end mouseUp
Searching for postcode "4816": "Filtered 41 lines from total 18272 in 20 millisecs" - no matter what, it's always ~20 millisecs here.

Have fun!
Livecode programming until the cat hits the fan ...

mrcoollion
Posts: 562
Joined: Thu Sep 11, 2014 1:49 pm
Location: The Netherlands

Re: Using CSV as reference file

Post by mrcoollion » Tue May 26, 2020 7:23 pm

Could not resist.
Here is a version that imports the CSV, strips all necessary columns, and has a filter field in which you can type the postal code to filter the list.
It is pretty fast, maybe someone can make it faster 8)
ImportCSV02.zip
Import CSV and filter
(1.47 KiB) Downloaded 19 times
ImportCSV02.zip
Import CSV and filter
(1.47 KiB) Downloaded 19 times
Regards,

Paul (MrCoolLion)

richmond62
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 4685
Joined: Fri Feb 19, 2010 10:17 am
Location: Bulgaria

Re: Using CSV as reference file

Post by richmond62 » Tue May 26, 2020 7:35 pm

Could not resist.
Fantastic! :D

Post Reply

Return to “Getting Started with LiveCode - Complete Beginners”