Optimizing SQLite Insert Statements

Creating desktop or client-server database solutions?

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
Not a lot of thought
Posts: 77
Joined: Thu May 21, 2015 2:41 am

Optimizing SQLite Insert Statements

Post by Not a lot of thought » Sat May 07, 2016 7:57 pm

Does anyone have any suggestions or good links for optimizing a bulk SQLITE insert statement from a .csv? I've not really found anything overly helpful. What I have works but I want it to go faster.

Currently I'm using something like the following with a loop to loop through each entry:

Put "INSERT INTO tableOne (vOne, vTwo, vThree) VALUES(:1, :2, :3);" into tSQL
revExecuteSQL gDatabase, tSQL

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 9837
Joined: Sat Apr 08, 2006 7:05 am
Location: Los Angeles
Contact:

Re: Optimizing SQLite Insert Statements

Post by FourthWorld » Sat May 07, 2016 8:14 pm

Use transactions - wrap the INSERTs between BEGIN TRANSACTION and END TRANSACTION.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

Not a lot of thought
Posts: 77
Joined: Thu May 21, 2015 2:41 am

Re: Optimizing SQLite Insert Statements

Post by Not a lot of thought » Sat May 07, 2016 10:12 pm

Would that look like:

loop
BEGIN TRANSACTION;
INSERT;
END TRANSACTION;
end loop

or
BEGIN TRANSACTION;
loop
INSERT;
end loop
END TRANSACTION;

quailcreek
Posts: 746
Joined: Sun Feb 04, 2007 11:01 pm
Location: McKenna, WA

Re: Optimizing SQLite Insert Statements

Post by quailcreek » Thu May 12, 2016 3:24 am

This should get you started.

Code: Select all

on mouseUp
   local theCSVPath, tFileContents
   answer file "Please choose a CSV file to import" with type "Comma Separated Values|csv|CSV"
       
   if the result is not "cancel" then
      put it into theCSVPath
      ## Get the contents of the file
      put URL ("file:" & theCSVPath) into tFileContents
      delete line 1 of tFileContents -- this is because the first line is the header data
      put CSVToTab4(tFileContents) into tFileContents
      
      get executeSQL("BEGIN")
      if it is not an integer then
         -- error handling code
         answer "An error occured BEGIN"
         exit mouseUp
      end if
      
      put "INSERT INTO Distributors(distributorName,contactName,contactPhone,contactEmail,distAddress,distCity,distState,distZip,accountNum) VALUES (:1,:2,:3,:4,:5,:6,:7,:8,:9)" into tSQL
      
      repeat for each line tLine in tFileContents
         split tLine by tab
         revExecuteSQL the uDatabaseID of this stack,tSQL,"tLine"
      end repeat #for each line tLine in tFileContents
      
      if the result is not an integer then
         get executeSQL("ROLLBACK")
         -- error handling here
         answer "An error occured ROLLBACK Update"
         exit mouseUp
      end if
      
      get executeSQL("COMMIT")
      if the result is an integer then
         popManageDistributorsDG
         --         answer information "Distributors data was successfully inserted."
      else
         answer "Sorry, there was an error inserting the Distributors data."
      end if
   end if
end mouseUp

Code: Select all

function CSVToTab4 pData, pOldLineDelim, pOldItemDelim, pNewCR, pNewTAB
   local tNuData -- contains tabbed copy of data
   local tStatus, theInsideStringSoFar
   
   -- fill in defaults
   if pOldLineDelim is empty then put CR into pOldLineDelim
   if pOldItemDelim is empty then put COMMA into pOldItemDelim
   if pNewCR is empty then put numtochar(11) into pNewCR  -- Use <VT> for quoted CRs
   if pNewTAB is empty then put numtochar(29) into pNewTAB -- Use <GS> (group separator) for quoted TABs
   
   -- Normalize line endings: REMOVED
   -- Will normaly be correct already, only binfile: or similar chould make this necessary
   -- and that exceptional case should be the caller's responsibility
   
   put "outside" into tStatus
   set the itemdel to quote
   repeat for each item k in pData
      -- put tStatus && k & CR after msg
      switch tStatus
         
         case "inside"
            put k after theInsideStringSoFar
            put "passedquote" into tStatus
            next repeat
            
         case "passedquote"
            -- decide if it was a duplicated escapedQuote or a closing
            quote
            if k is empty then   -- it's a duplicated quote
               put quote after theInsideStringSoFar
               put "inside" into tStatus
               next repeat
            end if
            -- not empty - so we remain inside the cell, though we have left the quoted section
            -- NB this allows for quoted sub-strings within the cell content !!
            replace pOldLineDelim with pNewCR in theInsideStringSoFar
            replace TAB with pNewTAB in theInsideStringSoFar
            put theInsideStringSoFar after tNuData
            
         case "outside"
            replace pOldItemDelim with TAB in k
            -- and deal with the "empty trailing item" issue in Livecode
            replace (pNewTAB & pOldLineDelim) with pNewTAB & pNewTAB & CR in k
            put k after tNuData
            put "inside" into tStatus
            put empty into theInsideStringSoFar
            next repeat
         default
            put "defaulted"
            break
      end switch
   end repeat
   
   -- and finally deal with the trailing item isse in input data
   -- i.e. the very last char is a quote, so there is no trigger to flush the last item
   if the last char of pData = quote then
      put theInsideStringSoFar after tNuData
   end if
   
   return tNuData
end CSVToTab4
Tom
MacBook Pro OS Mojave 10.14

AxWald
Posts: 578
Joined: Thu Mar 06, 2014 2:57 pm

Re: Optimizing SQLite Insert Statements

Post by AxWald » Thu May 12, 2016 8:10 pm

Hi,
Not a lot of thought wrote:Would that look like:

loop
BEGIN TRANSACTION;
INSERT;
END TRANSACTION;
end loop

or
BEGIN TRANSACTION;
loop
INSERT;
end loop
END TRANSACTION;
The second should be much faster. Read this.

I'd do it in another way, without any looping. Provided I have data like this:

Code: Select all

ANum: AChar: AStr:
321;  "A";  "A.A"
654;  "B";  "B.B"
987;  "C";  "C.C"
On we go:

Code: Select all

replace ";" with "," in MyCSVData  -- or whatever needed
replace quote with "'" in MyCSVData  -- again, depends of your format
replace return with ")," & return & "(" in MyCSVData  -- voodoo!
  -- now our data are like #1, below.
put "INSERT INTO 't_Table' ('r_Rec1', 'r_Rec2', 'r_Rec3') Values " & return & "(" into StrSQL
put MyCSVData & ");" after StrSQL
  -- see the SQL string in #2, below
get revdatafromquery( , , MyDBID, StrSQL)
  --Ahem, see #3 ...
Explanation:
#1: This is "sanitized" data, I usually have a function to do this *1)

Code: Select all

321,'A','A.A'),
(654,'B','B.B'),
(987,'C','C.C'
#2: Look, a nice, working SQL string :)

Code: Select all

INSERT INTO 't_book' ('bk_Num', 'bk_1', 'bk_2') Values 
(321,'A','A.A'),
(654,'B','B.B'),
(987,'C','C.C');
#3:
Yes, it's "revDataFromQuery" for an action query - while looking for code to post here I found this in a production software of me, and it really works! LiveCode is full of miracles and wonders!

Well, the transaction - you could always start your insert with a "begin":

Code: Select all

put "BEGIN;" & return & "INSERT INTO 't_book' ('bk_Num', 'bk_1', 'bk_2') Values " & return & "(" into StrSQL
and after checking the result, dropping a "COMMIT;" or a "ROLLBACK;" ...

I know this is not the canonical way the LC books show you, but I'm used to do it this way, and it works great, usually ;-) And I hate the "insert into mytable values(:1,:2,:1)" notation ...
Besides, it's much more entertaining this way once you're accustomed to do some chunk juggling & SQL rhyming!

Have fun, hope I could help.

*1) There's a nice CSV converter above, I know - just I have given up trying to use "kill 'em all" approaches. "Character Separated Values" files can come in such many flavors, you'll always bite the carpet sooner or later. But usually they come repeatedly, so it's (for me) no problem to analyze once & quickly hack me a customized converter. After all, most often it's not more than a few "replace"s ...
All code published by me here was created with Community Editions of LC (thus is GPLv3).
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!

Post Reply

Return to “Databases”