Page 1 of 1

Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:55 pm
by trevordevore
[This post is an archive from the old SQL Yoga forums]

I have created a virtual FTS table in my sqlite db and wish to populate it.

The creation command - CREATE VIRTUAL TABLE line_text USING fts4(content TEXT) created a table "line_text" with one field, "content" as well as all the associated tables required to index this table. (line_text_content, line_text_docsize, line_text_segdir, line_text_segments, line_text_stat)

To populate the table I can use the INSERT command.

There doesn't seem to be an SQLYoga insert command.

Am I missing something?

Will I need to populate this virtual table outside of SQLYoga using the revdatabase functions?



James

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:55 pm
by trevordevore
sqlquery_create will insert a record into the database.

http://www.bluemangolearning.com/revolu ... create.htm

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:55 pm
by trevordevore
Unfortunately this doesn't work. The database object for the Virtual table is the block here.

The table object created for the virtual table only has a single virtual field "content" and the Insert I want to use was to include the alias "docid" which corresponds to the "rowid" of this virtual table. The database object created by SQLYoga doesn't seem to know this.

It is OK with:

Code: Select all

sqlrecord_set theRecordC, "content", aLine_text[tkey]
but barfs when I try to run:

Code: Select all

sqlrecord_set theRecordC, "docid", tkey
Have you ever used SQLYoga in this context?

in the code above "content" is the text in line "tkey" of a variable. Storing "tkey" as the "docid" (same as the "rowid") means the "docid"'s returned from any search directly corresponds the the line numbers of the text I want. Unless I specify the "docid" on insert I have no control over what it is.

I was trying to avoid redefining the virtual table to have two fields; "tkey" and "content" and then having to look up the appropriate "tkey" from the returned "docid"s

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:56 pm
by trevordevore
I haven't used SQL Yoga with FTS tables yet. What happens if you use "rowid" instead of "docid"? Does that work?

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:56 pm
by trevordevore
Same thing. I initially tried "rowid". As I mentioned, the db object created by SQLYoga knows nothing of "rowid" or "docid" with respect to the virtual table. According to the SQLYoga plugin, there is only the "content" field.

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:56 pm
by trevordevore
I am thinking that the problem is only that the DB object created by SQLYoga doesn't have the "rowid" in the virtual table's schema.

Would it be possible to amend the schema for this table, i.e. tell it to to SQLYoga add the "rowid" field to the db object?

I am assuming here that SQLYoga is just assembling an executeSQL type command and is using the schema of its DB object to provide placeholders for the in/out values. In this case as the virtual table definition is really a dummy definition (its creation actually makes five real tables that are not directly accessed) and is treated by the FTS module of SQLite as if it had actually defined the "rowid" then adding this field to the db object's schema would allow the "rowid" to be addressed.

If this is possible then one would just be required to run an extra command to make this modification to any databases that use the FTS module after the normal db object creation.

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:56 pm
by trevordevore
James - you are correct that SQL Yoga does not know about the rowid property. The SQLite table_info query doesn't return any information about it so SQL yoga doesn't add it to the schema.

However, I just tested sqlquery_create with a "rowid" key and it works.

Code: Select all

put "8000" into theA["rowid"]
sqlquery_create "articles", theA
In your example you are using a SQL Record object which is more strict. Did you try sqlquery_create?

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:56 pm
by trevordevore
Trevor,

hmm. OK I can see how that works.

an example value set in my case might be

Code: Select all

put "23" into theA["rowid"]

put the_text[23] into theA["content"]

sqlquery_create "line_text", theA
This will create a row (rowid=23) in the "line_text" virtual table with the content being the text in "the_text[23]".

This is doing this one row at a time (I was committing mutiple records with the sqlrecord_create.)

so rather than get tied up in multiple file I/O I would need to wrap this in a transaction for all my lines, yes?

BTW there are 15,000 of them!

The docs show a command

Code: Select all

dbconn_beginTransaction pParamsA, pConnectionA, pDBKey
but doesn't tell me what the arguments are nor give an example.

I have already set the default connection and database to the appropriate file and, within this handler, have already created records in other tables that exist within that file so can I assume that pConnection and pDBKey will already have the appropriate values?

I have no idea of what "pParamsA" is in this context.

James

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:57 pm
by trevordevore
Yes, you want to wrap all of the calls in a transaction (so only one journal file is created).

Just call dbconn_beginTransaction without any parameters. Then use dbconn_commitTransaction (if no errors occur) or dbconn_rollbackTransaction if an error occurs.

I have written more detailed docs for the dbconn_beginTransaction entry but it doesn't look like they are on the web yet.

Code: Select all

* \brief Begins a transaction in the database connection.
*
* \param pParamsA Not implemented. Included for possible future use.
* \param [pConnName] The name of a Database Connection. The default is the "default connection" for the Database Object.
* \param [pDBKey] Database Object connection is associated with. Default is name returned by sqlyoga_getDefaultDatabase().
*
* After calling this command no changes you make to the database will be saved until you call dbconn_commitTransaction. You can 
* cancel any changes you made to the database by calling dbconn_rollbackTransaction rather than dbconn_commitTransaction.
*
* ## Start a transaction in the database.
* dbconn_beginTransaction
*
* ## Modify database, checking for errors along the way.
*
* ## If nothing went wrong then commit the changes you made.
* if theError is empty then
 * dbconn_commitTransaction
 * else
* ## Something went wrong, rollback to state database was in before calling dbconn_beginTransaction.
* dbconn_rollbackTransaction
* end if
*
* \package Database Connection
*
* \return Error message
*
* \seealso dbconn_commitTransaction, dbconn_rollbackTransaction
*/

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:57 pm
by trevordevore
Thanks Trevor,

Took a while, but that was to be expected given the number of insertions.

15,000 plus records, approx 500,000 words. around 20 secs.

Tested sql queries using the FTS functions and sqlquery_set and all but one worked flawlessly. The one exception returned a blob and I am not sure how to decode it as receiving array seems to think it is only a single character. However I don't actually need this function so I am now good to go.

BTW the searching using FTS is blindingly fast.

example: entering three search terms and search.

-> give number of paragraphs containing each term for each term (so three in this case)

-> give number of paragraphs containing all three terms (boolean AND)

-> list all paragraphs satisfying the boolean AND, giving their content, line number (rowid) and a snippet of the paragraph showing the searched for terms hilited and in context

total time 0.13 secs

While I could do all this using livescript only, it would have taken much longer, especially constructing the snippets to display the hits in context.

very happy.

James

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:58 pm
by trevordevore
Here is a working example using the two working functions..

Code: Select all

sqlquery_set theQueryA,"select clause", "rowid,snippet(line_text),offsets(line_text)"

   put "'your brother' function only" into search_terms

   sqlquery_set theQueryA, "conditions","line_text.content MATCH ':1'", search_terms

   sqlquery_retrieveAsArray theQueryA, theGRecordsA
This particular query gave me the last item in my timing example above. Perhaps of interest is that the search is doing a AND on a phrase (your brother) signified by enclosing in single quotes and two words (function) and (only)

The function that didn't work was "macthinfo" function.

It is meant to return a blob but, as I said, it only appeared to be a single character when I looked at the returned value.

Using the matchinfo function with the above search terms should return a

Code: Select all

single blob value 80 bytes in size (20 32-bit integers)
as described in the FTS docs. Given it returned for me only a single character I figured something got lost somewhere.

Given the other two functions worked an all three are a piece, so to speak, I imagine sqlite is trying to return the correct result but....



In the example above each element in the array "theGRecordsA" has a dimension corresponding to the sqlquery_set clause....rowid,snippet(line_text),offsets(line_text)

Interestingly to read these I needed to put the function keys into a variable else I got an error, i.e. I couldn't write

Code: Select all

put theGRecordsA[1][offsets(line_text)] into temp
I had to write

Code: Select all

put "offsets(line_text)" into offindex

put theGRecordsA[1][offindex] into temp
without this "indirection" livecode coughs up a syntax function error!

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:58 pm
by trevordevore
Addendum:

The time it took to insert data into the FTS table was troubling me.

I wrote an alternative handler to use the RevDB function directly to do the same thing and compared the time.

This particular handler does three things, two of which create a lot of records. Specifically 947 NAV records and 15,000 plus text records.

Using SQLYoga function the times were: NAV insertions - 3.76 secs, FTS insertions - 44.55 secs

Using direct calls vis RevDB functions: NAV insertions - 0.38 secs, FTS insertions - 1.85 secs

The NAV insertion time is ok, by itself, but the FTS insertion time is quite noticable.

Seems the overhead on simple insertions is a factor of 10 whereas the overhead on FTS insertions is over 20 to 1.

So, while I will use SQLYoga for the queries and other table insertions I need to do (which are invariably a single row at a time) I will call the RevDB functions for this module.

So when you come to try FTS for your self you might want to go direct if you are inserting a lot of records to avoid the SQLYoga overhead.

Re: Using the FTS module in sqLite

Posted: Fri Oct 11, 2013 2:58 pm
by trevordevore
Yes, the FTS inserts were wrapped in the begin and commit transaction commands. I simply modified the code example from the doc entry you provided above. The other insertion loop (947 NAV insertions) were done differently. I was able in this case to use the SQLYoga create record command and modified an example (I think you gave to another poster on this forum) of creating an array of record arrays and using a single SQLYoga command to act on that array.