Creating hunspell library?
-
- VIP Livecode Opensource Backer
- Posts: 1005
- Joined: Sat Apr 08, 2006 3:06 pm
- Location: Overland Park, Kansas
- Contact:
Creating hunspell library?
Do we have everything in place that we would need to wrap hunspell up as a library? If so this might be an interesting project for the community to work on. I have a hunspell external that Monte created for me a while ago which I use in my projects. It has a wrapper which makes it easy to use the flaggedRanges to mark words as misspelled. If someone could set up an initial project that imports hunspell and has an example or two I could work on fleshing out the API.
Trevor DeVore
ScreenSteps - https://www.screensteps.com
LiveCode Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode
LiveCode Builder Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode-builder
ScreenSteps - https://www.screensteps.com
LiveCode Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode
LiveCode Builder Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode-builder
Re: Creating hunspell library?
Hmm... you should be able to wrap the C API in hunspell.h but I'm not sure how we handle char *** for suggestions etc? Perhaps ZStringUTF8Array (List?) would need to be implemented first.
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/
-
- VIP Livecode Opensource Backer
- Posts: 1005
- Joined: Sat Apr 08, 2006 3:06 pm
- Location: Overland Park, Kansas
- Contact:
Re: Creating hunspell library?
Thanks for looking into that Monte. If we do need the API updated in order to make it happen then it looks like it will have to wait a while unless a community member is willing to update the engine.
My opinion, (for the time being at least), is that it is best to wait until the builder syntax is a little more polished before venturing down the road of creating an extension or widget for the general public to use.
My opinion, (for the time being at least), is that it is best to wait until the builder syntax is a little more polished before venturing down the road of creating an extension or widget for the general public to use.
Last edited by trevordevore on Mon Jan 04, 2016 3:46 pm, edited 1 time in total.
Trevor DeVore
ScreenSteps - https://www.screensteps.com
LiveCode Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode
LiveCode Builder Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode-builder
ScreenSteps - https://www.screensteps.com
LiveCode Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode
LiveCode Builder Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode-builder
Re: Creating hunspell library?
I think you may be able to use some low-level libfoundation functions to do pointer arithmetic, so as to handle the char *** unpacking. I've got some similarly horrible code in my poll(2) wrapper in undergrowth (I haven't looked at or touched that code for months though).
LiveCode Open Source Team — @PeterTBBrett — peter.brett@livecode.com
Re: Creating hunspell library?
I've looked at your struct packing and unpacking in undergrowth and ran away screaming... In the end if we need to do stuff like that it is much simpler just to write an external. Which luckily we already have for hunspell
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/
Re: Creating hunspell library?
Looking at the Hunspell API, the crucial piece currently missing from engine APIs is the ability to unpack / pack a pointer to/from a native type. If we had that then wrapping Hunspell is actually quite trivial (certainly as easy as writing bindings for it in other high-level languages, and you would avoid writing any C).
Here is a simple (not running / not finished) example of a way to bind to it assuming the 'notyetpossible' module existed:
So, we do need to do a little more work to make this kind of thing possible, but not perhaps as much as it appears (at first sight at least - famous last words!).
Here is a simple (not running / not finished) example of a way to bind to it assuming the 'notyetpossible' module existed:
Code: Select all
library module org.runrev.hunspell
type Hunhandle as optional Pointer
type Hunslist as optional Pointer
foreign handler Hunspell_create(in pAffPath as UTF8CString, in pDPath as UTF8CString) returns Hunhandle
foreign handler Hunspell_create_key(in pAffPath as UTF8CString, in pDPath as UTF8CString, in pKey as UTF8CString) returns Hunhandle
foreign handler Hunspell_destroy(in pHunspell as Hunhandle)
foreign handler Hunspell_spell(in pHunspell as Hunhandle, in pWord as UTF8CString) returns CInt
foreign handler Hunspell_get_dic_encoding(in pHunspell as Hunhandle) returns UTF8CString
foreign handler Hunspell_suggest(in pHunspell as Hunhandle, out rSLst as Hunslist, in pWord as UTF8CString) returns CInt
foreign handler Hunspell_analyze(in pHunspell as Hunhandle, out rSLst as Hunslist, in pWord as UTF8CString) returns CInt
foreign handler Hunspell_stem(in pHunspell as Hunhandle, out rSLst as Hunslist, in pWord as UTF8CString) returns CInt
foreign handler Hunspell_stem2(in pHunspell as Hunhandle, out rSLst as Hunslist, in pDesc as Pointer, in pDescCount as CInt) returns CInt
foreign handler Hunspell_generate(in pHunspell as Hunhandle, out rSLst as Hunslist, in pWord as UTF8CString, in pWord2 as UTF8CString) returns CInt
foreign handler Hunspell_generate2(in pHunspell as Hunhandle, out rSLst as Hunslist, in pWord as UTF8CString, in pDesc as Hunslist, in pDescCount as CInt) returns CInt
foreign handler Hunspell_free_list(in pHunspell as HUnhandle, inout rSLst as Hunslist, in pSlstCount as CInt) returns nothing
--------
private variable sHandle as optional Hunhandle
public handler hunspellInitialize(in pAffPath as String, in pDPath as String) returns nothing
if sHandle is not nothing then
throw "Hunspell already initialised"
end if
put Hunspell_create(pAffPath, pDPath) into sHandle
end handler
public handler hunspellFinalize() returns nothing
if sHandle is nothing then
return
end if
Hunspell_destroy(sHandle)
put nothing into sHandle
end handler
public handler hunspellSpell(in pWord as String)
__hunspellEnsure()
return Hunspell_spell(sHandle, pWord) is not 0
end handler
public handler hunspellSuggest(in pWord as String) returns List
__hunspellEnsure()
variable tSuggestions as optional Hunslist
variable tSuggestionCount as CInt
put Hunspell_suggest(sHandle, tSuggestions, pWord) into tSuggestionCount
variable tList as List
put notyetpossible.UnpackPointerAsArrayOfString(tSuggestions, tSuggestionCount) into tList
Hunspell_free_list(sHandle, tSuggestions, tSuggestionCount)
return tList
end handler
end module
Re: Creating hunspell library?
Would be neater if we could just add an optional List after the foreign type in the foreign handler declaration and it would automatically pack and unpack for us.
Code: Select all
foreign handler Hunspell_suggest(in pHunspell as Hunhandle, out rSLst as UTF8CString List, in pWord as UTF8CString) returns CInt
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/
Re: Creating hunspell library?
@monte: The problem with that is that this doesn't explain how many elements are in the returned list. C APIs tend to return the number of elements in a native array as either the return value, or as another out parameter (i.e. pointer to slot) - this would need to be encoded in the declaration somehow. Perhaps something like:
The idea here is that the '[]' annotation would indicate it was a 'native' C array of the given type - which would bridge to a List. Indeed, in this case, it would be nice to be able to indicate that the rSLstCount parameter was 'silent' in the LCB binding.
Of course, things get more complicated when you have to start considering exceptions / error return. Imagine an API which returns true if it succeeds, or false if it fails - here if 'false' is returned the out parameters (which map to ptr-to-type) are untouched:
In this case there needs to be a way to express the relationship between a return value of false and an exception to ensure that out parameters which aren't well defined are not touched.
I've actually been pondering whether the foreign handler declaration needs to be multi-line to allow greater richness in specifying 'safe' bindings - not just for array counts, but also ownership / lifetime annotations.
Of course the advantage of finding a way to specify foreign handler bindings in a high-level fashion (without having to use Pointer and friends) is that it is 'safer' in the sense that greater type and range checking can be done at runtime; however, in lieu of that, we can get a fair way by using small 'wrapper' functions along with some Pointer manipulation functions I think.
Code: Select all
-- The Hunspell API as it is
foreign handler Hunspell_suggest(in pHunspell as Hunhandle, out rSLst as UTF8CString[result], in pWord as UTF8CString) returns CInt
-- A modified version which returns the number of elements as an out parameter
foreign handler Hunspell_suggest(in pHunspell as Hunhandle, out rSLst as UTF8CString[rSLstCount], out rSLstCount as CInt, in pWord as UTF8CString) returns nothing
Of course, things get more complicated when you have to start considering exceptions / error return. Imagine an API which returns true if it succeeds, or false if it fails - here if 'false' is returned the out parameters (which map to ptr-to-type) are untouched:
Code: Select all
foreign handler GetMyStrings(out rStrings as UTF8CString[rStringCount], out rStringCount as CInt) returns CBool
I've actually been pondering whether the foreign handler declaration needs to be multi-line to allow greater richness in specifying 'safe' bindings - not just for array counts, but also ownership / lifetime annotations.
Of course the advantage of finding a way to specify foreign handler bindings in a high-level fashion (without having to use Pointer and friends) is that it is 'safer' in the sense that greater type and range checking can be done at runtime; however, in lieu of that, we can get a fair way by using small 'wrapper' functions along with some Pointer manipulation functions I think.
Re: Creating hunspell library?
Ah, yes, sorry, too early in January for me to be thinking straight...
By silent in the binding do you mean there's no need to include it in the parameter list because it's all just handled for you? Not sure how common it would be but how would you handle the array size coming from an extra parameter? Maybe:
In cases where out params are untouched they should still be nil though right? I expect you are already checking for nil before trying to set the LCB variable which should leave exception handling up to the calling code.
Multi-line foreign handler declarations would reduce the number of very long lines I suppose and perhaps look a bit like lcidl...
By silent in the binding do you mean there's no need to include it in the parameter list because it's all just handled for you? Not sure how common it would be but how would you handle the array size coming from an extra parameter? Maybe:
Code: Select all
foreign handler GetMyStrings(out rStrings as UTF8CString[in pStringCount as CInt])
variable tStrings as Array
variable tSize as Integer
put 5 into tSize
GetMyStrings(tStrings[tSize])
Multi-line foreign handler declarations would reduce the number of very long lines I suppose and perhaps look a bit like lcidl...
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/