Creating hunspell library?

LiveCode Builder is a language for extending LiveCode's capabilities, creating new object types as Widgets, and libraries that access lower-level APIs in OSes, applications, and DLLs.

Moderators: LCMark, LCfraser

Post Reply
trevordevore
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 1005
Joined: Sat Apr 08, 2006 3:06 pm
Location: Overland Park, Kansas
Contact:

Creating hunspell library?

Post by trevordevore » Sat Jan 02, 2016 7:12 pm

Do we have everything in place that we would need to wrap hunspell up as a library? If so this might be an interesting project for the community to work on. I have a hunspell external that Monte created for me a while ago which I use in my projects. It has a wrapper which makes it easy to use the flaggedRanges to mark words as misspelled. If someone could set up an initial project that imports hunspell and has an example or two I could work on fleshing out the API.
Trevor DeVore
ScreenSteps - https://www.screensteps.com

LiveCode Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode
LiveCode Builder Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode-builder

monte
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 1564
Joined: Fri Jan 13, 2012 1:47 am
Contact:

Re: Creating hunspell library?

Post by monte » Sun Jan 03, 2016 12:42 am

Hmm... you should be able to wrap the C API in hunspell.h but I'm not sure how we handle char *** for suggestions etc? Perhaps ZStringUTF8Array (List?) would need to be implemented first.
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/

trevordevore
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 1005
Joined: Sat Apr 08, 2006 3:06 pm
Location: Overland Park, Kansas
Contact:

Re: Creating hunspell library?

Post by trevordevore » Mon Jan 04, 2016 3:44 pm

Thanks for looking into that Monte. If we do need the API updated in order to make it happen then it looks like it will have to wait a while unless a community member is willing to update the engine.

My opinion, (for the time being at least), is that it is best to wait until the builder syntax is a little more polished before venturing down the road of creating an extension or widget for the general public to use.
Last edited by trevordevore on Mon Jan 04, 2016 3:46 pm, edited 1 time in total.
Trevor DeVore
ScreenSteps - https://www.screensteps.com

LiveCode Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode
LiveCode Builder Repos - https://github.com/search?q=user%3Atrevordevore+topic:livecode-builder

peter-b
Posts: 182
Joined: Thu Nov 20, 2014 2:14 pm
Location: LiveCode Ltd.

Re: Creating hunspell library?

Post by peter-b » Mon Jan 04, 2016 3:46 pm

I think you may be able to use some low-level libfoundation functions to do pointer arithmetic, so as to handle the char *** unpacking. I've got some similarly horrible code in my poll(2) wrapper in undergrowth (I haven't looked at or touched that code for months though).
LiveCode Open Source Team — @PeterTBBrett — peter.brett@livecode.com

monte
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 1564
Joined: Fri Jan 13, 2012 1:47 am
Contact:

Re: Creating hunspell library?

Post by monte » Mon Jan 04, 2016 10:16 pm

I've looked at your struct packing and unpacking in undergrowth and ran away screaming... In the end if we need to do stuff like that it is much simpler just to write an external. Which luckily we already have for hunspell ;-)
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/

LCMark
Livecode Staff Member
Livecode Staff Member
Posts: 1209
Joined: Thu Apr 11, 2013 11:27 am

Re: Creating hunspell library?

Post by LCMark » Tue Jan 05, 2016 8:35 pm

Looking at the Hunspell API, the crucial piece currently missing from engine APIs is the ability to unpack / pack a pointer to/from a native type. If we had that then wrapping Hunspell is actually quite trivial (certainly as easy as writing bindings for it in other high-level languages, and you would avoid writing any C).

Here is a simple (not running / not finished) example of a way to bind to it assuming the 'notyetpossible' module existed:

Code: Select all

library module org.runrev.hunspell

type Hunhandle as optional Pointer
type Hunslist as optional Pointer

foreign handler Hunspell_create(in pAffPath as UTF8CString, in pDPath as UTF8CString) returns Hunhandle
foreign handler Hunspell_create_key(in pAffPath as UTF8CString, in pDPath as UTF8CString, in pKey as UTF8CString) returns Hunhandle
foreign handler Hunspell_destroy(in pHunspell as Hunhandle)
foreign handler Hunspell_spell(in pHunspell as Hunhandle, in pWord as UTF8CString) returns CInt
foreign handler Hunspell_get_dic_encoding(in pHunspell as Hunhandle) returns UTF8CString
foreign handler Hunspell_suggest(in pHunspell as Hunhandle, out rSLst as Hunslist, in pWord as UTF8CString) returns CInt
foreign handler Hunspell_analyze(in pHunspell as Hunhandle, out rSLst as Hunslist, in pWord as UTF8CString) returns CInt
foreign handler Hunspell_stem(in pHunspell as Hunhandle, out rSLst as Hunslist, in pWord as UTF8CString) returns CInt
foreign handler Hunspell_stem2(in pHunspell as Hunhandle, out rSLst as Hunslist, in pDesc as Pointer, in pDescCount as CInt) returns CInt
foreign handler Hunspell_generate(in pHunspell as Hunhandle, out rSLst as Hunslist, in pWord as UTF8CString, in pWord2 as UTF8CString) returns CInt
foreign handler Hunspell_generate2(in pHunspell as Hunhandle, out rSLst as Hunslist, in pWord as UTF8CString, in pDesc as Hunslist, in pDescCount as CInt) returns CInt
foreign handler Hunspell_free_list(in pHunspell as HUnhandle, inout rSLst as Hunslist, in pSlstCount as CInt) returns nothing

--------

private variable sHandle as optional Hunhandle

public handler hunspellInitialize(in pAffPath as String, in pDPath as String) returns nothing
	if sHandle is not nothing then
		throw "Hunspell already initialised"
	end if
	put Hunspell_create(pAffPath, pDPath) into sHandle
end handler

public handler hunspellFinalize() returns nothing
	if sHandle is nothing then
		return
	end if
	Hunspell_destroy(sHandle)
	put nothing into sHandle
end handler

public handler hunspellSpell(in pWord as String)
	__hunspellEnsure()
	return Hunspell_spell(sHandle, pWord) is not 0
end handler

public handler hunspellSuggest(in pWord as String) returns List
	__hunspellEnsure()
	
	variable tSuggestions as optional Hunslist
	variable tSuggestionCount as CInt
	put Hunspell_suggest(sHandle, tSuggestions, pWord) into tSuggestionCount

	variable tList as List
	put notyetpossible.UnpackPointerAsArrayOfString(tSuggestions, tSuggestionCount) into tList

	Hunspell_free_list(sHandle, tSuggestions, tSuggestionCount)

	return tList
end handler

end module
So, we do need to do a little more work to make this kind of thing possible, but not perhaps as much as it appears (at first sight at least - famous last words!).

monte
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 1564
Joined: Fri Jan 13, 2012 1:47 am
Contact:

Re: Creating hunspell library?

Post by monte » Tue Jan 05, 2016 9:50 pm

Would be neater if we could just add an optional List after the foreign type in the foreign handler declaration and it would automatically pack and unpack for us.

Code: Select all

foreign handler Hunspell_suggest(in pHunspell as Hunhandle, out rSLst as UTF8CString List, in pWord as UTF8CString) returns CInt
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/

LCMark
Livecode Staff Member
Livecode Staff Member
Posts: 1209
Joined: Thu Apr 11, 2013 11:27 am

Re: Creating hunspell library?

Post by LCMark » Tue Jan 05, 2016 10:26 pm

@monte: The problem with that is that this doesn't explain how many elements are in the returned list. C APIs tend to return the number of elements in a native array as either the return value, or as another out parameter (i.e. pointer to slot) - this would need to be encoded in the declaration somehow. Perhaps something like:

Code: Select all

-- The Hunspell API as it is
foreign handler Hunspell_suggest(in pHunspell as Hunhandle, out rSLst as UTF8CString[result], in pWord as UTF8CString) returns CInt
-- A modified version which returns the number of elements as an out parameter
foreign handler Hunspell_suggest(in pHunspell as Hunhandle, out rSLst as UTF8CString[rSLstCount], out rSLstCount as CInt, in pWord as UTF8CString) returns nothing
The idea here is that the '[]' annotation would indicate it was a 'native' C array of the given type - which would bridge to a List. Indeed, in this case, it would be nice to be able to indicate that the rSLstCount parameter was 'silent' in the LCB binding.

Of course, things get more complicated when you have to start considering exceptions / error return. Imagine an API which returns true if it succeeds, or false if it fails - here if 'false' is returned the out parameters (which map to ptr-to-type) are untouched:

Code: Select all

foreign handler GetMyStrings(out rStrings as UTF8CString[rStringCount], out rStringCount as CInt) returns CBool
In this case there needs to be a way to express the relationship between a return value of false and an exception to ensure that out parameters which aren't well defined are not touched.

I've actually been pondering whether the foreign handler declaration needs to be multi-line to allow greater richness in specifying 'safe' bindings - not just for array counts, but also ownership / lifetime annotations.

Of course the advantage of finding a way to specify foreign handler bindings in a high-level fashion (without having to use Pointer and friends) is that it is 'safer' in the sense that greater type and range checking can be done at runtime; however, in lieu of that, we can get a fair way by using small 'wrapper' functions along with some Pointer manipulation functions I think.

monte
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 1564
Joined: Fri Jan 13, 2012 1:47 am
Contact:

Re: Creating hunspell library?

Post by monte » Wed Jan 06, 2016 12:56 am

Ah, yes, sorry, too early in January for me to be thinking straight...

By silent in the binding do you mean there's no need to include it in the parameter list because it's all just handled for you? Not sure how common it would be but how would you handle the array size coming from an extra parameter? Maybe:

Code: Select all

foreign handler GetMyStrings(out rStrings as UTF8CString[in pStringCount as CInt])

variable tStrings as Array
variable tSize as Integer
put 5 into tSize
GetMyStrings(tStrings[tSize])
In cases where out params are untouched they should still be nil though right? I expect you are already checking for nil before trying to set the LCB variable which should leave exception handling up to the calling code.

Multi-line foreign handler declarations would reduce the number of very long lines I suppose and perhaps look a bit like lcidl...
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/

Post Reply

Return to “LiveCode Builder”