Text-to-speech TTS for macOS 10.6+

LiveCode Builder is a language for extending LiveCode's capabilities, creating new object types as Widgets, and libraries that access lower-level APIs in OSes, applications, and DLLs.

Moderators: LCMark, LCfraser

Post Reply
PaulDaMacMan
Posts: 259
Joined: Wed Apr 24, 2013 4:53 pm
Contact:

Text-to-speech TTS for macOS 10.6+

Post by PaulDaMacMan » Thu Feb 20, 2020 9:47 pm

Inspired by trevix & Simon Knight who are working on an LCB library to wrap Apple's AVSpeech APIs for Text-to-speech on iOS (at least some of which should also work on newer versions of macOS like 10.14 Mohave/10.5 Catalina), I started messing around with the older Apple API for doing Text-to-speech, NSSpeechSynthesizer/NSSpeechRecognizer, which has been part of MacOS X since 10.3 (obviously a 32bit/Intel LC Standalone won't run on a System going that far back). This all might seem as pointless as I thought it would be (I started it purely as an exercise) since RevSpeech extension has already handled this for a long time (and also works on Windows), but there are features of NSSpeech that aren't tapped into by RevSpeech, like NSSpeechRecognizer, or the ability to output the synthesized speech to an AIFF sound file (which could then be loaded as an Musical Sampled Instrument by my LCB_AppleAVAudioSampler, FUN!) just for two examples.

So far it's pretty basic but it already includes the AIFF sound file output function that RevSpeech doesn't have. Here's a link to it:
https://github.com/PaulMcClernan/LCB_NSSpeechLib
https://github.com/PaulMcClernan

PaulDaMacMan
Posts: 259
Joined: Wed Apr 24, 2013 4:53 pm
Contact:

Re: Text-to-speech TTS for macOS 10.6+

Post by PaulDaMacMan » Sun Feb 23, 2020 5:29 pm

Just added setting the pitchMod property on a SpeechSynth Voice to this lib, you can make "Alex" voice sound like Mickey Mouse and then have Mickey say inappropriate things! hahah!
:P Too much fun!

https://www.youtube.com/watch?v=qvI8_y4 ... MEPTdGafgA
https://github.com/PaulMcClernan

trevix
Posts: 529
Joined: Sat Feb 24, 2007 11:25 pm
Location: Italy
Contact:

Re: Text-to-speech TTS for macOS 10.6+

Post by trevix » Tue Feb 25, 2020 10:46 pm

Hi Paul.
I must say that Simon did pratically all the work for the LCB TTS on iOS (AVSpeechSynthesizer).
Thanks to him, I can do now some sort implementation that before was impossible.

There are still a few things that are missing for a optimal use, both on iOS and on OSX/Win.
Since you seem proficient in LCB, I invite you on working on the following points that, in my opinion, are fundamental:

1 - end of speech call back for AVSpeechSynthesizer (iOS). I tried, but I just can't grab the correct code.

2 - full list of available voices for NSSpeechSynthesizer (OSX). The "en-US" language/country code for each voice is needed in my opinion, in order to understand a voice choice (it is a strange missing form RevSpeech, isn't it?)

3 - a way to incorporate some special "pause" characters in the "Text", on iOS, like the "[[slnc 500]]" works on OSX.

Best of all, off course, would be to have a unique extension working on all platforms. But I doubt we will ever see it.

These points came to me after messing around for several days on both platforms, trying to make a stack that could compile for both.
For example (point 1), trying to integrate the "wait until revIsSpeaking() is false" on OSX, with the "missing" equivalent on iOS and having to deal with "send in time" in order to give the user the chance to do something while the OS is speaking, is really complicated.
Or (point 3) the fuzzy way which both OS deal with punctuation: I discovered the "Hello.HowAre you?" doesn't put a pause at the dot. Instead "Hello. HowAre you?" does (extra space). An that "returns" are ignored, for what concern the pauses on speech.
Trevix
OSX 10.14.6 LC 9.6.0 Dp2 iOs 9.3>

PaulDaMacMan
Posts: 259
Joined: Wed Apr 24, 2013 4:53 pm
Contact:

Re: Text-to-speech TTS for macOS 10.6+

Post by PaulDaMacMan » Wed Mar 11, 2020 4:23 pm

AVSpeechSynthesizer is a different animal that seems to be only available on mac as of macOS 10.14+, I'm running no higher than 10.12.6 still and I don't have an iOS device to test on. BUT...

Maybe take a look at my NSSpeech wrapper lib & demo, it may be helpful in working out things for AVSpeech, I have a feeling AVSpeech is mostly based on NSSpeech. The speech boundary constants are probably the same for example (which is an NSInteger Obj that = 0,1... in NSSpeech that represent Stop speaking Immediately, after the current word, or after the current sentence).

As for callbacks, if it uses a block (denoted by the ^ symbol) I still haven't quite figured that out yet but I believe others (Ali,Monte,Trevor) have had some luck with that. My NSSpeech LCB wrapper uses a Delegate for the "Finished Speaking" callback but it unfortunately it doesn't seem to fire consistantly and I haven't figured out why.
https://github.com/PaulMcClernan

PaulDaMacMan
Posts: 259
Joined: Wed Apr 24, 2013 4:53 pm
Contact:

Re: Text-to-speech TTS for macOS 10.6+

Post by PaulDaMacMan » Mon Mar 23, 2020 8:00 am

I had no idea that LC Team was working on a new cross platform speech lib for LC's commercial editions (Indy+), but LC 9.6dp3 adds exactly that! So since my NSSpeech lib (Community+) was fairly complete, just needing some documentation, I thought I'd renamed some of the handlers so that they're closer in name to corresponding stuff in the official LC speech lib, where applicable (I took a slightly different approach, like returning line-delimited text lists instead of arrays). For example my NSSpeechSynthGetAvailableVoices() is now NSspeechGetVoices() which corresponds to LC's speechGetVoices(). I might adjust the returned data to match LC's lib as well at some point in the future, but I probably won't spend much time on this any time soon.
https://github.com/PaulMcClernan

trevix
Posts: 529
Joined: Sat Feb 24, 2007 11:25 pm
Location: Italy
Contact:

Re: Text-to-speech TTS for macOS 10.6+

Post by trevix » Mon Mar 23, 2020 10:12 am

I would be interested on your comment on my last post, regarding the new speech library introduced with LC9.6.0Dp3.
https://forums.livecode.com/viewtopic.p ... 5&start=15
Trevix
OSX 10.14.6 LC 9.6.0 Dp2 iOs 9.3>

PaulDaMacMan
Posts: 259
Joined: Wed Apr 24, 2013 4:53 pm
Contact:

Re: Text-to-speech TTS for macOS 10.6+

Post by PaulDaMacMan » Wed Mar 25, 2020 7:22 am

trevix wrote:
Mon Mar 23, 2020 10:12 am
I would be interested on your comment on my last post, regarding the new speech library introduced with LC9.6.0Dp3.
https://forums.livecode.com/viewtopic.p ... 5&start=15
My comment would be similar to LC Mark's reply, if you have Indy or better LC edition I would try moving a copy of the 9.6DP3 speech lib into the LC 9.5.1 bundle and see if that works for building stand alones to run on iOS9. I can't think of any reason that wouldn't work. If it doesn't then you could always take a stab at building your own library for AVSpeechSynth in LiveCode Builder. I looked at the API's on Apple's Dev site and it looks like that was available from iOS 7 on and it doesn't look all that much more complicated than the NSSpeechSynth in macOS (which I wrapped most of in a couple of days).
https://github.com/PaulMcClernan

PaulDaMacMan
Posts: 259
Joined: Wed Apr 24, 2013 4:53 pm
Contact:

Re: Text-to-speech TTS for macOS 10.6+

Post by PaulDaMacMan » Wed Mar 25, 2020 8:51 am

PaulDaMacMan wrote:
Wed Mar 25, 2020 7:22 am
trevix wrote:
Mon Mar 23, 2020 10:12 am
I would be interested on your comment on my last post, regarding the new speech library introduced with LC9.6.0Dp3.
https://forums.livecode.com/viewtopic.p ... 5&start=15
you could always take a stab at building your own library for AVSpeechSynth in LiveCode Builder.
Anyway, Simon and you pretty much did that out in that other forum thread.

Sometimes I've found that I don't really need to build a whole FFI library, just wrap the parts I need for whatever it is I'm trying to do.
https://github.com/PaulMcClernan

trevix
Posts: 529
Joined: Sat Feb 24, 2007 11:25 pm
Location: Italy
Contact:

Re: Text-to-speech TTS for macOS 10.6+

Post by trevix » Wed Mar 25, 2020 10:09 am

I wasn't referring so much to the lib e OS version, as to the fact that the "LC way" should not be betrayed.
If I have 2 TTS libraries, one for OSX/Win and one that I am building for iOS/Android, why don't unify them with the same commands and functions names? How can a newbie make sense of it?
And, at least, I would not call the two libraries with the same name, as I read
...Important: The revSpeak command is part of the Speech library....
Trevix
OSX 10.14.6 LC 9.6.0 Dp2 iOs 9.3>

PaulDaMacMan
Posts: 259
Joined: Wed Apr 24, 2013 4:53 pm
Contact:

Re: Text-to-speech TTS for macOS 10.6+

Post by PaulDaMacMan » Wed Mar 25, 2020 5:37 pm

trevix wrote:
Wed Mar 25, 2020 10:09 am
I wasn't referring so much to the lib e OS version, as to the fact that the "LC way" should not be betrayed.
If I have 2 TTS libraries, one for OSX/Win and one that I am building for iOS/Android, why don't unify them with the same commands and functions names? How can a newbie make sense of it?
And, at least, I would not call the two libraries with the same name, as I read
...Important: The revSpeak command is part of the Speech library....
OK I see what you're saying, my thinking about it is very different from "newbie" because I'm an old-head that arrived at LC from a position of being a long-time HyperCard/SuperCard/AppleScript X-talk user... One of the great things about all of those IMO is that they were/are all very extensible via easy to use add ons (XCMD/XCMDs, AS Scripting Addition OSAXen, etc.). I actually prefer to have MORE than one option floating around for hooking into OS things, APIs, Services, etc. The more the merrier! Many of the things that you needed an Xternal add-on for back in the day is already incorporated into the LC engine now, but I used to have a choice of various add-ons just for selecting files, for example. I can't see there being a good way to control all developer's making add-ons, forcing them into some large set of arbitrary rules about function naming, or what datatypes would be allowed to be returned, just to make things consistent and simple for newbies. I don't think that would be good for devs, users, or newbs. Newbs got a bunch of stuff to understand before they should be getting too into the goodies anyway.

The Speech libs that are now available:

The very old RevSpeech, which is a pre-LiveCode Builder External (plug-in) built in Xcode probably a long time ago. It has pretty much just the basic features that were common between the big platforms at the time (built pre-Smart-Phones, so Desktop only). This still seems to work just fine (I've used it recently on both macOS & Win), and is a good choice for newbs just running in the IDE on Desktop. If they changed the functions in this to match the new speechLib it would break a lot of older stacks that use it. I believe revSpeech has been around so long that it supported MacOS 9 Classic/Carbon.

The newly released native speechLib, built in LiveCode Builder, a bit more complicated to use but supports Mac, iOS, and Android, with (As I understand from the comments) added Windows support planned for future update. They can't name the handlers in this exactly the same as revSpeech because they don't do the same things, revSpeech is much more straight forward to use. speechGetVoices (or whatever) in this lib gets an Array of Voices that includes all the Voice Attributes, which isn't close to the same as the Voice Names list returned by revSpeech's getVoice (or whatever). My only complaint here is that this doesn't seem to me like it should be a premium feature for Commercial Editions only, since it's also accessibility thing for like the visually impaired, but I do understand LC people got bills to pay, and there is still revSpeech available for Community edition.

Then there's my recent NSspeech Lib, that's free / open source, built with LiveCode builder, it's macOS only but it should work on every OS version from 10.6 on and it has at least one feature that LC's new speechLib doesn't have (probably because it's a speech API feature that's specific to macOS/MacOS X), the ability to synthesize speech to an AIFF sound-file.

If you only need to support iOS, but also need to support older iOS devices, you could use the AVSpeech lib that youse guys were putting together in the other thread. Change the handlers so they're closer to the one's in LC's speechLib if you like, if you search speech in the dictionary they should all show up (as soon as I finish adding docs to my lib anyway).

If anyone would like to do the work, one could take my NSspeechLib and combine it with the AVspeech lib in the other thread to make a single lib that supports speech on old versions of macOS & old versions of iOS in one library available to the community. Don't look to me to do it as I intend to concentrate on other things right now.

Then someone might come along and support some alternative open source TTS engine, like MaryTTS or eSpeak for example, that aren't a built-in part of an operating system. I imagine someone making an LCB lib that uses one of those would want to name their handlers something like eSpeakGetVoices() for example.
https://github.com/PaulMcClernan

Post Reply

Return to “LiveCode Builder”