PDF search and highlighting

Anything beyond the basics in using the LiveCode language. Share your handlers, functions and magic here.

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
SteveFI
Posts: 30
Joined: Tue Mar 16, 2021 6:15 pm

PDF search and highlighting

Post by SteveFI » Tue Mar 16, 2021 6:28 pm

Hello all,

After quite a long time away (I originally wrote the DataTree treeview many moons ago, if anyone remembers that?), I'm back working on a little LiveCode project for work.

I'm quite impressed with what LiveCode has become and a lot came back to me pretty quickly. One thing we didn't have back in the day was the ability to render PDFs but it's something I need for this project. I've easily got some to display in a browser widget but can't do much else with them. Ideally, I would like to be able to search for text terms using a field and highlight text using some sort of highlight tool.

I doubt it's possible with the browser widget as it stands, although I gather it's based on Chromium, and the PDF renderer is in turn is based on PDFium. Is it possible to send Javascript to the browser widget to get it to do things? Could search and highlighting be achieved by loading in a web page that drives the PDF using Javascript? Bit beyond my skills but if it is were possible, it's something I could get some help with from web developer friends, I'm sure.

The business edition has a PDF widget. Again, this looks read-only. Is it worth the licence fee?

Thanks,


Steve

liveme
Posts: 240
Joined: Thu Aug 27, 2015 5:22 pm
Location: down under

Re: PDF search and highlighting

Post by liveme » Tue Mar 16, 2021 7:03 pm

Hi Steve...
using some online tool such as : https://pdftotext.com/
one can convert a small PDF to some quite "clean text only" downloadable format ...then do whatever one wants with LC.
maybe there is some similar available libraries for a converting tool that could then be used - offline - by an LC stack :idea:
I havent had the time to search for that yet !
:wink:
Last edited by liveme on Wed Mar 17, 2021 5:45 pm, edited 1 time in total.

SteveFI
Posts: 30
Joined: Tue Mar 16, 2021 6:15 pm

Re: PDF search and highlighting

Post by SteveFI » Wed Mar 17, 2021 11:15 am

Thanks for this.

Years ago, I wrote something that took some XML and constructed a 'page' using various fields and other layout constructs I'd created. If I did this again, I could recreate the content of the PDFs and then do whatever I wanted with them... highlighting, searches and so on.

pdftotext would then be useful in extracting the text from the PDF.

I think there must surely be some mileage in a library that improves PDF functionality still further, beyond what the browser and PDF widgets can do. If only I had time these days...

Steve

SteveFI
Posts: 30
Joined: Tue Mar 16, 2021 6:15 pm

Re: PDF search and highlighting

Post by SteveFI » Mon Mar 22, 2021 5:56 pm

I looked a bit further into this over the weekend.

Turns out we don't need to highlight content on the PDF. However, a search function is on the list.

Knowing that I can't interact with the browser if it natively loads in a PDF, I thought 'how about hooking in a JavaScript renderer, as mentioned in a thread elsewhere?'. And thus, I found pdf dot js and it does the job. I would link but it seems the forum isn't keen on this (I can't even type the name of library!).


The prebuilt (for older browsers) package would appear to work quite well. I have much to learn but I think I have a solution now.

Steve

liveme
Posts: 240
Joined: Thu Aug 27, 2015 5:22 pm
Location: down under

Re: PDF search and highlighting

Post by liveme » Mon Mar 22, 2021 7:25 pm

Cool, I've no real knowledge of JS, not to talk about a render :lol:
all I can offer..some "confined","virtual" support !
...and beta testing under "LinLin"
(which BTW has no web browser working feature,
in case you'd decide "not" to go that way... :lol: )

kdjanz
Posts: 300
Joined: Fri Dec 09, 2011 12:12 pm
Location: Fort Saskatchewan, AB Canada

Re: PDF search and highlighting

Post by kdjanz » Sun Mar 28, 2021 7:02 pm

Hi SteveFI

To combat spam, the forum doesn’t give users privileges until they reach 8 posts. Keep posting and soon you will be able to put up links etc.

Good Luck & keep posting

FiNN-6001
Posts: 8
Joined: Mon Feb 15, 2021 11:46 pm

Re: PDF search and highlighting

Post by FiNN-6001 » Mon Mar 29, 2021 12:32 am

One option which requires a bit of knowldge in Python but not too much is to create a simple server, (you can host it on replit for free if it's small scale becasue it doesn't have the best response times) and use the pdfx module to convert the PDF to JSON using python which you send back to the client. Not the cleanest but it can work...

SteveFI
Posts: 30
Joined: Tue Mar 16, 2021 6:15 pm

Re: PDF search and highlighting

Post by SteveFI » Fri Apr 09, 2021 9:39 pm

kdjanz wrote:
Sun Mar 28, 2021 7:02 pm
Hi SteveFI

To combat spam, the forum doesn’t give users privileges until they reach 8 posts. Keep posting and soon you will be able to put up links etc.

Good Luck & keep posting
Thanks! I'm making really good progress on this on-and-off project of mine!

I've got really far with pdf dot js and the only wall I've hit is being able to display a target pdf by default. I've had to ask a web/js developer friend of mine but I hope we can crack it.

Steve

SteveFI
Posts: 30
Joined: Tue Mar 16, 2021 6:15 pm

Re: PDF search and highlighting

Post by SteveFI » Fri Apr 09, 2021 9:42 pm

FiNN-6001 wrote:
Mon Mar 29, 2021 12:32 am
One option which requires a bit of knowldge in Python but not too much is to create a simple server, (you can host it on replit for free if it's small scale becasue it doesn't have the best response times) and use the pdfx module to convert the PDF to JSON using python which you send back to the client. Not the cleanest but it can work...
Thanks. I'm afraid I don't really know Python and the prospect of starting servers and so on sounds a bit scary!

Steve

liveme
Posts: 240
Joined: Thu Aug 27, 2015 5:22 pm
Location: down under

Re: PDF search and highlighting

Post by liveme » Sat Apr 10, 2021 12:57 am

...interesting FINN - I'd be more afraid by Json itself ..than by *easy python
:wink:

Post Reply

Return to “Talking LiveCode”