PDF search and highlighting
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller
PDF search and highlighting
Hello all,
After quite a long time away (I originally wrote the DataTree treeview many moons ago, if anyone remembers that?), I'm back working on a little LiveCode project for work.
I'm quite impressed with what LiveCode has become and a lot came back to me pretty quickly. One thing we didn't have back in the day was the ability to render PDFs but it's something I need for this project. I've easily got some to display in a browser widget but can't do much else with them. Ideally, I would like to be able to search for text terms using a field and highlight text using some sort of highlight tool.
I doubt it's possible with the browser widget as it stands, although I gather it's based on Chromium, and the PDF renderer is in turn is based on PDFium. Is it possible to send Javascript to the browser widget to get it to do things? Could search and highlighting be achieved by loading in a web page that drives the PDF using Javascript? Bit beyond my skills but if it is were possible, it's something I could get some help with from web developer friends, I'm sure.
The business edition has a PDF widget. Again, this looks read-only. Is it worth the licence fee?
Thanks,
Steve
After quite a long time away (I originally wrote the DataTree treeview many moons ago, if anyone remembers that?), I'm back working on a little LiveCode project for work.
I'm quite impressed with what LiveCode has become and a lot came back to me pretty quickly. One thing we didn't have back in the day was the ability to render PDFs but it's something I need for this project. I've easily got some to display in a browser widget but can't do much else with them. Ideally, I would like to be able to search for text terms using a field and highlight text using some sort of highlight tool.
I doubt it's possible with the browser widget as it stands, although I gather it's based on Chromium, and the PDF renderer is in turn is based on PDFium. Is it possible to send Javascript to the browser widget to get it to do things? Could search and highlighting be achieved by loading in a web page that drives the PDF using Javascript? Bit beyond my skills but if it is were possible, it's something I could get some help with from web developer friends, I'm sure.
The business edition has a PDF widget. Again, this looks read-only. Is it worth the licence fee?
Thanks,
Steve
Re: PDF search and highlighting
Hi Steve...
using some online tool such as : https://pdftotext.com/
one can convert a small PDF to some quite "clean text only" downloadable format ...then do whatever one wants with LC.
maybe there is some similar available libraries for a converting tool that could then be used - offline - by an LC stack
I havent had the time to search for that yet !
using some online tool such as : https://pdftotext.com/
one can convert a small PDF to some quite "clean text only" downloadable format ...then do whatever one wants with LC.
maybe there is some similar available libraries for a converting tool that could then be used - offline - by an LC stack
I havent had the time to search for that yet !
Last edited by liveme on Wed Mar 17, 2021 5:45 pm, edited 1 time in total.
Re: PDF search and highlighting
Thanks for this.
Years ago, I wrote something that took some XML and constructed a 'page' using various fields and other layout constructs I'd created. If I did this again, I could recreate the content of the PDFs and then do whatever I wanted with them... highlighting, searches and so on.
pdftotext would then be useful in extracting the text from the PDF.
I think there must surely be some mileage in a library that improves PDF functionality still further, beyond what the browser and PDF widgets can do. If only I had time these days...
Steve
Years ago, I wrote something that took some XML and constructed a 'page' using various fields and other layout constructs I'd created. If I did this again, I could recreate the content of the PDFs and then do whatever I wanted with them... highlighting, searches and so on.
pdftotext would then be useful in extracting the text from the PDF.
I think there must surely be some mileage in a library that improves PDF functionality still further, beyond what the browser and PDF widgets can do. If only I had time these days...
Steve
Re: PDF search and highlighting
I looked a bit further into this over the weekend.
Turns out we don't need to highlight content on the PDF. However, a search function is on the list.
Knowing that I can't interact with the browser if it natively loads in a PDF, I thought 'how about hooking in a JavaScript renderer, as mentioned in a thread elsewhere?'. And thus, I found pdf dot js and it does the job. I would link but it seems the forum isn't keen on this (I can't even type the name of library!).
The prebuilt (for older browsers) package would appear to work quite well. I have much to learn but I think I have a solution now.
Steve
Turns out we don't need to highlight content on the PDF. However, a search function is on the list.
Knowing that I can't interact with the browser if it natively loads in a PDF, I thought 'how about hooking in a JavaScript renderer, as mentioned in a thread elsewhere?'. And thus, I found pdf dot js and it does the job. I would link but it seems the forum isn't keen on this (I can't even type the name of library!).
The prebuilt (for older browsers) package would appear to work quite well. I have much to learn but I think I have a solution now.
Steve
Re: PDF search and highlighting
Cool, I've no real knowledge of JS, not to talk about a render
all I can offer..some "confined","virtual" support !
...and beta testing under "LinLin"
(which BTW has no web browser working feature,
in case you'd decide "not" to go that way... )
all I can offer..some "confined","virtual" support !
...and beta testing under "LinLin"
(which BTW has no web browser working feature,
in case you'd decide "not" to go that way... )
Re: PDF search and highlighting
Hi SteveFI
To combat spam, the forum doesn’t give users privileges until they reach 8 posts. Keep posting and soon you will be able to put up links etc.
Good Luck & keep posting
To combat spam, the forum doesn’t give users privileges until they reach 8 posts. Keep posting and soon you will be able to put up links etc.
Good Luck & keep posting
Re: PDF search and highlighting
One option which requires a bit of knowldge in Python but not too much is to create a simple server, (you can host it on replit for free if it's small scale becasue it doesn't have the best response times) and use the pdfx module to convert the PDF to JSON using python which you send back to the client. Not the cleanest but it can work...
Re: PDF search and highlighting
Thanks! I'm making really good progress on this on-and-off project of mine!
I've got really far with pdf dot js and the only wall I've hit is being able to display a target pdf by default. I've had to ask a web/js developer friend of mine but I hope we can crack it.
Steve
Re: PDF search and highlighting
Thanks. I'm afraid I don't really know Python and the prospect of starting servers and so on sounds a bit scary!FiNN-6001 wrote: ↑Mon Mar 29, 2021 12:32 amOne option which requires a bit of knowldge in Python but not too much is to create a simple server, (you can host it on replit for free if it's small scale becasue it doesn't have the best response times) and use the pdfx module to convert the PDF to JSON using python which you send back to the client. Not the cleanest but it can work...
Steve
Re: PDF search and highlighting
...interesting FINN - I'd be more afraid by Json itself ..than by *easy python