Oh no another speed test : read thumbnail from raw
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller
-
- Posts: 919
- Joined: Wed Nov 04, 2009 11:41 am
Oh no another speed test : read thumbnail from raw
Hi,
I have been experimenting with reading a jpeg thumbnail from camera raw files. There are hundreds of raw file formats so I have only tested on a few : Panasonic RW2, Olympus ORF and Nikon NEF but there is a good chance it will work on any raw file that is based on the tiff file format. These files contain the raw image data, Exif meta data and JPEG thumbnails or previews.
The code in the attached stack searches the data in the file for a 4 byte header that indicates the start of jpeg image data, it reads this and the jpeg is displayed. The code is crude and has no error checking, it also stops searching when it finds the first block of jpeg data. The stack offers three methods of reading the data.
1) read the whole raw file into memory and then process. On a mac the time is longer on first read than subsequent reads indicating that Mac OS buffers files in the back ground for a period. I see first run times between 80 and 350 ms
2 and 3) These methods read in small blocks of data which is then searched. You may enter any value for block size. I find that a size of 256-512 gives fastest times of typically 80 ms.
My raw files are between 16 and 24 mbytes in size. I suspect that larger files will require a different block size and that the optimum for all files depends on how the OS reads data from disk ( it might be page size ) meaning that the code could be tuned for different OS and disk formats. However, I can live with read times of 300 ms or better.
Please have a try on your system and report back any results or improvements in the code.
best wishes
Simon
I have been experimenting with reading a jpeg thumbnail from camera raw files. There are hundreds of raw file formats so I have only tested on a few : Panasonic RW2, Olympus ORF and Nikon NEF but there is a good chance it will work on any raw file that is based on the tiff file format. These files contain the raw image data, Exif meta data and JPEG thumbnails or previews.
The code in the attached stack searches the data in the file for a 4 byte header that indicates the start of jpeg image data, it reads this and the jpeg is displayed. The code is crude and has no error checking, it also stops searching when it finds the first block of jpeg data. The stack offers three methods of reading the data.
1) read the whole raw file into memory and then process. On a mac the time is longer on first read than subsequent reads indicating that Mac OS buffers files in the back ground for a period. I see first run times between 80 and 350 ms
2 and 3) These methods read in small blocks of data which is then searched. You may enter any value for block size. I find that a size of 256-512 gives fastest times of typically 80 ms.
My raw files are between 16 and 24 mbytes in size. I suspect that larger files will require a different block size and that the optimum for all files depends on how the OS reads data from disk ( it might be page size ) meaning that the code could be tuned for different OS and disk formats. However, I can live with read times of 300 ms or better.
Please have a try on your system and report back any results or improvements in the code.
best wishes
Simon
- Attachments
-
- ReadThumbnailSpeedTest.livecode.zip
- Archive of jpeg thumbnail reader / speed test
- (7.71 KiB) Downloaded 258 times
best wishes
Skids
Skids
-
- Posts: 919
- Joined: Wed Nov 04, 2009 11:41 am
Re: Oh no another speed test : read thumbnail from raw
Ooops,
I have just remembered that the stack uses a shell command to determine the size of the selected file. So I suspect that this command is OS specific :
From the mouse up handlers in buttons 2 and 3:
The first button should work across all desktops and while I could look up versions of the shell script for Windows and Linux I have no way of testing them, sorry.
S
I have just remembered that the stack uses a shell command to determine the size of the selected file. So I suspect that this command is OS specific :
From the mouse up handlers in buttons 2 and 3:
Code: Select all
put "stat -f%z " & tFileName into tCommand
put shell (tCommand) into tBytesInFile
The first button should work across all desktops and while I could look up versions of the shell script for Windows and Linux I have no way of testing them, sorry.
S
best wishes
Skids
Skids
-
- Posts: 919
- Joined: Wed Nov 04, 2009 11:41 am
Re: Oh no another speed test : read thumbnail from raw
Here is an updated stack that uses a handler posted by Jan Schenkel here https://forums.livecode.com/viewtopic.php?t=5397
I believe that the stack will now work across all desktop operating systems but there is no error checking or user protection in the stack.
S
I believe that the stack will now work across all desktop operating systems but there is no error checking or user protection in the stack.
S
- Attachments
-
- ReadThumbnailSpeedTest.livecode.zip
- Updated stack file - now multi platform
- (5.89 KiB) Downloaded 246 times
best wishes
Skids
Skids
Re: Oh no another speed test : read thumbnail from raw
Hi Simon,
I tried using byteOffset in handler ReadJpegFromRaw of button "Search1" of your second stack. I commented-out the unused code part.
I could only test on JPEGs but it should also work on Raw.
Kind regards
Bernd
I tried using byteOffset in handler ReadJpegFromRaw of button "Search1" of your second stack. I commented-out the unused code part.
I could only test on JPEGs but it should also work on Raw.
Code: Select all
Function ReadJpegFromRaw @pImage
--put empty into field "debug"
--put ReadImageOrientation (pImage) into tOrintation
--local tFilePointer = 0, tFileLength
local tJpegStart= -1, tJpegEnd = -1
local tJpegData
Constant FF = 255
Constant D8 = 216
Constant DB = 219
Constant D9 = 217
-- repeat with n = 1 to length(pImage)
-- if byteToNum(Byte n of pImage) is FF then
-- --test following byte
-- if byteToNum(Byte n+1 of pImage) is D8 then
-- --test following byte
-- if byteToNum(Byte n+2 of pImage) is FF then
-- --test following byte
-- if byteToNum(Byte n+3 of pImage) is DB then
-- --put "Found FFD8 FFDB at address : " & n & cr after fld "debug"
-- put n into tJpegStart
-- exit repeat
-- end if
-- end if
-- end if
-- end if
-- end repeat
put numToByte(FF) & numToByte(D8) & numToByte(FF) & numToByte(DB) into tStartJPG
put byteOffset(tStartJPG, pImage) into tJpegStart
--put "JpegStart is set to " & tJpegStart & cr after fld "debug"
if tJpegStart is 0 then return "-1"
-- Start code found so now locate the end
-- repeat with n = tJpegStart+3 to length(pImage)
-- if byteToNum(Byte n of pImage) is FF then
-- --test following byte
-- if byteToNum(Byte n+1 of pImage) is D9 then
-- put n into tJpegEnd
-- --put "Found FFD9 at address : " & tJpegEnd & cr after fld "debug"
-- exit repeat
-- end if
-- end if
-- end repeat
put numToByte(FF) & numToByte(D9) into tEndJPG
add 3 to tJpegStart
put byteOffset(tEndJPG, pImage, tJpegStart) into tJpegEnd
if tJpegEnd > 0 then
add tJpegStart to tJpegEnd
else
put -1 into tJpegEnd
end if
If tJpegEnd is -1 then return "-1"
put Byte tJpegStart to tJpegEnd+1 of pImage into tJpegData
return tJpegData
end ReadJpegFromRaw
Bernd
Re: Oh no another speed test : read thumbnail from raw
Simon,
I just noticed that I fell afoul of image caching. The first time around it is slow because the whole image is read, second time around it is fast because the cached image.
Kind regards
Bernd
I just noticed that I fell afoul of image caching. The first time around it is slow because the whole image is read, second time around it is fast because the cached image.
Kind regards
Bernd
Re: Oh no another speed test : read thumbnail from raw
Simon,
How about this
Kind regards
Bernd
How about this
Code: Select all
on mouseUp pMouseButton
put empty into image "Thumbnail"
put the milliseconds into tStart
put field "FileName" into tNameOfImageFile
Constant FF = 255
Constant D8 = 216
Constant DB = 219
Constant D9 = 217
put numToByte(FF) & numToByte(D9) into tEndJPG
Open File tNameOfImageFile for binary read
read from file tNameOfImageFile until tEndJPG
put it into tImageData
Close File tNameOfImageFile
put numToByte(FF) & numToByte(D8) & numToByte(FF) & numToByte(DB) into tStartJPG
put byteOffset(tStartJPG,tImageData) into tJPGStart
if tJPGStart = 0 then
put empty into image "ThumbNail"
else
put byte tJPGStart to - 1 of tImageData into tJpeg
put tJpeg into image "ThumbNail"
end if
put the milliseconds into tEnd
put tEnd - tStart into tProcessTime
put "Read and extract Jpeg - time to beat : " & tProcessTime && "milliseconds" & cr after field "Results"
end mouseUp
Bernd
-
- Posts: 919
- Joined: Wed Nov 04, 2009 11:41 am
Re: Oh no another speed test : read thumbnail from raw
Hi Bernd,
Your code blows mine out of the water! With my test images on an SD card and your code reads in 4ms. Which is 20 times faster than mine or 80 times faster than my original code. So I think you win!
best wishes
Simon
Your code blows mine out of the water! With my test images on an SD card and your code reads in 4ms. Which is 20 times faster than mine or 80 times faster than my original code. So I think you win!
best wishes
Simon
best wishes
Skids
Skids
Re: Oh no another speed test : read thumbnail from raw
Hi Simon,
glad it works for you.
Additionally reading in chunks of e.g. 256 or any other size stands a slight chance to cut through your target string/byte sequence by chance.
Kind regards
Bernd
glad it works for you.
Additionally reading in chunks of e.g. 256 or any other size stands a slight chance to cut through your target string/byte sequence by chance.
Kind regards
Bernd
-
- Posts: 683
- Joined: Wed Apr 24, 2013 4:53 pm
- Contact:
Re: Oh no another speed test : read thumbnail from raw
I wrote two functions (one for embedded JPEGs and one for PNGs) similar to that some time ago while trying to fix problems with Mark Smith's (R.I.P.) ID3 tag lib. I haven't really done any more work on it or any speed optimizations, and the repo is a mess, but you might want to give it a look to compare.The code in the attached stack searches the data in the file for a 4 byte header that indicates the start of jpeg image data, it reads this and the jpeg is displayed. The code is crude and has no error checking, it also stops searching when it finds the first block of jpeg data. The stack offers three methods of reading the data.
See here:
https://github.com/PaulMcClernan/id3lib
-
- Posts: 919
- Joined: Wed Nov 04, 2009 11:41 am
Re: Oh no another speed test : read thumbnail from raw
Hi Bernd and Paul,
Thanks for your posts. Both of your methods are very quick, probably because they let the Disk File System do the heavy lifting.
Bernd : Searching for the End of Jpeg Data bytes first is a very elegant solution but I worry that the two bytes, 0xFFD8, may not unique within a raw file. This issue was highlighted by Paul's code, downloaded from GitHub.
Paul : In original form your handler crashes because it is only searching for a two byte Start of Jpeg Data marker 0xFFD8 which it finds but which is not the start of jpeg data. It seems that these two bytes also indicate the start of EXIF data within "some" raw files. This is why my code uses 0xFFD8 0xFFDBto find the start of jpeg data. Once the correct start point is found then the next End of Jpeg Data marker 0xFFD9 has to be correct by definition.
The most robust method is to use Paul's two read from file commands which while requiring two blocks of data to be passed to Livecode does seem to run at the same speed as Bernd's method while searching for the four bytes start of jpeg data marker first.
I have found this a very interesting exercise - thank you.
best wishes
Simon
Thanks for your posts. Both of your methods are very quick, probably because they let the Disk File System do the heavy lifting.
Bernd : Searching for the End of Jpeg Data bytes first is a very elegant solution but I worry that the two bytes, 0xFFD8, may not unique within a raw file. This issue was highlighted by Paul's code, downloaded from GitHub.
Paul : In original form your handler crashes because it is only searching for a two byte Start of Jpeg Data marker 0xFFD8 which it finds but which is not the start of jpeg data. It seems that these two bytes also indicate the start of EXIF data within "some" raw files. This is why my code uses 0xFFD8 0xFFDBto find the start of jpeg data. Once the correct start point is found then the next End of Jpeg Data marker 0xFFD9 has to be correct by definition.
The most robust method is to use Paul's two read from file commands which while requiring two blocks of data to be passed to Livecode does seem to run at the same speed as Bernd's method while searching for the four bytes start of jpeg data marker first.
I have found this a very interesting exercise - thank you.
best wishes
Simon
- Attachments
-
- ReadThumbnailSpeedTest.livecode.zip
- Stack that uses various methods of reading jpeg image data from a binary file.
- (10.63 KiB) Downloaded 240 times
best wishes
Skids
Skids
-
- Posts: 683
- Joined: Wed Apr 24, 2013 4:53 pm
- Contact:
Re: Oh no another speed test : read thumbnail from raw
Cool, I'll give it a try when I get a chance. (Damn it Jim!!!...) I'm a graphic artist not a photographer so I don't have any big RAW files around but I think I might be able to get some big ones off the net (maybe from NASA) to test. I know the speed of a function like this is important when you have a ton of files to go through (like my mess of an mp3 collection).
-
- Posts: 683
- Joined: Wed Apr 24, 2013 4:53 pm
- Contact:
Re: Oh no another speed test : read thumbnail from raw
On second thought, I should just be able to test this on any large file with an embedded JPEG, I'm a prog-rock fan so I have tons of mp3s that are larger than 16mb, LOL!
-
- Posts: 683
- Joined: Wed Apr 24, 2013 4:53 pm
- Contact:
Re: Oh no another speed test : read thumbnail from raw
9.1 MB mp3 file:
Apparently a lot of my mp3 files have embedded PNG files instead of JPEG, so I'm wasting lots of disc storage space! More reason to get back to get back to working on a better ID3 lib someday!
-
- Posts: 919
- Joined: Wed Nov 04, 2009 11:41 am
Re: Oh no another speed test : read thumbnail from raw
Well if you want some raw files to play with I have over a hundred thousand of them and you are welcome to some to play with.
This exercise in optimisation now means I'm planning to update my file renaming app to list all the images with thumbnails which will be good. The bad side is that I will probably end up using a datagrid and they tend to lead me into "issues"!
best wishes
Simon
This exercise in optimisation now means I'm planning to update my file renaming app to list all the images with thumbnails which will be good. The bad side is that I will probably end up using a datagrid and they tend to lead me into "issues"!
. Well I'm a photographer because I have few artistic skills which is quite annoying as my late Uncle was a commercial artist do work for Airfix and Revell as well as book covers and my daughter is currently reading Art and Design at Leeds University. The closest I have come to graphic art is having to pay out for the Adobe suite of applications for her to use despite already buying her the much cheaper and in my opinion better Affinity collection.I'm a graphic artist not a photographer
best wishes
Simon
best wishes
Skids
Skids
-
- Posts: 683
- Joined: Wed Apr 24, 2013 4:53 pm
- Contact:
Re: Oh no another speed test : read thumbnail from raw
I've been hearing a lot about these Affinity Apps lately, and they have iPad versions which is nice! I really like the drawing app ProCreate (only $10) with the iPad/iPencil that my son turned me on to! I think I've tried just about every graphics app for the last 30 years. Lots of now defunct software like MultiAd Creator, Denabe Canvas, Aldus FreeHand, etc.The closest I have come to graphic art is having to pay out for the Adobe suite of applications for her to use despite already buying her the much cheaper and in my opinion better Affinity collection.
I used to do lots of AppleScript (which is like the spiritual cousin of LiveCode's Script) automation with QuarkXpress and later InDesign for offset printing (paper), but now I'm involved with a lot of packaging work which is mostly flexographic, screenprint, & gravure (extremely expensive cylinders that last forever) for printing on plastics that stretch, or curved surfaces that require distortions to look right, and use glitters, varnishes, foils, tactiles, holo, hotstamps, along with opaque & translucent inks. We use niche software by Esko called ArtPro ($$$$) along with a processing server for automation, in addition to Adobe CC. We still do lots of manual trapping (think a bazillion vector paths), so I doubt that Affinity would make the cut for that sort of work after only 5 years of development, but I'd really LOVE for Adobe to have some real competition again!
I just looked at the specs page for their Designer app, and it sound pretty good...CMYK, Pantone, Color Profiles, etc.
"Overprint controls (for desktop only)" <- that's a good pro feature but If it doesn't also have an overprint preview that's a deal breaker.