English 中文(简体)
Checksum JPEG data (not the whole file)
原标题:

Are there end-of-exif / end-of-xmp / end-of-iptc / start-of-data markers that I could use to get a checksum of just the data part of a jpg / jpeg (and other image formats)?

最佳回答

MediaTags has checksum support for JPEG, MP3, M4A, etc

问题回答

I think this question is related to this one Compute hash of only the core image data (excluding metadata) for an image, https://stackoverflow.com/a/10075170/890106 gives an element of answer if you re looking for code.

It might not works with all JPG variants though : some of them can embed multiple images (MPF / CIPA Multi-Picture Format, more informations at http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/MPF.html) and you might still have some metadata. Also, some software put an UID in the form of --[0-9A-F]+-- at the end of the file and it shouldn t be read. Safest solution if probably to checksum pixels (though you can still have influence of orientation, color profile, ..).

One easy way to get a hash sum of just the pixel data would be to convert the JPEG into a 32Bit BMP or alternatively into PNG and to calculate a hashsum from that. This will strip all the associated information from the JPEGs and would even match JPEGs with differnt encodings that lead to the same pixel data. You could of course also use the in memory pixel data of the resulting BMPs directly if you have it (i.e. Windows has several API functions to get it from any supported image type).

Yes to jpeg and exif, I don t know to the others.

The JPEG spec that I have is called JFIF (JPEG File Interchange Format) it comes from Annex B of ISO 10918-1 and like all ISO specs, it takes careful reading to figure out how to translate the spec into data structures. I think this is much easier to follow

the EXIF format parses much like the TIFF format. each chunk has a type and a size, so you just walk the chunks until you get to the image data chunk. it has a pointer to the image data (actually pointers to strips, but I m pretty sure that you can assume the everything after the first strip of image data to the end of the file is image data.

The exif format has its own website

You ll have to look at each format. For JPEG, it looks like the structure implies that you can just do a checksum of the sections that start with FFEn (e.g. 0xFFE1) and checksum the bytes specified after each marker (It looks like the length follows the marker and is 2 bytes in big-endian format). For more details, see here.

Since you want to do this for various image formats, you should just use a general-purpose image decompression library and run your checksum on the uncompressed data. This will allow you to match identical images even if they are encoded differently on disk.

If you want to limit yourself to JPEG, you can checksum the data between SOI and EOI. This answer can be slightly adapted to do what you need.





相关问题
Create a Video Stream (AVI) from a Series of Images

There is an IP web camera that I wrote a .NET class for sometime ago. It s basically a Timer implementation that pings a snapshot CGI script from the camera every five seconds. The camera itself is ...

JPG+Zip File Combination Problem with Zip Format

Hopefully you ve heard of the neat hack that lets you combine a JPG and a Zip file into a single file and it s a valid (or at least readable) file for both formats. Well, I realized that since JPG ...

c# converting HTML to JPG

I am trying to download image from HTTP URL to my computer via c#. Example picture: http://www.hcs.harvard.edu/csharp/Logo1.png I am using cURL to fletch it. Then I am saving it to computer as ...

Scale a .jpg file in WPF

I d like to open a .jpg file in WPF, scale it down to around 50%, then save it back to the file system. What s a good/efficient way to go about doing that?

How to Add Comments to a JPEG File Using C#

Within the property window of a JPEG image, there is a tab called Summary . Within this tab, there is a field called Comments I would like to write some c# code which will add a given string to ...

热门标签