Extract and analyze images with DynaPDF

Did you know that you can extract and analyze images from a PDF file with DynaPDF? For this you can use the function DynaPDF.GetImage. In the parameters you can specify which information you want to get for an image. You can choose between the following values:

BufSize The size of the image buffer in bytes.
BufSize The size of the image buffer in bytes.
Buffer The image data as JPEG or FILE.
Picture The image as a picture container. Either JPEG or TIFF.
Filter The format of image:
Required decode filter if the image is compressed. Possible values are dfDCTDecode (JPEG), dfJPXDecode (JPEG2000), and dfJBIG2Decode. Other filters are already removed by DynaPDF since a conversion to a native file format is then always required.
OrgFilter The image was compressed with this filter in the PDF file. This info is useful to determine which compression filter should be used when creating a new image file from the image buffer.
BitsPerPixel Bit depth of the image buffer. Possible values are 1, 2, 4, 8, 24, 32, and 64.
ColorSpace The color space refers either to the image buffer or to the color table if set.
NumComponents The number of components stored in the image buffer.
MinIsWhite If 1, the colors of 1 bit images are reversed.
ColorCount The number of colors in the color table.
Width Image width in pixel.
Height Image height in pixel.
ScanLineLength The length of a scanline in bytes.
InlineImage If 1, the image is an inline image.
Interpolate If 1, image interpolation should be performed.
Transparent The meaning is different depending on the bit depth and whether a color table is available.
If the image is a 1 bit image and if no color table is available, black pixels must be drawn with the current fill color. If the image contains a color table, ColorMask contains the range of indexes in the form min/max index which should appear transparent. If no color table is present ColorMask contains the transparent ranges in the form min/max for every color component.
Intent The rendering intent. Default is none.
MetadataSize Length of Metadata in bytes.
ResolutionX Image resolution on the x-axis.
ResolutionY Image resolution on the y-axis.
Metadata Optional XML Metadata stream as text.
ICCProfile ICC Color Profile of the colorspace (can be empty).
MaskImage If set, a 1 bit image is used as a transparency mask. Returns index of that image.
SoftMask If set, a grayscale image is used as alpha channel. Returns index of that image.
FillColor The current fill color. An image mask is drawn with the current fill color.
FillColorSpace The color space in which FillColor is defined.

For example, the size query would look like this:

Set Field [ Images::Width ; Value: MBS( "DynaPDF.GetImage"; $PDF; $i; "Width" ) ]
Set Field [ Images::Height; Value: MBS( "DynaPDF.GetImage"; $PDF; $i; "Height") ]

In the parameters we first specify the PDF working environment in which the file with the analyzed images is located, then the index of the image to determine which image should be analyzed and finally the type of information that is requested.

As already mentioned, we can not only analyze images, but also extract them from the file. The image is in the end only information that we query, which means we can use the same function again.

Set Field [ Images::image ; MBS("DynaPDF.GetImage"; $PDF; $i; "Picture"; $i&"_imag.png"; "PNG") ]

We can specify more information here in the parameters, so we additionally specify the file name and the format we want for the image.

If we want to extract all images of a document, we can use the functions in a loop. The loop will then run as many times as there are images. We determine the number of images with the function DynaPDF.GetImageCount. The index of the addressed images starts at 0, and ends at DynaPDF.GetImageCount-1.

If you are interested in this topic have a look at our new example Extract and analyze images.fmp12 included with the next version.

To use this functions you need a DynaPDF Lite license and the new example file Extract and analyze images.fmp12 is part of 11.5 release.