Identify data content

We got various cases where it would be useful to have a way to check the content of a file in FileMaker. Whether the file comes as hex or base64 encoded string, a container value or a file on disk. Why would be need this? Well, let's think about a few cases:

  • You got a file without file extension and you need to know what it is.
  • You got a JPEG, which actually is a PNG or HEIF image.
  • Some user wanted to edit an image before uploading. They dropped it into a word document and later renamed it to have the extension to be imported.
  • You like to check if an attachment is an executable as you like to prevent passing on malware.

To solve these problems, we got a few things in the past. Like we could try to load a picture in GraphicsMagick to see if it would load and thus likely be a real image. Or ask FileMaker to import a file and see if it passes. But to help you, we got three new functions for MBS FileMaker Plugin:

Whether you have a file to import, a container value in a field or from a function or some text encoded data from a web service, you can use one of these functions to look into the data. They will return one of the following texts:

PDF PDF document.
JPEG JPEG image.
GIF GIF image.
BMP Windows BMP image
WEBP WebP image.
PNG PNG image.
TIFF TIFF image.
GIF GIF image.
SVG SVG image.
ZIP ZIP file, possible Office file
HEIF HEIF image file.
FMP FileMaker database.
EXE Windows executable
MachO macOS executable
ELF Linux executable

You may combine them with other functions. e.g. if Container.IdentifyData returns "HEIF" for an image and you need a PNG, you would run it through Container.ReadImage to compare. Same for sending email attachments, where you could run the function and reject the attachment if it is an EXE, ELF or MachO file type.

Please note that docx and xlsx files from Word and Excel are internally zip files, so they get reported as zip file.

Please try it and let us know whether they work as expected.

2 Likes

Easier than creating a calculation that base64encodes the container, and looks at the first 8 characters, to map against a table of characters specific to each file type. Done all the heavy lifting in these commands! Awesome.

HEIF is particularly troublesome in mixed FMGo and Windows environments, according to HOW the user inputs the photo into a container. This will make that work easier. :slight_smile:

1 Like

Well, we had the functions to determinate if data has right starting bytes for many types and I just made it available conveniently.

Let me know if it you find a file which doesn't work properly.

Thanks! Saving me from the TOs and code in every app by having convenient commands to get there. :slight_smile:

What I would use is recognizing email files like eml, emlx and the likes since I try to save those and store them in distinct storage areas...

...and of course I'd like to detect/distinguish all useless stuff like those Instagram, Facebook-, Youtube-, xxx-icons that are send attached either inline or external to mails and just spam the inbox documents while I only want to route and process the incoming PDFs and JPEGs that are scanned documents. So much noise in email attachements :frowning:

I suppose that would be a typical AI job, selecting what is useful content and what is just clutter...

2 Likes