Privacy-compliant use of AI in FileMaker development: A placeholder approach

mipiano · October 18, 2024, 1:14pm

Dear community,

I would like to present a method for discussion that allows us to use AI tools in our development work while maintaining data privacy.

BASIC PROCEDURE:

Anonymization: Before sending to AI tools, we replace all sensitive information with placeholders.
AI processing: The AI works with the anonymized data.
Recovery: After receiving the AI response, we replace the placeholders with the original data.

Note: New placeholders are created for each planned AI processing.

PLACEHOLDER METHODS:

Semi-generic placeholders:
- Example: Person_1, Company_A, Product_X
- Evaluation: Easy to implement, clearly recognizable as placeholders, but not very realistic
Random names from predefined lists:
- Example: Max Müller, Bergstadt GmbH, SyncPro
- Evaluation: More realistic, retains cultural context, requires maintenance of the lists
AI-generated names:
- Example: Lena Bergmann, NexTech Solutions, DataFlow Pro
- Evaluation: High creativity and context customization, maybe technically demanding

IMPLEMENTATION IN FILEMAKER:

All necessary routines for placeholder generation and recovery are implemented directly in FileMaker. This includes:

Scripts for recognizing and replacing sensitive data
Management of name lists for random selection
Integration of an AI API for name generation (optional)
Mapping tables for assigning placeholders and original data

ADVANTAGES OF THIS APPROACH:

Maintaining the confidentiality of customer or company data
Flexibility in the choice of placeholder method
Full integration into the FileMaker workflow

QUESTIONS FOR THE COMMUNITY:

What experiences have you had with similar approaches?
How do you rate the different placeholder methods?
What challenges do you see when implementing them in FileMaker?
Do you have any ideas for optimizing or expanding this concept?
Do you have any concerns about this approach in terms of data protection and confidentiality?

I look forward to hearing your thoughts and experiences on this topic!

~Udo ( @mipiano )

villegld · October 18, 2024, 4:09pm

Replacing sensitive data with placeholders is not anonymisation. It is pseudonymisation. Depending on case that may be enough or may not be.

Same result you could get with data encyption. You would need key to decrypt. Same with placeholders you need to know what values placeholders match with.

mipiano · October 19, 2024, 2:14pm

You are right, @villegld . I should have worded that more precisely.

In my opinion, this pseudonymisation should be sufficient for many everyday topics. For very sensitive topics such as health information or very confidential company topics, it would probably be better to use a locally installed LLM.
Of course, this also involves more effort and is a real hurdle for some people.

I already mentioned this in the description of my idea.

Do you mean replacing the terms with encrypted values instead of the placeholder methods I mentioned? Then tracking would theoretically be possible, unless I encrypt in a different way each time.

Or have I misunderstood you? If so, I would like you to explain what you mean.

villegld · October 19, 2024, 6:48pm

I mean the idea is quite similar; you alter the data in a way that you are the only one who knows how it was done. You could for example have different encryption key everytime and even for each field.

I have no further knowledge or wider experience of this. Just a thought that encrypting data would be more ”easier” way to not have to reinvent the tools.

But this is very interesting and important topic. I’ll keep following up.

mipiano · October 21, 2024, 10:37am

Well, encrypted terms are also a good way to go. The same applies here: you have to tell the AI that you are using placeholders. Otherwise you run the risk of confusing the AI and the quality of the processing result will suffer.

And in some cases it will be useful to provide a certain context: e.g. CRM-Software_X, so that it is clear what type of software is involved. This can help the AI to return a better result.

Long story short: It depends on the use case which type of placeholder is best used. I don't think there is "ONE best solution" here that fits all use cases.

mipiano · October 21, 2024, 10:45am

I'm a bit surprised that only one person is joining the discussion. I thought this topic would be of interest to many.
Do you ignore AI?
Or do you trust the AI providers so much that you don't bother?
Or have you simply not thought about it yet?
Or have I just written nonsense here?

I would really like to know the thoughts of more forum participants on this topic. Are there any questions about my idea of using placeholders?

Markus · October 21, 2024, 11:04am

well...
I did use AI (chat gpt), not for FileMaker but for Xcode/SwiftUI. I got some really helpful examples, tipps - but after maybe a dozen answers, AI doesn't like me anymore. No chance to get one single answer, always some error messages and a way to long time until a message appears.

Therefore, I'm done - for the moment

Drew · October 21, 2024, 8:03pm

I have thought about this before, I would love to train an AI and maybe put my database in as XML and have a GPT for the solution I am working on.

As of now I keep it as generic as I can.
I have made a GPT that has parameters. (Friendly and casual mentor for novice Claris FileMaker developers.) I set it to explain things to a beginner, and not get too technical. I am sure I could go deeper with it, but have no need as of now.

What do you think about this?

Malcolm · October 22, 2024, 12:13am

We’re handling this by using custom LLM on-site. So data never leaves the premises.

Another aspect is that we are only interested in aggregate data. We don’t need to attach information that would identify individuals.

steve_ssh · October 24, 2024, 3:51am

It's a totally interesting topic, and I am glad to see it posted here. I don't participate because I simply don't have the knowledge or experience to contribute to the discussion meaningfully. Perhaps in a year, or two, that could change.

bdbd · October 24, 2024, 7:39am

Your question was tough to answer. I admit it made me uncomfortable to think about it.

I am a bit sheepish to say I actively ignore AI. Sheepish because I believe my reasons are only half-baked right now.

You see, I am a content creator in the realm of photography. There is a ton of concerns regarding AI training and the fair use of images, texts, sound recordings and videos. The courts have so far said little about the disputes that have so far come up. I am biassed on the subject. I opine that AI companies are having a free lunch off the back of pretty much everyone else.

I am a father. There is a ton of concerns about privacy and AI. No matter how this plays out, there is little chance for someone to regain a modicum of privacy when privacy slip-ups occur. Worst, there are already news articles of AIs ruining people's reputations, either through indiscretions or fabulations. Heard about AI generated child pornography? People want to make money no matter the cost. CEOs and investors are not at all motivated to play it safe. So yeah! I don't trust AIs at all.

I believe in equality. AIs are businesses now and will be for the foreseeable future. AIs can give a huge advantage to those who use it effectively and regularly. You can play with it if you pay. Can't pay? Too bad! You get left in the dust. Irony: AI companies are doing all they can to avoid paying the rights holders to the training material.

Those of you who are interested in economics should already know that AIs will have a huge impact on labour fairly soon… and not in a pretty way. CEOs are saying two diverging messages. They say to the public that AIs will augment employees capabilities. They say to investors that AIs will reduce the need for human resources. The expected loss of labour is also expected to negatively impact economies fairly quickly. The timescale to reach a new equilibrium is expected to be quite long.

I am concerned about security. AIs can be used for benefit or harm. Some AIs are trained to poke holes in system security and to deceive people. There are more threats but those two affect pretty much everyone. I can no longer trust that an image, a voice on the telephone, a recording and other similar products are genuine. The mimicry may be crude today but that's changing fast.

I am afraid that AI mimicry will soon have a detrimental impact on our legal systems, to name only one. How reliable will evidence be if said evidence can be spoofed by AIs? Spoofs are somewhat still easy to detect right now but AIs are improving awfully fast and producing ever better spoofs.

Lastly, there are human factors that make the use of AIs problematic. How many articles have I read where a professional is caught faking submissions via the use of AIs? Lawyers, programmers, engineers, doctors and many other pros are actively neglecting their work by using the work of AIs without review. These pros are suppose to be paid the big bucks because their knowledge, experience and work is suppose to be of high value. How can they justify their rates when they no longer do the work?

I am not pretending that my reasons are well thought out or all that relevant. I read the news and see greed, laziness, fear and malice guide AIs way more than makes me comfortable. I am confident that AIs and LLMs are not mature. I do not believe they are ready for prime time. Seems I am not alone.

Many trust and use AI results without question. That's scary! At least to me!

planteg · October 24, 2024, 9:42pm

AI is the new buzzword in town. Let's have a look at the past:

Cloud was so extraordinary, why would you host your data yourself ? Just think about it: someone will host you data that will be available from everywhere, anytime for a modest (!!) fee. Life made easier. Until people found out that all cloud providers were reliable, some suffering outage, other security issues. And on top that, rules by some government that said they will have a look if they wish. Many abandoned the cloud.
Impartition was another clever idea. Fire your employees that run your IT infrastructure, they cost you too much. Professionnels invited corporations to sign with them, they would provide you with personnel much less... Those corporations discovered that everything they asked for that was not already in the long contract was charged as a supplement, and that these people sent to their place were not working for them, the customer, but for their bosses. Many corporation had disillusion afterwards.

But AI is different, it is so dangerous in that it may harm you. Just think about deep fake. @bdbd has mentioned some situations you should fear. Ok I get that for some uses it is very useful, for example looking at X-Rays pictures to detect health problems that are not detectable by a human eye. But in the hands of the wrong peoples, it is so harmful.

THE question is: do the inconveniences overwhelm the advantages ? That brings a long discussion !

Markus · October 29, 2024, 8:49am

On SRF.ch, broadcasting Switzerland, was a serie about privacy on websites.
Link to that article:

People could ask questions that would be answered by experts

Here is a question about ChatGPT (in German, translation by DeepL at the bottom)

Frage: Welche Risiken besteht in Bezug auf die Personen bezogene Datensicherheit bei der (häufigen) Nutzung von ChatGPT? Lassen sich aufgrund der Fragen Profile erstellen? Wer hat Zugriff darauf? Wie kann bei der Verwendung von ChatGPT das Tracking und Profiling verhindern / eingeschränkt werden?

Antwort: ChatGPT speichert ja in der Regel alle Ihre früheren Konversationen. Ja, die sind in einem Profil zusammengestellt. Hersteller OpenAI hat Zugriff darauf und kann diese Inhalte beispielsweise auch zum Trainieren ihrer Modell verwenden (falls Sie das nicht explizit ausschliessen). Wie genau OpenAI diese Daten speichert, wie gross der Kreis der Personen innerhalb des Unternehmens ist, die darauf Zugriff haben und was alles sichtbar wäre, würde es jemandem gelingen, in das Unternehmen einzudringen, wissen wir nicht. Mit anderen Worten: Ich würde auf sehr persönliche Fragen oder das Übermitteln von persönlichen Dateien oder Geschäftsgeheimnissen verzichten.

English by DeepL

Question: What are the personal data security risks associated with the (frequent) use of ChatGPT? Can profiles be created based on the questions? Who has access to them? How can tracking and profiling be prevented/restricted when using ChatGPT?

Answer: ChatGPT usually saves all your previous conversations. Yes, they are compiled in a profile. The manufacturer OpenAI has access to this and can also use this content to train its model, for example (unless you explicitly exclude this). We do not know exactly how OpenAI stores this data, how large the circle of people within the company is who have access to it and what would be visible if someone were to succeed in penetrating the company. In other words: I would refrain from asking very personal questions or submitting personal files or trade secrets.

(end of quote)

I personally never asked ChatGPT something containing any personal data, just generic code (for example (not real..): 'create a sort method for a SwiftUI NavigationSplitView in the first column')

Topic		Replies	Views
Weekend Project: Integrating with the OpenAI Assistant API Sample files & code ai , api , curl , chatgtp	1	155	August 10, 2024
Use Cases for AI with Claris FileMaker™ Heads-Up! pdf , filemaker , announcement	1	180	June 27, 2024
ChatGPT on macOS can now directly edit code - can it work for Filemaker? ai , script , mbs	25	311	March 12, 2025
[FM Weetbicks] FileMaker ChatGPT Integration in the News	0	163	March 19, 2024
We are genuinely thrilled to be at the start of our journey to bring you Lounge (Discussions) ai	2	239	April 20, 2023

Privacy-compliant use of AI in FileMaker development: A placeholder approach

Related topics