- RELATIONSHIPS
What effect do relationships have on performance and what are the alternatives? What are examples of performant and non-performant relationship graph designs?
What effect do relationships have on performance and what are the alternatives? What are examples of performant and non-performant relationship graph designs?
The oldest chestnut in the FileMaker world is that relationships can be expensive regarding performance.
For a new user the ease of creating relationships, and method that has not changed since 2004, is very seductive because it is easy to create chains of relationships and tunnel through to finely defined data.
However, seductive relationships may be regretted with the wisdom that comes with hindsight.
Just because Filemaker is notionally a relational database that doesn't mean that creating many relationships is the best way of using it. Since relationships appeared (in FM3? I only started with 4.1) many other techniques for getting the record or the data you want have been added so that today in 2021 it is really a data management system, not just a relational database.
It is noticeable that in the rest of the world there are other types of database able to perform many operations very quickly. The feature that defines FileMaker's difference is that it "automatically" or "unasked" (depending on your perspective) performs updates of all the indexes used by all the relationships that that can be assessed from the current user "context" (get(layouttablename)) & record when ever a commit is made, either explicitly or on a record, layout or mode change. This can be described as "nearly continually re-evaluating the known universe" from a context point of view. This may comprise very little load or a great deal dependent on the size of the known universe observable from that context.
Here is an amusing and helpful video from Michael Rocharde on the question of context:
The possible means of getting records or data without using any relationships are as follows:
BobinoRegular
One example of alternative techniques:
Record hierarchy (multi-level parent-child relationships).
This is often represented as writing a parent's id as a foreign key in a child record. It is fast for some use cases and slow for some others.
A different model is the nested set model , fast to retrieve a given node's descendants, slow when updates are required amongst other nodes.
Michael Rocharde's latest on Primary Keys: Primary Keys (as in FileMaker™) - YouTube
In which cases filtering through relationships kills performance and in which case is it an advisable method?
Applying the basic principle:
“The less you ask the machine to do the faster it will run.”
(A) so one hop is less work than more than one hop;
(B) so one predicate in the relationship is less work than more than one predicate;
(C) so a simple expression used in a predicate is less work than a complex expression;
(D) so extracting related records from a small data set is less work than extracting them from a large data set; and
(E) when using multiple predicates apply the one that most reduces the found set first and thereafter so that subsequent predicates have smaller sets to work with.
The second example of filtering through relationships is portal filters, which work very well provided you don’t ask them to filter from a very large set. As a general guide consider using a portal filter for less than say 1,000 records but use a scripted find for over that number. Certainly a find is the best best for large data sets anything over 10k.
Portal filters can offer a really flexible means of sifting a reasonable sized data set, even where complex custom functions are used to enable multiple predicate queries.
Such a custom function can seek to set multiple includes and excludes by evaluating different criteria to true or false.
A portal filter can be thought as a relationship with soft wired predicates, it is just a slightly different delivery mechanism for a relationship.
Relationships are bi-directional - but most people are constrained by the left to right anchor buoy idea. When developing your schema consider areas in the relationship graph where you can work right to left as well as left to right. Maybe re-arrange the TOs reversing right and left and look for relationships you can use that already exist?
Bi-directional use of TOs can be good from a performance point of view. This said, the left-to-right constraint serves the equally important goal of self documentation of TOs. TO names will be confusing when reading calculations and scripts when using bi-directional TOs.
For some things, the Party Role Model is a good choice. I use it among other things for contacts. The TO's are used bidirectionally and there are no comprehension problems with the naming. Especially the subdivision of contacts into organizations, persons, customers, employees etc. is very helpful. Working with these subsets of contacts is very performant!
I saw it the first time in a DevCon session video where Dave Graham speaks about it: Data Modeling That Scales (Advanced Track 007) - YouTube
Except with a filtered portal, all child records download, THEN the filter is applied. Benchmarked against a relationship filter, it does not take very many child records to see the performance difference; relationship filters are far faster because they filter the data prior to the data transfer.