- TABLE WIDTHS
What effect do wide tables (tables with many fields) have on performance? How can you reduce table widths? Where do repeating fields fit in?
What effect do wide tables (tables with many fields) have on performance? How can you reduce table widths? Where do repeating fields fit in?
It is by no means obvious why a table with 1000 fields and 1000 records is generally less performant than than a table with 10 fields and 1,000,000 records. However, experience shows that this is clearly the case. It's easy to create a couple of tables, fill them with auto-entered random data and use a timer to test this.
FileMaker seems to deal well with narrow tables with millions of records provided the basic rules of performance being explored in these topics are are observed i.e:
Examples of use cases that performed well and didn't perform well would be useful here? What can you contribute?
Cheers, Nick
Just as a personal/historical note:
I always have been grateful to Mark Richman of Skeleton Key for raising my awareness on this topic.
It was about the time of v.11 of FMP, and Mark was sharing results of testing he was doing regarding table width and performance. His proposition was that FMP is record centric, and that, with the exception of container data, record data is always transferred from server to client as a unit -- even if one field is needed by the client, all field data is sent across the wire. This led to some devs experimenting with architecture whereby fields known to contain large quantities of text might be separated out from a table definition and given their own separate table, so as to allow more efficient access to the remainder of the fields.
Now that we are in 2021, the above feels like something that many/most (but not all) devs are aware of, and we design with it in mind. But -- back in around 2009, it was the first time I had heard of it, and I always thank Mark for producing materials at the time which brought this to light.
Evidence that FM sends over entire records of a given table can be seen when bandwidth is limited.
Example: A customer's solution has a table some 150 fields wide (no containers, but some comment fields and a lot of multi-line keys and value lists). In a revamp, the basic fields (used for searching names and a few key properties) have been transferred to a new, narrow table (10 fields).
Now, the search routine loads the entire list (about 30'000 records) in a second over VPN. Loading the 'wide' table for the same purpose takes 5 to 10 sec.
Tables in solutions have a tendency to widen over time (fields added for new purposes). In a GB network it does not really matter. But now with people doing WFH, the limitations of VPN and Internet connections clearly show the performance impact wide tables can cause.
Always a compromise, particularly with FileMaker’s (lack of) ability to create cross table reports.
Thanks @Torsten - always great to get varied examples of these principle’s application in practice.
Anyone else wish to contribute other examples / comments on this subject?
Cheers, Nick
Avoid them like COVID-19 ! A few years ago, I had to rework an old solution where a laaaaaaarge table of about 450 fields was present. And some of these fields were repeating fields . For example there was a group of repeating fields that held data for a specific year. In the beginning they created these fields with a certain amount of repetitions . . . All was well until they hit the year that needed a new repetition. The repetitions were displayed vertically, meaning that the layout height would grow each year .
Repeating fields are like boomerangs: they come back to you and it may hurt.
I refactored that wide table by sending many fields in other tables where they belong with proper references.
Thanks for your comment @planteg
My experience is that repeating fields have their uses and generally not those they were originally designed for.
My main use is for settings, often in a single record table. If you use them the right way they offer very high performance.
If you don’t then they don’t.
I always bear in mind that on Server all the data in a record (except for container fields that are not visible) is passed over the network to the clue when that record is opened. Clearly one needs to avoid that happening repeatedly with a wide table.
Cheers, Nick
What if the table is wide, but only a few fields are ever in use at once?
It is by no means obvious why a table with 1000 fields and 1000 records is generally less performant than than a table with 10 fields and 1,000,000 records.
For example what if a table has 1000 fields, but only 10 fields are ever filled at one time?
For example, let's u have a table of contacts. Each contact has one of 100 different types (Doctor, lawyer, EMT,etc...) and for each type there are 10 associated fields (the doctor fields, the laywer fields, the EMT fields, etc...).
Leaving aside the question of whether that's bad design, does this suffer the 'wide' penalty, and if so, is it less of a penalty? I researched this a long while back, and my understanding was that it had been suggested (in devcon presentations iirc) that this situation wasn't as much of a problem.
Given the convenience of stored calculations vs scripting, I think this type of table could have it's place.
I think that this answers your question.
You'd have to have clear evidence of real benefits to convince me that this design model has advantages. Although I have an example from my own work where I did purposefully de-normalise data to create a wide table but was dozens of fields, not hundreds. It was for a CWP web site. The denormalising allowed us to grab everything in a single call. At that time, it was faster than repeated calls and slightly faster than having several portals on the layout.
…natively - unless you make the effort to create virtual lists in order to simulate cross table reports.
The second factor to be considered is indexing. A wide table with uncontrolled indexing may result in very large indexes. People who have deliberately kept their tables narrow are generally also alert to the performance hit created by uncontrolled and unnecessary indexing.
Again a work around Nick that has its place, but is not the answer to all reports clients need
What about Global and/or Constant tables? I'm referring to instances where developers will create a table for all the global fields they use. There's just one record, and every field is a global field. Does it matter if that sort of global table gets too large? If you have 500 globals, is it better to split this into multiple global tables? If you reference such a table in the relationship graph, will you get better performance with multiple narrow tables?
I have a questions about narrowing tables. Let's say you have "Products" table that is too wide- let's say 200 fields. 50 of those fields are "currency" fields that are rarely used and never displayed in portals or lists.
Is it worthwhile to create a second table, "Products_currency," in which each Product has one and only one related "Products_currency" record that holds all the currency info?
That is one improvement. How worthwhile? Probably depends on how many product records you have.
Another potential approach to this is to create a key::value table of currency data, where one product can have zero or more related currency recors and where each currency record has a key, essentially what was previously a field name, and a value, what was previously the field value. That way, you only have currency records for currency data that exists.
I think @bdbd has the most flexible answer for this. If you have fields that are frequently used but are secondary information then a one to one relationship to another table is a good solution. When you say "fields that are rarely used", especially when there are a lot of records, the situation is different.
If you have numerous fields that do not require indexing you are better off using a single repeating field for all of them - thereby you reduce table width and hence improve performance.
Be rigorous in not indexing everything that does not actually require indexing / by default turn off indexing completely.