Fastest Key Value Pairs

FileKraft · April 21, 2021, 2:07pm

In need of speed improvements for a layout with a heavily loaded portal where conditional formatting is applied using ~6 predicates for each field and displaying 12 fields in one row and a couple of hide conditions also each field and the portal itself using lots of triggers and drop down lists etc the end user data entry experience is slow.

Since most predicates used for hiding or cond. formatting evaluate thru one-dimensional keys like an array I am wondering if repeating variables would be fastest or JSON arrays or JSON key value pairs. I am also using a hand crafted custom function where variables have the index encoded in the name which apparently is fast to return its value.

Anyway for further discussions there will be test results comparing each technique.

BTW I tried with caching at least of conditional formatting and wasn't yet successful or done.
By caching I mean first field's predicate expressions would set a local variable then available for next field in portal row. Apparently that doesn't work. Tab order is not followed on evaluation conditional formatting expressions or other. I am going to do more research on repeating fields where it seams to be possible to cache results per definition as in SIMD (single instruction multiple data) ..

thanks!

steve_ssh · April 21, 2021, 3:14pm

Hello @FileKraft

I once worked on something similar:

FMP v.14 (so this was before some threading change for portal rendering introduced in v.15)
Use case was a portal used to display the list part of an email viewing interface, and there were lots of icons and formatting in the list that needed to hide or show or otherwise change formatting, determined by email status - driven by field data local to the portal context, some related data, and a variety of global fields which allowed the user to click display settings.

I worked on a cache approach using a variable (I suspect I used a $$global, and not a $local). There were no native JSON functions at the time, so I am guessing that I probably used a return-delimited list as the data structure stored in the variable. (I haven't seen the code for years)

IIRC, the logic for accessing the cache data was to first check a separate variable which stored the primary key from the portal row for the last time that cache data was recalculated. And so, for the first layout object that needed to access the cache on a portal row, there would be the overhead to recalculate the all of the necessary cache values to render a row, and then these would be stored. All subsequent attempts to access the cache data would first involve the same conditional check against the recorded primary key -- since those subsequent checks would match the previous recorded key, the cache data would be returned without any recalculation.

While I did not experiment with comparing other means of storing the values, I did experiment with turning the cache feature on and off, and I did notice a subjective difference in the speed at which I could scroll through the portal. That is, it did seem to help at some human-perception level which I did not attempt to quantify.

I am going to take a guess at a basic outline of how my calcs worked. I am pretty sure that I made use of custom functions which referenced both variables and schema, and I know that this may not suit you, but the use of custom functions was not essential to the strategy -- the same could be achieved without them. I personally have a distaste/aversion for referencing schema within a custom function, but I made a trade-off in that regard because the custom functions helped improve the readability of my layout object conditional calculations.

So, I think the approach was something like this:

Define a custom function that would recalculate all of the constraints that are needed to render a portal row. These constraint results were free of newline chars, and so it was safe to have this custom function return the values as an ordered CR-delimited list. Lets call this function: CalculateConstraints()
Then there was the function that served as the accessor to the cache data.

It probably looked something like this:

AccessConstraints( PortalRowId ):

Let([
    ~id = PortalRowId
];
    Case(
       ~id = $$_cache_row_id ; $$_constraint_cache ;

       Let([
            $$_constraint_cache = CalculateConstraints();
            $$_cache_row_id = ~id
       ];
            $$_constraint_cache
       )
   )
)

And then each conditional layout object calculation would access a constraint via:

GetValue( AccessConstraints( MyPortalContext::ID ); n ) // n specific to each constraint

I do not recall if I further wrapped the above in another layer of custom functions with hardcoded values for n for the sake of readability. That would be something that I would be apt to do, but at the same time I had to consider overhead -- I can not remember which I decided.

What I do recall is that there was some strategy involved in crafting the AccessConstraints function itself -- strategy which was specific to the needs of this solution:

If all constraints are entirely independent of one another, and all are always necessary and subject to change with each portal row, then I believe this makes for the simplest case, as AccessConstraints must determine everything, and order of determining values may not matter much due to the independence.
If, on the other hand, some constraints depend on others and/or it may not be necessary to recalculate everything for each portal row (either because not all values are subject to change, or not all are required), then the game of strategy begins, and one has to decide the optimum structure of AccessConstraints (or even if more than one such AccessConstraints function is necessary). But -- the top level game plan is the same: Calculate what must be calculated, and cache by Portal Row ID.

Again - I will call out that this was with v.14, and so many things have changed, including the option of using JSON (I suspect CR-delimited might perform faster), and the under-the-hood code for rendering a portal.

I will also say that this was one of those cases where I felt some regret about leaving this code behind for another developer to understand. It really is not difficult to understand, but it is unfamiliar, and I sometimes wondered whether anyone ever had to work on that portal and struggled to make sense of it.

All the best to you -- I hope this may be of some help or inspiration.

-Steve

p.s. My uncertain recollection is that front-to-back layering order of layout objects was the primary factor in determining the order of layout object calculations, but I can't claim to have confirmed this any time recently.

FileKraft · April 21, 2021, 3:25pm

this is super great stuff @steve_ssh - thank you. definitely goes in line with what I had in mind or used before but will bet on JSON arrays or indexed repeating vars.

I agree with your thoughts about leaving this behind for the next developer taking over but frankly anything exotic i won't reject if performance increase is gained.

Thanks again Steve for your thoughtful and detailed response. I will review more detailed as soon as I get some tests implementing all suggestions.

FileKraft · April 21, 2021, 3:30pm

will focus here on KEY VALUE pairs - and maybe start another discussion for layout caching.
If the fastest KEY VALUE pairs with integer keys are established the focus can go on caching separately.

Thanks again!

steve_ssh · April 21, 2021, 3:54pm

Sounds good. If I had to place a bet, it would be that you'll get faster key value performance by using the repeating vars. But this is just a guess.

FileKraft · April 21, 2021, 4:06pm

that's my gut feeling too - my other bet would be non-repeating vars but index in var name like $var123 vs $var[123] if the overhead of evaluate is not to much .. will test ...

steve_ssh · April 21, 2021, 4:15pm

Ah. Great thinking! and very interesting. Evaluate definitely can cost, but perhaps not with something as simple as just a single variable reference. I look forward to hearing the test results.

FileKraft · April 21, 2021, 5:10pm

Depending how repeating vars are implemented to fetch via their index. Maybe the 10^400 index range correspondents with the var name ?? (wish there would be 'real' devcon happen again to investigate with Claris staff)

Bobino · April 21, 2021, 6:17pm

@FileKraft I did not get to read this whole thread, sorry for that. You may be interested in looking into the following (from 2015, but could be relevant):

https://scalefm.com/2015/10/filemaker-serializers-compared/

FileKraft · April 21, 2021, 9:13pm

thanks Bobino - worth investigating!

Malcolm · April 21, 2021, 9:55pm

Handling named parameters as variable repetitions is relatively easy.

$scriptParam[ code("name") ] = "pins"
$scriptParam[ code("qty") ] = 5

is equivalent to

$scriptParam[ 101001090009700110 ] = "pins"
$scriptParam[ 1210011600113 ] = 5

FileKraft · April 21, 2021, 10:07pm

thanks Malcolm - the question actually was what's the fastest and the index is already integer given by canonic serial No

Cecile · April 22, 2021, 1:06am

Was wondering if this current scenario is a use case or a new topic completely, in relation of the several topics of @nicklightbody performance project?

nicklightbody · April 22, 2021, 8:25am

It certainly sounds like this problem would fit well within the performance project.

From the initial description it sounds like some simplification could be applied to the complex Ui that has, I imagine, evolved over time?

If there is a discernible pattern to how the Ui is supposed to respond then a single custom function could turn all the hide controls on and off by setting vars.

I share Steve’s antipathy to using custom functions with direct inputs from schema as opposed to through parameters, this can create problems maintenance wise. Better to avoid if at all possible in my experience.

I recently experimented with putting many (10k, 100k, 1 million) values into a global repeating var array, so 2 dimensions. What worked fast locally didn’t work fast remotely - when it was large - because the entire array has to uploaded from the server to my local user session (because it was global) which highlighted for me the importance of thinking through exactly what has to move where in practice?

Someone I am coaching recently improved the start up of their remote calendar system from 7 seconds to 1 second by applying the principles of simplification and correct use of variables etc.

Cheers, Nick

FileKraft · April 22, 2021, 1:25pm

Thanks for your input and suggestions @nicklightbody.

Not used at all. Abstracted and completely functional except of the purpose of intentioned side effect creating variables or retrieving variables' content by encoding the index within the variable name.

The system has not been evolved just written super carefully all predicates minimized applying Boole, Karnaugh, Moore etc.

Since it is slow due to many fields in many rows with many predicates in cond. from and hiding it would benefit from caching and all is already set just limited to the dark matter of not knowing how it really gets evaluated.

It mostly needs fetching from an index a boolean value for the conditions etc. And therefor I just need to focus on Fastest Key Value Pairs. Then backtracking if this won't help to other measures if possible.

Thanks again.

FileKraft · April 22, 2021, 1:30pm

great points - sorry forgot to mention the file is local and stand-alone.

Topic		Replies	Views
Performance core principles: 10. Scripting NickLightbody's performance , loop	24	1572	March 31, 2021
FileMaker 18 Data API > Create Record - JSON visualization and building Lounge (Discussions)	27	2042	July 7, 2020
Performance core principles: 1. Unstored calculations NickLightbody's	41	1522	April 22, 2021
FMP caching of data Questions date-cache	14	348	May 31, 2021
Button Bars as Layout level calculations...are there performance advantages? Lounge (Discussions) button-display , fields , object-calculation	29	2076	September 29, 2023

Fastest Key Value Pairs

Related topics