Performance core principles: 10. Scripting

  1. SCRIPTING

How do I write fast and easy to maintain scripts to automate my solution?

In part 2 of this section we will examine server side scripting called by Perform Script on Server or called by a FileMaker Server Scheduled Script.

1 Like

Black box scripts
It takes more effort, it is a more rigorous process, but there are considerable gains to be had in terms of longevity and lack of future issues if scripts are written as a self contained black box which receives inputs through get(scriptparameter) and outputs the results to a global variable or a field, in that order of preference.

In other words you don't call for data from other sources nor write data part way through the script.

This follows basic computer science wisdom and has the additional benefit that such a script is much easier to set up to run server-side with Perform Script on Server.

The speed gains, whether called locally or through PSOS are obvious because FileMaker has a lot of work to do every time we refer to a field. This is because a field is part of a record, it has context, other sessions may have the record open, may be editing it. Those users may be far away over the WAN. Whereas, where the necessary inputs, whether in value pairs, json format or whatever, is provided on the script call as a parameter the script has no context to consider, all values are held in local variables during the process and finally results are written to, or perhaps better, output with exit script / get(script result) the later case again avoiding any context?

Internal documentation
Writing a script should be a pleasure but it should also be a pleasure when you or someone else has to work out what you were doing some years later. Hence the preferred method of writing a script is to write a series of comments describing every step you need to take first, then fill in the steps. Later on, try to remember to update your comments to reflect how you ended up doing it.

Whatever script you write can nearly always be improved by removing unnecessary steps - doing more in a single let() statement and streamlining your logic.

Breaking up a script without good reason makes it slower
However, one myth to be understood and avoided is that breaking a long process into many sub scripts is faster and better.

If something has to be called by different processes then clearly it should comprise a free standing subscript rather than being rewritten in several places.

However a single long script runs faster and is far easier to follow than the same script cut up into smaller steps. This was a Ray Cologon insight from 2014. Its easy to test and measure.

Troubleshooting
If a script runs correctly in the debugger but not otherwise inserting a short pause at the key point will often fix the issue. The debugger of course provides that pause without it being in the script.

Ease of future development
Where future work needs to be done to a process, if properly self contained scripts are written, it is easy then write a new script, or adapt a copy of the original, without touching how the original is written, test the new script and if you are having a bad day revert to the original.

Cheers, Nick

@nicklightbody I've heard something about how FileMaker deals with JSON in variables and some caching related optimizations. The improvements appeared in a version that succeeds the original introduction of JSON, but I can't remember if that was 17 or 18.

All in all, I don't know much about this. I suspect some people like @jwilling, @mrwatson-de, @CamelCase may know more about this.

2 Likes

Let’s hope we hear from them!

1 Like

Well, here I am. This is the TLT;DW; version (Too Little Time; Didn't Write)

As far as I know the internal JSON cacheing has been present from the start, as it is part of the implementation.

The implementation:

  • FileMaker works internally with a JSON library (name?) which stores JSON as objects
  • Thus all FileMaker JSON functions have to convert between the JSON objects on the inside and the JSON string on the outside
  • Particularly with large JSON structures it is the repeated conversion back and forth which is the performance killer

The Cache:

  • FileMaker keeps a single copy of the internal objects
  • If you call JSONGetElement with exactly the same JSON string as the last time FileMaker can reuse the object cache without having to reconvert.

Best Practices

  • Where possible build smaller JSON trees before adding them to a larger tree, rather than incrementally adding to a huge tree
  • Avoid using JSONFormatElements - preferably only use it when outputting for humans/debugging
  • If you have two (large) JSON strings to build (say from the current found set) it may well be faster to scan the found set twice and build them one after another [<= needs testing]
  • Prefer a single JSONSetElement call with multiple parameters over multiple single JSONSetElements calls
  • Use the following tricks to add conditional elements:

To output null where a value is missing change the json type parameter:

JSONSetElement ( $json ;
[$key ; $number ; Case( not Isempty( $number ) ; JSONNumber ; JSONNull ) ]
)

To remove an element where a value is missing...

  • ...is trickier, because setting the key to "" doesn't have the effect of removing that element
  • ...indeed it is not possible to not add an element in the middle of a JSONSetElement function
  • ...however, if you have required parameters you can overwrite your optional parameters with required parameters:
JSONSetElement ( $json ;
[Case( not Isempty( $optionalValue ) ; "optionalValue" ; "requiredValue" ) ; $optionalValue ; JSONString ] ;
["requiredValue" ; $someValue ; JSONString ]
)
  • ... like this the optional value IS written to the JSON, but is then immediately overwritten by the following value.

If you have no required values, or would like a standardised method, you could try this:

 JSONDeleteElement( JSONSetElement ( $json ; 
[Case( not Isempty( $optionalValue ) ; "optionalValue" ; "DELME" ) ; $optionalValue ; JSONString ] ;
[Case( not Isempty( $optionalValue2 ) ; "optionalValue2" ; "DELME" ) ; $optionalValue2 ; JSONString ]  ;
[Case( not Isempty( $optionalValue3 ) ; "optionalValue3" ; "DELME" ) ; $optionalValue3 ; JSONString ]
) ; "DELME" )
  • If your JSON is large enough, then a single delete function may be faster than

Have fun!

MrWatson

9 Likes

Thank you - this detailed description is highly appreciated @mrwatson-de
Cheers, Nick

1 Like

Does anyone have a view on or experience of comparing the performance of text parsing with json processing?

I am thinking here of passing and parsing data as value pairs or as json? Ie using the json as a predefined format as opposed to creating your own format / structure?

The reason is that FileMaker processes plain text very fast.

Ideas? Tell me I am wrong?

1 Like

Hi @nicklightbody Chris Irvine published something back in 2015. It can be found here: FileMaker Serializers Compared – ScaleFM

There is also a DigFM meetup tonight (in your case that is pretty much the middle of the night) that seems to hint towards similarities with this topic: Collections and Maps -- Beyond FileMaker Arrays and Objects (Nat Robinson) | Meetup

DigFM presentations are usually recorded and published on their YouTube channel.

2 Likes

Also wanting to highlight another amazing contribution from @mrwatson-de, he reported extensively about a technique to append data to an existing variable here: https://community.claris.com/en/s/question/0D50H00006dsly8SAA/

The technique relies on the script step Insert Calculated Result to append, and outperforms the Set Variable script step that overwrites the whole variable instead of letting us append in a fashion that leverages what is already in the variable.

3 Likes

Thanks @Bobino for both - very interesting - will try to digest!
Cheers Nick

Part 2: server side scripting

We have already referred to self contained black box scripts being much easier to call server side. Let’s now look at this area where the most enormous performance gains can be made but which is probably the most difficult area of vanilla FileMaker because you cannot directly observe what your script is doing in the Debugger.

Every server side script call creates a new unique user session on the server. That user session opens the File with an Account, obtains a privilege set, obtains scripted access to the file equipped initially only with the parameters that accompanied the script call.

The global field and variable values present in the calling user session remain unknown to and inaccessible from the server side session.

More experienced users will recall developing in FileMaker before the Debugger was released when the only means of observing what your script was doing was using Display Custom Dialogue. Similarly to those ancient days a server side script call can only be observed by us creating a feedback mechanism. The most common method is writing current data to an Event Log which can give the developer a means of retracing the steps taken by the script.

So this is not easy, so why do it?

The big reason is that the work of a server side session runs entirely on FileMaker Server or FileMaker Cloud. As such, once instigated, it is a local session on server side. Hence all script calls within the session must not be PSOS, because that just evokes Error [no.].

The most laborious single actions a script can call are either creating or deleting a record, because of the ancillary work required to protect our data and to communicate with the local FileMaker client session. The original design of FileMaker Server delegated as much work as possible to the client machine, to protect Server from load. That approach requires a great deal of communication which is hindered by a slower network and/or greater latency (delay) over the network. Hence older versions of FileMaker perform slower over a network than newer versions because the platform development has focused on improving network performance since about 2010.

Typically an action to create or delete records server side will run many times faster than the same action controlled from the client user session. dsBenchmark is an open source load testing tool for FileMaker Server released by Nick Lightbody in 2015. It creates many records in a session. It can delete many thousands of records in a few seconds using server side scripting. It serves as an example file for creation and deletion of records using PSoS.

Any server side script call will typically include the following elements:

  1. A calling script on client side which alternatively calls the script with either PSoS or Perform Script dependent on whether (a) FileMaker Server is hosting the file and (b) is this user session already running server side, or not?

  2. A scripted mechanism to enable the server side call to open the file with the correct user account and hence privilege set and to then get to the correct context (ie layout) to enable the requested process to commence.

  3. If the process is to apply to specific records the server side session will typically be informed by record identifiers provided as a script parameter otherwise the process will apply to all the records available to the session.

  4. When the process has concluded the session should be terminated with Quit Application which as it is hosted only closes that session.

[to be continued]

1 Like

A recent post in the Community ran tests on different ways of going to next record.

Here's the link to the post: Do I Need to Worry About Handling "Last Error: [101] Record is Missing" When Looping?

I made a test (3.8 million records)

the test was a loop on all records and was repeated 3 times. Here the average of the times in seconds

Go To Record/Request/Page [Next] exit after last: 18.947 s.

Exit Loop If [ Get(RecordNumber) = Get(FoundCount) ] : 46.953 s.

Exit Loop If ( Let ( $i = $i + 1 ; $i > $n ) : 78.664 s.

Exit Loop If ( Let ( $i = $i + 1 ; $i > Get(FoundCount) ) : 81.279 s.

The Go To Record Next/Prev steps are the base.

Then we see a performance hit. Any calc that touches the calc engine increases the amount of time spent on the task.

The second thing to note about the performance hit is the number of calculations. In the simple test above, it seems as though a simple operation takes about the same time. In the test every calculation adds about 30s across 3.8million iterations. We have a base of ~19s for the go to record action. Adding a single operation in the calc engine adds ~30s. Adding a second operation to the calc adds another ~30s.

7 Likes

@Malcolm Wow - I didn't think it would make such a difference! Crazy!
Thanks for this great valuable tip!

Remember that the timing is the average from walking through 3.8 million records (3,800,000).

That means that on smaller record sets the difference is miniscule. For 100 records, each operation in the calc dialog is adding less than a millisecond to the duration of the entire process.

1 Like

I would be curious to know how the following scenarios would do in your comparison:

  • Go To Next Record, followed by an exit loop if that tests for get(lastError)
  • Looping on the set simply by using getNthRecord() incremented with a variable

The second one assumes you are only reading from the set and not dealing with anything across a relationship, but would save you from actually going to the next record.

Also, comparing the using $i with other approaches, we have to keep in mind that the variable needs to be written to and that writing tends to be more expensive than simply reading.

As you are pointing to, those nuances can amount to nothing with a small record set. With a large record set, every little thing can make a big difference, because of the scale.

This is really interesting isn’t it? So everything I have been doing is seeking to work out in any given scenario how to ask the machine to do less. So on goto next record the only question to be answered is “is this the last record?” This being dealt with at a lower level than the calc engine, hence it is little work. As soon as you invoke the calc engine you are asking for several things to happen:

  1. Get some data on every cycle and store it;
  2. Evaluate the given expression using stored data;
  3. Do whatever comes next.
    So this totally makes sense to me but it’s very interesting to see an assessment of the comparative load / effect on time elapsed.
1 Like

I'm not sure that with the specific case of GoToRecord [Next], there is some "magical" lower level calc engine involved. I simply think the checkbox checks for error 101 and provokes an exit loop when that condition is met. The assumption here (to be tested) is that checking for last error = 101 on every iteration is faster than checking if the record is the last one.

Someone can test for this and see if that makes sense or if there is indeed a deeper/faster/more powerful / level than calc engine involved for this.

You could well be correct but the fact that it appears to be several times faster suggests to me that it is checking for “no more records” at a more direct level than the calc engine?

When I created the legions concept as a method of grouping records to displayed through a filters portal I found that by far the fastest means of doing this was by allocating serial numbers to records in order to group them into blocks of 1000.

Hence I assumed that serial numbers were allocated at a lower level than if I had created an expression which allocated serial numbers?

In my mind text in the expression has to be encoded or transformed into something that can be used - machine code or whatever. If a commonly used “expression” like “does last error =x” was hard wired in so it didn’t require transformation before use I would assume that it would run many times faster - so that is what I meant, which is maybe what you meant too?

I recall when I told by an engineer that they had decided to “expose” curl to devs in the url script step that they were already using it at a lower level and then opened access to it to us at the script step / calc engine level?

Cheers, Nick

great stuff - thank you! would be interesting to see if the cache / memory settings in FMP would change outcome?
what version were you using?

It wasn't me running the tests. I'll find out.

1 Like