StartupRestoration

With FileMaker Server 18 now having been available for coming up to 1 year, I'd be interested in the experiences of actual production servers that are or are not using StartupRestoration.

There appears to be many posts from people who have had problems with it and, just about any response to any reported problems with FMS 18, is to disable StartupRestoration.

However, my understanding of it is that FMS 18 reverts to handling queries in serial with it disabled, whereas with StartupRestoration enabled it will deal with these in parallel, therefore providing a performance increase opportunity.

We've a post on the Community where we've had to downgrade FMS 18 to FMS 17 due to performance problems on a legacy system, which we'd tested with StartupRestoration enabled and disabled. We'd not noticed much difference on this or with our development server.

I'd welcome any feedback relating to FMS 18 used on production servers.

One last thing, a useful command to check the status of StartupRestoration:
fmsadmin get serverprefs startuprestorationenabled

FMS in general is remarkably stable (has been for a while) so most problems reported these days are directly related to the StartupRestoration being a factor in crashes. I don't see many other problems being reported on the various forums.

It's a little more complex than that, largely depending on the type of operation. So it's not all queries and it is not completely black-or-white (in parallel or serial).

You are correct in that the Data Restoration feature is coupled with the Better Parallel Processing feature (BPP). BPP spawns more threads on more cores (when it can) than not using BPP. More threads = an increased risk of one of them running into trouble. The Claris engineers set out to help mitigate that greater risk by building the Data Restoration feature. If you turn off Data Restoration they opted to also disable BPP so that you wouldn't run that higher risk without the safety net.
Unfortunately it seems that their safety net is causing grief of its own.

Not entirely surely I follow what you're saying here but if you were testing performance with BPP/DR on and off and not seeing much difference: it really depends on both the type of machine and the amount of load you put on the server.
For instance if you test on a 2-core machine with high load: BPP doesn't really have much room to spread the workload around so you won't see much impact.
If you test on a 16-core machine but as a single user then there might not be much need to spread the workload around.
And to add to the complexity: certain types of operation lend themselves better to BPP than others.
"Finds" for instance are a good example of where BPP shines.

Here's an example of doing a search on an unstored calc across 5000 records. On a 2-core machine. Each set of bars represents the number of concurrent server-side sessions doing this at the same time. Note that in the busiest of scenarios there is not a lot of difference between 18 with BPP on and off.

And here is that exact same activity on an 8-core machine: BPP makes a huge difference here because there is spare processing capacity

Doing a sort however on the same records doesn't have such a dramatic impact, there isn't a lot of difference between BPP on or off.

So it's a complex subject, depending on the deployment, the sheer user load and the nature of the solution, BPP may have a lot of impact or pretty much none at all.

Our own experience:
We've had 2 or 3 customers who have seen crashes after moving to 18 with BPP on. The crashes went away after turning it off. Since we're very risk adverse we now install all FMS18 with BPP off. Across the board. But we don't go back to 17, we have not seen any need for that. Plus we'd lose the Admin API and the better Data API.

9 Likes

As ever Wim, very useful background and practical information, thank you.

We’re gradually moving towards upgrading some of our primary production servers, albeit it looks as if we won’t know where we will be until we actually make the move. Certainly, if we can use BPP/DR it may improve some of the find problems I've posted about previously and we would definitely increase the number of vCPUs on the VMs. I'm hoping having already moved these to at least FMS16, there will be less surprises for us than from v15.

However, it is a concern about the number of reports of crashing with DR enabled, and we can assume with the annual FileMaker release cycle, we'll not see much more effort going into v18 to resolve this.

Regarding my post in the community (FileMaker Community (English)), we were completely taken by surprise that downgrading to FMS17 from 18 had the performance improvement it did. Our aim was for the client to be able to do side by side tests using FMP v15 with FMPA v18, with the possibility of having to finally revert back to FMS15 until a lot of optimisation work had been completed on the legacy system.

However, we now have a new problem with a plug-in causing a Windows error that appears to subsequently crash the FMSE, which restarts itself but doesn't reenable the plug-in. The fun never stops!

Make sure the plugin is thread-safe and 'certified' for server-use. I've seen many deployments suffering from this where the plugin tramples all of the memory space and crashes FMSE. Depending on what the plugin does, consider setting up an API-based system that delivers that functionality. Easier to integrate with other apps and say FM Go as well.

1 Like

Great thread all,

This snippet 102 seconds in was interesting...


(multiple file advantage for now)

Yes, FMS 18 only if startup restoration is off.
https://fmhelp.filemaker.com/help/18/fms/en/index.html#page/FMS_Help/hostdb-startup-restoration.html

fmsadmin set serverprefs StartupRestorationEnabled=false

and

fmsadmin get serverprefs StartupRestorationEnabled

Thanks.

1 Like

Thanks Wim. We’re hosting on behalf of another developer on this one. The plug-in is from 360Works, which we’ve only ever had good experience with.

Anyway, flying out to Austria for a ski week with the family and friends early tomorrow, so immediate problem is - packing :grinning:

The plugins from 360Works have, in the past, been very sensitive to java updates... We also found that making sure nothing is hitting the plugin at the same time, cause the FMSE to crash less often. For example, 2 schedules that overlap that both use the plugin around the same time.

As Wim pointed out, most of that functionality, from the server end, can often be moved to an API service or an external micro-service.