Optimization ideas for Draco calculation engine

Since HOnza posted about his Initiative ’24 about making FileMaker calculations quicker, I spend some time thinking on how to do this practically. Like what could be optimized if Claris puts work into their calculation engine. And I got some ideas:

  • Math seems to be a problem sometimes. Have a flag internal for number values on whether they are 32-bit integers. If math functions are invoked and the flag is set for both values, perform integer math directly with normal C math instructions and skip the precision math library. E.g. the normal "j = j + 1 " in a loop doesn't need to do precision math.
  • Cache parsed calculations. Evaluate() taking a text parsed calculation opcodes. The same calculations are frequently used, so if possible cache the parsed version of the calculation for some time. e.g. the calculation engine could keep a hash map with the last 10 parsed calculations to avoid parsing it again if the same calculation is needed again soon. e.g. for calculations running very frequently in a loop.
  • Apply caching for constants. It may be worth to cache calculations and their result for calculations, that don't need externals. e.g. if the calculation is just a number or text and avoid running it through the full calculation engine each time.
  • Have a flag for text values on whether they carry StyleRuns. Have text functions optimized to run a faster code path if no styles are involved. e.g. a function like text concatting may not need need to bother about merging style runs if both strings have no styles.
  • For small text values, consider a small text optimization. e.g. like std::string class have space for a few characters in the text value data structure, so you don't need to allocate bytes for the data. If I remember correctly std::string uses the high bit in length value to indicate that characters are stored in the data pointer directly instead of it referencing additional memory for the text. 8 bytes for a pointer then fit up to 7 characters in std::string.
  • Use reference counting as much as possible for strings. Avoid copying text values and just reference them. Using e..g Middle($s; 200; 300) on a long text should reference the new text and reference the original memory for the character to avoid copying it. The new text value then has a different data pointer and size value to just point to the requested text portion.
  • Consider walking through older Draco code and find cases where Draco was optimized to work on Macs with 128 K of memory. Like don't try to process data in small chunks, have it enjoy the current machines with GBs of memory and optimize functions to use bigger block sizes or work on the whole string.
  • Have values be marked with a flag for whether they a known to contain only ASCII text, so text functions could have branches which handle full unicode text different than pure ASCII text, since that usually can be handled much quicker.
  • A ton of API calls need text as UTF-8 or UTF-16 or UTF-32 depending on what API is used. Your text values could cache such representations. If the same text encoding is needed later, you don't need the conversion again. If the text changes, you would clear it. And if the same text encoding is not needed again, you just delayed the freeing operation.
  • Cache parsed JSON/XML/Calculation values. If you are at caching in the text values, you could do it for parsed texts as XML, JSON or Calculations.

Well, if I would design a class in C++ to hold a text value (n UTF-16), it may be something like this:

class TextValue {

	// Flags: IsSmallText, IsASCIIOnly, HasStyles
	// one of: hasJSON, hasXML, hasParsedCalculation
	// one of: hasTextUTF8, hasTextUTF32
	uint32 flags;
	uint32 length;
	StyleRuns *styleRuns;
	
	union
	{
		uint16 *bigTextdata;
		uint16 smallTextData[4];
	};
	
	union
	{
		JSONFragment *json;
		XMLDocument *xml;
		ParsedCalculation *calculation;
	};

	union
	{
		uint8* textUTF8;
		uint32* textUTF32;
	};
}

Since we use union to store values exclusively controlled by flags, we can do this all in about 48 bytes. For most texts almost all fields are NULL, but if you'd ask the text value twice for something, the second call can return a cached value. Just make sure all code calls the relevant functions instead of accessing the text directly. I personally use optimizations like this in my plugin code.

For all changes, please measure. As an outsider we have no idea whether the overhead for caching needs more CPU time than it saves. But the goal could be to write fast replacements for a couple of functions which produces the same result and behavior. It would be an interesting job to optimize a calculation engine like FileMaker's and find the spots where an optimization brings a real result to the users.

Or maybe someone at Claris is already working on a translation of calculations to JavaScript to leverage a quick JavaScript engine? But if that approach would be considered, better skip JavaScript and write a LLVM frontend for FileMaker calculations and let LLVM do the compile part and run the machine code it outputs.

6 Likes

thinking of the Draco engine and the db backend I always ask myself wether we would draw gains from adding a special index type for UUIDs - at least adding UUID indexing has a heavy impact on file growth... - but than of course I have no idea about what it would mean in terms of effort for the dev team at Claris and if this is possible at all

I think we put in an UUID field type request years ago.

The idea is like MySQL to store it binary as 16 bytes. Apply index like for a number, so you can quickly find values. And whenever user reads or write it, expand to a text representation.

2 Likes

It would be great to get this list in front of Claris engineers and get their reactions as to how much of an impact some of these might have, reasons why they might not work, etc. More understanding is always better.

Is @Clay the right person to ping?

I really like the idea of reference counting strings and avoiding cloning. That could really improve lots of operations, from parsing out values from delimited data (Middle, Left[Values], Right[Values], GetValue. Iterating lists could get faster (albeit GetValue() is rarely the bottleneck except in super long lists).

It would definitely have some interesting rules to make sure underlying strings don't get mutated "under our feet" while still holding a reference to various string slices. E.g. perhaps Middle ( Field ; 10 ) would always need to allocate a new string, because holding a reference to a portion of a field value seems tricky. But I can see this working for variables if you did some kind of clone-on-write thing where "mutating" a string allocates a new one, leaving the old one in place if any other references to it exist. That could end up with garbage hanging around for a while if, say, the original string is huge, then you take a 10-byte slice of it, and then add 5 characters to the end of the original string variable (producing a clone). Now you have two humongous strings sitting around even though you're only using 10-bytes of the first.

Great food for thought overall @MonkeybreadSoftware

1 Like

Well, the first thing is to get Claris to have calculation performance as a priority. Not sure how much of a priority it has been in the past years.

Once you have the will do improve it, you need to measure. We have no idea whether maybe some of the optimizations are there already or there are good reason why one of them would not be a good idea.

Maybe we can ask Clay or Rick about this at the conference. Or at least ask them who's in charge of this. It would be nice to see an iteration on this and a few percents of speed improvements for each upcoming release, so over one or two years the performance doubles.

5 Likes

As far as I observe and perceive it a priority is when a major account requests it. If one of their highest volume clients would asks for it something might move - would be my bet. IMHO ..

2 Likes

@MonkeybreadSoftware - Did you already share this with Clay?

If they tackle improving calculation performance... I would love to see them in the same breath also tackle aggregation performance improvements.

2 Likes

Hi Vince,

thanks for chiming in!

1 Like

It may be something to talk with him at the conference.

have been talking about this for decades and asked specifics without success. For example security settings are evaluated for each record apparently if still valid and are a major bottleneck therefor. If bitmaps would be used instead for locking and access control it might look different...

Good luck!