Hello @JamesG
Some ideas/thoughts/feedback for you:
Bruce:
Bruce Robertson authored a CF similar to ZapValues, which, I believe, is named AntiFilterValues. You might look for that function, and see if it is more performant.
Agnès:
I would also look to see if you can find anything by the brilliant Agnès Barouh. I seem to recall that she once posted a custom function that leveraged the v.16 FMP native SortValues function in a really genius way.
EDIT: Correction: The function used UniqueValues (not SortValues). See link posted below by @jwilling.
My memory is no longer clear about this, but I believe it might have been another implementation of the ZapValues/AntiFilterValues concept. If you can find it (or if anyone can chime in with a link to it), I would encourage you to also test that for performance.
Agreement with your findings:
FWIW: I have also found that, given the same number of iterations, and the same work being performed with each iteration, the While function has not offered a significant performance advantage in the few cases that I have tested -- though it certainly can help to write easier-to-read code,
Regarding optimization of CFs:
At the risk of stating the obvious:
The key to making such functions perform optimally is to both:
- Keep the overhead per iteration as low as possible.
- Keep the number of iterations to a minimum.
In the context of a re-write ZapValues type of function:
If you simply re-wrote the ZapValues function by keeping the same concept, but using the While construct, then I believe that you might be able to win a performance gain by making changes to your implementation. The key would be to exploit any potential to zap more than one value in the source list per iteration.
Of course, the resulting function will likely lose its elegance that comes with its simplicity, but perhaps it would be a worthwhile exchange if you are able to write something which is air-tight and more performant. I think it's worth checking out to see.
Examples of things to play with to reduce the iteration count:
-
If your input list contains duplicate values, is the code zapping all instances of a repeated value at once within a single iteration? Or does the code iterate through every value in the input list?
-
Experiment with operating on a sub-block of the input list per each iteration, and performing multiple tests/zaps per iteration. This would likely mean some repetitive code, but it would reduce the number of iterations by a factor determined by your block size.
-
Combinations of both #1 and #2 above.
Finally:
Before investing much time in evaluating and possibly improving the performance of the implementation that you wrote, I'd strongly suggest checking out both the implementations by Bruce and Agnès. Doing so may very well yield some great insights, is likely to be inspiring, and may save you some time.
Kind regards, and good luck!
I hope that you will post back with your findings...
-steve