Best Way To Implement Huge SQL Query In FIleMaker

The issue with the inner query producing so few records is not pagination related; it's just that there aren't that many records that generally satisfy this criteria.

I have some separate code I wrote that will do list functions. I've tested it with 100,000 items in each of two lists and it does the "in this list but not in that list", etc., in about 2 seconds.

--

I'm on hold for now with this project as the other company (competitor) is going to pitch MS CRM. I had a good meeting with the client today and he liked what he saw from FMP so there's hope.

Thanks as always.

Hi Oliver,

The inquiry about pagination was more a question about the reason behind the

LIMIT 25

clause in the main query. It's not important that I understand this. My question was mostly out of curiosity.

I have some separate code I wrote that will do list functions. I've tested it with 100,000 items in each of two lists and it does the "in this list but not in that list", etc., in about 2 seconds.

A thought to consider on the above:

For sure there are going to be extremely fast means for list processing outside of FMP, but in order to leverage them, something has to collect all the data gathered in FMP in order to pass it to the outside process. That process of gathering a very large found set of data within FMP can cost more time than is desirable, i.e., it's where a slow down could happen. In fact, the slowness of the gathering of the data is one of the contributing reasons for why using a FMP script to massage the data would appear to be slow.

If you run into a bottleneck with gathering the data, I hope that you will revisit the idea of creating a more targeted found set of data by using the features of FMP that perform efficiently, such as the finds and the omit request feature of finds.

HTH, and best wishes with it. Glad to hear the meeting went well.

The reason behind the LIMIT 25 is that the query creates a call list for the sales folks. The 25 is just a form default and they can change that value In the actual application

I'll definitely use Finds and Omits, if performant, before reverting to external list logic. That's a great suggestion. FileMaker Finds are typically very fast.

Thanks Steve.

1 Like

OK, I have a script portion that does the sorting for the Outer portion of the SQL statement. Simple.

I also have a script portion that does the inner exclude portion. Also straightforward.

But how do I tell the outer loop's find results to exclude all records for a certain field (not a field's "value") that are in the inner loop's find results?

I tried various combinations of omit and even changing the relationships, but this is why SQL should work better here: simple, industry-wide, and intuitive. (Frustrated with all the extra work FileMaker requires to achieve simple SQL results.)

Thanks,

Please explain what you mean by the following:

Sure, sorry about that confusion.

In a typical SQL subquery, you have something like:

select something where some field is not in (select ...).

So, I meant how do I get FileMaker to do this "where some (a specific) field" is not in the outer find?

So far, ...
I have the outer part of the SQL where I get all the records sorted.
I have the inner part where I've restricted the records.

All of the examples I see are were you have a particular value to "omit".

Thanks.

Ah!

I said the following in a previous post:

You created a list and stored it in a variable as part step 1. That's the inner portion of the SQL query.

You will need to use a loop in step two to create omit requests, one for each value in the list created in step 1. The loop should look something like this:

set variable [theCount; valueCount (theList)]
Loop
  Exit Loop If [theCount = 0]
  New Record / Request
  Set Field [field_1; GetValue (theList; theCount)]
  Omit Record / Request
  Set Variable [theCount; theCount - 1]
End Loop

Hope this help!

2 Likes

In SQL, one tends to put the business logic "outside" of the data, for example as shown by your original somewhat complicated query.

In FileMaker, one tends to put the logic "inside" the data, and build up complex queries using calculated fields first, and relationships second.

For example, suppose you have a Customers table, and you want to know which Customers are due for another Call soon. In SQL you might do that with a query. In FileMaker, you would do that with a calculated field, for example:

Customer::NeedsCallback = 
Let( 
  now = Get(CurrentDate); // today's date
  lastCall = Max(Customer_to_Calls::Date);  // date of last call to customer
  elapsed = now - lastCall ; // # of days since last call
  highValueCustomer = Customer_to_Sales::Total > 1000000; // treat big clients specially
  employeeOnVacation = Customer_to_Employee:OnVacation;  // is the rep on vacation?
];
Case(
  not IsActive; "" ;  // inactive customers never need callbacks
  employeeOnVacation; ""; // if salesperson is on vacation, ignore
  elapsed > 3 and highValueCustomer; 1; // big customers get calls more often
  elapsed < 7; "" ;  //  under one week since last call?  ignore
  
  [ ... other business rules here ... ]

Then, with this complicated calc Field defined, queries become quite simple -

Perform Find Customer::NeedsCallback = 1

Pros of this approach:

  • data is local and encapsulated: this is a more object-oriented approach: the Customers table knows when the Customer is due a callback. You don't have to go searching through reams of SQL to find where the logic is located
  • debugging is easy: the field updates instantly in the data table, so you can view, edit, and change the field by itself, and be confident that any query using this field will also be updated.

Cons:

  • speed - these calc fields that reference other relationships are SLOW in FileMaker. However, there are other techniques for speeding these up (the Triggered-Stored-Indexed-AutoEnter-Calc field trick).
3 Likes

Sorry I missed your previous posting. I appreciate your input.

Here's the script code I have for the final part.

The "$innerQueryList" variable is the list of, in this case, 55 IDs).

Set Variable [ $theCount ; Value: ValueCount ( $innerQueryList ) ]
Loop [ Flush: Always ]
Exit Loop If [ $theCount = 0 ]
New Record/Request
Set Field [ field1 ; GetValue ($innerQueryList; $theCount) ]
Omit Record
Set Variable [ $theCount ; Value: $theCount-1 ]
End Loop

It looks like it's working, but when it's done, all the records still display (Get(FoundCount) = table size).

What's off in what I did?

I checked in MariaDB, and for this ID (using the actual SQL), with no limit filtering, there should only be about 19,000 values left instead of > 500,005 (all table records).

Thanks

Nice ideas, thanks!

Hi @OliverBarrett,

That loop code (excerpted just above) looks good, but I will note that it needs to be executed within the portion of your script that is in "find mode".

To illustrate, the bigger picture of your script might look something like this:

  • Go To Layout (establish the correct context for performing the Find)
  • Enter Find Mode (tell FMP that you are about to define your Find criteria)
  • Set Field (this is a step to handle matching a particular value)
  • Then perform the steps of loop code (which define what you want to omit)
  • Perform Find (Tell FMP to execute the defined Find criteria)

Comments:

  • I deliberately left out error trapping/handling steps, just to keep things to a simple skeleton. Normally I'd be doing some error trapping right after the Perform Find step.
  • There is another way to do this sort of thing where one first finds found records, and then omits other records by using a step called Constrain Found Set, but the above pattern where the omit steps are included within the main Find body is a good pattern to be aware of, as it is very common within the world of FMP scripting.

HTH.

1 Like

Well some progress. I'm now down to 470,000+ found records so still only off the correct result by about 450,000... :frowning:

Appreciate your help.

I need to do this another way, clearly.

I suspect that the issue lies in both accurately and completely translating the intent behind the original SQL into the FMP paradigm. We've been chipping away at the "accurately" part, but probably are only beginning to address the "completely" part.

Some thoughts:

  • The SQL in the original post is doing a GROUP BY.

  • On its own, creating a found set of records in FMP is not applying any sort of GROUP BY construct.

  • So, if the MariaDB result is incorporates GROUP BY, then it feels like expected behavior that the FMP find is returning many more records -- they are two different queries.

  • I would, however, be looking to have the same number of records/rows if one compares the FMP found count to the row count returned by MariaDB if the SQL were run without the GROUP BY.

The bigger picture of the common pattern in FMP for doing this type of work goes something as follows:

  1. Create a found set of (ungrouped) records (as per conversations above)

  2. Use sorting and summary features to "group" records by a field or set of fields.**

  3. Either use layout sub-summary features to display the results to a user, or use scripting to harvest the results into some other data structure for consumption by some other code.

I'm expecting step 1 to be fast. For steps 2 and 3, I think it will take trying it out to see how well it performs.***

I realize that doing the above can feel like a lot of "unnecessary" work if the expectation is that FMP should behave like an SQL database. It can especially feel this way if the FMP patterns are unfamiliar, but I'm pretty sure that lots of us here will be happy to chip in and do our best to help ease that learning curve (as evidenced by some of the participation already happening in this thread).


** Note, however, that while "grouping" does effectively group records in a way that is natural in the FileMaker world, it does not reduce the number of records in the found set. In FMP, this grouping serves a similar conceptual purpose as GROUP BY in SQL, but it does not attempt to be an exact of copy the behavior of an SQL database.

*** In the case where the sorting and summarization takes too long, or even borderline too long, within the context of building this out in FMP, I would consider whether it might be possible to implement in a way where the sums are pre-calculated and cached. That calls for more effort up front to build out, but it pays off handsomely with respect to runtime overhead during reporting.

3 Likes

Thanks Steve. You're awesome! :slight_smile:

1 Like

Christian,

I use SQLLite some times in applications for storing preferences and such.

Following up on your suggestion, how long do you think it would it take MBS to load 500,000 records into an in-memory SQLLIte DB with about 100 fields per record? (The other table to load only has about 20K records).

This loading would have to be done each time a user wants to get this result as the data change constantly (IOW, that import can't be run say just once a day).

Not sure if this loading would take too much time, but I'm wondering what bench-marking you might have done.

Thanks to all. Resolution below.

OK, I got it working, but it took over 2 minutes to process.

However, having presented the solution, the client said "NO" and feels generally uncomfortable with FileMaker's proprietary approach. Plus, given the time it took in FileMaker with all special coding and other techniques to convert a single SQL statement, plus the (relative) immense amount of time for it to run (2 minutes for FileMaker vs. 2 seconds with SQL Server), this is a no-go. FileMaker flunked the non functional query performance requirement laid out by the client in his spec.

To make matters more challenging, his has about 550 complex SQL statements, most with GROUP BY, most on large data.

So, assuming it took us, say, 30 minutes per query to convert standard SQL to something FileMaker could do, it would take (30 minutes / query * 550 queries) over a month to do (34 days+), assuming you did this full time, 8 hours a day and nothing else, no breaks, and you were always as successful at the 30 minutes per conversion.

Plus...his queries change often. So, implementing queries in FileMaker script code means a query change = a code change. Thus, this script technique to run industry standard SQL gives up the declarative ability of SQL (the self-describing non code-based way to run). Constant code maintenance and application updates here.

Final decision -- He decided to go with MS-CRM where he can just use his SQL and get often sub-second results with no conversions or proprietary techniques needed. Although I'm no fan of MS-CRM, I understand his decision. He also said he preferred the UI capabilities of MS-CRM better though I didn't get a full breakdown from him on these.

Will keep looking for a project that fits both FileMaker's capabilities and my clients' willingness to use FileMaker (so far, I'm zero for four in FileMaker app sales attempts). This project definitely wasn't it.

2 Likes

Taking something that works, asking for a month of fees, and returning a much slower version of it is not usually on the spec sheet.

:smile:

2 Likes

It would have been a $500K - $1M project so, yes, it's disappointing (and plenty of lost revenue to Claris).

Thanks M.