I have three datasets, totalling approx 35,000 records. These have to be transferred everyday. They are CSV.
I'd like to be able to transform each dataset into JSON and throw them at the target database in batches. Currently, all I can do is to create a new record, row by row. That is time-consuming.
I'm wondering if a better solution is to push the CSV into a container field in the Cloud app, then read the contents to file on the server, and import from there?
I have not been in a situation where I would need to try that, but pushing 35k records sounds like it is worth trying to get the file across and handle it on the server. I believe I've heard about John Renfrew being a fan of something similar.
But I think in his case it was to get data out of the server. I think since the API is not pushing the container over but gives you a link, someone can use it to circumvent some data measurements of the API itself. I do not know if the technique has merit from the performance angle or not.
In your case, as you are pushing data towards the server, the payload is not really impacting your API quota.
The local end is running FMS on-premise and initially we imported the CSV into the on-premise DBs.
I had thought another alternative is that we could continue to do that - and the FileMaker Cloud app could grab the data - using Insert from URL to call the DataAPI.
I have colleagues who swear (in a good way) by the technique you describe above. I suspect that, assuming that the server can handle the extra load during the time when this must happen, that this approach will be difficult to beat with respect to performance.
As far as API's go, it sounds as though you are settled on the Data API, but I will note that one of the things that I am excited about with the newer OData API is that it allows for performing batches of update operations within a single request. I would think that this would give you the option to push data into your Cloud server in batches. That said, I don't know of any community feedback regarding OData performance -- OData feels promising, but it also still feels very new at this point. And -- I still suspect that the upload as container, and then import server-side is going to be the performance winner.
You said you were doing some testing with oData. Did you get anywhere with it? Also, I'm targeting a Cloud app. Can we use oData auth for FMCloud?
I have been doing research on OData both to gain some familiarity, and to put together some learning material to share with the FM community. At this point I would call myself a strong "armchair" user -- I am comfortable with the basic techniques, but I don't have real world deployment experience to back that up.
The authentication for OData with FM Cloud involves Amazon Cognito, and so, if you have had success working through that, you should be able to leverage that success to use OData. Since I have been doing my testing and learning using a Linux FMS installation, I am taking the easy route, using just a basic authentication method.
The auth for Cloud is still making my head spin.
If I've obtained a tokenID and a sessionID via Cognito, can I use those on an OData session?
In the Claris OData docs, they only mention a "Claris ID Token". If you are able to obtain that, then, I believe it is just a matter of supplying that token in the header of each OData request.
Here is where I am getting that information.
Do you use our MBS FileMaker Plugin?
We do have a FM.InsertRecordCSV function, so you can upload the CSV into a field, then let the plugin create records quickly.
How does that work with FileMaker Cloud?
Oh, sorry. You may need a different cloud for using plugins than the one from Claris.
I've run a quick experiment in Postman. Both the Data API and the OData interface requires that you authorise with Cognito. When you do that you receive an ID token. The OData interface allows you to authorise using the ID Token. With the Data API you take an extra step to obtain an API Token. You then use the API token to authorise calls.
The ability to batch processes in OData is intriguing but I don't think it's going to be the best way to handle my situation. To process the records as batches requires that they are transferred as JSON and contained in a change set. The overhead generated by the JSON metadata and the change set metadata adds up. The data set is 2,911,170 bytes as CSV, the JSON and change set boundaries will contribute an extra 16,148,171 bytes. That is generating a 19MB payload from a 3MB data set.
Thanks for posting back on this @Malcolm
I am not surprised by your assessment, and I think it sound, too (FWIW). But -- I appreciate that you went so far as to scope it out, and further that you shared your findings back on this thread. That's helpful for the rest of us.