Thinking Outside the Loop

By Daniel Wood, 3 March 2011

If you have spent any time developing in FileMaker, or programming in almost any language, you will be familiar with the concept of a loop. When scripting your loops you probably have a favorite method that you use. This article presents and compares a number of looping techniques that can be used in FileMaker in an attempt to locate the fastest method.

Establishing our Test Scenario

Lets assume that our solution is a CRM which has a Contacts table. The CRM also has a letter writing screen where we can compose letters.

To make the best use of this facility, we should be able to compose a letter and print a personalized version for each contact.

In order to print these personalized letters we require three important things:

  • A Letters table to store the personalized letters.
  • Records for each individual letter
  • A Contact ID field to link to a Contact

The goal of our loop then is to generate Letter records, each personalized for a specific contact which is determined by the fact that each record will have a different Contact ID.

A Note:

This article makes heavy use of the example file that accompanies it. It might be a good idea to download the example and reference it while reading this article :-)

The code in this article is also a simplified version of what appears in the example file and is to illustrate the basic loop structure. If you want to see it in action then the example file is the best place to look.

Method #1: Layout Switching

Perhaps the simplest method and the one most beginners to FileMaker would use. The method involves looping through contact records, grabbing information from each in order to create letter records. The loop ends when all records have been traversed.

Set Variable [ $LetterContents ; Home::Letter Contents ]
Freeze Window
Go to Layout [ "Contacts" (Contacts) ]
Show All Records
Go to Record/Request/Page [ First ]
Loop
Set Variable [ $ContactID ; Value: Contacts::Contact ID ]
Go to Layout [ "Letters" (Letters) ]
New Record/Request
Set Field [ Letters::Letter Contents ; $LetterContents ]
Set Field [ Letters::ContactID ; $ContactID ]
Go to Layout [ "Contacts" (Contacts) ]
Go to Record/Request/Page [ Next ; Exit After Last ]
End Loop

One of the biggest problems with the above method is the number of layout switches it involves. For each letter created, there are two layout switches - from contacts to letters & back. If you are generating 20,000 letters, that's 40,0000 layout switches which all adds up. This method might be fine for small numbers of letters only, but when the number of records generated gets large, an impact on performance can be seen.

Method #2: Chopping a List

This method differs from the first in that there is little or no layout switching involved, rather the looping is done by using a list of Contact IDs which is defined before the loop begins. This method is one that is quite common, but it too suffers from some inefficiencies.

Set Variable [ $LetterContents ; Home::Letter Contents ]
Set Variable [ $ContactIDs ; List ( Home to Contacts::Contact ID ) ]
Set Variable [ $Total ; ValueCount ( $ContactIDs ) ]
Go to Layout [ "Letters" (Letters) ]
Loop
New Record/Request
Set Field [ Letters::Letter Contents ; $LetterContents ]
Set Field [ Letters::ContactID ; GetValue ( $ContactIDs ; 1 ) ]
Set Variable [ $ContactIDs ; RightValues ( $ContactIDs ; ValueCount ( $ContactIDs ) - 1 ) ]
Exit Loop If [ IsEmpty ( $ContactIDs ) ]
End Loop

During each iteration of the loop, the first contact ID in the list is set into the new record. The list is them trimmed of its first value, so that the second value now becomes the first. When the loop next iterates, the new first value is used, and so on until the list is empty.

The big issue with this method is the overhead in trimming the list during each loop. In particular the use of the RightValues function is a particularly costly function call. The more letters you create, the more calls to this function required. Also, consider generating a large set of letters such as 20,000. Your initial list is 20,000 values. After the first iteration the RightValues function is being told to grab the 19,999 right-most values from that list to make the new list - that can't be good!

So, is there any way we can still use a list of contact IDs, but perhaps avoid using these costly function calls during each iteration of the loop? Fortunately there is...

Method #3: The Counter Approach

You may have noticed in Method #2 that we used the GetValue function to obtain the first value in the list. GetValue takes two parameters, the second of which is the value # in the list to obtain. We were just using the first value, but there is no reason why we can't grab any value.

So, we are now going to maintain a running counter variable for each iteration of the loop. We will use this in conjunction with GetValue. This lets us obtain a contact ID during each iteration, without needing to modify the original list:

Set Variable [ $LetterContents ; Home::Letter Contents ]
Set Variable [ $ContactIDs ; List ( Home to Contacts::Contact ID ) ]
Set Variable [ $Total ; ValueCount ( $ContactIDs ) ]
Set Variable [ $Counter ; 1 ]
Go to Layout [ "Letters" (Letters) ]
Loop
New Record/Request
Set Field [ Letters::Letter Contents ; $LetterContents ]
Set Field [ Letters::ContactID ; GetValue ( $ContactIDs ; $Counter ) ]
Set Variable [ $Counter ; $Counter + 1 ]
Exit Loop If [ $Counter > $Total ]
End Loop

This method has eliminated the need for the RightValues function call, and instead requires one GetValue function call per iteration of the loop.

One thing I noticed in building this particular loop was that during each iteration, I was carrying out two "Set Field" operations - one to set the Contact ID, and one the letter itself. This means that if we generate 20,000 records, then there are 40,000 set field operations. Is there another way to set field contents en-mass? Why yes there is...

Method #4: Replacing

The premise behind this method is as follows - we generate all the letter records using a standard loop as we have seen in the above methods, however we don't set any field values during this loop. Once all records are generated, we then run two "Replace Field Contents" steps across the found set, one to set contact IDs, and the other to set letter contents.

The question is - are two replace field content steps faster than X "set field" steps - where X is 2-times the # of records generated?

Set Variable [ $LetterContents ; Home::Letter Contents ]
Set Variable [ $ContactIDs ; List ( Home to Contacts::Contact ID ) ]
Set Variable [ $Total ; ValueCount ( $ContactIDs ) ]
Set Variable [ $Counter ; 1 ]
Go to Layout [ "Letters" (Letters) ]
Show All Records
Show Omitted Only
Loop
New Record/Request
Set Variable [ $Counter ; $Counter + 1 ]
Exit Loop If [ $Counter > $Total ]
End Loop
Replace Field Contents [ No Dialog ; Letters::Contact ID ; GetValue ( $ContactIDs ; Get ( RecordNumber ) ]
Replace Field Contents [ No Dialog ; Letters::Letter Contents ; $LetterContents ]

Here you can see the loop creating essentially blank records. When done, the created records then have their fields populated. GetValue is used in conjunction with Get(RecordNumber) to populate Contact ID.

When the replace happens, Get(RecordNumber) is evaluated from the context of each record, so when the 50th record is being replaced, Get(RecordNumber) will be 50, and so on.

Note also that before looping we ensure the found set is empty so we don't interfere with other letter records when we do the replace.

Looping methods without actually looping at all...

Up until this point, all four looping methods have used Loop constructs - that being the Loop and End Loop script steps. We are traversing items, whether they be records, or values in a list.

There is however another way in which records can be generated in a table without explicitly looping or using the New Record/Request script step, and that method is using an Import.

The Import Records script step can be used to Import records from a file, such as a CSV, tab delimited etc. It can also be used to import from another FileMaker file. Even more specifically, it can be used to import from a different table within the same file, and it is this method we will be making use of.

So how do we create multiple records using Import?

The conventional idea behind an import is you start with X records in the source table, and after the import you now have X records in the destination table. In this situation however we don't have X records to begin with.

The way in which we will create the letter records is to make use of a particular setting of the Import Records script step:

thinkingoutsidetheloop 1

When importing values in a repeating field, you have the option to either keep them in one record, or Split them into separate records.

So, if we have 20,000 contacts for example, if we can somehow get each contact ID into a repetition of a repeating field, and then import that repeating field choosing to split into separate records, we can effectively generate 20,000 letter records!

Method #5: Import Madness

And here it is:

Go to Layout [ "Home" (Home) ]
Show All Records
Omit Record
Show Omitted Only
Set Variable [ $$ContactIDs ; List ( Home to Contacts::Contact ID ) ]
Import Records [ No Dialog ; "ThinkingOutsideTheLoop.fp7" ; Add; Mac Roman ]

It's a little hard to deduce what is happening here so I'll explain. The first few steps are to ensure that we are in a found set of 1 records. This is to ensure that the import step generates the correct number of letter records. This is to do with how the import step works when dealing with repeating fields. If we have a repeating field of 20,000 repetitions, each with a contact ID in them, and we import from a found set of 1, we get 20,000 letter records. if however we had a found set of 2 records, we will get 20,000 letter records per record so in effect we end up with 40,000 letters! We always need to be on a found set of 1.

The setting of the global variable $$ContactIDs is the way in which I am setting the repeating field which is part of the import, lets have a look at that repeating field:

thinkingoutsidetheloop 2

This is a live unstored calculation that is setting each repetition to a value in the list in $$ContactIDs. So if $$ContactIDs were to contain 20,000 values, then the first 20,000 repetitions in this calculation field will have values, any others will be blank.

It is important here to note also that only repetitions with a value in them will be imported, and thus have a record generated for them.

In the import record step itself, the repeating field is being mapped onto the Contact ID field in the Letters table, and the Letters contents is also mapped accordingly.

Some inefficiencies with this Import Method

The above method is great, it's simple, cuts down on the number of lines of code, and is quite reliable. However there is an inefficiency here. This is the fact that the repeating field we are using in the import is an unstored calculation. When the import is occurring, each repetition has to be evaluated, so in effect there are X calculations being evaluated, where X is the number of letters being created.

Some Testing and Results, What Do We Learn

Now would be a good time to look at the example file. In it, I have setup each of the 5 looping methods mentioned so far.

Here, I present the average time to create 20,000 letters for each of the 5 methods presented. This first test is done on a local copy of the database.

Layout Switching: 45 Seconds
Chopping a List: 72 Seconds
Counter Approach: 30 Seconds
Replacing: 31 Seconds
Import Madness: 32 Seconds

The following test was done over a slow internet connection to a copy of the database hosted on FileMaker Server 11 Advanced. Due to internet connection speeds, I have reduced the number of records created in this test to 500 letters. It's not ideal using a reduced record set because this affects the interpreting of results, but I'll do my best:

Layout Switching: 210 Seconds
Chopping a List: 209 Seconds
Counter Approach: 209 Seconds
Replacing: 313 Seconds
Import Madness: 110 Seconds

Interpreting the Results

First thing I notice is that times vary more significantly in the local test than the WAN test. My assumption here is that this is because of the different in letters generated, 20,000 compared to 500. In the local test, the main reason for the large differences comes down to the script steps themselves and the looping methods, not the actual process of creating the records. In the WAN test, fewer records means the overhead of the looping constructs becomes less, as indicated by the fairly consistent speeds for the first 3 methods. The overhead in this case is the method of record creation, and the number of calls between server and client required.

Looking at the WAN results, the layout switching, chopping a list, and counter methods are all very similar in times, in fact, only a second separates them. This would indicate the overhead in these 3 cases is in the New Record/Request action. The methods of looping have little overhead. In fact if you think about it, layout switching, and list manipulation are all happening on the client-side anyway and there is no server interaction in this process. The only interaction with server is first obtaining the list, and then the committing of the new records back to server.

The Replace method is an interesting one. Here in the WAN test, it is by far the slowest, and if you think about it this kind of makes sense. If we keep the assumption that the overhead here is in server/client calls, or more specifically the setting and getting of data between the two, then this method has 3 explicit send/get operations compared to 1 with the rest. The first is the loop to generate the records. The second and third are the two replace field contents script steps. There are three separate sends of data to the server as opposed to one for the rest.

Finally is the import method which has proven to be the fastest of all. The calculation that determines the repetition values is done on client once the contact IDs have been set into the global variable $$ContactIDs. So the overhead here is again the creation of records via the import.

I can only deduce that record creation via import is more efficient than a loop, perhaps it reduces on the number of packets sent back and forth between server and client, I'm not sure, but compared to looping methods, the import method proved significantly faster in WAN testing, and in local copy testing the difference in speed was negligible compared to that of looping methods.

And for the Local Copy Results...

As always with using more records, the results are going to be more distinctly different. If we tested a local copy with 500 records then the time difference between all methods is negligible to the point of under 1/100th of a second. It is only when using very large data sets do these differences present themselves, and the truly efficient methods present themselves.

So what does this all mean?

What this means for real world practices? Basically, if using a solution over WAN, you should probably look at using an import as a way to generate your records if you can as it is a more efficient method of record creation. If using a local copy, OR only generating a small set of records, then any looping method is probably OK, but if you want to use a truly efficient method, either the import, or a counter looping method would be advised.

Added Bonus: Looking for the Ultimate Speed

Looking at the results shown above, the import method is fastest over WAN, but the Counter approach is fastest over local copy (and with large record sets). Is there a way to combine these two approaches to produce an even faster method?

Lets compare a loop vs import for simple record creation

For this test, I am using a local copy of the database and 20,000 records to generate. The goal is to compare the speeds of a simple loop, with that of a simple import to generate records.

The first thing I did was to create a very simple loop to generate 20,000 blank records:

Set Variable [ $Counter ; 1 ]
Loop
New Record/Request
Set Variable [ $Counter ; $Counter + 1 ]
Exit Loop If [ $Counter > 20000 ]
End Loop

The second method is a single import records script step. Note that to create the 20,000 records I am using a Stored repeating field with values 1 thru 20,000 in the repetitions.

Import Records [ No Dialog ; "ThinkingOutsideTheLoop.fp7" ; Add; Mac Roman]

Simple Loop: 12 Seconds
Simple Import: 3 Seconds

That's a whole 4x faster to create 20,000 records using an import as opposed to the simplest loop possible.

So, the import is clearly the best way to create the records that was already known, but why is this method so much faster than the loop? The key here is that our repeating field is Stored.

Recall the original Method #5 was using an unstored calculation as the repeating field, so there was overhead involved in calculating each repetition. What if we instead used a loop to set the Contact IDs into a stored repeating field, instead of having them unstored?

Method #6: The Need for Speed

This method uses a simple loop to set each repetition of a stored repeating field. The repeating field is then used in an import records script step to generate the records.

The hope is that this combines the best of both methods. We are hoping a loop to set 20,000 repetitions will have less overhead than the evaluation of 20,000 unstored calculations.

We also hope then that the import of a stored repeating field is faster than a loop of 20,000 record creations (we already know this to be true from the earlier test).

So, here we go!

Set Variable [ $LetterContents ; Home::Letter Contents ]
Set Variable [ $Counter ; 1 ]
Set Variable [ $ContactIDs ; List ( Home to Contacts::Contact ID ) ]
Set Variable [ $Total ; ValueCount ( $ContactIDs ) ]
Loop
Set Field [ Home::ContactIDs_Stored[$Counter] ; GetValue ( $ContactIDs ; $Counter ) ]
Set Variable [ $Counter ; $Counter + 1 ]
Exit Loop If [ $Counter > $Total ]
End Loop
Commit Records/Requests [ Skip data entry validation; No Dialog ]
Import Records [ No Dialog; ThinkingOutsideTheLoop.fp7; Add; Mac Roman ]

The Need For Speed: 22 Seconds

That's a whole 10 seconds faster than the earlier Import method which used an unstored repeating field. Not too shabby!

Again, overhead comes with the loop when dealing with a local copy. So when dealing with small record counts over WAN you may notice no difference between the two import methods. The difference is achieved when working with large record counts.

Limitations using the Import Method

While the import methods may be more efficient for creating records, they do have some limitations from other aspects. For starters, you are using a repeating field, and repeating fields have a limit to the number of repetitions they can hold - 32,000. So you are limited to 32,000 records created at any one time. You can of course modify your import script to be contained within a loop which itself loops for each block of 32,000 records if you need to generate more.

The other limitation to this is that while the records are being generated you have no way of keeping track of what number of records have been created. All you will see is an import dialog which says it is importing 1 record (which technically it is, but it is splitting it into many records).

One good thing about using a scripted loop is that during each iteration of the loop you can know exactly what point you are up to, so you can do things such as progress dialogs or visual cues as to progress.

Final Thoughts...

Hopefully this article has presented some alternate styles of record creation and looping you may not have thought to use.

While some are more efficient than others, the performance gain depends on a number of factors such as your connection method to the database, internet speed, number of records being created and so on. So don't feel bad if you are using a looping method that may be slower than another, that difference in speed may be unnoticeable.

I am very keen to hear of other people's looping methods, or whether you can take the example file and expand on it, or come up with other faster ways. Some things to explore may be:

 

  • SQL Creation of records
  • Exporting to CSV/Tab and Importing back in
  • Alternate methods of setting the repeating field, faster calculations
  • Ways of optimizing scripted loops

If you have any feedback or suggestions please leave a comment, or get in touch directly via e-mail. I'll do my best to add them to the article.

=======================================

 

EXTRA! Reader feedback and New Looping Suggestions

Thank you to everyone for your feedback, with it has come a couple of new methods of record creation. I actually feel a little silly at this point because one of them now appears so glaringly obvious I wonder why I completely omitted it, so much for thinking outside the loop as it were :)

Method #7 - Contact Import

Instead of putting all of your contact IDs into a repeating field, why not just go direct to the contact records and import them into letters, why not indeed! In fact I even made mention to the fact that FileMaker can import from one table to another, yet seemed to glance over that aspect :)

So, to do this I have put a global field into the contacts table to store the letter contents for purpose of the import. Then, a Go to Related Record step is used to put the user on the contact records, the import is done, and that is that:

Set Field [ Contacts::zg Letter Contents ; Home::Letter Contents ]
Go to Related Records [ Show only related records; From table: "Home to Contacts" ; Using layout: "Contacts" (Contacts); New window]
Import records [ No Dialog; "ThinkingOutsideTheLoopExtra.fp7"; Add; Mac Roman ]
Close Window [ Current Window ]

Turns out, this method is by far the fastest method tested so far, there is no processing of repeating field required, and no looping, it is just a pure import. Testing with 20,000 records on a local copy took an average of 8 seconds, a full 14 seconds faster than my "need for speed" method :)

Method #8 - Auto Enter

This method makes use of field-level auto-enter calculations to populate the contact ID and letter contents of letter records. The key involves the fact that local variables exist for the duration of a script, and that applies to records created during the processing of the script, thus auto-enter calculations can reference those local variables.

The script itself appears as a basic loop:

Set Variable [ $LetterContents ; Home::Letter Contents ]
Set Variable [ $ContactIDs ; List ( Home to Contacts::Contact IDs ) ]
Set Variable [ $Counter ; 1 ]
Set Variable [ $Total ; ValueCount ( $ContactIDs ) ]
Go to Layout [ "Letters" (Letters) ]
Loop
New Record/Request
Exit Loop If [ Let ( $Counter = $Counter + 1 ; $Counter > $Total ]
End Loop

From the scripts perspective contact ID and Letter contents are not set, but they are in fact being set via the auto-enter calculations, each of which are set to $LetterContents and $ContactIDs respectively.

Testing of this method has shown that it takes around 22 seconds to generate the 20,000 records on a local copy, so comparable with need for speed method 6, but nowhere near as fast as method #7.

Revised Conclusion

If you are able to generate your records by importing direct form a found set of records in another table then you should definitely look at doing that as it is the fastest method. There is no overhead in looping, setting a repeating field and so on.

However that is not to discredit the repeating field import method which still has its place. This would be a good method to use if you had to set many fields on the new records from globals, or if you don't have the ability to go to a found set of records. Perhaps you are not specifying contacts, but instead you have an arbitrary list of items that need to be turned into records, in this case there is no found set available so your options are a loop or an import.

Thank you to those who contributed to this extra section!

Example Files

Please find attached an example file. This file was used for all the screenshots in this article, and is provided to help you fully understand what is going on in this article, and to let you experiment in FileMaker with this solution.

Download Example File

This next example file contains methods 7 and 8 that were added after the article was published, thank you to Mike and Stephen for contributing these two methods.

Download Extra Example File

Categories(show all)

Subscribe

Tags