Faster Than a Speeding Bullet

Speed is where it's at. No client is going to pay you to make his app run slower. Fox Software's initial claim to fame was that FoxBASE (and later, FoxBASE+ and FoxPro) ran faster than its competitors. Fox's reputation for speed was well deserved. However, the speed gain has never been automatic. You have to do things right to make your code "run like the Fox."

The Rushmore optimization technology introduced in FoxPro 2.0 is based on indexes. Rushmore examines index files to determine which records meet the conditions for a particular command. So, in order to make things fast, it's important to create the right indexes and to write code that takes advantage of those indexes. The difference between the lightning-fast code your client uses to make crucial strategic decisions and the plodding code his competition uses might differ by no more than an operator or two, so listen up!

There are also some other tricks, not related to Rushmore, that can speed up your applications considerably. This section discusses both Rushmore and non-Rushmore optimization techniques.

Scared by a Mountain in South Dakota?

Fox Software always claimed that Rushmore was named after Dr. Fulton and the development team watched Hitchcock's North by Northwest. But we have no doubt the name caught on due to the phrase "rush" embedded in it. In fact, some of the FoxPro documentation and advertising used the phrase "RUSH me some MORE records."

As we mentioned above, the key to getting the most from Rushmore is to create the right indexes and take advantage of them. So how do you know which are the right indexes, and how do you take advantage of them?

Rushmore can optimize the SET FILTER command, and any command involving a FOR clause, as well as SELECT-SQL. The secret (not really a secret—it is documented) is to make the left-hand side of each expression in the filter, FOR or WHERE clause exactly match an existing index tag. For example, to optimize:

SUM OrderTotal FOR state="PA"

you need an index tag for state. If your tag is on UPPER(state), instead, you'd want to write the command as:

SUM OrderTotal FOR UPPER(state)="PA"

Suppose you want to find everyone named Miller in a table of Clients and that you have a tag on UPPER(cLastName+cFirstName) to put folks in alphabetical order. You optimize the BROWSE by writing it as:

BROWSE FOR UPPER(cLastName+cFirstName)="MILLER"

even though you're really interested only in the last name.

It's All in What You Index

We've answered the second question—how to take advantage of existing tags—but we still haven't tackled the first: What are the right indexes to create? That's because it's not always straightforward. There are a few clear-cut rules, but to a great extent, you'll need to use your judgment and test your theories against your data, on your hardware. Here are the rules:

Create a tag for your primary key, the field that uniquely identifies the record. (Do this whether or not you define it as a primary key in the database.) You'll need it to look up particular records and for setting relations. (If your table is in a database, you'll want a tag for the primary key anyway for creating persistent relations.)
Create a tag for any field or expression you expect to search on frequently.
Create a tag for any field or expression you think you'll want to filter data on frequently. (This is to let Rushmore kick in.)
Make sure the tags you create exactly match the conditions you'll need to search or filter on.

Don't automatically create tags on every field. (That's called inverting the table.) It can make adding and updating records slower than necessary, especially if you have a lot of fields in your table. On the flip side, if you have a table, especially one that is rarely changed, and you do use every field in filters, then go ahead and invert the table.
Do not create indexes with a NOT expression for Rushmore optimization. Rushmore ignores any tag whose expression contains NOT. If you need the NOT expression, say, for a filter, create both indexes, one with and one without the NOT.
Don't filter your tags. That is, don't use the FOR clause of the INDEX command. Tags that are filtered are ignored by Rushmore. If you need a filtered tag for some reason, and you're likely to filter on that tag's index expression as well, create an unfiltered tag, too.

In general, you'll be trading off update speed for search speed. So, think about what you expect to do with this table. If it's going to have lots of additions but few searches, keep the number of tags to a minimum. If it'll be used for lots of searching, but rarely updated, create more tags.

But I Didn't Delete Any Records!

One of the optimization tips that fools lots of people has to do with SET DELETED. The typical conversation goes something like this:

"I have a query that's taking too long. How can I speed it up?"

"Create a tag on DELETED() for each table, if you have SET DELETED ON."

"But there are only a few deleted records. That shouldn't make much difference."

"Try it anyway."

(Later)

"You're right. It's much faster now. But there are only a few deleted records. How come it matters so much?"

What's going on here? In fact, you'll see the same speed-up even with NO deleted records. It's the setting of DELETED that matters.

Here's the point. Even in many complex queries and FOR clauses, Rushmore performs its magic almost entirely on the relatively small and compact CDX file, a file structured with nodes, branches and leaves to be searched efficiently. When DELETED is ON, FoxPro has to check each and every record in a result set (whether from a query, a filter, or FOR) to see if it's deleted—even if no records are actually deleted. This sequential reading of the entire cursor or file completely defeats the benefits of Rushmore. Don't do it!

By creating a tag on DELETED(), you let Rushmore do the checking instead of looking at each record sequentially, which makes the whole thing much faster. The larger the result set, the more speed-up you'll see.

Going Nowhere Fast

Another common problem goes like this. In troubleshooting sessions we attend, someone complains that a filter should be optimized, but it's dog slow. He's asked to show the filter and the tags. Everything looks good for Rushmore to optimize the filter. Puzzling.

Then he shows the code he's using. Typically, it looks something like this:

SET FILTER TO <something optimizable>
GO TOP    && put filter in effect

and the light goes on. GO TOP and GO BOTTOM are not optimizable commands. They move through the records sequentially, attempting to find the first record matching the filter.

Without a filter (and with SET DELETED OFF), this isn't generally a problem. Moving to the top or bottom of the current order is pretty quick. FoxPro can either locate the first or last record in the index or, if no tag is set, move directly to the beginning or end of the file.

But when a filter is set (or DELETED is ON, which is like having a filter set), once GO gets to the first or last record in the order, it has to search sequentially for the first record that matches the filter condition. This is what's so slow. Smart like a fox, eh? What a dumb idea! This is like you writing code to go to record 10 by issuing a SKIP, asking if this is RECNO()=10, and if not, SKIPping again.

What can you do about it? Don't use GO TOP and GO BOTTOM. How do you avoid them? By using a neat trick. It turns out that LOCATE with no FOR clause goes to the first record in the current order. So, for GO TOP, you just issue LOCATE, like this:

SET FILTER TO <optimizable condition>
LOCATE   && same as GO TOP

Okay, that works for finding the first record. What about the last record? You have to stand on your head for this. Well, almost. You really have to stand the table on its head. Try it like this:

SET FILTER TO <optimizable condition>

* reverse index order
lDescending=DESCENDING()
IF lDescending
   SET ORDER TO ORDER() ASCENDING
ELSE
   SET ORDER TO ORDER() DESCENDING
ENDIF
* now Top is Bottom and Bottom is Top
LOCATE  && same as GO TOP

IF lDescending
   SET ORDER TO ORDER() DESCENDING
ELSE
   SET ORDER TO ORDER() ASCENDING
ENDIF

After setting the filter (or with a filter already in effect), you turn the index upside down. If it was ascending, you make it descending; if it was descending, you make it ascending. Then, use LOCATE to go to the first record. Since you've reversed the order, that's the last record in the order you want. Then, reverse the order again. Voila! You're on the bottom record.

By the way, the code above works only if there is an index order set. If there might be no order, you have to check for that.

One more warning. Under particular circumstances, the work-around can be very slightly slower than just using GO. In most cases, though, it tends to be an order of magnitude faster. We think it's worth it.

HAVING noWHERE Else To Go

SQL-SELECT has two clauses that filter data: WHERE and HAVING. A good grasp of the English language might lead us to believe that these are synonyms, but SQL is not English, and mixing these two indiscriminately is a sure-fire disaster in the making! It's not obvious where a particular condition should go at first glance. But getting it wrong can lead to a significant slowdown.

Here's why. The conditions in WHERE filter the original data. Wherever possible, existing index tags are used to speed things up. This produces an intermediate set of results. HAVING operates on the intermediate results, with no tags in sight. So, by definition, HAVING is slower than WHERE, if a query is otherwise constructed to be optimized.

So, when should you use HAVING? When you group data with GROUP BY and want to filter not on data from the original tables, but on "aggregate data" formed as the results of the grouping. For example, if you group customers by state, counting the number in each, and you're interested only in states with three or more customers, you'd put the condition COUNT(*)>=3 in the HAVING clause.

SELECT cState,COUNT(*) ;
       FROM Customer ;
       GROUP BY cState ;
       HAVING COUNT(*)>=3

A simple rule of thumb: Don't use HAVING unless you also have a GROUP BY. That doesn't cover all the cases, but it eliminates many mistakes. To make the rule complete, remember that a condition in HAVING should contain one of the aggregate functions (COUNT, SUM, AVG, MAX or MIN) or a field that was named with AS and uses an aggregate function.

The Only Good Header is No Header

FoxPro lets you store procedures and functions in a variety of places. But using the Project Manager gives you a strong incentive to put each routine in a separate PRG file. We generally agree with this choice.

But, if you're not careful, there's a nasty performance penalty for doing so. It turns out that having a PROCEDURE or FUNCTION statement at the beginning of a stand-alone PRG file increases the execution time by a factor of as much as 10!

You read that right. It can take 10 times as long to execute a PRG that begins with PROCEDURE or FUNCTION as one with no header. Hearing about this goodie (no, we didn't discover it ourselves), we tested a couple of other alternatives. It turns out that using DO <routine> IN <PRG file> cuts the penalty down some, but it's still twice as slow as simply eliminating or commenting out the header line.

SETting PROCEDURE TO the PRG, then calling the routine, speeds things up if you only have to do it once, but issuing SET PROCEDURE TO over and over again (as you'd need to for many different PRGs) is about 20 times slower than the slow way. That is, it's 200 times slower than omitting the header in the first place.

But wait, there's more. Not surprisingly, if the routine you're calling isn't in the current directory, but somewhere along a path you've set, it takes a little longer. For an ordinary routine with no header, the difference isn't much. Same thing if you're using SET PROCEDURE (which you shouldn't be, except for coded class libraries). However, the other two cases get a lot slower when they have to search a path. Using DO <routine> IN <PRG file> when the file isn't in the current directory is just about as slow as doing a SET PROCEDURE. But that's only the bad case. The horrible situation is calling a routine with a PROCEDURE or FUNCTION header directly—it can be as much as 1000 times slower than calling the same routine without the header!

The good news is that the path penalties go away as soon as you add the routines to a project and build an APP or EXE. That is, unless you're running in a very unusual setup, your users are unlikely to pay this price.

Bottom line. When you migrate a routine into a stand-alone PRG file, comment out the header line and just start with the code. It's easy and it'll speed up your applications considerably.

Loops Aren't Just for Belts

FoxPro offers three different ways to write a loop. Choosing the right one can make a big difference in your program. So can making sure you put only what you have to inside the loop.

Let's start with the second statement. Every command or function you put inside a loop gets executed every time through the loop. (Big surprise.) Put enough extra stuff in there and you can really slow a program down. The trick is to put each statement only where you need it. This is especially true when you've got nested loops—putting a command farther in than it has to be might mean it gets executed dozens more times than necessary.

Bottom line here: If the command doesn't depend on some characteristic of the loop (like the loop counter or the current record) and it doesn't change a variable that's changed elsewhere in the loop, it can probably go outside the loop.

Here's an example:

* Assume aRay is a 2-D array containing all numeric data
* We're looking for a row where the sum of the first three columns is 
* greater than 100
lFound = .F.
nRowCnt = 1
DO WHILE NOT lFound AND nRowCnt<=ALEN(aRay,1)
   IF aRay[nRowCnt,1]+aRay[nRowCnt,2]+aRay[nRowCnt,3]>100
      lFound = .T.
   ELSE
      lFound = .F.
      nRowCnt=nRowCnt+1
   ENDIF
ENDDO

The version below eliminates repeated calls to ALEN() and the need for the lFound variable. Benchmarks with 10,000 records show that it's almost twice as fast as the original.

nNumofRows = ALEN(aRay,1)
DO WHILE aRay[nRowCnt,1]+aRay[nRowCnt,2]+aRay[nRowCnt,3] <= 100 and ;
         nRowCnt < nNumofRows
  nRowCnt = nRowCnt + 1
ENDDO

We find we're most likely to make this particular mistake when we're dealing with nested loops, so scrutinize those especially.

What's This Good FOR?

In the case of loops that execute a fixed number of times, FOR is a better choice than DO WHILE. Because the counting and checking feature is built into FOR, it just plain goes faster than DO WHILE. In a simple test with a loop that did nothing at all except loop, FOR was more than 10 times faster than DO WHILE. Never write a loop like this:

nCnt = 1
DO WHILE nCnt <= nTopValue
   * do something here
   nCnt=nCnt+1
ENDDO

Always use this instead:

FOR nCnt = 1 TO nTopValue
   * do something here
ENDFOR

SCANning the Territory

Guess what? DO WHILE isn't the best choice for looping through records either. SCAN was designed to process a table efficiently and does it faster than DO WHILE. Our results show that SCAN is one-and-a-half to two times faster to simply go through an unordered table one record at a time. (This is where we have to come clean and admit that the phenomenal differences we reported in the original Hacker's Guide appear to have been flawed. We're seeing about the same results in VFP 3.0b and 5.0a as we are in 6.0.)

To give full disclosure, we did find that with some index orders, DO WHILE was as much as 20 percent faster. With other indexes, SCAN is faster, although it doesn't appear to have the same advantage as in an unordered table. (It's also worth noting that, with large tables, if the memory allocation to FoxPro isn't property tuned—see below—DO WHILE can be faster than SCAN.)

A word to the wise here: Don't just globally replace your DO WHILE loops with SCAN...ENDSCAN. SCAN has a built-in SKIP function—if your code already has logic to perform a SKIP within the loop, you can inadvertently skip over some records. Make sure to pull out those SKIPs.

To Wrap or Not to Wrap

One of the new capabilities that OOP gives us is "wrapper classes." These classes let us take a collection of related capabilities and put them all into a single class. The class gives us a more consistent interface to the functions involved and generally presents a tidy package.

The Connection Manager class described in the Reference section (see SQLConnect()) is pretty much a wrapper class, though it adds some capabilities. We've seen folks suggest wrapper classes for the FoxTools library (which desperately needs a consistent interface despite the addition of lots of its residents to the language). During the beta test for VFP 3, we played around on and off for months with a wrapper class for array functions that would let us stop worrying about things like the second parameter to ALEN().

On the whole, wrapper classes sound pretty attractive. Unfortunately, they also add a fair amount of overhead.

There's another way to do the same thing—just create an old-fashioned procedure file. Now that SET PROCEDURE has an ADDITIVE clause, it's no big deal to have lots of procedure libraries around. It turns out, of course, that procedure libraries also carry an overhead penalty.

Because the contents of the class or library matter so much, it's hard to produce benchmarks that give you hard and fast rules about this stuff. We tested with our embryonic array handler class, using only some of the simpler methods included (aIsArray, aElemCount, aRowCount, aColCount, aIs2D—all of which do exactly what their names suggest). We set it up as a class and as a procedure library. Then, we wrote a program that made a sample series of calls. We also wrote the same functionality in native code (ALEN() for aElemCount, ALEN(,1) for aRowCount and so on).

The sad result is that either a procedure library or a class is an order of magnitude slower than using the built-in functionality. In this example, the procedure library is faster than the class by about a third.

We also tested the same functions as stand-alone programs. The timing came out pretty much the same as the procedure library and the class. (The difference between this case and the timing reported in "The Only Good Header is No Header" is that a single SET PROCEDURE was used in this case rather than issuing SET PROCEDURE for each function call.)

Finally, we tested with everything (the test program, the stand-alone programs, the procedure file and the class definition) built into a single APP file. Using an APP improved the speed of each case a little, but didn't make a significant difference overall.

Our guess is that, as functionality becomes more complex, the overhead counts less. Given the other, overwhelming benefits of using modular code, we don't recommend you stop writing procedures. But, at this point, we can't recommend wrapper classes where a procedure library would do.

There are some benefits to a wrapper class, of course. The biggest benefit is the ability to sub-class to provide specialized behaviors. Where this is a possibility, it's worth the overhead.

What's in a Name?

You wouldn't think that a little thing like a name would matter so much. But it does. The name we're referring to is the Name property possessed by almost every object you can create in Visual FoxPro. (A few of the weird marriages of Xbase to OOP, like SCATTER NAME, produce objects without a Name property.)

When you CreateObject() an object whose class definition doesn't assign a value to the Name property, Visual FoxPro makes one up for you. That's nice. Except it insists on making it unique (usually, the class name or a variant thereof, followed by one or more digits, like Form3 or Text17). The problem is, as the number of objects of that class grows, making sure a name is unique takes longer and longer. The Microsoft folks say the time grows exponentially. We suspect that's an overstatement and that it's actually geometric. What it ain't is linear. What it really ain't is fast enough. (Before we go any further with this, we should point out that this applies only to code classes. All VCX-based classes have an implicit assignment of the Name property.)

We tested with a pair of very simple classes based on Custom. One contained nothing. The other contained an explicit assignment to Name. With 10 repetitions, the explicitly named class would instantiate so fast it couldn't be measured, but the nameless class was fast, too. By 100 repetitions, explicit naming was more than four times as fast. At 1,000 repetitions, the explicit version was eight to 10 times faster. At 5,000 of each class, explicit names are about 18 times faster than nameless objects to instantiate.

The moral of the story here is easy. Always assign a value to the Name property for any class you write in code.

Incidentally, it turns out that getting rid of all these objects once you instantiate is pretty expensive, too. Working with VCX-based classes, it took almost six times as long to destroy 5000 of the same object than to create them. With our explicitly named code class, destroying 5000 instances took two to three times as long as creating them.

Looks Can Be Deceiving

But, in this case, they're not. The form property LockScreen lets you make a series of changes to a form without the individual changes showing as you go. When you set LockScreen to .F., all the changes occur simultaneously. Visually, it's far more consistent.

We were all set to tell you that this is one of those times where the user's eyes will play tricks on him. He'll think the update is faster because he doesn't see the individual changes take place.

But guess what? The update really is faster this way. We tested a simple form with just a few controls. We changed only a few properties of the form (Height, Width, BackColor, ForeColor and Caption) once each. With LockScreen set to .T., the updates were about one-and-a half times faster. Surprise—the version that looks better is faster, too. We suspect it's because Windows only has to redraw the screen once.

What Type of Var are You?

Testing the type of a variable or field is one of those things we do a lot in our code. In VFP 6, it's something we can do faster than ever. The new VARTYPE() function is significantly faster than its predecessor, TYPE(). How much faster? With both variables and fields, we consistently find VARTYPE() three to four times as fast as TYPE().

One warning here. VARTYPE() is appropriate only for fields, variables, properties and the like. You can't use it with expressions to find out what type the result will be. In that case, you need TYPE(), which pseudo-evaluates the expression to find out the result type. VARTYPE() simply looks at what you pass it and tells you its type. So, VARTYPE("x+y") returns "C", regardless of the type of x and y, while TYPE("x+y") returns "N". So, don't throw TYPE() out of your toolkit quite yet.

How to Create an Object and Other Mysteries of Life

VFP 6 also introduces a new way to create objects. The NewObject() function lets you instantiate objects without worrying about whether you've pointed to the class library ahead of time—instead, you just include the library name in the call. CreateObject(), of course, needs a Set ClassLib or SET PROCEDURE ahead of time.

So which way is faster? As usual, the answer is "it depends." With VCX-based classes, if you can issue Set ClassLib just once and then instantiate classes from that library repeatedly, CreateObject() is the way to go. It's anywhere from one-and-a-half to eight times faster than calling NewObject() with the class library. On the other hand, if you need to load the library each time, the Set ClassLib/CreateObject() pair is more than an order of magnitude slower than NewObject().

How about for classes written in code? In that case, issuing a single SET PROCEDURE and calling CreateObject() repeatedly is an order of magnitude faster than either NewObject() or the SET PROCEDURE/CreateObject() pair, which are pretty similar.

Oh, and one more—instantiating a coded class is a little faster than instantiating a VCX-based class. However, except in the case where you're setting the library each time (which you should never do), it's not enough faster to wipe out the benefits of developing classes visually.

We tested and found no performance penalty for having a lot of class libraries open, no matter where in the list the class you're instantiating is found. So the rule here is to think about how you're going to do things before you write the code and, if possible, just keep open the class libraries you use a lot. Then use NewObject() for the one-shots, the classes from libraries you need only once in a while.

Can You Have Too Much Memory?

It turns out that, in VFP, the answer is "yes." When you start VFP, it figures out how much memory it ought to be able to use, if it needs it. The number is generally about half as much as the machine actually has. Often, the amount that VFP picks is too much.

How can you have too much memory? Like this, according to our buddy Mac Rubel, who knows more about this topic than anyone else—more even, we suspect, than the folks who wrote VFP. However much memory VFP grabs, it assumes it has that much physical memory to work with. But, because it takes so much memory, it often doesn't—some of the memory it's working with is really disk space pretending to be memory, and that's slow. By decreasing the amount of memory VFP thinks it has available, you ensure that it only uses physical memory. VFP knows what to do when it needs more memory than it has available, and it's good at that. The last thing you want happening is the operating system swapping virtual (disk) memory for real memory while FoxPro thinks it is using RAM. So, as long as you restrict it to using physical memory, things are fast, fast, fast.

Okay, so how do you that? Use the SYS(3050) function. SYS(3050,1) controls foreground memory, the memory VFP has available when it's in charge. SYS(3050,2) is for background memory—how much memory FoxPro should have when you're off doing something else. In either case, you pass it a number and it rounds that down to a number it likes (multiples of 256) and that's how much memory it uses. It even tells you how much it really took.

We were really amazed how much of a difference this setting makes. On Tamar's machine with 64MB of RAM, VFP 6 takes 35,072 MB by default. Reducing it to just under 24,000 MB (by calling SYS(3050,1,24000000)) cut the time needed for one of the particularly slow tests in this section by a factor of 4! One disclaimer here: Tamar tends to operate with lots of applications open. A typical load while working on this book was six or seven apps running (not to mention those living in the system tray, like Dial-Up Networking and a virus checker). No doubt they all take some memory.

Practice, Practice, Practice

All of the tips we've given you here should speed up your code, but your application on your LAN with your data is the true test. Differences in network throughput, the architecture of your system, the design of your tables, your choice of indexes, the phase of the moon, what's on TV that night, and so forth all make significant differences in the performance you see. Our advice is always to examine and benchmark how a particular change affects your system. Keep in mind that a single test isn't conclusive unless it can be repeated, and you need to repeat tests with caution because FoxPro and the operating system and the network and even your disk controller might be caching information.