by Dann Corbit (USA)
Compiling faster binaries is really more a function of the excellence of the compiler than the programmer.
Currently, I do not choose to use any compiler specific instructions, because it will cause the program to fail on many machines and this makes for problems in explanations. For the most part, I just turn on all of the optimization flags, and set the calling convention to fastcall for MSVC++ (which passes things in registers and is a little quicker than cdecl calling convention).
If someone wants to make binaries that run just as fast as the ones that I create, it is as simple as downloading and installing a compiler from here:
If you fiddle around with profile guided optimization, you can make a faster binary than not using that technique. But I rarely bother with that, as it only adds a few percentage points of speed. Sometimes I do it because a good friend asks me to do that and (if they want me to), I post the binary on my site as well. The problem with profile guided optimization is that it can be machine specific – it may be faster for one CPU and yet slower for another. I tend to benchmark on machines with a large cache, and so it may actually cause performance problems on machines with a tiny cache like the Celeron CPUs.
There are a few simple optimization flags that can make the binaries run faster. But some of the flags I like to use might actually make the program run slower on your machine. The best way to be sure whether or not a particular choice makes something faster on your machine is to try it and run a benchmark.
The reason that I produce so many binaries is generally curiosity. I want to see if the program will build cleanly on my machine. I also like to see the clever problem solving ability of the programmers who write the programs.
I will generally make a few small changes to the source code for things like uninitialized variables, and those changes will be posted back into the zip file that I put on my ftp site.
The only real values added by my experiments are corrections or improvements that I find and give back to the original authors.
Sometimes, I can see something obvious, like a large object passed by value that does not change in the called function. In such a case, a change to call by reference or by pointer may have value. Or I may see some simple thing where a better algorithm might be used. Once in a while, I correct a bug that I find.
Really, there is no magic to my compiles, and you will probably find that others can make binaries as fast as or faster than mine if they put any real effort into it.
The thing you might find on my site is a port to WIN32 from a POSIX system, because I like to do those for fun, and many engines that you will run across start out as XBOARD engines rather than WINBOARD.
Other times, I might update a C++ program to use the current standard, so that it can compile on modern compilers.
With C programs, I may add function prototypes or other safeguards to make the program a bit more robust.
From time to time, I will do profiles of chess programs and send the results to the chess program author, so that they can see where the time is going.
The Intel profiler is the best one that I know of. It has many different modes and produces spectacular detail. By far, that is the best way to figure out where to spend more energy and labor trying to speed up a program. The most important reports are the one that shows time in modules and the one that shows program bottlenecks (which may not be the same thing).