timing_test.cc and timing_test.pl – Testing the speed of the code

Timing results

This example can be used to test how the speed of the code varies depending on the different compilation parameters. It makes use of a simple test case, featuring 100000 particles in a cube side length 2, using periodic boundary conditions. The code is very similar in structure to the basic random points example.

The timing of the Voronoi computation is carried out using the clock() function in the standard <ctime> library. On line 51, the initial value of the clock is stored. The code then calls the compute_all_cells() routine. This evaluates all of the Voronoi cells, but does not do anything further, such as compute or store their volume. Once this is finished, the duration of the run is computed by comparing the new value of the clock(), and it is printed to standard output.

The perl script “timing_test.pl” can be used to compile the code under a variety of different configurations. On line 4, a range of different values is given for the grid size used in the calculation, and on line 12, some optimization flags are specified. From lines 15 to 38, the code compiles the code with different grid sizes and computes an average running time. The grid size is passed to the compiler via the -DNNN flag that sets the preprocessor variable NNN. The average and standard deviation of the running times are printed to the standard output.

The above image shows the results of the timing test for different grid sizes. Four different machines were tested:

On all systems, the standard deviation between runs is very small. The graph shows that the best running times occur with grid sizes between 20 and 32, corresponding to between 12.5 and 3 particles per subgrid region. The best runs occur with a grid size of 27, corresponding to 5.1 particles per subgrid region. On the Mac Pro and Gentoo Linux systems, for the optimal grid size, this gives a running time of 2.55 s, equivalent to around 40000 Voronoi cells per second. The Ubuntu system has an optimal running time of 1.76 s, while the iMac system has an optimal running time of 1.36 s, equivalent to over 73000 Voronoi cells per second. The code exhibits fairly linear scaling if the number of particles is boosted. Switching to non-uniform distributions of particles increases the running time, as the complexity of some of the Voronoi cells may increase.

It is difficult to directly compare to the well-known Qhull software, since the two codes are computing different things, which Qhull returning the Voronoi mesh, and Voro++ returning the individual cells. However, a basic test of Qhull using the standard compilation options on the Mac Pro system for 100000 random points gives 8.58 seconds.

Note that care should be taken when using the C++ clock() command. On many popular systems, the value of clock periodically wraps around, and this must be taken into account when timing very long runs that last more than half an hour. For this example, where times are typically on the order of seconds, this should not pose a problem.

Code listing

 1: // Timing test example code
 2: //
 3: // Author   : Chris H. Rycroft (LBL / UC Berkeley)
 4: // Email    : [email protected]
 5: // Date     : August 30th 2011
 6: 
 7: #include <ctime>
 8: using namespace std;
 9: 
10: #include "voro++.cc"
11: using namespace voro;
12: 
13: // Set up constants for the container geometry
14: const double x_min=-1,x_max=1;
15: const double y_min=-1,y_max=1;
16: const double z_min=-1,z_max=1;
17: 
18: // Set up the number of blocks that the container is divided into. If the
19: // preprocessor variable NNN hasn't been passed to the code, then initialize it
20: // to a good value. Otherwise, use the value that has been passed.
21: #ifndef NNN
22: #define NNN 26
23: #endif
24: const int n_x=NNN,n_y=NNN,n_z=NNN;
25: 
26: // Set the number of particles that are going to be randomly introduced
27: const int particles=100000;
28: 
29: // This function returns a random double between 0 and 1
30: double rnd() {return double(rand())/RAND_MAX;}
31: 
32: int main() {
33:         clock_t start,end;
34:         int i;double x,y,z;
35: 
36:         // Create a container with the geometry given above, and make it
37:         // periodic in each of the three coordinates. Allocate space for eight
38:         // particles within each computational block.
39:         container con(x_min,x_max,y_min,y_max,z_min,z_max,n_x,n_y,n_z,
40:                         true,true,true,8);
41: 
42:         //Randomly add particles into the container
43:         for(i=0;i<particles;i++) {
44:                 x=x_min+rnd()*(x_max-x_min);
45:                 y=y_min+rnd()*(y_max-y_min);
46:                 z=z_min+rnd()*(z_max-z_min);
47:                 con.put(i,x,y,z);
48:         }
49: 
50:         // Store the initial clock time
51:         start=clock();
52: 
53:         // Carry out a dummy computation of all cells in the entire container
54:         con.compute_all_cells();
55: 
56:         // Calculate the elapsed time and print it
57:         end=clock();
58:         double runtime=double(end-start)/CLOCKS_PER_SEC;
59:         printf("%g\n",runtime);
60: }

Perl script listing

 1: #!/usr/bin/perl
 2: 
 3: # The range of grid sizes to consider
 4: @range=(10..40);
 5: 
 6: # The number of trials to consider. If this is set to one, the time for a
 7: # single trial will be outputted. For higher values, the mean of all the trials
 8: # will be outputted, along with the standard deviation.
 9: $tries=1;
10: 
11: # The flags to pass for code optimization 
12: $opt="-fast -O3";
13: 
14: foreach $r (@range) {
15: 
16:         # Compile the code with the current grid size
17:         system "g++ $opt -I../../src -DNNN=$r -o timing_test "
18:                 ."-L../../src timing_test.cc";
19: 
20:         # Carry out the trials for this grid size
21:         $st=$stt=0;
22:         foreach $t (1..$tries) {
23: 
24:                 # Run the code, and output the timing information to the
25:                 # "time_temp" file.
26:                 system "./timing_test >time_temp";
27: 
28:                 # Read the "time_temp" file to find the duration of the run
29:                 open F,"time_temp" or die "Can't open timing file: $!";
30:                 ($t)=split ' ',<F>;
31:                 $st+=$t;$stt+=$t*$t;
32:                 close F;
33:         }
34: 
35:         # Compute the mean and variance and print to standard output
36:         $st/=$tries;
37:         $stt=$stt/$tries-$st*$st;$stt=$stt>0?sqrt($stt):0;
38:         print "$r $st $stt\n";
39: }
40: 
41: # Delete the temporary timing file
42: unlink "time_temp";