timing_test.cc and timing_test.pl – Testing the speed of the code
This example can be used to test how the speed of the code varies depending on the different compilation parameters. It makes use of a simple test case, featuring 100000 particles in a cube side length 2, using periodic boundary conditions. The code is very similar in structure to the basic random points example.
The timing of the Voronoi computation is carried out using the
clock()
function in the standard <ctime>
library. On line 51, the initial value of the clock is stored. The code then
calls the compute_all_cells()
routine. This evaluates all of the
Voronoi cells, but does not do anything further, such as compute or store their
volume. Once this is finished, the duration of the run is computed by comparing
the new value of the clock()
, and it is printed to standard
output.
The perl script “timing_test.pl” can be used to compile the code
under a variety of different configurations. On line 4, a range of different
values is given for the grid size used in the calculation, and on line 12, some
optimization flags are specified. From lines 15 to 38, the code compiles the
code with different grid sizes and computes an average running time. The grid
size is passed to the compiler via the -DNNN
flag that sets the
preprocessor variable NNN
. The average and standard deviation of
the running times are printed to the standard output.
The above image shows the results of the timing test for different grid sizes. Four different machines were tested:
- A Mac Pro with 2.66 GHz Dual-Core
Intel Xeon processor, using the Apple build of GCC 4.2, with the
-fast
optimization flag. - A Gentoo Linux system with 2.66 GHz Opteron processor, using GCC 4.6 with
-O3 -march=native
optimization flags. - An Ubuntu Linux system with 3.33 GHz Xeon processor, using the Intel compiler with the
-O3
optimization flag. - A iMac with 3.4 GHz Intel Core i7 processor, using GCC 4.5 with
-O3 -march=core2
optimization flags.
On all systems, the standard deviation between runs is very small. The graph shows that the best running times occur with grid sizes between 20 and 32, corresponding to between 12.5 and 3 particles per subgrid region. The best runs occur with a grid size of 27, corresponding to 5.1 particles per subgrid region. On the Mac Pro and Gentoo Linux systems, for the optimal grid size, this gives a running time of 2.55 s, equivalent to around 40000 Voronoi cells per second. The Ubuntu system has an optimal running time of 1.76 s, while the iMac system has an optimal running time of 1.36 s, equivalent to over 73000 Voronoi cells per second. The code exhibits fairly linear scaling if the number of particles is boosted. Switching to non-uniform distributions of particles increases the running time, as the complexity of some of the Voronoi cells may increase.
It is difficult to directly compare to the well-known Qhull software, since the two codes are computing different things, which Qhull returning the Voronoi mesh, and Voro++ returning the individual cells. However, a basic test of Qhull using the standard compilation options on the Mac Pro system for 100000 random points gives 8.58 seconds.
Note that care should be taken when using the C++ clock()
command. On many popular systems, the value of clock periodically wraps around,
and this must be taken into account when timing very long runs that last more
than half an hour. For this example, where times are typically on the order of
seconds, this should not pose a problem.
Code listing
1: // Timing test example code 2: // 3: // Author : Chris H. Rycroft (LBL / UC Berkeley) 4: // Email : [email protected] 5: // Date : August 30th 2011 6: 7: #include <ctime> 8: using namespace std; 9: 10: #include "voro++.cc" 11: using namespace voro; 12: 13: // Set up constants for the container geometry 14: const double x_min=-1,x_max=1; 15: const double y_min=-1,y_max=1; 16: const double z_min=-1,z_max=1; 17: 18: // Set up the number of blocks that the container is divided into. If the 19: // preprocessor variable NNN hasn't been passed to the code, then initialize it 20: // to a good value. Otherwise, use the value that has been passed. 21: #ifndef NNN 22: #define NNN 26 23: #endif 24: const int n_x=NNN,n_y=NNN,n_z=NNN; 25: 26: // Set the number of particles that are going to be randomly introduced 27: const int particles=100000; 28: 29: // This function returns a random double between 0 and 1 30: double rnd() {return double(rand())/RAND_MAX;} 31: 32: int main() { 33: clock_t start,end; 34: int i;double x,y,z; 35: 36: // Create a container with the geometry given above, and make it 37: // periodic in each of the three coordinates. Allocate space for eight 38: // particles within each computational block. 39: container con(x_min,x_max,y_min,y_max,z_min,z_max,n_x,n_y,n_z, 40: true,true,true,8); 41: 42: //Randomly add particles into the container 43: for(i=0;i<particles;i++) { 44: x=x_min+rnd()*(x_max-x_min); 45: y=y_min+rnd()*(y_max-y_min); 46: z=z_min+rnd()*(z_max-z_min); 47: con.put(i,x,y,z); 48: } 49: 50: // Store the initial clock time 51: start=clock(); 52: 53: // Carry out a dummy computation of all cells in the entire container 54: con.compute_all_cells(); 55: 56: // Calculate the elapsed time and print it 57: end=clock(); 58: double runtime=double(end-start)/CLOCKS_PER_SEC; 59: printf("%g\n",runtime); 60: }
Perl script listing
1: #!/usr/bin/perl 2: 3: # The range of grid sizes to consider 4: @range=(10..40); 5: 6: # The number of trials to consider. If this is set to one, the time for a 7: # single trial will be outputted. For higher values, the mean of all the trials 8: # will be outputted, along with the standard deviation. 9: $tries=1; 10: 11: # The flags to pass for code optimization 12: $opt="-fast -O3"; 13: 14: foreach $r (@range) { 15: 16: # Compile the code with the current grid size 17: system "g++ $opt -I../../src -DNNN=$r -o timing_test " 18: ."-L../../src timing_test.cc"; 19: 20: # Carry out the trials for this grid size 21: $st=$stt=0; 22: foreach $t (1..$tries) { 23: 24: # Run the code, and output the timing information to the 25: # "time_temp" file. 26: system "./timing_test >time_temp"; 27: 28: # Read the "time_temp" file to find the duration of the run 29: open F,"time_temp" or die "Can't open timing file: $!"; 30: ($t)=split ' ',<F>; 31: $st+=$t;$stt+=$t*$t; 32: close F; 33: } 34: 35: # Compute the mean and variance and print to standard output 36: $st/=$tries; 37: $stt=$stt/$tries-$st*$st;$stt=$stt>0?sqrt($stt):0; 38: print "$r $st $stt\n"; 39: } 40: 41: # Delete the temporary timing file 42: unlink "time_temp";