world leader in high performance signal processing
Trace: » whetstone

Whetstone

The Whetstone benchmark was the first intentionally written to measure computer performance and was designed to simulate floating point numerical applications:

  • it contains a large percentage of floating point data and instructions
  • a high percentage of execution time (approximately 50%) is spent in mathematical library functions
  • the majority of its variables are global and the test will not show up the advantages of architectures such as RISC where the large number of processor registers enhance the handling of local variables;
  • Whetstone contains a number of very tight loops and the use of even fairly small instruction caches will enhance performance considerably;
  • the original program was written in Fortran using single or double precision calculations.

Whetstone history, can be found at http://homepage.virgin.net/roy.longbottom/whetstone.htm

The version of Whetstone which is in the uClinux-dist was taken from http://cm.bell-labs.com/netlib/benchmark/index.html and was written in 3/20/98 by Rich Painter, as an update of the original 1987 C version of the Whetstone benchmark. This updated version corrects a minor oversight in the original version which caused the output of the original C version to look different from the orginal Fortran version.

Compiling for Blackfin

rgetz@test:~/whetstone> bfin-uclinux-gcc -Wl,-elf2flt -O3 -ffast-math whetstone.c -o whetstone -lm
rgetz@test:~/whetstone> rcp ./whetstone root@10.64.204.125:/var/.
rgetz@test:~/whetstone> rsh -l root 10.64.204.125 /var/whetstone 5000

Loops: 5000, Iterations: 1, Duration: 9 sec.
C Converted Double Precision Whetstones: 55.6 MIPS

Results

These results were taken on the processor, with Drystone compiled as a Linux application.

Compiler Version

rgetz@pinky:~/blackfin/uclinux-dist/user/whetstone> bfin-linux-uclibc-gcc --version
bfin-linux-uclibc-gcc (ADI-trunk/svn-3648) 4.3.4
Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Summary

All these tests were taken with the same processor (BF537 0.2, running at 500MHz CCLK, 125 MHz SCLK), just varying the compiler optimization settings. 30000 (thirty thousand) iterations are used to obtain accurate results. To run these tests yourself, just do (it takes a few hours run run all the tests):

rgetz@imhotep:~/blackfin/releases/2008R1/uClinux-dist> make user/whetstone_only TONLY=results IP=10.64.204.74 | grep “|” | sort -rn -t “|” +4

Flat

testing on 500.000 MHz with bfin-uclinux-gcc

Flags 1) size Loops Duration (seconds) Double Precision Whetstones MIPS
-O0 38540 30000 321 9.3
-O0 -fomit-frame-pointer 38860 30000 319 9.4
-O0 -fomit-frame-pointer -ffast-math 38412 30000 310 9.7
-O0 -ffast-math 38124 30000 307 9.8
-O1 27452 30000 253 11.9
-O1 -fomit-frame-pointer 27452 30000 252 11.9
-O0 -mfast-fp 31492 30000 228 13.2
-O0 -fomit-frame-pointer -mfast-fp 31780 30000 228 13.2
-O0 -fomit-frame-pointer -ffast-math -mfast-fp 31364 30000 217 13.8
-O0 -ffast-math -mfast-fp 31044 30000 217 13.8
-O1 -ffast-math 24392 30000 216 13.9
-O1 -fomit-frame-pointer -ffast-math 24360 30000 215 14.0
-O1 -mfast-fp 20372 30000 190 15.8
-O1 -fomit-frame-pointer -mfast-fp 20404 30000 189 15.9
-O1 -ffast-math -mfast-fp 17380 30000 159 18.9
-O1 -fomit-frame-pointer -ffast-math -mfast-fp 17348 30000 158 19.0
-O2 27324 30000 74 40.5
-Os -fomit-frame-pointer 27004 30000 73 41.1
-Os 27052 30000 73 41.1
-O2 -fomit-frame-pointer 27340 30000 72 41.7
-O3 -fomit-frame-pointer 28868 30000 65 46.2
-O3 28916 30000 65 46.2
-O2 -mfast-fp 20276 30000 57 52.6
-Os -mfast-fp 20004 30000 56 53.6
-Os -fomit-frame-pointer -mfast-fp 19956 30000 56 53.6
-O2 -fomit-frame-pointer -mfast-fp 20260 30000 56 53.6
-O3 -fomit-frame-pointer -mfast-fp 21820 30000 50 60.0
-O3 -mfast-fp 21836 30000 49 61.2
-O2 -fomit-frame-pointer -ffast-math 24160 30000 35 85.7
-O2 -ffast-math 24176 30000 35 85.7
-Os -fomit-frame-pointer -ffast-math 23912 30000 34 88.2
-Os -ffast-math 23960 30000 34 88.2
-O3 -fomit-frame-pointer -ffast-math 25752 30000 26 115.4
-O3 -ffast-math 25800 30000 26 115.4
-Os -fomit-frame-pointer -ffast-math -mfast-fp 16900 30000 25 120.0
-Os -ffast-math -mfast-fp 16948 30000 24 125.0
-O2 -fomit-frame-pointer -ffast-math -mfast-fp 17148 30000 24 125.0
-O2 -ffast-math -mfast-fp 17164 30000 24 125.0
-O3 -fomit-frame-pointer -ffast-math -mfast-fp 18740 30000 18 166.7
-O3 -ffast-math -mfast-fp 18788 30000 18 166.7
FDPIC
1) standard CFLAGS include -pipe -Wall -g -mcpu=bf537-0.2 -DNO_PROTOTYPES=1