I was porting the CUBE cosmological simulation code and had decided to start at the beginning. The Github repo had some demo / test code so it seemed the best way forward was to walk, not “run”, ha ha.
RNG means Random Number Generators.
FFT means (fast) Fourier Transform. The FFTW3 library is one of the popular and well-tested FFT libraries around.
This experience was a trip through debugging and hypothesizing about fftw3, compiling, testing, retesting, code-reading and more.
The cube_fft subdirectory had three F90 test programs, for testing FFTs and OpenMP and combinations. Well, once you read the code there was no OpenMP in there. But the code was doing 3-D FFTs on cubic arrays.
So build (on OpenBSD 6.6, flang 8.0.1, fftw3 3.3.8, g95 8.3.0), using included build script, and discover g95 conflicts with flang over the location of mod files. Move the files, report the bug, and switch to flang (*).
echo $p
-I/usr/local/include -I/usr/local/include/flang -L/usr/local/lib
echo $libf
-lfftw3f
flang -O3 -o testfft.x $p $libf -x f95 test.f90.orig
./testfft.x
...(snip)...
check results
NaN NaN NaN NaN
NaN NaN NaN NaN
NaN NaN
Whoah there. NaN from an FFT? Shouldn’t anything be allowed as input and therefore generate valid output?
(*) Note that flang is no longer available on OpenBSD from 6.8 onward.
My first thought was gigo, that the input to the FFT was bad, hence the output was bad. But how to test?
subroutine checknan(d,ng)
implicit none
integer, intent(in) :: ng
real, intent(in) :: d(:,:,:)
integer i,j,k
outer1: do k = 1, ng
do j = 1, ng
do i = 1, ng
if(ieee_is_nan(den(i,j,k))) then
type *, ' nan at', i, j, k
exit outer1
end if
enddo
enddo
enddo outer1
endsubroutine checknan
And run this before and after the FFT routine. But no, only the output had NaN, the input was “clean”.
So, count the NaNs:
subroutine checknan(d,ng)
implicit none
integer, intent(in) :: ng
real, intent(in) :: d(:,:,:)
integer i,j,k, nancount
nancount=0
outer1: do k = 1, ng
do j = 1, ng
do i = 1, ng
if(ieee_is_nan(den(i,j,k))) then
nancount = nancount + 1
end if
enddo
enddo
enddo outer1
if(nancount > 0) then
print *,' nans found, ', nancount
end if
endsubroutine checknan
And, as it turned out, there were lots:
nans found, 14047232
Which was closely related to the array size (306 * 304 * 304), 14047232 being (152 * 304 * 304).
It makes sense. FFT works by mixing lots of the numbers throughout the array with lots of arithmetic: a few invalid numbers could spread throughout the array.
And, sometimes, only 14046624 elements were NaNs. Huh again!
Reading the various manuals, I rediscovered the IEEE_ builtins, and the IEEE_IS_NORMAL function which tests for a number that is not denormalized, infinity, or NaN.
But IS_NORMAL always counted 14047232 on the random numbers even when NaNs numbered 14046624.
Back to the code again.
There were two processes going on in this test code. Generate some artificial data using random numbers and an angle and log density function. Then run FFT on the artificial data:
integer, parameter :: ng = 304
real den(ng + 2, ng, ng)
....
call random_number(den(:ng, :, :)) !!! fill with random [0,1)
....
temp_theta = 2*pi*den(i, j, k)
temp_r = sqrt(-2*log(1 - den(i + 1, j, k)))
den(i, j, k) = temp_r*cos(temp_theta)
den(i + 1, j, k) = temp_r*sin(temp_theta)
...
call sfftw_execute(plan_fft) ! forward transform
...
Which, in words, is a weighted cosine and sine of a random angle. The weight includes sqrt( negative * something ). Which is a hint. And the something is log(1 - something). Another hint.
Both sqrt and log are sensitive to values outside their valid domain: zero and positive for sqrt, and positive for log.
The argument to log() is supposed to be positive
since den
is filled with
random numbers strictly less than one. According to the spec of random_number
that is.
So, count more badnesses (and include hexadecimal representation of the floating point number):
if(nancount /= notnormalcount) then
outer3: do k = 1, ng
do j = 1, ng
do i = 1, ng
if((.not. ieee_is_nan (den(i,j,k))) .and. &
(.not. ieee_is_normal(den(i,j,k))) ) then
print '(A,3I5,F16.7,Z12)', ' unnormal ', i,j,k, &
den(i,j,k), den(i,j,k)
exit outer3
end if
enddo
enddo
enddo outer3
and find this output when checking the sqrt(log()) expression:
unnormal 7 165 111 -Inf FF800000
Which confirms the sqrt(log(0)) = negative infinity. And the expression above, log(1-den) is in fact the source of the problem!
Now we can check the random number generator (RNG):
subroutine checkrngone(d,ng)
implicit none
integer, intent(in) :: ng
real, intent(in) :: d(:,:,:)
integer i,j,k
outer1: do k = 1, ng
do j = 1, ng
do i = 1, ng
if(den(i,j,k) >= 1.0) then
print '(A,3I5,F16.7,Z12)', ' snbone ', i,j,k, &
den(i,j,k), den(i,j,k)
exit outer1
end if
enddo
enddo
enddo outer1
And run this test after filling the array with “random numbers less than one”. And get a hit:
snbone 81 79 104 1.0000000 3F800000
Aha! But why? Is there a bug in the runtime?
Read the code for the flang runtime (runtime/flang/random3f.c):
...
extern double drand48();
double ENT3F(RANDOM, random)(void) { return drand48(); }
...
Which is returning a double (real(8) for those of us who like it that way).
Which means, go back to read the code:
Remember this?
real den(ng + 2, ng, ng)
....
call random_number(den(:ng, :, :)) !!! fill with random [0,1)
What we were seeing, but didn’t understand, was an implied
den = round_to_real4_number(real8_number)
Any real(8) value slightly less than 1.0D0 was rounded to the real(4) value 1.0E0. Which was emphatically not a bug, because it is the standard way for Fortran conversion from real(8) to real(4). It’s not a bug in the flang runtime anyway. But it is a bug in this test code. So, to fix:
use ieee_arithmetic, only: ieee_is_nan, ieee_is_normal, &
ieee_set_rounding_mode, ieee_round_type, ieee_to_zero
...
call ieee_set_rounding_mode(ieee_to_zero)
...
call random_number(den(:ng, :, :))
call checkrngone(den, ng)
...
And after a few hundred million random numbers are generated and tested, it seems to work.
December 2019 (revised November 2020)
Links:
OpenBSD Numerics Experience - 1 - RNG
OpenBSD Numerics Experience - 2 - RNG floats
OpenBSD Numerics Experience - 3 - FFTW
OpenBSD Numerics Experience - 4 - CAF
OpenBSD Numerics Experience - 5 - MPI Networking
OpenBSD Numerics Experience - 6 - Memory Models
OpenBSD Numerics Experience - 7 - Python Image Display
OpenBSD Numerics Experience - 8 - RNGs, again
OpenBSD Numerics Experience - 9 - Nim
OpenBSD Numerics Experience - A - Graphical Display
OpenBSD Numerics Experience - B - ParaView
OpenBSD Numerics Experience - C - Numerical Debugging
OpenBSD Numerics