After about an hour, I need to ask for some help.
I ve downloaded the math benchmark Lapack to run within a few CPU configurations for the simulation tool Gem5, which lets you build X86 computers and run binaries.
To aid in my boostrapping of this simulation, the entire codebase runs within a Docker container on top of WSL2 on my Windows 10 PC. The hierarchy is rather deep, but this has been working well for me.
One issue with Lapack is that it doesn t make the generated binaries very transparent. It tries to hide everything behind make
commands. While this works most of the time, I do need to have a binary filepath that I can then use in my Gem5 configuration.
After a bit of searching around, I found the binaries. To run them, you need to provide an input file with a bunch of setup params. Here s how I run this in my Docker container:
./LIN/xlintstrfz < ztest_rfp.in
Tests of the COMPLEX*16 LAPACK RFP routines
LAPACK VERSION 3.*.0
The following parameter values will be used:
N : 0 1 2 3 5 6 10 11 50
NRHS: 1 2 15
TYPE: 1 2 3 4 5 6 7 8 9
Routines pass computational tests if test ratio is less than 30.00
Relative machine underflow is taken to be 0.222507-307
Relative machine overflow is taken to be 0.179769+309
Relative machine precision is taken to be 0.111022D-15
COMPLEX*16 RFP routines passed the tests of the error exits
All tests for ZPF drivers passed the threshold ( 2304 tests run)
All tests for ZLANHF auxiliary routine passed the threshold ( 384 tests run)
All tests for the RFP conversion routines passed ( 72 tests run)
All tests for ZTFSM auxiliary routine passed the threshold ( 7776 tests run)
All tests for ZHFRK auxiliary routine passed the threshold ( 2592 tests run)
End of tests
Total time used = 0.71 seconds
Now I need to translate this into my Gem5 configuration, which is a Python script.
binary = os.path.join(
thispath,
"../../../",
"tests/test-progs/lapack/TESTING/LIN/xlintstrfz",
)
process = Process()
process.cmd = [binary]
process.input = tests/test-progs/lapack/TESTING/ztest_rfp.in
When I run this, the program is found and the input file does start to be processed:
gem5 Simulator System. https://www.gem5.org
gem5 is copyrighted software; use the --copyright option for details.
gem5 version 22.1.0.0
gem5 compiled May 18 2023 15:42:00
gem5 started May 24 2023 00:42:19
gem5 executing on 8fa8d1b910d8, pid 26054
command line: build/GCN3_X86/gem5.opt configs/learning_gem5/part1/simple.py
Global frequency set at 1000000000000 ticks per second
build/GCN3_X86/base/statistics.hh:280: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
0: system.remote_gdb: listening for remote gdb on port 7008
Beginning simulation!
build/GCN3_X86/sim/simulate.cc:192: info: Entering event queue @ 0. Starting simulation...
Tests of the COMPLEX*16 LAPACK RFP routines
LAPACK VERSION 3.*.0
The following parameter values will be used:
N : 0 1 2 3 5 6 10 11 50
build/GCN3_X86/sim/mem_state.cc:443: info: Increasing stack size by one page.
At line 166 of file zchkrfp.f (unit = 5, file = stdin )
Fortran runtime error: Bad integer for item 3 in list input
Error termination.
As you can see, it does start to run the same binary and gets the first lines before it crashes only in the Gem5 environment.
Going to the offending file, the section related to the NRHS is (starting at line 153 of zchkrpf.f
):
*
* Read the values of NRHS
*
READ( NIN, FMT = * )NNS
IF( NNS.LT.1 ) THEN
WRITE( NOUT, FMT = 9996 ) NNS , NNS, 1
NNS = 0
FATAL = .TRUE.
ELSE IF( NNS.GT.MAXIN ) THEN
WRITE( NOUT, FMT = 9995 ) NNS , NNS, MAXIN
NNS = 0
FATAL = .TRUE.
END IF
READ( NIN, FMT = * )( NSVAL( I ), I = 1, NNS )
DO 30 I = 1, NNS
IF( NSVAL( I ).LT.0 ) THEN
WRITE( NOUT, FMT = 9996 ) NRHS , NSVAL( I ), 0
FATAL = .TRUE.
ELSE IF( NSVAL( I ).GT.MAXRHS ) THEN
WRITE( NOUT, FMT = 9995 ) NRHS , NSVAL( I ), MAXRHS
FATAL = .TRUE.
END IF
30 CONTINUE
IF( NNS.GT.0 )
$ WRITE( NOUT, FMT = 9993 ) NRHS , ( NSVAL( I ), I = 1, NNS )
*
It seems to crash at the line READ( NIN, FMT = * )( NSVAL( I ), I = 1, NNS )
but I cannot figure out why it would fail in one environment and not the other.
The contents of ztest_rfp.in
are as follows:
Data file for testing COMPLEX*16 LAPACK linear equation routines RFP format
9 Number of values of N (at most 9)
0 1 2 3 5 6 10 11 50 Values of N
3 Number of values of NRHS (at most 9)
1 2 15 Values of NRHS (number of right hand sides)
9 Number of matrix types (list types on next line if 0 < NTYPES < 9)
1 2 3 4 5 6 7 8 9 Matrix Types
30.0 Threshold value of test ratio
T Put T to test the error exits
I thought maybe there is some weird character processing that is different, but I don t know nearly enough about Fortran to identify what s going on. At this point I don t really know how to proceed since this does block any further work. I ve poked around as best as I can without any success, so I d appreciate anyone else s insight here.