3. Building and Running the UFS Weather Model

3.1. Supported Platforms & Compilers

Before running the Weather Model (WM), users should determine which of the levels of support is applicable to their system. Generally, Level 1 & 2 systems are restricted to those with access through NOAA and its affiliates. These systems are named (e.g., Hera, Orion, Derecho). Level 3 & 4 systems include certain personal computers or non-NOAA-affiliated HPC systems. The prerequisite software libraries for building the WM already exist in a centralized location on Level 1/preconfigured systems, so users may skip directly to getting the data and downloading the code. On other systems, users will need to build the prerequisite libraries using spack-stack.

3.2. Prerequisite Libraries

The UFS WM requires a number of libraries. The WM uses two categories of libraries, which are available as a bundle via spack-stack:

  1. NCEP libraries (NCEPLIBS): These are libraries developed for use with NOAA weather models. Most have an NCEPLIBS prefix in the repository (e.g., NCEPLIBS-bacio). Select tools from the UFS Utilities repository (UFS_UTILS) are also included in this category.

  2. Third-party libraries (NCEPLIBS-external): These are libraries that were developed externally to the UFS Weather Model. They are general software packages that are also used by other community models. Building these libraries is optional if users can point to existing builds of these libraries on their system instead.

3.2.1. Common Modules

As of February 24, 2025, the UFS WM Regression Tests (RTs) on Level 1 systems use the following common modules:

bacio/2.4.1
crtm/2.4.0
esmf/8.6.0
fms/2024.01
g2/3.5.1
g2tmpl/1.13.0
gftl-shared/1.6.1
hdf5/1.14.0
ip/4.3.0
jasper/2.0.32
libpng/1.6.37
mapl/2.40.3-esmf-8.6.0
netcdf-c/4.9.2
netcdf-fortran/4.6.1
parallelio/2.5.10
scotch/7.0.4
sp/2.5.0
w3emc/2.10.0
zlib/1.2.13

The most updated list of common modules can be viewed in ufs_common.lua here.

Attention

Documentation is available for installing spack-stack. Spack-stack (or the libraries it contains) must be installed before running the UFS Weather Model.

3.3. Get Data

The WM RTs require input files to run. These include static datasets, files that depend on grid resolution and initial/boundary conditions, and model configuration files. On Level 1 and 2 systems, the data required to run the WM RTs are already available at the following DISKNM locations:

Table 3.1 Data Locations ($DISKNM) for Level 1 & 2 Systems

Machine

File location

Derecho

/glade/derecho/scratch/epicufsrt/ufs-weather-model/RT/

Gaea-C6

/gpfs/f6/bil-fire8/world-shared/role.epic/UFS-WM_RT

Hera

/scratch2/NAGAPE/epic/UFS-WM_RT

Hercules

/work/noaa/epic/hercules/UFS-WM_RT

Jet (Level 2)

/mnt/lfs5/HFIP/hfv3gfs/role.epic/RT

NOAA Cloud (Level 2)

/contrib/ufs-weather-model/RT

Orion

/work/noaa/epic/UFS-WM_RT

S4 (Level 2)

/data/prod/emc.nemspara/RT

WCOSS2

/lfs/h2/emc/nems/noscrub/emc.nems/RT

Within DISKNM, input data for the UFS WM is located at the following locations:

  • INPUTDATA_ROOT: ${DISKNM}/NEMSfv3gfs/input-data-20240501

  • INPUTDATA_ROOT_WW3 ${INPUTDATA_ROOT}/WW3_input_data_20250212

  • INPUTDATA_ROOT_BMIC: ${DISKNM}/NEMSfv3gfs/BM_IC-20220207

  • INPUTDATA_LM4: ${INPUTDATA_ROOT}/LM4_input_data

For Level 3-4 systems, the data must be added to the user’s system. Publicly available data is available in the UFS WM Data Bucket. Baseline data for the develop branch is available for the most recent 60 days. The regression testing script (rt.sh) has certain default data directories (i.e., INPUTDATA_*) that users may need to change when working on Level 3-4 systems. The corresponding data is publicly available in the data bucket. To view the data, users can visit https://noaa-ufs-regtests-pds.s3.amazonaws.com/index.html. Users can download the data and update the rt.sh script to point to the appropriate locations in order to run RTs on their own system:

To download data, users must select the files they want from the bucket and download them either in their browser, via a wget command, or through the AWS CLI.

Detailed information on input files can be found in Chapter 4.

3.4. Downloading the Weather Model Code

To clone the develop branch of the ufs-weather-model repository and update its submodules, execute the following commands:

git clone --recursive https://github.com/ufs-community/ufs-weather-model.git
cd ufs-weather-model

Compiling the model will take place within the ufs-weather-model directory created by the clone command.

3.5. Building the Weather Model

Note

The most straightforward way to run the UFS WM is to use the regression testing (RT) framework. The RT framework will load modulefiles, build (compile) the desired WM configuration, and run the test(s). Users can create new tests or modify existing tests to correspond to the WM configuration(s) they wish to run. This section is provided for those who do not want to use the RT framework to run the WM. However, most users should skip to Section 3.6 to learn more about RT configuration or Section 3.7 to build/run the WM with the RT framework.

3.5.1. Loading the Required Modules

The process for loading modules is fairly straightforward on NOAA Level 1 Systems. Users may need to make adjustments when running on other systems.

3.5.1.1. On NOAA Level 1 & 2 Systems

Modulefiles for preconfigured platforms are located in modulefiles/ufs_<platform>.<compiler>. For example, to load the modules from the ufs-weather-model directory on Hera:

module use modulefiles
module load ufs_hera.intel

Note that loading this module file will also set the CMake environment variables shown in Table 3.2.

Table 3.2 CMake environment variables required to configure the build for the Weather Model

EnvironmentVariable

Description

Hera Intel Value

CMAKE_C_COMPILER

Name of C compiler

mpiicc

CMAKE_CXX_COMPILER

Name of C++ compiler

mpiicpc

CMAKE_Fortran_COMPILER

Name of Fortran compiler

mpiifort

CMAKE_Platform

String containing platform and compiler name

hera.intel

3.5.1.2. On Other Systems

If you are not running on one of the pre-configured platforms, you will need to set the environment variables manually. For example, in a bash shell, a command in the following form will set the C compiler environment variable:

export CMAKE_C_COMPILER=</path/to/C/compiler>

3.5.2. Setting the CMAKE_FLAGS and CCPP_SUITES Environment Variables

The UFS Weather Model can be built in one of several configurations (see Table 4.1 for common options). The CMAKE_FLAGS environment variable specifies which configuration to build using the -DAPP and -DCCPP_SUITES variables. Users set which components to build using -DAPP. Users select the CCPP suite(s) by setting the CCPP_SUITES environment variable at build time in order to have one or more CCPP physics suites available at runtime. Multiple suites can be set. Additional variables, such as -D32BIT=ON, can be set if the user chooses. These options are documented in Section 7.1.3. The following examples assume a bash shell.

3.5.2.1. ATM Configurations

Standalone ATM

For the ufs-weather-model ATM configuration (standalone ATM):

export CMAKE_FLAGS="-DAPP=ATM -DCCPP_SUITES=FV3_GFS_v16"

ATMW

For the ufs-weather-model ATMW configuration (standalone ATM coupled to WW3):

export CMAKE_FLAGS="-DAPP=ATMW -DCCPP_SUITES=FV3_GFS_v16"

ATMAERO

For the ufs-weather-model ATMAERO configuration (standalone ATM coupled to GOCART):

export CMAKE_FLAGS="-DAPP=ATMAERO -DCCPP_SUITES=FV3_GFS_v17_p8"

ATMAQ

For the ufs-weather-model ATMAQ configuration (standalone ATM coupled to CMAQ):

export CMAKE_FLAGS="-DAPP=ATMAQ -DCCPP_SUITES=FV3_GFS_v15p2"

ATML

For the ufs-weather-model ATML configuration (standalone ATM coupled to LND):

export CMAKE_FLAGS="-DAPP=ATML -DCCPP_SUITES=FV3_GFS_v17_p8"

ATMF

For the ufs-weather-model ATMF configuration (standalone ATM coupled to UFS Fire):

export CMAKE_FLAGS="-DAPP=ATMF -DCCPP_SUITES=FV3_HRRR -D32BIT=ON"

ATM_DS2S

For the ufs-weather-model ATM_DS2S configuration (ATM/DOCN/DICE):

export CMAKE_FLAGS="-DAPP=ATM_DS2S  -DCCPP_SUITES=FV3_GFS_v17_coupled_p8_ugwpv1"

ATM_DS2S-PCICE

For the ufs-weather-model ATM_DS2S-PCICE configuration (ATM/DOCN/CICE6 [prescribed ice mode]):

export CMAKE_FLAGS="-DAPP=ATM_DS2S-PCICE -DCCPP_SUITES=FV3_GFS_v17_coupled_p8"

3.5.2.2. S2S Configurations

S2S

For the ufs-weather-model S2S configuration (coupled atm/ice/ocean):

export CMAKE_FLAGS="-DAPP=S2S -DCCPP_SUITES=FV3_GFS_v17_coupled_p8"

To turn on debugging flags, add -DDEBUG=ON flag after -DAPP=S2S. Users can allow verbose build messages by running:

export BUILD_VERBOSE=1

To receive atmosphere-ocean fluxes from the CMEPS mediator, add the argument -DCMEPS_AOFLUX=ON. For example:

export CMAKE_FLAGS="-DAPP=S2S -DCCPP_SUITES=FV3_GFS_v17_coupled_p8_sfcocn -DCMEPS_AOFLUX=ON"

S2SA

For the ufs-weather-model S2SA configuration (atm/ice/ocean/aerosols):

export CMAKE_FLAGS="-DAPP=S2SA -DCCPP_SUITES=FV3_GFS_2017_coupled,FV3_GFS_v15p2_coupled,FV3_GFS_v16_coupled,FV3_GFS_v16_coupled_noahmp"

S2SW

For the ufs-weather-model S2SW configuration (atm/ice/ocean/wave):

export CMAKE_FLAGS="-DAPP=S2SW -DCCPP_SUITES=FV3_GFS_v17_coupled_p8"

S2SWA

For the ufs-weather-model S2SWA configuration (atm/ice/ocean/wave/aerosols):

export CMAKE_FLAGS="-DAPP=S2SWA -DCCPP_SUITES=FV3_GFS_v17_coupled_p8,FV3_GFS_cpld_rasmgshocnsstnoahmp_ugwp"

S2SWAL

For the ufs-weather-model S2SWAL configuration (atm/ice/ocean/wave/aerosols/land):

export CMAKE_FLAGS="-DAPP=S2SWAL -DCCPP_SUITES=FV3_GFS_v17_coupled_p8,FV3_GFS_v17_coupled_p8_ugwpv1"

3.5.2.3. NG-GODAS Configuration

For the ufs-weather-model NG-GODAS configuration (atm/ocean/ice/data assimilation):

export CMAKE_FLAGS="-DAPP=NG-GODAS"

3.5.2.4. HAFS Configurations

HAFS

For the ufs-weather-model HAFS configuration (atm/ocean) in 32 bit:

export CMAKE_FLAGS="-DAPP=HAFS -D32BIT=ON -DCCPP_SUITES=FV3_HAFS_v0_gfdlmp_tedmf_nonsst,FV3_HAFS_v0_gfdlmp_tedmf"

HAFSW

For the ufs-weather-model HAFSW configuration (atm/HYCOM/wave) in 32-bit with moving nest:

export CMAKE_FLAGS="-DAPP=HAFSW -D32BIT=ON -DMOVING_NEST=ON -DCCPP_SUITES=FV3_HAFS_v0_gfdlmp_tedmf,FV3_HAFS_v0_gfdlmp_tedmf_nonsst,FV3_HAFS_v0_thompson_tedmf_gfdlsf"

HAFS-MOM6W

For the ufs-weather-model HAFS-MOM6 configuration (atm/MOM6/wave) in 32-bit with moving nest:

export CMAKE_FLAGS="-DAPP=HAFS-MOM6W -DREGIONAL_MOM6=ON -DCDEPS_INLINE=ON -DMOVING_NEST=ON -DCCPP_SUITES=FV3_HAFS_v1_gfdlmp_tedmf,FV3_HAFS_v1_gfdlmp_tedmf_nonsst,FV3_HAFS_v1_thompson,FV3_HAFS_v1_thompson_nonsst -D32BIT=ON"

HAFS-ALL

For the ufs-weather-model HAFS-ALL configuration (data/atm/ocean/wave) in 32 bit:

export CMAKE_FLAGS="-DAPP=HAFS-ALL -D32BIT=ON -DCCPP_SUITES=FV3_HAFS_v0_gfdlmp_tedmf,FV3_HAFS_v0_gfdlmp_tedmf_nonsst"

3.5.2.5. Land Configurations

LND

For the ufs-weather-model LND configuration (DATM/land [NOAHMP]):

export CMAKE_FLAGS="-DAPP=LND"

LM4

For the ufs-weather-model LND-LM4 configuration (DATM/land [LM4]):

export CMAKE_FLAGS="-DAPP=LND-LM4"

3.5.3. Building the Model

The UFS Weather Model uses the CMake build system. There is a build script called build.sh in the top-level directory of the WM repository that configures the build environment and runs the make command. This script also checks that all necessary environment variables have been set.

If any of the environment variables have not been set, the build.sh script will exit with a message similar to:

./build.sh: line 11: CMAKE_Platform: Please set the CMAKE_Platform environment variable, e.g. [macosx.gnu|linux.gnu|linux.intel|hera.intel|...]

The WM can be built by running the following command from the ufs-weather-model directory:

./build.sh

Once build.sh is finished, users should see the executable, named ufs_model, in the ufs-weather-model/build/ directory. If users prefer to build in a different directory, specify the BUILD_DIR environment variable. For example: export BUILD_DIR=test_cpld will build in the ufs-weather-model/test_cpld directory instead.

Expert help is available through GitHub Discussions. Users may post questions there for help with difficulties related to the UFS WM.

3.6. Test Configuration

Note

This section explains how forecasts are configured using the regression test (RT) framework. For a full list of supported RT configurations, view the rt.conf file or visit the tests/tests directory.

The UFS Weather Model (WM) can be run in any of several configurations, from a single-component atmospheric model to a fully coupled model with multiple earth system components (e.g., atmosphere, ocean, sea-ice, land, and mediator). Each RT test configuration file (located in the tests/tests directory) sets default variables by calling functions from tests/default_vars.sh. Then, the test file sets test-specific variables. These values will override the defaults.

3.6.1. default_vars.sh

default_vars.sh first sets a series of machine-specific variables. It also contains several functions that set defaults for different types of tests. Table 3.3 describes what each function does.

Table 3.3 default_vars.sh functions

Function Name

Description

export_fv3_v16

Set variables to the FV3 default values for GFS v16 cases. This section will be removed once support for GFSv16 is officially depricated.

export_fv3

Set variables to the FV3 default values.

export_tiled

Set default values for tiled grid namelist.

export_ugwpv1

Set default values for the Unified Gravity Wave Drag Physics v1.

export_cice6

Set default values for the CICE6 model namelist and mx100.

export_mom6

Set default values for the MOM6 model namelist and mx100.

export_ww3

Set default values for the WW3 global model.

export_fire_behavior

Set default values for the Fire Behavior model.

export_cmeps

Set default values for the coupled 5-component tests using CMEPS.

export_cpl

Set default values for coupled / S2S configurations.

export_35d_run

Set default values for EMC’s weekly coupled benchmark 35d tests (see rt_35d.conf).

export_datm_cdeps

Set default values for configurations that use the data atmosphere (DATM) component.

export_hafs_datm_cdeps

Set default values for HAFS configurations that use the data atmosphere (DATM) component.

export_hafs_docn_cdeps

Set default values for HAFS configurations that use the data ocean (DOCN) component.

export_hafs_regional

Set default values for regional HAFS configurations.

export_hafs

Set default values for HAFS configurations.

export_hrrr

Set default values for HRRR test configurations.

export_hrrr_conus13km

Set default values for hrrr_conus13km test configurations.

export_rap_common

Set default values that are common to RAP and RRFS v1 test configurations.

export_rap

Set default values for RAP test configurations.

export_rrfs_v1

Set default values for RRFS v1 test configurations.

Multiple default_vars.sh functions may be called in a given test, usually starting with the most general function and ending with the most specific. Values set in one function will be overridden when the same values are set in a subsequent function.

3.6.2. Test Files

Individual test files typically start with an export TEST_DESCR statement describing the test, followed by an export CNTL_DIR statement indicating the name of the directory that contains the baselines for the experiment. Next, an export LIST_FILES statement indicates which files the test expects to output from the model run. This list often includes RESTART files. After the LIST_FILES statement, the tests typically call functions from default_vars.sh to set default values.

For example, the hafs_regional_atm_ocn_wav test file lists the files that it will output and then calls three export_* functions from default_vars.sh, moving from the most general to the most specific:

export LIST_FILES="atmf006.nc \
                sfcf006.nc \
                archv.2019_241_06.a \
                archs.2019_241_06.a \
                20190829.060000.out_grd.ww3 \
                20190829.060000.out_pnt.ww3 \
                ufs.hafs.ww3.r.2019-08-29-21600.nc \
                ufs.hafs.cpl.r.2019-08-29-21600.nc"

export_fv3
export_hafs
export_hafs_regional

Lastly, the test configuration file sets any test-specific variables for the experiment. These variables will override the default values from default_vars.sh. In the excerpt below, ... indicates omitted lines:

export HAFS=true
export FHMAX=6
export RESTART_N=${FHMAX}
export DT_ATMOS=180
export IDEFLATE=1
export OUTPUT_FH='3 -1'
export OUTPUT_FILE="'netcdf' 'netcdf'"
export SDAY=29
export SHOUR=00
export SMONTH=08
export SYEAR=2019

...

export CDEPS_DOCN=false
export OCEAN_START_DTG=43340.00000

export atm_model=fv3
export ocn_model=hycom
export wav_model=ww3
OCN_tasks=60
WAV_tasks=60
export coupling_interval_sec=360
export MESH_ATM=unset

export FIELD_TABLE=field_table_hafs
export DIAG_TABLE=diag_table_hafs_template
export INPUT_NML=input_regional_hafs.nml.IN
export MODEL_CONFIGURE=model_configure_hafs.IN
export UFS_CONFIGURE=ufs.configure.hafs_atm_ocn_wav.IN
export FV3_RUN="hafs_fv3_run.IN hycom_hat10_run.IN hafs_ww3_run.IN"

if [[ $MACHINE_ID = orion ]]; then
WLCLK=40
fi
...

3.6.3. Creating New Tests

Users are welcome to modify current tests for their own use or create new tests to facilitate their own research. When creating a test, users will need to add a row for the test in rt.conf or in their own custom file. See Section 3.6.4 for more information.

Typically, when a developer needs to create a new test for his/her implementation, the first step would be to identify a test in the tests/tests directory that can be used as a basis and to examine the variables defined in the test file. The names of appropriate template files for model configuration and initial conditions can be identified via variables INPUT_NML, UFS_CONFIGURE, MODEL_CONFIGURE and FV3_RUN by running grep -n INPUT_NML * inside the tests and tests/tests directories.

3.6.4. The rt.conf File

The rt.conf file is a pipe-separated values (PSV) file grouped into sections of tests with a COMPILE line followed by several RUN lines. The COMPILE line contains information needed to compile the tests, while the RUN lines contain information on specific tests. COMPILE lines have 6 columns:

  1. COMPILE indicator

  2. Compile name – a category of test to compile

  3. Compiler to use in build (intel or gnu)

  4. CMAKE Options – Provides all CMAKE options for the build. This typically includes the -DAPP and -DCCPP_SUITES flags; these flags set which components to build and which physics suites will be available at runtime. Additional options are documented in Section 7.1.3, but users can examine the CMakeLists.txt file for the most up-to-date list of options.

  5. Machines to run on (- is used to ignore specified machines, + is used to run only on specified machines). For example:

    • + hera orion gaea: Compile will only run on Hera, Orion, and Gaea machines

    • - wcoss2 acorn: Compile will NOT be run on WCOSS2 or Acorn

  6. fv3: Set as fv3. Previously, this was used to run a test without compiling code (e.g., if FV3 was already present).

After each compile line is one or more RUN lines. RUN lines have five columns. The build resulting from the COMPILE line above the RUN line will be used to run the tests.

  1. RUN indicator

  2. Test name – indicates which test in the tests/tests directory should be sourced.

  3. Machines to run on (+) or ignore (-).

  4. Baseline Creation – controls whether the run creates its own baseline or uses the baseline from a different (control) test (see information on -c option below for more).

  5. Comparison Test – Test name to compare baselines with if not itself.

The order of lines in rt.conf matters since rt.sh processes them sequentially; a RUN line should be preceeded by a COMPILE line that builds the model used in the test. The following rt.conf file excerpt builds the standalone ATM model with GFS_v16 physics in 32-bit mode and then runs the control test:

COMPILE | s2swa_32bit_pdlib  | intel | -DAPP=S2SWA -D32BIT=ON -DCCPP_SUITES=FV3_GFS_v17_coupled_p8_ugwpv1 -DPDLIB=ON | - noaacloud | fv3 |
RUN | cpld_control_gfsv17                               | - noaacloud                          | baseline |
RUN | cpld_control_gfsv17_iau                           | - noaacloud                          | baseline | cpld_control_gfsv17
RUN | cpld_restart_gfsv17                               | - noaacloud                          |          | cpld_control_gfsv17
RUN | cpld_mpi_gfsv17                                   | - noaacloud                          |          |

The rt.conf file includes a large number of tests. If the user wants to run only specific tests, s/he can either (1) comment out the tests to be skipped (using the # prefix) or (2) create a new file (e.g., my_rt.conf), add the tests, and execute ./rt.sh -l my_rt.conf.

3.7. Running the Model

Attention

Although the following discussions are general, users may not be able to execute the script successfully “as is” unless they are on a Tier-1 platform.

3.7.1. Using the Regression Test Script

Users can run a number of preconfigured regression test cases from the rt.conf file using the regression test script rt.sh in the tests directory. rt.sh is the top-level script that calls lower-level scripts to build specified WM configurations, set up environments, and run tests. Users should edit the rt.conf file to indicate which tests/configurations to run or create their own configuration file (e.g., my_tests.conf) with the subset of tests they want to run.

3.7.1.1. On NOAA RDHPCS

On Tier-1 platforms, users can run regression tests by editing the rt.conf file and executing:

./rt.sh -a <account> -l rt.conf

where <account> is to the account/project number where users submit their batch jobs. Users may need to add additional command line arguments or change information in the rt.sh file as well. This information is provided in Section 3.7.1.3 below.

3.7.1.2. On Other Systems

Users on non-NOAA systems will need to make adjustments to several files in the tests directory before running rt.sh, including:

  • rt.sh

  • run_test.sh

  • detect_machine.sh

  • default_vars.sh

  • fv3_conf/fv3_slurm.IN_*

  • fv3_conf/compile_slurm.IN_*

  • compile.sh

  • module-setup.sh

3.7.1.3. The rt.sh File

This section contains additional information on command line options and troubleshooting for the rt.sh file.

3.7.1.3.1. Optional Arguments

To display detailed information on how to use rt.sh, users can simply run ./rt.sh, which will output the following options:

./rt.sh -a <account> | -b <file> | -c | -d | -e | -h | -k | -l <file> | -m | -n <name> | -o | -r | -v | -w
   -a  <account> to use on for HPC queue
   -b  create new baselines only for tests listed in <file>
   -c  create new baseline results
   -d  delete run directories that are not used by other tests
   -e  use ecFlow workflow manager
   -h  display this help
   -k  keep run directory after rt.sh is completed
   -l  runs test specified in <file>
   -m  compare against new baseline results
   -n  run single test <name>
   -o  compile only, skip tests
   -r  use Rocoto workflow manager
   -v  verbose output
   -w  for weekly_test, skip comparing baseline results

When running a large number (10’s or 100’s) of tests, the -e or -r options can significantly decrease testing time by using a workflow manager (ecFlow or Rocoto, respectively) to queue the jobs according to dependencies and run them concurrently. The -n option can be used to run a single test; for example, ./rt.sh -a epic -n "control_c48 intel" will build the ATM model and run the control_c48 test with an Intel compiler using the “epic” account (users should substitute an account where they can charge computational resources). The -c option is used to create a baseline. New baselines are needed when code changes lead to result changes and therefore deviate from existing baselines on a bit-for-bit basis.

To run rt.sh using a custom configuration file and the Rocoto workflow manager, create the configuration file (e.g. my_rt.conf) based on the desired tests in rt.conf, and run:

./rt.sh -r -l my_rt.conf

adding additional arguments as desired.

To run a single test, users can try the following command instead of creating a my_rt.conf file:

./rt.sh -r -k -n "control_p8 <compiler>"

where <compiler> is gnu or intel.

3.7.1.3.2. Troubleshooting

Users may need to adjust certain information in the rt.sh file, such as the Machine and Account variables ($MACHINE_ID and $ACCNR), for the tests to run correctly. If there is a problem with these or other variables (e.g., file paths), the output should indicate where:

+ echo 'Machine: ' hera.intel '    Account: ' nems
Machine:  hera.intel     Account:  nems
+ mkdir -p /scratch1/NCEPDEV/stmp4/First.Last
mkdir: cannot create directory ‘/scratch1/NCEPDEV/stmp4/First.Last’: Permission denied
++ echo 'rt.sh error on line 370'
rt.sh error on line 370

Then, users can adjust the information in rt.sh accordingly.

3.7.1.4. Log Files

The regression test generates a number of log files. The summary log file RegressionTests_<machine>.<compiler>.log in the tests directory compares the results of the test against the baseline for a given platform and reports the outcome:

  • 'Missing file' results when the expected files from the simulation are not found and typically occurs when the simulation did not run to completion;

  • 'OK' means that the simulation results are bit-for-bit identical to those of the baseline;

  • 'NOT OK' when the results are not bit-for-bit identical; and

  • 'Missing baseline' when there is no baseline data to compare against.

More detailed log files are located in the tests/log_<machine>.<compiler>/ directory. The run directory path, which corresponds to the value of RUNDIR in the run_<test-name> file, is particularly useful. $RUNDIR is a self-contained (i.e., sandboxed) directory with the executable file, initial conditions, model configuration files, environment setup scripts and a batch job submission script. The user can run the test by navigating into $RUNDIR and invoking the command:

sbatch job_card

This can be particularly useful for debugging and testing code changes. Note that $RUNDIR is automatically deleted at the end of a successful regression test; specifying the -k option retains the $RUNDIR, e.g. ./rt.sh -l rt.conf -k.

Inside the $RUNDIR directory are a number of model configuration files (input.nml, model_configure, ufs.configure) and other application dependent files (e.g., ice_in for the Subseasonal-to-Seasonal Application). These model configuration files are generated by rt.sh from the template files in the tests/parm directory. Specific values used to fill in the template files are test-dependent and are set in two stages. First, default values are specified in tests/default_vars.sh, and the default values are overriden if necessary by values specified in a test file tests/tests/<test-name>. For example, the variable DT_ATMOS is initially assigned 1800 in the function export_fv3 of the script default_vars.sh, but the test file tests/tests/control overrides this setting by reassigning 720 to the variable.

The files fv3_run and job_card also reside in the $RUNDIR directory. These files are generated from the template files in the tests/fv3_conf directory. job_card is a platform-specific batch job submission script, while fv3_run prepares the initial conditions for the test by copying relevant data from the input data directory of a given platform to the $RUNDIR directory. Table 3.4 summarizes the subdirectories discussed above.

Table 3.4 Regression Test Subdirectories

Name

Description

tests/

Regression test root directory. Contains rt-related scripts and the summary log file

tests/tests/

Contains specific test files

tests/parm/

Contains templates for model configuration files

tests/fv3_conf/

Contains templates for setting up initial conditions and a batch job

tests/log_*/

Contains fine-grained log files

3.7.2. Using the Operational Requirement Test Script

The operational requirement test script opnReqTest in the tests directory can be used to run tests in place of rt.sh. Given the name of a test, opnReqTest carries out a suite of test cases. Each test case addresses an aspect of the requirements that new operational implementations must satisfy. These requirements are shown in Table 3.5. For the following discussions on opnReqTest, the user should note the distinction between 'test name' and 'test case'. Examples of test names are control, cpld_control and regional_control which are all found in the tests/tests directory, whereas test case refers to any one of the operational requirements: thr, mpi, dcp, rst, bit and dbg.

Table 3.5 Operational Requirements

Case

Description

thr

Varying the number of threads produces the same results

mpi

Varying the number of MPI tasks produces the same results

dcp

Varying the decomposition (i.e. tile layout of FV3) produces the same results

rst

Restarting produces the same results

bit

Model can be compiled in double/single precision and run to completion

dbg

Model can be compiled and run to completion in debug mode

The operational requirement testing uses the same testing framework as the regression tests, so it is recommended that the user first read Section 3.7.1. All the files in the subdirectories shown in Table 3.4 are relevant to the operational requirement test. The only difference is that the opnReqTest script replaces rt.sh. The tests/opnReqTests directory contains opnReqTest-specific lower-level scripts used to set up run configurations.

On Tier-1 platforms, tests can be run by invoking

./opnReqTest -n <test-name>

For example, ./opnReqTest -n control performs all six test cases listed in Table 3.5 for the control test. At the end of the run, a log file OpnReqTests_<machine>.<compiler>.log is generated in the tests directory, which informs the user whether each test case passed or failed. The user can choose to run a specific test case by invoking

./opnReqTest -n <test-name> -c <test-case>

where <test-case> is one or more comma-separated values selected from thr, mpi, dcp, rst, bit, dbg. For example, ./opnReqTest -n control -c thr,rst runs the control test and checks the reproducibility of threading and restart.

The user can see different command line options available to opnReqTest by executing ./opnReqTest -h, which produces the following results:

Usage: opnReqTest -n <test-name> -a <account> [ -c <test-case> ] [-b] [-d] [-e] [-k] [-h] [-x] [-z]

   -a  specify HPC <account> to use for batch job
   -n  specify <test-name>
   -c  specify <test-case>
       defaults to all test-cases: thr,mpi,dcp,rst,bit,dbg,fhz
       comma-separated list of any combination of std,thr,mpi,dcp,rst,bit,dbg,fhz
   -b  test reproducibility for bit; compare against baseline
   -d  test reproducibility for dbg; compare against baseline
   -s  test reproducibility for std; compare against baseline
   -e  use ecFlow workflow manager
   -k  keep run directory
   -h  display this help and exit
   -x  skip compile
   -z  skip run

Frequently used options are -e to use the ecFlow workflow manager, and -k to keep the $RUNDIR. The Rocoto workflow manager is not used operationally and therefore is not an option.

As discussed in Section 3.7.1.4, the variables and values used to configure model parameters and to set up initial conditions in the $RUNDIR directory are set up in two stages. First, tests/default_vars.sh define default values; then a specific test file in the tests/tests subdirectory either overrides the default values or creates new variables if required by the test. The regression test treats the different test cases shown in Table 3.5 as different tests. Therefore, each test case requires a test file in the tests/tests subdirectory. Examples include control_2threads, control_decomp, control_restart and control_debug, which are just variations of the control test to check various reproducibilities. There are two potential issues with this approach. First, if several different variations of a given test were created and included in the rt.conf file, there would be too many tests to run. Second, if a new test is added by the user, s/he will also have to create these variations. The idea behind the operational requirement test is to automatically configure and run these variations, or test cases, given a test file. For example, ./opnReqTest -n control will run all six test cases in Table 3.5 based on a single control test file. Similarly, if the user adds a new test new_test, then ./opnReqTest -n new_test will run all test cases. This is done by the operational requirement test script opnReqTest by adding a third stage of variable overrides. The related scripts can be found in the tests/opnReqTests directory.