Skip Navigation Links www.nws.noaa.gov 
NOAA logo - Click to go to the NOAA home page National Weather Service   NWS logo - Click to go to the NWS home page
Climate Prediction Center
 
 

 
About Us
   Our Mission
   Who We Are

Contact Us
   CPC Information
   CPC Web Team

 
HOME > Monitoring_and_Data > Oceanic and Atmospheric Data > Reanalysis: Atmospheric Data > gmerge
 

gmerge

Making Ensemble, Analysis and Forecast Means

Introduction

There are two fast techniques for making averages with wgrib2, "gmerge" and "fast averaging".

  gmerge: Input A B and C are grib analyses, 3 hours apart, chronological and 
          have identical field order
  gmerge - A B C | wgrib2 - -ave 3hr avg.grb
     avg.grb has the averages of all fields in A
     process is explained later

  fast averaging: Input A B and C are grib analyses,3 hours apart and chronological
  cat A B C | wgrib2 - -if "FIELD1" -ave 3hr avg.grb -endif \
                       -if "FIELD2" -ave 3hr avg.grb -endif \
                       ...
		       -if "FIELDN" -ave 3hr avg.grb -endif
     avg.grb has the averages of FIELD1,..,FIELDN
The "fast averaging" technique was developed first and requires multiple averages to be done at one time. The "gmerge" technique was developed for creating ensemble statistics but also applies for making averages of forecasts and analyses. The "gmerge" technique is easier to program, uses less memory but requires the fields in the grib files to be a fixed order. The "fast averaging" requires that the script generate a line of code that gets executed. This page describes the gmerge technique and the use of -merge_fcst to convert 3 hourly TMAX fields to make daily TMAX fields.

Gmerge is a program included with wgrib2 in the aux_progs directory. Gmerge is used to combine grib files by reading them in round-robin order. Assuming that the input files are in a fixed order, the output file will have the order appropriate for wgrib2 to create ensemble, analysis and forecast means efficiently.

Grib files consists of message. Suppose input files have two messages.

input1:  (TMP-1)  (HGT-1)
input2:  (TMP-2)  (HGT-2)
input3:  (TMP-3)  (HGT-3)

Suppose that you combine input1, input2 and input3 by the "cat" command.

$ cat input1 input2 input3 >out1
out1: (TMP-1) (HGT-1) (TMP-2) (HGT-2) (TMP-3) (HGT-3)

You can use gmerge to get "round robin" order of the file

$ gmerge out input1 input2 input3
out2: (TMP-1) (TMP-2) (TMP-3) (HGT-1) (HGT-2) (HGT-3)

Suppose you want calculate the average TMP and HGT. With out2, you read the data sequentially. With out1, you can create an index file, and then random access read the TMP and HGT fields. Alternatively you could adopt a solution that keeps all partial sums in memory like "fast averaging".

Usage

gmerge OUT FILE-1 FILE-2 ... FILE-N
       OUT:  -           stdout, dash is a commonly used convention for stdin/stdout
       OUT: (name)       output grib file
       FILE-i   grib file with no non-grib2 data
                N is limited by system, one less than given by
		  linux: $ ulimit -n

Using gmerge only makes sense when all the input files have (1) no non-grib2 data in them, (2) all the input files have their fields in the same order and (3) your processing program can process the fields in sequential manner. This seems quite restrictive; however, it is common. In a following section, techniques will be shown to prepare the data so that it meets the restrictions.

Ensemble Members

Wgrib2 can be used to generate ensemble statistics for members of the same ensemble. The fields need to be in the same order and the only difference in the metadata can be the number of the ensemble member. CORe used gmerge and wgrib2 for generating the statistics.

 An example from CORe (80 member ensemble)

  files: core.t03z.flx.mem001,..,core.t03z.flx.mem080

 $ gmerge - core.t03z.flx.mem0?? | wgrib2 - -ens_processing flxstats.grb 1
 1:0:d=2023111503:DLWRF:surface:anl:ENS=+1
 2:195555:d=2023111503:DLWRF:surface:anl:ENS=+2
 3:391194:d=2023111503:DLWRF:surface:anl:ENS=+3
 4:587071:d=2023111503:DLWRF:surface:anl:ENS=+4
 5:782846:d=2023111503:DLWRF:surface:anl:ENS=+5
 6:978867:d=2023111503:DLWRF:surface:anl:ENS=+6
 ...
 9197:1010069754:d=2023111503:CPRAT:surface:0-3 hour ave fcst:ENS=+77
 9198:1010134971:d=2023111503:CPRAT:surface:0-3 hour ave fcst:ENS=+78
 9199:1010200392:d=2023111503:CPRAT:surface:0-3 hour ave fcst:ENS=+79
 9200:1010266285:d=2023111503:CPRAT:surface:0-3 hour ave fcst:ENS=+80
 $ 
 $ wgrib2 flxstats.grb | head
 1:0:d=2023111503:DLWRF:surface:anl:min all members
 2:262325:d=2023111503:DLWRF:surface:anl:max all members
 3:524650:d=2023111503:DLWRF:surface:anl:ens mean
 4:786975:d=2023111503:DLWRF:surface:anl:ens spread
 ...

Analyses

Wgrib2 can be used to combine analyses into daily or monthly mean analyses.

 $ gmerge - a.* | wgrib2 - -ave DT OUT.grb
   Note: a.* must a list of files in chronologcal order.
         To make a daily mean, you have to restrict a.* to the files for that day.
         To make a monthly mean, you have to restrict a.* to the files for that month.

 An Example from CORe, making a daily mean

 $ gmerge - flx_19900111??_ensmean.grb | wgrib2 - -ave 3hr /tmp/ave.grb
 1:0:d=1990011100:DLWRF:surface:anl:ens mean
 2:168405:d=1990011103:DLWRF:surface:anl:ens mean
 3:337533:d=1990011106:DLWRF:surface:anl:ens mean
 4:506437:d=1990011109:DLWRF:surface:anl:ens mean
 5:675601:d=1990011112:DLWRF:surface:anl:ens mean
 6:844337:d=1990011115:DLWRF:surface:anl:ens mean
 7:1013963:d=1990011118:DLWRF:surface:anl:ens mean
 8:1182991:d=1990011121:DLWRF:surface:anl:ens mean
 9:1352216:d=1990011100:ULWRF:surface:anl:ens mean
 10:1514196:d=1990011103:ULWRF:surface:anl:ens mean
 ...
 918:103736065:d=1990011115:CPRAT:surface:0-3 hour ave fcst:ens mean
 919:103825356:d=1990011118:CPRAT:surface:0-3 hour ave fcst:ens mean
 920:103913495:d=1990011121:CPRAT:surface:0-3 hour ave fcst:ens mean

 $ wgrib2 /tmp/ave.grb
 1:0:d=1990011100:DLWRF:surface:8@3 hour ave(anl),missing=0:ens mean
 2:262349:d=1990011100:ULWRF:surface:8@3 hour ave(anl),missing=0:ens mean
 3:524698:d=1990011100:DSWRF:surface:8@3 hour ave(anl),missing=0:ens mean
 4:754279:d=1990011100:USWRF:surface:8@3 hour ave(anl),missing=0:ens mean
 5:1000244:d=1990011100:UGRD:10 m above ground:8@3 hour ave(anl),missing=0:ens mean
 6:1262593:d=1990011100:VGRD:10 m above ground:8@3 hour ave(anl),missing=0:ens mean
 ...

Forecasts

Combining forecast files into daily, weekly, monthly or seasonal is much more involved that with the previous files. The first problem is that the forecast hour = 0 often has fewer grib messages than the following forecasts. This is common with NCEP forecast files because accumulations and temporal averages are unavailable at t=0.

Another problem with NCEP forecast files is that accumulations, averages, maximums and minimums do not have a simple structure. For example a simple structure if all the AVEs were like "M-N hour ave fcst" where N=M+constant. However, in my sample GFS forecast, the constants were either 3 or 6.

Example from NCEP's gfs system.
$ wgrib2 gfs.t00z.pgrb2.1p00.f003 -match "ULWRF:top of atmosphere:"
658:39419182:d=2025051900:ULWRF:top of atmosphere:0-3 hour ave fcst:
$ wgrib2 gfs.t00z.pgrb2.1p00.f006 -match "ULWRF:top of atmosphere:"
658:39698487:d=2025051900:ULWRF:top of atmosphere:0-6 hour ave fcst:
$ wgrib2 gfs.t00z.pgrb2.1p00.f009 -match "ULWRF:top of atmosphere:"
658:39438111:d=2025051900:ULWRF:top of atmosphere:6-9 hour ave fcst:
$ wgrib2 gfs.t00z.pgrb2.1p00.f012 -match "ULWRF:top of atmosphere:"
658:39643948:d=2025051900:ULWRF:top of atmosphere:6-12 hour ave fcst:

Finally wgrib2 can average forecasts of different leads from the same initial conditions using the -fcst_ave option. The accumulations/averages/maximums/minimums are treated more carefully by the -merge_fcst option. This option takes two adjacent acc/ave/max/min intervals and makes a longer longer acc/ave/max/min interval.

Step 1: averages of instantaneous forecasts

Example using GFS forecast files

 list=`seq -f gfs.t00z.pgrb2.1p00.f%03.0f 24 3 45`
 : makes a list of files to process, forcast hours 24 27 .. 45
 :    list=
 :    gfs.t00z.pgrb2.1p00.f024
 :    gfs.t00z.pgrb2.1p00.f027
 ;    gfs.t00z.pgrb2.1p00.f030
 :    gfs.t00z.pgrb2.1p00.f033
 :    gfs.t00z.pgrb2.1p00.f036
 :    gfs.t00z.pgrb2.1p00.f039
 :    gfs.t00z.pgrb2.1p00.f042
 :    gfs.t00z.pgrb2.1p00.f045

 gmerge - $list | wgrib2 - -not ' (ave|min|max|acc) ' -fcst_ave 3hr ave.grb
 : gmerge - $list              process $list in round-robin order and write to stdout
 : wgrib2 -                    process grib input from stdin
 : -not ' (ave|min|max|acc) '  only process N hour fcst
 : -fcst_ave 3hr ave.grb       the fields are spaced very 3 hours

Step 2: Examining input acc/ave/min/max fields

Example using GFS forecast files

 listflx=`seq -f gfs.t00z.pgrb2.1p00.f%03.0f 24 3 45`
 gmerge - $listflx | wgrib2 - -match ' (ave|min|max|acc) ' | head
 4681:285550939:d=2025051900:TMAX:2 m above ground:18-24 hour max fcst:
 4682:285598150:d=2025051900:TMAX:2 m above ground:24-27 hour max fcst:
 4683:285644800:d=2025051900:TMAX:2 m above ground:24-30 hour max fcst:
 4684:285691467:d=2025051900:TMAX:2 m above ground:30-33 hour max fcst:
 4685:285739040:d=2025051900:TMAX:2 m above ground:30-36 hour max fcst:
 4686:285786354:d=2025051900:TMAX:2 m above ground:36-39 hour max fcst:
 4687:285834479:d=2025051900:TMAX:2 m above ground:36-42 hour max fcst:
 4688:285882184:d=2025051900:TMAX:2 m above ground:42-45 hour max fcst:
 4689:285929688:d=2025051900:TMIN:2 m above ground:18-24 hour min fcst:
 4690:285978003:d=2025051900:TMIN:2 m above ground:24-27 hour min fcst:

 The time starts 6 hour prior to the forecast hour so we need to adjust $listflx

 listflx=`seq -f gfs.t00z.pgrb2.1p00.f%03.0f 30 3 51`
 gmerge - $listflx | wgrib2 - -match ' (ave|min|max|acc) ' | head
 4681:285378854:d=2025051900:TMAX:2 m above ground:24-30 hour max fcst:
 4682:285425521:d=2025051900:TMAX:2 m above ground:30-33 hour max fcst:
 4683:285473094:d=2025051900:TMAX:2 m above ground:30-36 hour max fcst:
 4684:285520408:d=2025051900:TMAX:2 m above ground:36-39 hour max fcst:
 4685:285568533:d=2025051900:TMAX:2 m above ground:36-42 hour max fcst:
 4686:285616238:d=2025051900:TMAX:2 m above ground:42-45 hour max fcst:
 4687:285663742:d=2025051900:TMAX:2 m above ground:42-48 hour max fcst:
 4688:285710641:d=2025051900:TMAX:2 m above ground:48-51 hour max fcst:
 4689:285757042:d=2025051900:TMIN:2 m above ground:24-30 hour min fcst:
 4690:285804816:d=2025051900:TMIN:2 m above ground:30-33 hour min fcst:

 The intervals are not appropriate for -merge_fcst, try again

 listflx=`seq -f gfs.t00z.pgrb2.1p00.f%03.0f 30 6 51`

 $ echo $listflx
 gfs.t00z.pgrb2.1p00.f030 gfs.t00z.pgrb2.1p00.f036 gfs.t00z.pgrb2.1p00.f042 gfs.t00z.pgrb2.1p00.f048

 $ gmerge - $listflx | wgrib2 - -match ' (ave|min|max|acc) ' | head
 2341:142428715:d=2025051900:TMAX:2 m above ground:24-30 hour max fcst:
 2342:142475382:d=2025051900:TMAX:2 m above ground:30-36 hour max fcst:
 2343:142522696:d=2025051900:TMAX:2 m above ground:36-42 hour max fcst:
 2344:142570401:d=2025051900:TMAX:2 m above ground:42-48 hour max fcst:
 2345:142617300:d=2025051900:TMIN:2 m above ground:24-30 hour min fcst:
 2346:142665074:d=2025051900:TMIN:2 m above ground:30-36 hour min fcst:
 2347:142712663:d=2025051900:TMIN:2 m above ground:36-42 hour min fcst:
 2348:142760444:d=2025051900:TMIN:2 m above ground:42-48 hour min fcst:
 2373:144099440:d=2025051900:CPRAT:surface:24-30 hour ave fcst:
 2374:144168768:d=2025051900:CPRAT:surface:30-36 hour ave fcst:

 The intervals are good for -merge_fcst.

Step 3a: Merge the acc/ave/min/max fields

 Now the intervals for TMAX are for day 2 and can be merged

 listflx=`seq -f gfs.t00z.pgrb2.1p00.f%03.0f 30 6 51`
 gmerge - $listflx | wgrib2 - -match ' (ave|min|max|acc) ' -merge_fcst 4 acc.grb
 wgrib2 acc.grb | head
 1:0:d=2025051900:TMAX:2 m above ground:1-2 day max fcst:
 2:89798:d=2025051900:TMIN:2 m above ground:1-2 day min fcst:
 3:179596:d=2025051900:CPRAT:surface:1-2 day ave fcst:
 4:301974:d=2025051900:PRATE:surface:1-2 day ave fcst:
 5:408062:d=2025051900:APCP:surface:1-2 day acc fcst:
 6:506005:d=2025051900:ACPCP:surface:1-2 day acc fcst:
 7:587658:d=2025051900:WATR:surface:1-2 day acc fcst:
 8:640148:d=2025051900:CSNOW:surface:1-2 day ave fcst:
 9:648496:d=2025051900:CICEP:surface:1-2 day ave fcst:
 10:648699:d=2025051900:CFRZR:surface:1-2 day ave fcst:

 Finally combine ave.grb and acc.grb

 cat ave.grb acc.grb >day2.grb

Step 3b: Average the acc/ave/min/max fields

In step 3a, the fields were merged, the time intervals were made longer. This is more appropriate for processing into daily files.
 listflx=`seq -f gfs.t00z.pgrb2.1p00.f%03.0f 30 6 51`
 gmerge - $listflx | wgrib2 - -match ' (ave|min|max|acc) ' -fcst_ave 6hr flx.grb
 wgrib2 flx.grb
 $ wgrib2 flx.grb | head
 1:0:d=2025051900:TMAX:2 m above ground:4@6 hour ave(24-30 hour max fcst)++,missing=0:
 2:89810:d=2025051900:TMIN:2 m above ground:4@6 hour ave(24-30 hour min fcst)++,missing=0:
 3:179620:d=2025051900:CPRAT:surface:4@6 hour ave(24-30 hour ave fcst)++,missing=0:
 4:302010:d=2025051900:PRATE:surface:4@6 hour ave(24-30 hour ave fcst)++,missing=0:
 ...

 4@6 hour ave(24-30 hour max fcst)++,missing=0
   4@6 hour ave
     You average 4 fields which are separated by 6 hours.
     (24-30 hour max fcst)++
       The first field is the 24-30 hour max fcst, the max in the 24-30 hour forecast.
       The second field has 24 and 30 incremented by 6 hours, "30-36 hour max fcst"
       ...
  So the grib metadata precisely describes the statistical operation done in a
  perhaps difficult to understand manner.

Combining Steps 3a and 3b

The above approach is not ideal for TMAX and TMIN. You will want merge TMAX and TMIN into daily TMAX and TMIN. Then you can average the daily TMAX and TMIN values. So you would use step 3a to make daily files by merging. Then you use step 3b to average the daily files.

So creating forecast means is a two processes for NCEP forecast files. First you create the means for the "N hour fcst" fields using the -fcst_ave option. Then you have to examine the files for acc/ave/min/max fields. This may require a different set of files to process. Then you process the acc/ave/max/min fields using step 3a, 3b or both.

Does my data have a constant field order

For operational NCEP models, the field order will change with major upgrades when new fields are added to the grib file. That can't be avoided, but how about the order in the same same version of the model. Most of NCEP's model output is created using the Unified Post Processor (ncep post) and the order is specified by the control file.

Removing non-grib2 data

 $ wgrib2 IN.grb -grib OUT.grb

Selecting and removing fields

 $ wgrib2 IN.grb -match "(A|B|C)" -not "(G|H)" -grib OUT.grb
   keep A, B, or C
   remove G and H

Sorting fields

 $ wgrib2 IN.grb | sort (whatever) | wgrib2 IN.grb -i -grib OUT.grb
   puts the fields in a sorted order

Windows Compatibility

The above examples may not work in Windows. It's a problem of mixing text and binary I/O to stdin and stdout.

See also: -ens_processing, -ave, -fcst_ave, -merge_fcst,


NOAA/ National Weather Service
National Centers for Environmental Prediction
Climate Prediction Center
5830 University Research Court
College Park, Maryland 20740
Climate Prediction Center Web Team
init: 5/2025
Disclaimer Privacy Policy