![]()
| ||||||
ICOADS Web information page (Wednesday, 11-May-2016 19:34:47 UTC): Release 1c (1784-1949) Beta1. Introduction Preliminary ("beta") Release 1c (1784-1949) data are completed. This page provides some background, plus comparison results against original COADS Release 1 data for 1854-1949. After additional analysis of the beta data and assessment of duplicate elimination (dupelim) performance, it is planned to rerun Release 1c processing to produce final observational and statistical products for this period, with LMRF availability planned by mid-February 2001. 2. Data source additions Following is a brief discussion of the new input datasets. Note that report counts are as input to dupelim, after initial quality controls and conversion to LMR. As a result some counts reflect some changes from earlier (raw) report estimates (e.g., the Maury Collection was previously estimated at 1.4M reports). a) Blend of the UK Main Marine Data Bank (MDB) with COADS for the period 1854-1949 (decks 201-255; 11.7M reports): Copies of TD-1100 decks in MDB were deleted (9.6M reports during 1854-1949) prior to this stage. b) Maury Collection (deck 701; 1784-1863; 1.3M reports): This deck provides the only data for 1784-1803, and substantial new data additions after that. c) Norwegian Logbook Collection (deck 702; 1867-89; 201K reports). d) Japanese Kobe Collection data (deck 762; 1890-1932; 1M reports): These are data more recently keyed by Japan (decks 118-119, which are among the COADS COADS Release 1 data, were keyed in the 1960s). e) US Merchant Marine 1912-46 Collection (decks 705-707; 3.5M reports): A few data also included back to 1910. f) Russian Makarov Collection (deck 731; 1804-1891; 3.5K reports): 27 ships including the "Vitiaz" in two partially overlapping collections. g) World Ocean Database 1998 (WOD98; deck 780; 405K reports), including sea surface temperature estimates derived from the uppermost layers of ocean profiles, and some surface meteorological fields (CTD and XBT archives were outside of the Release 1c period). h) Arctic drift stations (deck 734): For this period the deck includes two Norwegian ships overwintering in the Arctic: i) Data from the North Polar expedition of the "Fram" (1893-96, North of 76N; 8K reports) were obtained from Volker Wagner at the Deutscher Wetterdienst (German Weather Service), and with the assistance of the US National Snow and Ice Data Center (NSIDC). ii) NCDC-keyed data covering 1922-24 (7K reports) from the North Polar expedition with the "Maud" (1918-25) were obtained also with the assistance of NSIDC. i) Russian AARI North Pole (NP) Station (manned drifting ice floe) data from the Polar Science Center (deck 733; NP-1 for 1937-38; 1K reports). j) Russian MARMET (deck 732; 268K reports starting about 1888) marine meteorological archive (previously known as MORMET). Figures 1-2 illustrate new data additions made to the Release 1c period, in comparison to existing Release 1 data (note that in some cases significant reductions in report counts occur as a result of dupelim processing): Figure 1: Dupelim output: 1796-1889 deck composition. Decks originally used for Release 1, and also output for Release 1c, are aggregated under the gray bar. Other bar colors are used for decks new to Release 1c. The line shows the total dupelim output for Release 1, for comparison. Figure 2: Dupelim output: 1890-1949 deck composition (otherwise as for Fig. 1). Notes: i) The Release 1 dupelim output as shown in Figures 1-2 is based on counts of Compressed Marine Reports (CMR), in which uncertain duplicates had been removed. The beta output is based on LMR, but the counts were reduced to account for most uncertain duplicates (see sec. 4c), plus for landlocked reports. After adjustment the counts correspond exactly to LMRF, but are only roughly comparable with CMR counts (for reasons including retention in CMR of landlocked reports). (This approximate relationship is not estimated to have a major impact on items ii-iii.) ii) In many cases in Figure 2 (e.g., 1904-14), fewer reports from Release 1 decks were output in the beta (i.e., the line is above the tops of the gray bars). We think this usually indicates that decks new to Release 1c were selected instead. For example, some of the UK MDB decks thereby replaced inferior or less complete copies of the data that were originally included in Release 1. iii) Conversely, in some cases (e.g., 1864) the line falls below the tops of the gray bars. This means that more data from the Release 1 decks were output in the beta than were output for Release 1. Sec. 4d describes a duplicate elimination problem that we believe accounts for this unexpected inflation (undetected duplicates). 3. Data problems Examples follow, but it should be noted that many of the early collections may be lacking in metadata or contain data problems not listed here (e.g., unadjusted pressures or magnetic wind directions): a) Maury Collection: Most of the wind directions may be magnetic. We are still looking into the feasibility of adjusting directions based on historical fields of magnetic declination (NOAA/NGDC may be able to help with this). In the absence of any metadata as to instrument type, barometers were all assumed to be mercurial and adjusted for gravity, and also adjusted for temperature if attached thermometer data were available (a flag was set indicating whether one or both corrections were made). Detailed information about the conversion of this Collection to LMR format, including corrections made to temperatures, is available here: icoads.noaa.gov/maury.html Further examination of the temperatures and other data is underway at NCDC (e.g., to explore whether Reaumur temperatures are embedded among those now labeled Celsius). b) Dutch (deck 193) sea level pressures: Pressures were recovered from the supplemental attachment and adjusted for gravity. This accounts partially for large increases in pressure data coverage (see sec. 4) particularly in the 19th century. However, an estimated 3% of the data may have been taken with aneroid barometers, and thus should not have been adjusted for gravity (the problem appears unresolvable at this time, due to a lack of metadata). c) Dupelim problems: Additional tuning of the dupelim procedure still appears needed to ensure that more unique data are retained, such as a "pass though" of some decks that were subject to comparison with other decks during the beta run. Also, some rules (e.g., exact time/space match) that were developed for Release 1a processing appear to have been too stringent for these earlier data (in the beta version of LMRF we retained exact time/ space uncertain duplicates, i.e., DS=6, to alleviate some problems). d) Undetected duplicates: An additional dupelim problem in the beta run was that duplicates went undetected for some combinations of German, Dutch, HSST, and MDB decks, because of no allowance for small sea level pressure differences. For example, this occurred in Dutch (deck 193) versus HSST (decks 155-156) matches as a side-effect from recovery of SLP in deck 193. Based on sample matches, the recovered SLP values tended to match HSST decks to about 0.1 hPa. But this altered dupelim performance, since SLP was expected to match exactly. We are addressing this in the rerun by extending to deck 193 (and similarly to other known deck combinations impacted by SLP differences) an existing deck 192-HSST allowance (#4), that considered pressures to match if they agreed to whole hPa. 4. Comparisons of near-global (62N-62S) time-series using "concurrent" 2° boxes Comparisons (see Appendix for plot details) are presented for two periods: 1854-99, and 1900-49. Dataset1 is COADS Release 1, and dataset2 is the COADS Release 1c beta (output from dupelim) for the overlapping periods (Release 1c extends back to 1784 with some new data). Year-month summaries for 2° boxes were calculated from the beta data for 1854-1949, such that the data were trimmed at 4.5 sigma and all platform types were included (e.g., some oceanographic data become available in the late 19th century). Figure 3a: 1854-99: Sea surface temperature. Figure 3b: 1854-99: Air temperature. Figure 3c: 1854-99: Scalar wind. Figure 3d: 1854-99: Sea level pressure. Figure 3e: 1854-99: Total cloudiness. Figure 4a: 1900-49: Sea surface temperature. Figure 4b: 1900-49: Air temperature. Figure 4c: 1900-49: Scalar wind. Figure 4d: 1900-49: Sea level pressure. Figure 4e: 1900-49: Total cloudiness. Notes on Figures 3a-3e and 4a-4e: i) Pushing the print button will print all the figures in a set (e.g., 3a-3e), since they are all colocated on the same page. =============================================================================== Appendix: Explanation of each 4-panel plot page comparing two datasets a) Departures For each variable, 2° latitude/longitude boxes in the region 62°N-62°S were included in the comparison only if they possessed data "concurrently" (i.e., for a given year-month) from both dataset1 (left) and dataset2 (right). This ensures a comparable grid within each monthly time step, but not a frozen grid through time. Departures were calculated for each year-month-2° box, and each dataset, of monthly means with respect to a basic 1950-79 COADS Release 1 long-term monthly mean (LTM). (Note that the data used to construct the LTM were also means, not medians.) Each concurrent 2° box departure value was cosine-weighted,* from which the area weighted average was computed. The top panel contains two separate curves of the area-weighted average departures (black = dataset1; green = dataset2). It should be emphasized that this type of comparison does not reveal anything about data patterns in either dataset outside of the region defined by the set of concurrent boxes for each year-month. b) Differences The green curve is the area-weighted average, as calculated for plot a), of the difference between 2° monthly means for dataset1 minus dataset2 (i.e., both datasets must possess a monthly mean for a given 2° box to be included in the difference). This set of boxes may be larger than the set of concurrent boxes used for the departures.** The black curve, which corresponds almost exactly (and is thus invisible on many of the plots), is the non-area-weighted average of the differences. c) 2° boxes The black curve shows the number of concurrent*** 2° boxes; the green curve shows that number plus the number of non-concurrent 2° boxes containing only data from dataset2 (the number of non-concurrent 2° boxes in dataset1, if any, is not shown). The green curve also includes any 2° boxes containing data in dataset2 that were not available in the 1950-79 LTM; such boxes were also not included in plot a). d) Numbers of observations Using only the set of concurrent**** boxes for each year-month, the green curve shows the number of observations for dataset2, and the black curve shows the corresponding number of observations for dataset1. Curves are not shown for any additional observations falling in 2° boxes represented in either dataset outside of the set of concurrent boxes. ---------- * The method used by GrADS employs the "delta of the sin of the latitudes at the edges of the grid box," rather than the central latitude of the box. ** The set of boxes used for the differences (plot b) may be a superset of the set of concurrent boxes used for the departures (plot a), because we did not require that the LTM exist for a given box-month for it to be included in the differences. Moreover, the set of "concurrent" boxes used for the counts of boxes and observations (plots c and d) is that defined by the differences, not the departures. The problem arises because the trimming limits (owing to interpolation and extrapolation), as well as the Release 1c data, may be more extensive (covering more boxes) than the 1950-79 LTM. Note that if one were comparing two untrimmed datasets, which might therefore both contain many box-months not represented in the 1950-79 LTM, the set of boxes used for the differences might be considerably different than that used for the departures. *** The set of concurrent boxes is that defined by the differences, which may be a superset of that defined by the departures (see footnote under b). **** The set of concurrent boxes is that defined by the differences, which may be a superset of that defined by the departures (see footnote under b).
[Delayed-mode (ICOADS.DM) Archive][Release 1c (1784-1949)]
U.S. National Oceanic and Atmospheric Administration hosts the icoads website privacy disclaimer Document maintained by icoads@noaa.gov Updated: May 11, 2016 19:34:47 UTC http://www.icoads.noaa.gov/r1c_beta.html |