Annual QA/QC Pipeline: 2025 Data

Published

April 22, 2026

Introduction

This document contains the complete data ingestion, QA/QC evaluation, flagging, and CDX export workflow for the 2025 Kenai River Baseline Water Quality Monitoring dataset.

The structure of this document follows the Data Evaluation Checklist provided by the Alaska Department of Environmental Conservation. Field observations that do not meet QA/QC standards are flagged before upload to the EPA Water Quality Exchange (WQX).

For the 2021 worked example (with full narrative and documentation), see the report repo: https://github.com/Kenai-Watershed-Forum/kenai-river-wqx


Year Configuration


Part A: Data Ingestion

2025-specific notes:

  • Spring sampling: April 30, 2025. Summer sampling: July 23, 2025.
  • SGS EDD delivered as XLSX (Sheet8) for both seasons. No ALS dissolved metals supplement this year.
  • No BTEX (8260D) analysis in spring 2025; BTEX analyzed at 4 sites in summer 2025.
  • TSS from SWWTP: both seasons read from the Updated_Formatting sheet. Spring source is KRWF TSS MONITORING 05-01-25.xlsx; summer is KRWF TSS MONITORING 07-25-25.xls (skip = 1 past title row). The original spring .xls (wide block format) is retained in the same directory for reference.
  • FC from SWWTP in standard XLS format (spring: skip = 11; summer: skip = 10). Summer time sampled stored as HHMM integer.
  • YSI ProQuatro / Hach turbidimeter data: single file covering both seasons, read in below.

SGS Lab Results

Lab QC rows written: 460 
Summer lab QC rows written: 436 
SGS field rows: 567  (spring: 233 | summer: 316 | trip blank expansion adds 2 x 9 rows)
SGS sites: 22 
SGS analytes: 20 

Fecal Coliform Results

FC rows: 46  (spring: 22 | summer: 24 )

Total Suspended Solids

TSS rows: 48  (spring: 24 | summer: 24 )

YSI ProQuatro / Turbidity Field Measurements

YSI rows: 222 
YSI sites: 22 
YSI parameters: Dissolved Oxygen, pH, Specific Conductance, Turbidity, Water Temperature 

Bind All Results into dat

Total dat rows: 883 
Analytes: 1,2-Dichloroethane-D4 (surr), 4-Bromofluorobenzene (surr), Arsenic, Benzene, Cadmium, Calcium, Chromium, Copper, Dissolved Oxygen, Ethylbenzene, Fecal Coliform, Iron, Lead, Magnesium, o-Xylene, P & M -Xylene, pH, Specific Conductance, Toluene, Toluene-d8 (surr), Total Nitrate/Nitrite-N, Total Phosphorus, Total suspended solids, Turbidity, Water Temperature, Xylenes (total), Zinc 
Sites: 22 

Part B: WQX Formatting

Provisional Results (Prior to QA/QC Review)

The following table summarizes results after WQX formatting, before QA/QC evaluation.

            characteristic_name result_analytical_method_id activity_start_date
1                       Arsenic                       200.8          2025-04-30
2                       Cadmium                       200.8          2025-04-30
3                       Calcium                       200.8          2025-04-30
4                      Chromium                       200.8          2025-04-30
5                        Copper                       200.8          2025-04-30
6              Dissolved Oxygen                       360.1          2025-04-30
7                Fecal Coliform                       9222D          2025-04-30
8                          Iron                       200.8          2025-04-30
9                          Lead                       200.8          2025-04-30
10                    Magnesium                       200.8          2025-04-30
11         Specific Conductance                       120.1          2025-04-30
12      Total Nitrate/Nitrite-N                 4500-NO3(F)          2025-04-30
13             Total Phosphorus                    4500-P-E          2025-04-30
14       Total suspended solids                      2540-D          2025-04-30
15                    Turbidity                       180.1          2025-04-30
16            Water Temperature                       170.1          2025-04-30
17                         Zinc                       200.8          2025-04-30
18                           pH                       150.1          2025-04-30
19 1,2-Dichloroethane-D4 (surr)                       8260D          2025-07-23
20  4-Bromofluorobenzene (surr)                       8260D          2025-07-23
21                      Arsenic                       200.8          2025-07-23
22                      Benzene                       8260D          2025-07-23
23                      Cadmium                       200.8          2025-07-23
24                      Calcium                       200.8          2025-07-23
25                     Chromium                       200.8          2025-07-23
26                       Copper                       200.8          2025-07-23
27             Dissolved Oxygen                       360.1          2025-07-23
28                 Ethylbenzene                       8260D          2025-07-23
29               Fecal Coliform                       9222D          2025-07-23
30                         Iron                       200.8          2025-07-23
31                         Lead                       200.8          2025-07-23
32                    Magnesium                       200.8          2025-07-23
33                P & M -Xylene                       8260D          2025-07-23
34         Specific Conductance                       120.1          2025-07-23
35                      Toluene                       8260D          2025-07-23
36            Toluene-d8 (surr)                       8260D          2025-07-23
37      Total Nitrate/Nitrite-N                 4500-NO3(F)          2025-07-23
38             Total Phosphorus                    4500-P-E          2025-07-23
39       Total suspended solids                      2540-D          2025-07-23
40                    Turbidity                       180.1          2025-07-23
41            Water Temperature                       170.1          2025-07-23
42              Xylenes (total)                       8260D          2025-07-23
43                         Zinc                       200.8          2025-07-23
44                     o-Xylene                       8260D          2025-07-23
45                           pH                       150.1          2025-07-23
    n
1  15
2  15
3  23
4  15
5  30
6  18
7  22
8  23
9  15
10 23
11 21
12 22
13 22
14 24
15 21
16 21
17 30
18 21
19  8
20  8
21 17
22  8
23 17
24 26
25 17
26 34
27 24
28  8
29 24
30 26
31 17
32 26
33  8
34 24
35  8
36  8
37 24
38 24
39 24
40 24
41 24
42  8
43 34
44  8
45 24

Part C: QA/QC Checklist

Prior to upload to the EPA WQX, all water quality data is checked against a standard Data Evaluation Checklist developed in coordination with the Alaska Department of Environmental Conservation.

Pre-Database

Overall Project Success

IN PROGRESS HERE 4/22/2026

1.) Were the appropriate analytical methods used for all parameters?


2.) Were appropriate QA/QC procedures followed in the field and laboratory?


3.) Were the appropriate number of samples collected?

# A tibble: 42 × 4
# Groups:   result_analytical_method_id, activity_start_date, activity_type
#   [42]
   result_analytical_method…¹ activity_start_date activity_type actual_results_n
   <chr>                      <date>              <chr>                    <int>
 1 120.1                      2025-04-30          Field Msr/Obs               19
 2 120.1                      2025-04-30          Quality Cont…                2
 3 120.1                      2025-07-23          Field Msr/Obs               22
 4 120.1                      2025-07-23          Quality Cont…                2
 5 150.1                      2025-04-30          Field Msr/Obs               19
 6 150.1                      2025-04-30          Quality Cont…                2
 7 150.1                      2025-07-23          Field Msr/Obs               22
 8 150.1                      2025-07-23          Quality Cont…                2
 9 170.1                      2025-04-30          Field Msr/Obs               19
10 170.1                      2025-04-30          Quality Cont…                2
# ℹ 32 more rows
# ℹ abbreviated name: ¹​result_analytical_method_id


4.) Do the laboratory reports provide results for all sites and parameters?


5.) Is a copy of the Chain of Custody included with the laboratory reports?


6.) Do the laboratory reports match the Chain of Custody and requested methods throughout?


7.) Are the number of samples on the laboratory reports the same as on the Chain of Custody?


8.) Was all supporting info provided in the laboratory report, such as reporting limits?


9.) Are site names, dates, and times correct and as expected?

   monitoring_location_id activity_start_date
1                10000002          2025-04-30
2                10000002          2025-07-23
3                10000005          2025-04-30
4                10000005          2025-07-23
5                10000008          2025-04-30
6                10000008          2025-07-23
7                10000015          2025-04-30
8                10000015          2025-07-23
9                10000016          2025-04-30
10               10000016          2025-07-23
11               10000017          2025-04-30
12               10000017          2025-07-23
13               10000018          2025-04-30
14               10000018          2025-07-23
15               10000020          2025-04-30
16               10000020          2025-07-23
17               10000021          2025-04-30
18               10000021          2025-07-23
19               10000022          2025-04-30
20               10000022          2025-07-23
21               10000023          2025-04-30
22               10000023          2025-07-23
23               10000024          2025-04-30
24               10000024          2025-07-23
25               10000025          2025-04-30
26               10000025          2025-07-23
27               10000026          2025-04-30
28               10000026          2025-07-23
29               10000027          2025-04-30
30               10000027          2025-07-23
31               10000028          2025-04-30
32               10000028          2025-07-23
33               10000029          2025-04-30
34               10000029          2025-07-23
35               10000030          2025-04-30
36               10000030          2025-07-23
37               10000031          2025-04-30
38               10000031          2025-07-23
39               10000032          2025-04-30
40               10000032          2025-07-23
41               10000424          2025-04-30
42               10000424          2025-07-23
43               10000425          2025-04-30
44               10000425          2025-07-23


10.) Were there any issues with instrument calibration?


11.) Did the instruments perform as expected?


12.) Was instrument calibration performed according to the QAPP?


13.) Was instrument verification during the field season performed according to the QAPP?


14.) Were instrument calibration verification logs kept?


15.) Do instrument data file site IDs, timestamps, and filenames match?


16.) Is any in-situ field data rejected and why?


17.) Were preservation, hold time, and temperature requirements met?

Hold time failures: 0 
# A tibble: 0 × 15
# ℹ 15 variables: sample <chr>, epa_analysis_id <chr>, analyte <chr>,
#   collect_date <date>, collect_time <time>, lab_name <chr>, rec_date <date>,
#   rec_time <time>, activity_datetime <dttm>, rec_datetime <dttm>,
#   hold_time_hours <dbl>, result_analytical_method_id <chr>,
#   max_holding_time_text <chr>, max_holding_time_hours <dbl>,
#   hold_time_pass <chr>


18.) Are dissolved metal quantities less than total metals quantities?

Copper — rows with total >= dissolved: 24 of 24 
Zinc — rows with total >= dissolved: 14 of 24 


19.) Are the duplicate sample(s) RPD within range described in QAPP?

RPD-eligible pairs: 34 
Pairs exceeding QAPP threshold: 6 
# A tibble: 6 × 13
  monitoring_location_id activity_start_date characteristic_name    
                   <dbl> <chr>               <chr>                  
1               10000022 2025-04-30          Zinc                   
2               10000022 2025-04-30          Total Nitrate/Nitrite-N
3               10000018 2025-04-30          Fecal Coliform         
4               10000018 2025-04-30          Total suspended solids 
5               10000022 2025-04-30          Total suspended solids 
6               10000022 2025-04-30          Turbidity              
# ℹ 10 more variables: result_detection_condition <chr>, result_unit <chr>,
#   result_detection_limit_type_1 <chr>, result_detection_limit_value_1 <dbl>,
#   result_detection_limit_unit_1 <chr>,
#   quality_control_field_replicate_msr_obs <dbl>, field_msr_obs <dbl>,
#   rpd_eligible <lgl>, rpd_pct <dbl>, threshold <dbl>


20.) Were there any laboratory discrepancies, errors, data qualifiers, or QC failures?

Matrix spike recovery failures: 5 
  collect_date sample_type result                 analyte analytical_method
1   2025-04-30          MS   5.93 Total Nitrate/Nitrite-N    SM21 4500NO3-F
2   2025-04-29          MS   6.34 Total Nitrate/Nitrite-N    SM21 4500NO3-F
3   2025-04-29         MSD   6.27 Total Nitrate/Nitrite-N    SM21 4500NO3-F
4   2025-05-08          MS   9.66 Total Nitrate/Nitrite-N    SM21 4500NO3-F
5   2025-05-08         MSD   9.43 Total Nitrate/Nitrite-N    SM21 4500NO3-F
  resultflag percent_recovered rec_limit_low rec_limit_high sample_rpd
1          =             114.0            90            110         NA
2          =             125.0            90            110         NA
3          =             124.0            90            110        1.1
4          =              88.0            90            110         NA
5          =              83.5            90            110        2.4
  rpd_limit_low rpd_limit_high loq  lod sample_condition rec_limit_pass
1            NA             NA 0.2 0.15               NA              N
2            NA             NA 0.2 0.15               NA              N
3             0             25 0.2 0.15               NA              N
4            NA             NA 0.2 0.15               NA              N
5             0             25 0.2 0.15               NA              N


21.) Is any laboratory data rejected and why?


22.) Review raw data files as received. Document changes and corrections.


23.) Is the dataset complete?


24.) Was data collected representative of environmental conditions?


25.) Does project meet Completeness Measure A criteria?

From the QAPP, CMA = primary samples collected / usable samples submitted (goal: 85%).

CMA (overall): 100% (goal: 85%)
# A tibble: 27 × 6
# Rowwise:  result_analytical_method_id, characteristic_name
   result_analytical_met…¹ characteristic_name flag_N flag_Y total_samples   cma
   <chr>                   <chr>                <int>  <int>         <int> <dbl>
 1 120.1                   Specific Conductan…     45     NA            45     1
 2 150.1                   pH                      45     NA            45     1
 3 170.1                   Water Temperature       45     NA            45     1
 4 180.1                   Turbidity               45     NA            45     1
 5 200.8                   Arsenic                 32     NA            32     1
 6 200.8                   Cadmium                 32     NA            32     1
 7 200.8                   Calcium                 49     NA            49     1
 8 200.8                   Chromium                32     NA            32     1
 9 200.8                   Copper                  64     NA            64     1
10 200.8                   Iron                    49     NA            49     1
# ℹ 17 more rows
# ℹ abbreviated name: ¹​result_analytical_method_id


26.) Does project meet Completeness Measure B criteria?

From the QAPP, CMB = unflagged results / planned results (goal: 60%).

CMB (overall): 100% (goal: 60%)
# A tibble: 42 × 7
# Groups:   result_analytical_method_id, activity_start_date, activity_type
#   [42]
   result_analytical_method_id activity_start_date activity_type   flag_N flag_Y
   <chr>                       <date>              <chr>            <int>  <int>
 1 120.1                       2025-04-30          Field Msr/Obs       19     NA
 2 120.1                       2025-04-30          Quality Contro…      2     NA
 3 120.1                       2025-07-23          Field Msr/Obs       22     NA
 4 120.1                       2025-07-23          Quality Contro…      2     NA
 5 150.1                       2025-04-30          Field Msr/Obs       19     NA
 6 150.1                       2025-04-30          Quality Contro…      2     NA
 7 150.1                       2025-07-23          Field Msr/Obs       22     NA
 8 150.1                       2025-07-23          Quality Contro…      2     NA
 9 170.1                       2025-04-30          Field Msr/Obs       19     NA
10 170.1                       2025-04-30          Quality Contro…      2     NA
# ℹ 32 more rows
# ℹ 2 more variables: expected_results_n <int>, cmb <dbl>


27.) Was the QA officer consulted for any data concerns?


28.) Are the correct monitoring locations associated with the project?


29.) Are the QAPP and other supporting documents attached?


30.) Is all project metadata correct?


31.) Is the organization ID correct?

Expected organization ID: Kenai_WQX


32.) Are the time zones consistent and correct?

Time zone applied: AKDT


33.) Are all media types included?

Activity media types in dataset:
[1] "Water"


34.) Check Sample Collection, Preparation and Preservation Methods.

Result sample fractions:
[1] "Dissolved"  "Unfiltered" "Total"      NA           "None"      
[6] "Suspended" 

Chemical preservatives:
NULL

Container types:
[1] "Plastic Bottle" NA              


35.) Are all expected activity types present and are QC samples correctly identified?

Activity types:
[1] "Field Msr/Obs"                          
[2] "Quality Control Field Replicate Msr/Obs"
[3] "Quality Control Sample-Trip Blank"      


36.) Is the Activity media subdivision filled in?

[1] "Surface Water"


37.) For Water activity media, is the relative depth filled in?

[1] 15


38.) Is the number of results for each Characteristic correct?

# A tibble: 27 × 2
   characteristic_name              n
   <chr>                        <int>
 1 1,2-Dichloroethane-D4 (surr)     8
 2 4-Bromofluorobenzene (surr)      8
 3 Arsenic                         32
 4 Benzene                          8
 5 Cadmium                         32
 6 Calcium                         49
 7 Chromium                        32
 8 Copper                          64
 9 Dissolved Oxygen                42
10 Ethylbenzene                     8
# ℹ 17 more rows


39.) Do the range of result values make sense?

# A tibble: 27 × 5
   characteristic_name          result_unit    min       max     n
   <chr>                        <chr>        <dbl>     <dbl> <int>
 1 1,2-Dichloroethane-D4 (surr) %           105       118        8
 2 4-Bromofluorobenzene (surr)  %            98.6     104        8
 3 Arsenic                      ug/L          0         7.57    32
 4 Benzene                      ug/L          0         0        8
 5 Cadmium                      ug/L          0         0       32
 6 Calcium                      ug/L          0    158000       49
 7 Chromium                     ug/L          0         0       32
 8 Copper                       ug/L          0        48.1     64
 9 Dissolved Oxygen             mg/L          6.83     17.0     42
10 Ethylbenzene                 ug/L          0         0        8
# ℹ 17 more rows


40.) Are units correct and consistent for each parameter?

Unique unit values:
[1] "ug/L"      "mg/L"      "%"         "MPN/100ml" "mg/l"      "uS/cm"    
[7] "NTU"       "deg C"     "None"     


41.) Are detection limits and laboratory qualifiers included?

Result qualifiers:
[1] "U" "J" "="

Detection limit types:
[1] "Limit of Quantitation"


42.) Are results in trip blanks and/or field blanks above detection limits?

Trip blank results: 36 
Detections in trip blanks (result_value > 0): 12 

Part D: Flag + CDX Export

CDX Upload Files

The following files are ready for upload to the EPA Water Quality Exchange Central Data Exchange (CDX) at https://cdx.epa.gov/:

results_activities.csv : 883 rows
project.csv : NOT FOUND -- run generate_cdx_export.R
station.csv : NOT FOUND -- run generate_cdx_export.R