Reproduce the intermediate outputs

Required steps to modify the script
  • Changing the path of the file in “setwd()” and for output
  • Fix the file name, e.g., sample.csv -> Sample.csv and Non-EVOS SINs.csv -> Non-EVOS_SINs.csv
A patch recording the updates
system("diff -u ./Total_PAH_and_Alkanes_GoA_Hydrocarbons_Clean.R ./new_Total_PAH_and_Alkanes_GoA_Hydrocarbons_Clean.R > ./merge.patch")
system("tab2space -unix -t2 ./merge.patch", intern = TRUE)
## character(0)
Patch the updates to the script
system("patch -p0 < ./merge.patch", intern = TRUE)
## character(0)
Run script to reproduce
#setwd(".")
source("Total_PAH_and_Alkanes_GoA_Hydrocarbons_Clean.R")
## 'data.frame':    16960 obs. of  72 variables:
##  $ Funding     : Factor w/ 1 level "EVOSTC": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Sin         : int  -600 -600 -557 -557 -900 20120312 20120313 20120310 20120309 20120308 ...
##  $ type        : Factor w/ 6 levels "","-552","-553",..: 6 6 6 6 6 1 1 1 1 1 ...
##  $ Rep         : int  1 2 1 2 1 1 1 1 1 1 ...
##  $ LAB         : Factor w/ 4 levels "ABL","GERG","NECD",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ QCbatch     : Factor w/ 1071 levels "20120827MZ","20120920MZ",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ strMAT      : Factor w/ 17 levels "","part","part, wat",..: 7 7 7 7 7 7 7 7 7 7 ...
##  $ LabSam      : Factor w/ 2164 levels "","-558","-877",..: 2161 2161 2157 2159 2160 1982 1983 1980 1979 1978 ...
##  $ Vol         : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ Proportion  : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ DryWt       : num  1 1 0.00678 0.00678 1 ...
##  $ WetWt       : num  1 1 0.00678 0.00678 1 ...
##  $ AnalysisType: Factor w/ 37 levels "","BLANK","COAL",..: 16 16 19 19 15 11 11 11 11 11 ...
##  $ Catno       : Factor w/ 319 levels "","6100","6101",..: 222 222 222 222 222 319 319 319 319 319 ...
##  $ NaphD8      : num  92.5 92.5 88.7 86.7 108.4 ...
##  $ Acend10     : num  100.6 101.1 94.5 94.2 116.1 ...
##  $ Phend10     : num  101.1 101 94.9 95.2 116.3 ...
##  $ Anthra10    : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ Banth12     : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ Chryd12     : num  104.5 104.8 94.6 93.5 114.3 ...
##  $ Benad12     : num  114 114 115 113 133 ...
##  $ Peryd12     : num  106.5 109.5 89.3 91.8 106.7 ...
##  $ Units       : Factor w/ 7 levels "","ng","ng/device",..: 4 4 5 5 4 5 5 5 5 5 ...
##  $ Naph        : num  247 247 124275 136809 0 ...
##  $ Menap2      : num  180 181 477226 487966 0 ...
##  $ MENAP1      : num  160 161 341500 345107 0 ...
##  $ DIMETH      : num  138 137 501639 503703 0 ...
##  $ C2NAPH      : num  412 412 1784545 1737098 0 ...
##  $ TRIMETH     : num  0 0 237407 239645 0 ...
##  $ C3NAPH      : num  0 0 1602999 1701227 0 ...
##  $ C4NAPH      : num  0 0 961949 990175 0 ...
##  $ BIPHENYL    : num  114 114 23663 23529 111 ...
##  $ ACENTHY     : num  138 134 15476 0 0 ...
##  $ ACENTHE     : num  118 118 16798 16649 0 ...
##  $ FLUORENE    : num  106 107 32643 33024 0 ...
##  $ C1FLUOR     : num  0 0 111440 113162 0 ...
##  $ C2FLUOR     : num  0 0 179608 180330 0 ...
##  $ C3FLUOR     : num  0 0 209519 199383 0 ...
##  $ C4FLUOR     : num  0 0 25311 25879 0 ...
##  $ DITHIO      : num  96.1 96.8 48726.8 45625.9 0 ...
##  $ C1DITHIO    : num  0 0 181422 183124 0 ...
##  $ C2DITHIO    : num  0 0 293077 296922 0 ...
##  $ C3DITHIO    : num  0 0 258189 258230 0 ...
##  $ C4DITHIO    : num  0 0 36010 37554 0 ...
##  $ PHENANTH    : num  245 245 97646 99282 0 ...
##  $ MEPHEN1     : num  202 205 97291 97390 0 ...
##  $ C1PHENAN    : num  978 980 470442 477279 0 ...
##  $ C2PHENAN    : num  0 167 733661 734837 0 ...
##  $ C3PHENAN    : num  0 0 502830 539533 0 ...
##  $ C4PHENAN    : num  0 0 230727 228864 0 ...
##  $ ANTHRA      : num  83.7 84.1 3669.9 2878 0 ...
##  $ FLUORANT    : num  190 189 2667 2660 0 ...
##  $ PYRENE      : num  208 208 8622 8475 0 ...
##  $ C1FLUORA    : num  0 0 40463 38925 0 ...
##  $ C2FLUORA    : num  0 0 83041 87831 0 ...
##  $ C3FLUORA    : num  0 0 65452 59747 0 ...
##  $ C4FLUORA    : num  0 0 46432 45956 0 ...
##  $ BENANTH     : num  107 107 4676 3706 0 ...
##  $ CHRYSENE    : num  104 104 11122 11243 0 ...
##  $ C1CHRYS     : num  89 88.6 26538.6 28080.2 0 ...
##  $ C2CHRYS     : num  0 0 53113 47651 0 ...
##  $ C3CHRYS     : num  0 0 22029 21114 0 ...
##  $ C4CHRYS     : num  0 0 7310 8871 0 ...
##  $ BENZOBFL    : num  193 195 834 1026 0 ...
##  $ BENZOKFL    : num  74.5 74.7 130.6 136.6 0 ...
##  $ BENEPY      : num  108 109 2575 3014 0 ...
##  $ BENAPY      : num  112 113 632 883 0 ...
##  $ PERYLENE    : num  103 102 33971 33769 0 ...
##  $ INDENO      : num  117 113 0 0 0 ...
##  $ DIBENZ      : num  111 112 302 350 0 ...
##  $ BENZOP      : num  142 132 1335 1611 0 ...
##  $ comment     : logi  NA NA NA NA NA NA ...
## 'data.frame':    15955 obs. of  53 variables:
##  $ Funding     : Factor w/ 1 level "EVOSTC": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Sin         : int  -902 -902 -902 -902 -902 -902 -902 -902 -902 -902 ...
##  $ type        : Factor w/ 6 levels "","-552","-553",..: 6 6 6 6 6 6 6 6 6 6 ...
##  $ Rep         : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ LAB         : Factor w/ 3 levels "ABL","GERG","NECD": 1 1 1 1 1 1 1 1 1 1 ...
##  $ QCbatch     : Factor w/ 1013 levels "20120827MZ","20120920MZ",..: 1006 145 1004 1003 142 1000 991 161 175 100 ...
##  $ strMAT      : Factor w/ 13 levels "","part","part, wat",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LabSam      : Factor w/ 1458 levels "","-99","1","1005303",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Vol         : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ Proportion  : logi  NA NA NA NA NA NA ...
##  $ DryWt       : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ WetWt       : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ AnalysisType: Factor w/ 27 levels "","Coal","COAL",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Catno       : Factor w/ 311 levels "","6100","6101",..: 144 144 144 144 144 144 144 144 144 144 ...
##  $ C12d26      : num  15.4 40.9 44.4 52.3 56.3 ...
##  $ C16d34      : num  45.2 68.9 77.8 62.3 64.8 ...
##  $ C20d42      : num  79.7 88.6 89.2 79.9 79.1 ...
##  $ C24d50      : num  87.8 92.2 86.9 84.7 82 ...
##  $ C30d64      : num  88.3 96.2 86.1 84.3 70.1 ...
##  $ Units       : Factor w/ 6 levels "","ng","ng/g",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ C9ALK       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ C10ALK      : num  5799 9024 9399 9067 9731 ...
##  $ C11ALK      : num  7115 9666 9279 9244 9789 ...
##  $ C12ALK      : num  10957 11305 10176 10265 10754 ...
##  $ C13ALK      : num  15396 12642 10918 10936 11227 ...
##  $ C14ALK      : num  6703 8743 7349 9347 9903 ...
##  $ C15ALK      : num  8123 9500 8664 9265 9797 ...
##  $ C16ALK      : num  11290 11966 10943 10940 11554 ...
##  $ C17ALK      : num  14662 13658 12383 12285 13160 ...
##  $ PRISTANE    : num  15357 14620 12439 12720 13234 ...
##  $ C18ALK      : num  10570 12613 12216 11400 12166 ...
##  $ PHYTANE     : num  51.9 83.7 10.6 31 99.4 ...
##  $ C19ALK      : num  10422 11554 10845 10516 11108 ...
##  $ C20ALK      : num  10096 10536 9709 9759 10198 ...
##  $ C21ALK      : num  10345 10350 9295 9685 10083 ...
##  $ C22ALK      : num  10735 11262 10937 10703 10884 ...
##  $ C23ALK      : num  9798 10094 9605 9596 9805 ...
##  $ C24ALK      : num  10298 10559 9773 9919 10220 ...
##  $ C25ALK      : num  9695 10025 9026 9288 9635 ...
##  $ C26ALK      : num  9627 10042 8828 9123 9512 ...
##  $ C27ALK      : num  4097 4298 3642 3791 4000 ...
##  $ C28ALK      : num  10516 10852 10689 10838 10926 ...
##  $ C29ALK      : num  9499 9863 9407 9448 9671 ...
##  $ C30ALK      : num  10179 10519 9718 9672 10053 ...
##  $ C31ALK      : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ C32ALK      : num  10168 10027 9036 8612 8990 ...
##  $ C33ALK      : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ C34ALK      : num  10776 10199 8883 8425 5148 ...
##  $ C35ALK      : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ C36ALK      : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ TOTALKANES  : num  242657 254894 233495 235218 241905 ...
##  $ UCM         : num  7942 0 0 0 0 ...
##  $ comment     : Factor w/ 6 levels "","There were 2 replicate 1 entries.  This is an arbitrary replicate 2",..: 1 1 1 1 1 1 1 1 1 1 ...
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:plyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
file.exists("rp_Total_Aromatic_Alkanes_PWS.csv")
## [1] TRUE

Reproduce the final results (analysis outputs)

Required steps to modify the script
  • Set the path for the csv file that the script reads (e.g., using setwd())
  • Add Coordinate Node (CN) and Member Node (MN) information to connect the repository
  • Fix unused object “hcdb2” -> “hcdb”
  • Filter out null values (on the column “matrix” in the dataset) that are plotted in the results
  • Set color for the plot in the first output
  • (Optional) Change the title of the artifacts if wanted to make it same as original
  • Add “ggsave()” to save each plot into each file
  • Write a file from binary data which is shape file, rename the file to zip, and the unzip it
  • Fix incorrect library “rColorBrewer” -> “RColorBrewer”
A patch recording the updates
system("diff -u ./hcdbSites.R ./new_hcdbSites.R > ./genResults.patch")
system("tab2space -unix -t2 ./genResults.patch", intern = TRUE)
## character(0)
Patch the updates to the script
system("patch -p0 < ./genResults.patch", intern = TRUE)
## character(0)
Run script to reproduce
source("hcdbSites.R")
## Warning in strptime(x, fmt, tz = "GMT"): unknown timezone 'default/America/
## Chicago'
## Loading required package: sp
## ### Welcome to rworldmap ###
## For a short introduction type :   vignette('rworldmap')
## rgdal: version: 1.3-3, (SVN revision 759)
##  Geospatial Data Abstraction Library extensions to R successfully loaded
##  Loaded GDAL runtime: GDAL 2.3.0, released 2018/05/04
##  Path to GDAL shared files: /usr/local/Cellar/gdal/2.3.0/share/gdal
##  GDAL binary built with GEOS: TRUE 
##  Loaded PROJ.4 runtime: Rel. 5.1.0, June 1st, 2018, [PJ_VERSION: 510]
##  Path to PROJ.4 shared files: (autodetected)
##  Linking to sp version: 1.3-1
## Regions defined for each Polygons
## Warning: Ignoring unknown aesthetics: x, y
## Warning: Removed 1270 rows containing missing values (geom_point).
## OGR data source with driver: ESRI Shapefile 
## Source: "/Users/eunjungpark/Documents/dataone/DataONE_2018_Summer_Intern_Project1/reproduce/data/GIS", layer: "statep010"
## with 62 features
## It has 9 fields
## Regions defined for each Polygons
## Warning: Ignoring unknown aesthetics: x, y

## Warning: Removed 1270 rows containing missing values (geom_point).
knitr::include_graphics("./data/rp_hcdbSamplesGOA.png")