Average of different layer and several netcdf files with R
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I have 15 netCDF files (.nc) for each year from 2000 to 2014. In one nc file, I have hourly data of one variable in 8760 layers.
The 3 dimensions are:
time (size 8760),
latitude (size 90) and
longitude (size 180) (2° resolution).
I want to compute the average of my variable between 8am and 7pm from april to september and for the period 2000-2014.
For one .nc file, this correspond to the average between
- layer time from 2169 (i.e 01/04/2000 8am) to 2180 (i.e 01/04/2000 7pm) (to i=2169 to i+11),
- then from 2193 (i.e 02/04/2000 8am) to 2204 (i.e 02/04/2000 7pm) (i+22, i+33)
- etc....
- ... and from 6537 (i.e 30/09/2000 8am) to 6548 (i.e 30/09/2000 7pm)
- And then the average of all nc. files.
The result should be presented in one .nc file of 3 dimensions :
- time (only one value as average),
- latitude (size 90) and
- longitude (size 180) (2° resolution)
then I can draw the map of the variable averaged over 2000-2014 (Apr to Sept, from 8am to 7pm).
I am able to read each nc file, do a map for each hour ofeach nc file, but I have know idea of how to make the mean required. If anybody can help me, that would be great.
name of my variable : dname <- "sfvmro3"
Here is my code as a fist reading:
ncin <- nc_open("sfvmro3_hourly_2000.nc")
print(ncin)
lon <- ncvar_get(ncin, "lon")
lon[lon > 180] <- lon[lon > 180] - 360
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(ncin, "lat", verbose = F)
nlat <- dim(lat)
head(lat)
print(c(nlon, nlat))
t <- ncvar_get(ncin, "time")
tunits <- ncatt_get(ncin, "time", "units")
nt <- dim(t)
dname <- "sfvmro3"
var.array <- ncvar_get(ncin, dname)*10^9 # from mol.mol-1 to ppb
dlname <- ncatt_get(ncin, dname, "long_name")
dunits <- ncatt_get(ncin, dname, "units")
fillvalue <- ncatt_get(ncin, dname, "_FillValue")
var.array[var.array == fillvalue$value] <- NA
dim(var.array)
tustr <- strsplit(tunits$value, " ")
tdstr <- strsplit(unlist(tustr)[3], "-")
tyear = as.integer(unlist(tdstr)[1])
tmonth = as.integer(unlist(tdstr)[2])
tday = as.integer(unlist(tdstr)[3])
chron = chron(t, origin = c(tmonth, tday, tyear))
Here are the details on one of the yearly file.nc:
4 variables (excluding dimension variables):
double time_bnds[bnds,time]
double lat_bnds[bnds,lat]
double lon_bnds[bnds,lon]
float sfvmro3[lon,lat,time]
standard_name: mole_fraction_of_ozone_in_air
long_name: Ozone Volume Mixing Ratio in the Lowest Model Layer
units: mole mole-1
original_name: O_x
original_units: 1
history: 2016-04-22T05:20:31Z altered by CMOR: Converted units from '1' to 'mole mole-1'.
cell_methods: time: point (interval: 30 minutes)
cell_measures: area: areacella
missing_value: 1.00000002004088e+20
_FillValue: 1.00000002004088e+20
associated_files: ...
4 dimensions:
time Size:8760 *** is unlimited ***
bounds: time_bnds
units: days since 1850-01-01
calendar: noleap
axis: T
long_name: time
standard_name: time
lat Size:90
bounds: lat_bnds
units: degrees_north
axis: Y
long_name: latitude
standard_name: latitude
lon Size:180
bounds: lon_bnds
units: degrees_east
axis: X
long_name: longitude
standard_name: longitude
bnds Size:2
26 global attributes:
institution: aaaa
institute_id: aaaa
experiment_id: aaaa
source: aaaa
model_id: aaaa
forcing: HG, SA, S
parent_experiment_id: N/A
parent_experiment_rip: N/A
branch_time: 0
contact: aaa
history: aaa
initialization_method: 1
physics_version: 1
tracking_id: aaa
product: output
experiment: aaa
frequency: hr
creation_date: 2016-04-22T05:20:31Z
Conventions: aaa
project_id: aaa
table_id:aaa
title: aaaa
parent_experiment: N/A
modeling_realm: aaa
realization: 1
cmor_version: 2.7.1
r average netcdf
add a comment |
I have 15 netCDF files (.nc) for each year from 2000 to 2014. In one nc file, I have hourly data of one variable in 8760 layers.
The 3 dimensions are:
time (size 8760),
latitude (size 90) and
longitude (size 180) (2° resolution).
I want to compute the average of my variable between 8am and 7pm from april to september and for the period 2000-2014.
For one .nc file, this correspond to the average between
- layer time from 2169 (i.e 01/04/2000 8am) to 2180 (i.e 01/04/2000 7pm) (to i=2169 to i+11),
- then from 2193 (i.e 02/04/2000 8am) to 2204 (i.e 02/04/2000 7pm) (i+22, i+33)
- etc....
- ... and from 6537 (i.e 30/09/2000 8am) to 6548 (i.e 30/09/2000 7pm)
- And then the average of all nc. files.
The result should be presented in one .nc file of 3 dimensions :
- time (only one value as average),
- latitude (size 90) and
- longitude (size 180) (2° resolution)
then I can draw the map of the variable averaged over 2000-2014 (Apr to Sept, from 8am to 7pm).
I am able to read each nc file, do a map for each hour ofeach nc file, but I have know idea of how to make the mean required. If anybody can help me, that would be great.
name of my variable : dname <- "sfvmro3"
Here is my code as a fist reading:
ncin <- nc_open("sfvmro3_hourly_2000.nc")
print(ncin)
lon <- ncvar_get(ncin, "lon")
lon[lon > 180] <- lon[lon > 180] - 360
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(ncin, "lat", verbose = F)
nlat <- dim(lat)
head(lat)
print(c(nlon, nlat))
t <- ncvar_get(ncin, "time")
tunits <- ncatt_get(ncin, "time", "units")
nt <- dim(t)
dname <- "sfvmro3"
var.array <- ncvar_get(ncin, dname)*10^9 # from mol.mol-1 to ppb
dlname <- ncatt_get(ncin, dname, "long_name")
dunits <- ncatt_get(ncin, dname, "units")
fillvalue <- ncatt_get(ncin, dname, "_FillValue")
var.array[var.array == fillvalue$value] <- NA
dim(var.array)
tustr <- strsplit(tunits$value, " ")
tdstr <- strsplit(unlist(tustr)[3], "-")
tyear = as.integer(unlist(tdstr)[1])
tmonth = as.integer(unlist(tdstr)[2])
tday = as.integer(unlist(tdstr)[3])
chron = chron(t, origin = c(tmonth, tday, tyear))
Here are the details on one of the yearly file.nc:
4 variables (excluding dimension variables):
double time_bnds[bnds,time]
double lat_bnds[bnds,lat]
double lon_bnds[bnds,lon]
float sfvmro3[lon,lat,time]
standard_name: mole_fraction_of_ozone_in_air
long_name: Ozone Volume Mixing Ratio in the Lowest Model Layer
units: mole mole-1
original_name: O_x
original_units: 1
history: 2016-04-22T05:20:31Z altered by CMOR: Converted units from '1' to 'mole mole-1'.
cell_methods: time: point (interval: 30 minutes)
cell_measures: area: areacella
missing_value: 1.00000002004088e+20
_FillValue: 1.00000002004088e+20
associated_files: ...
4 dimensions:
time Size:8760 *** is unlimited ***
bounds: time_bnds
units: days since 1850-01-01
calendar: noleap
axis: T
long_name: time
standard_name: time
lat Size:90
bounds: lat_bnds
units: degrees_north
axis: Y
long_name: latitude
standard_name: latitude
lon Size:180
bounds: lon_bnds
units: degrees_east
axis: X
long_name: longitude
standard_name: longitude
bnds Size:2
26 global attributes:
institution: aaaa
institute_id: aaaa
experiment_id: aaaa
source: aaaa
model_id: aaaa
forcing: HG, SA, S
parent_experiment_id: N/A
parent_experiment_rip: N/A
branch_time: 0
contact: aaa
history: aaa
initialization_method: 1
physics_version: 1
tracking_id: aaa
product: output
experiment: aaa
frequency: hr
creation_date: 2016-04-22T05:20:31Z
Conventions: aaa
project_id: aaa
table_id:aaa
title: aaaa
parent_experiment: N/A
modeling_realm: aaa
realization: 1
cmor_version: 2.7.1
r average netcdf
please give some sample reproducible examples
– ghub24
Oct 2 '18 at 14:09
I added a description of the nc file above.
– virginie
Oct 2 '18 at 14:59
Not a full solution, but if you have access to cdo utilities or can install them, you can get the mean over the required hours & months usingcdo timmean -selhour,8,9,10,11,12,13,14,15,16,17,19 -selmonth,4,5,6,7,8,9 input.nc output.nc
But then you want to combine years as well.
– Robert Davy
Jan 2 at 1:10
add a comment |
I have 15 netCDF files (.nc) for each year from 2000 to 2014. In one nc file, I have hourly data of one variable in 8760 layers.
The 3 dimensions are:
time (size 8760),
latitude (size 90) and
longitude (size 180) (2° resolution).
I want to compute the average of my variable between 8am and 7pm from april to september and for the period 2000-2014.
For one .nc file, this correspond to the average between
- layer time from 2169 (i.e 01/04/2000 8am) to 2180 (i.e 01/04/2000 7pm) (to i=2169 to i+11),
- then from 2193 (i.e 02/04/2000 8am) to 2204 (i.e 02/04/2000 7pm) (i+22, i+33)
- etc....
- ... and from 6537 (i.e 30/09/2000 8am) to 6548 (i.e 30/09/2000 7pm)
- And then the average of all nc. files.
The result should be presented in one .nc file of 3 dimensions :
- time (only one value as average),
- latitude (size 90) and
- longitude (size 180) (2° resolution)
then I can draw the map of the variable averaged over 2000-2014 (Apr to Sept, from 8am to 7pm).
I am able to read each nc file, do a map for each hour ofeach nc file, but I have know idea of how to make the mean required. If anybody can help me, that would be great.
name of my variable : dname <- "sfvmro3"
Here is my code as a fist reading:
ncin <- nc_open("sfvmro3_hourly_2000.nc")
print(ncin)
lon <- ncvar_get(ncin, "lon")
lon[lon > 180] <- lon[lon > 180] - 360
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(ncin, "lat", verbose = F)
nlat <- dim(lat)
head(lat)
print(c(nlon, nlat))
t <- ncvar_get(ncin, "time")
tunits <- ncatt_get(ncin, "time", "units")
nt <- dim(t)
dname <- "sfvmro3"
var.array <- ncvar_get(ncin, dname)*10^9 # from mol.mol-1 to ppb
dlname <- ncatt_get(ncin, dname, "long_name")
dunits <- ncatt_get(ncin, dname, "units")
fillvalue <- ncatt_get(ncin, dname, "_FillValue")
var.array[var.array == fillvalue$value] <- NA
dim(var.array)
tustr <- strsplit(tunits$value, " ")
tdstr <- strsplit(unlist(tustr)[3], "-")
tyear = as.integer(unlist(tdstr)[1])
tmonth = as.integer(unlist(tdstr)[2])
tday = as.integer(unlist(tdstr)[3])
chron = chron(t, origin = c(tmonth, tday, tyear))
Here are the details on one of the yearly file.nc:
4 variables (excluding dimension variables):
double time_bnds[bnds,time]
double lat_bnds[bnds,lat]
double lon_bnds[bnds,lon]
float sfvmro3[lon,lat,time]
standard_name: mole_fraction_of_ozone_in_air
long_name: Ozone Volume Mixing Ratio in the Lowest Model Layer
units: mole mole-1
original_name: O_x
original_units: 1
history: 2016-04-22T05:20:31Z altered by CMOR: Converted units from '1' to 'mole mole-1'.
cell_methods: time: point (interval: 30 minutes)
cell_measures: area: areacella
missing_value: 1.00000002004088e+20
_FillValue: 1.00000002004088e+20
associated_files: ...
4 dimensions:
time Size:8760 *** is unlimited ***
bounds: time_bnds
units: days since 1850-01-01
calendar: noleap
axis: T
long_name: time
standard_name: time
lat Size:90
bounds: lat_bnds
units: degrees_north
axis: Y
long_name: latitude
standard_name: latitude
lon Size:180
bounds: lon_bnds
units: degrees_east
axis: X
long_name: longitude
standard_name: longitude
bnds Size:2
26 global attributes:
institution: aaaa
institute_id: aaaa
experiment_id: aaaa
source: aaaa
model_id: aaaa
forcing: HG, SA, S
parent_experiment_id: N/A
parent_experiment_rip: N/A
branch_time: 0
contact: aaa
history: aaa
initialization_method: 1
physics_version: 1
tracking_id: aaa
product: output
experiment: aaa
frequency: hr
creation_date: 2016-04-22T05:20:31Z
Conventions: aaa
project_id: aaa
table_id:aaa
title: aaaa
parent_experiment: N/A
modeling_realm: aaa
realization: 1
cmor_version: 2.7.1
r average netcdf
I have 15 netCDF files (.nc) for each year from 2000 to 2014. In one nc file, I have hourly data of one variable in 8760 layers.
The 3 dimensions are:
time (size 8760),
latitude (size 90) and
longitude (size 180) (2° resolution).
I want to compute the average of my variable between 8am and 7pm from april to september and for the period 2000-2014.
For one .nc file, this correspond to the average between
- layer time from 2169 (i.e 01/04/2000 8am) to 2180 (i.e 01/04/2000 7pm) (to i=2169 to i+11),
- then from 2193 (i.e 02/04/2000 8am) to 2204 (i.e 02/04/2000 7pm) (i+22, i+33)
- etc....
- ... and from 6537 (i.e 30/09/2000 8am) to 6548 (i.e 30/09/2000 7pm)
- And then the average of all nc. files.
The result should be presented in one .nc file of 3 dimensions :
- time (only one value as average),
- latitude (size 90) and
- longitude (size 180) (2° resolution)
then I can draw the map of the variable averaged over 2000-2014 (Apr to Sept, from 8am to 7pm).
I am able to read each nc file, do a map for each hour ofeach nc file, but I have know idea of how to make the mean required. If anybody can help me, that would be great.
name of my variable : dname <- "sfvmro3"
Here is my code as a fist reading:
ncin <- nc_open("sfvmro3_hourly_2000.nc")
print(ncin)
lon <- ncvar_get(ncin, "lon")
lon[lon > 180] <- lon[lon > 180] - 360
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(ncin, "lat", verbose = F)
nlat <- dim(lat)
head(lat)
print(c(nlon, nlat))
t <- ncvar_get(ncin, "time")
tunits <- ncatt_get(ncin, "time", "units")
nt <- dim(t)
dname <- "sfvmro3"
var.array <- ncvar_get(ncin, dname)*10^9 # from mol.mol-1 to ppb
dlname <- ncatt_get(ncin, dname, "long_name")
dunits <- ncatt_get(ncin, dname, "units")
fillvalue <- ncatt_get(ncin, dname, "_FillValue")
var.array[var.array == fillvalue$value] <- NA
dim(var.array)
tustr <- strsplit(tunits$value, " ")
tdstr <- strsplit(unlist(tustr)[3], "-")
tyear = as.integer(unlist(tdstr)[1])
tmonth = as.integer(unlist(tdstr)[2])
tday = as.integer(unlist(tdstr)[3])
chron = chron(t, origin = c(tmonth, tday, tyear))
Here are the details on one of the yearly file.nc:
4 variables (excluding dimension variables):
double time_bnds[bnds,time]
double lat_bnds[bnds,lat]
double lon_bnds[bnds,lon]
float sfvmro3[lon,lat,time]
standard_name: mole_fraction_of_ozone_in_air
long_name: Ozone Volume Mixing Ratio in the Lowest Model Layer
units: mole mole-1
original_name: O_x
original_units: 1
history: 2016-04-22T05:20:31Z altered by CMOR: Converted units from '1' to 'mole mole-1'.
cell_methods: time: point (interval: 30 minutes)
cell_measures: area: areacella
missing_value: 1.00000002004088e+20
_FillValue: 1.00000002004088e+20
associated_files: ...
4 dimensions:
time Size:8760 *** is unlimited ***
bounds: time_bnds
units: days since 1850-01-01
calendar: noleap
axis: T
long_name: time
standard_name: time
lat Size:90
bounds: lat_bnds
units: degrees_north
axis: Y
long_name: latitude
standard_name: latitude
lon Size:180
bounds: lon_bnds
units: degrees_east
axis: X
long_name: longitude
standard_name: longitude
bnds Size:2
26 global attributes:
institution: aaaa
institute_id: aaaa
experiment_id: aaaa
source: aaaa
model_id: aaaa
forcing: HG, SA, S
parent_experiment_id: N/A
parent_experiment_rip: N/A
branch_time: 0
contact: aaa
history: aaa
initialization_method: 1
physics_version: 1
tracking_id: aaa
product: output
experiment: aaa
frequency: hr
creation_date: 2016-04-22T05:20:31Z
Conventions: aaa
project_id: aaa
table_id:aaa
title: aaaa
parent_experiment: N/A
modeling_realm: aaa
realization: 1
cmor_version: 2.7.1
r average netcdf
r average netcdf
edited Oct 2 '18 at 14:58
virginie
asked Oct 2 '18 at 13:03
virginievirginie
206
206
please give some sample reproducible examples
– ghub24
Oct 2 '18 at 14:09
I added a description of the nc file above.
– virginie
Oct 2 '18 at 14:59
Not a full solution, but if you have access to cdo utilities or can install them, you can get the mean over the required hours & months usingcdo timmean -selhour,8,9,10,11,12,13,14,15,16,17,19 -selmonth,4,5,6,7,8,9 input.nc output.nc
But then you want to combine years as well.
– Robert Davy
Jan 2 at 1:10
add a comment |
please give some sample reproducible examples
– ghub24
Oct 2 '18 at 14:09
I added a description of the nc file above.
– virginie
Oct 2 '18 at 14:59
Not a full solution, but if you have access to cdo utilities or can install them, you can get the mean over the required hours & months usingcdo timmean -selhour,8,9,10,11,12,13,14,15,16,17,19 -selmonth,4,5,6,7,8,9 input.nc output.nc
But then you want to combine years as well.
– Robert Davy
Jan 2 at 1:10
please give some sample reproducible examples
– ghub24
Oct 2 '18 at 14:09
please give some sample reproducible examples
– ghub24
Oct 2 '18 at 14:09
I added a description of the nc file above.
– virginie
Oct 2 '18 at 14:59
I added a description of the nc file above.
– virginie
Oct 2 '18 at 14:59
Not a full solution, but if you have access to cdo utilities or can install them, you can get the mean over the required hours & months using
cdo timmean -selhour,8,9,10,11,12,13,14,15,16,17,19 -selmonth,4,5,6,7,8,9 input.nc output.nc
But then you want to combine years as well.– Robert Davy
Jan 2 at 1:10
Not a full solution, but if you have access to cdo utilities or can install them, you can get the mean over the required hours & months using
cdo timmean -selhour,8,9,10,11,12,13,14,15,16,17,19 -selmonth,4,5,6,7,8,9 input.nc output.nc
But then you want to combine years as well.– Robert Davy
Jan 2 at 1:10
add a comment |
1 Answer
1
active
oldest
votes
I know two diferent possible solutions for your problem. One is base on taking the average for each .nc file and then take a weight average of that, the other is to get a really large array and average using that array.
- First possible solution
Each .nc that you read will give you and array, array1, array2 and so on. Also for each array you will have a time series associated to one dimension of the array. This meaning that time_serie1 has all the different times in POSIXct format for array1. So first you have to build in that vector. One you have that you can get a vector index of the times you want to use for average. For this I would use lubridate package but it is not necessary.
index1 <- month(time_serie1) < 10 & month(time_serie1) > 3 # this make an index from april to septembre
index1 <- index1 & hour(time_serie1) <= 19 & hour(time_serie1) >= 8 # then you add the hour restriction
mean1 <- apply(array1[,,index1],1:2,mean)
This code will give you a 2D array with the mean for the first year, you can put your arrays and time_series into list and loop it. Then you will have for each year a 2d array of the mean for that year and you can average this arrays. The part of "weight" average that I said is because if you do this and in your average you include February your's means will have be done taking different amount of days, for your example it is not necesary, but if you use February then you have to weight the amount of data used for each mean value.
- Second possible solution
For this solution is almost the same than the other one, but I like it more. You can merge all your arrays into a big array doing it in order so the time index is in increasing order, I will call this array BigArray. Then merge the Time series associated with each array, I will call it BigTime. And the look for the indexes you want to average and it is done. The big advantage is that you don't have to make a loop with the data in a list, and that you don't have to care about February changing size.
Index <- month(BigTime) < 10 & month(BigTime) > 3 # this make an index from april to septembre
Index <- Index & hour(BigTime) <= 19 & hour(BigTime) >= 8 # then you add the hour restriction
Mean <- apply(BigArray[,,Index],1:2,mean)
And then it is done the mean for your values.
In both possibles a 2d array is build, if you want a 3d array with one dimension (time) having only one value chase add that dimension. And if you want to look for more information taking mean of specific time values is normally call composite technique in Meteorology Science.
I hope this solve your problem.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52608917%2faverage-of-different-layer-and-several-netcdf-files-with-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I know two diferent possible solutions for your problem. One is base on taking the average for each .nc file and then take a weight average of that, the other is to get a really large array and average using that array.
- First possible solution
Each .nc that you read will give you and array, array1, array2 and so on. Also for each array you will have a time series associated to one dimension of the array. This meaning that time_serie1 has all the different times in POSIXct format for array1. So first you have to build in that vector. One you have that you can get a vector index of the times you want to use for average. For this I would use lubridate package but it is not necessary.
index1 <- month(time_serie1) < 10 & month(time_serie1) > 3 # this make an index from april to septembre
index1 <- index1 & hour(time_serie1) <= 19 & hour(time_serie1) >= 8 # then you add the hour restriction
mean1 <- apply(array1[,,index1],1:2,mean)
This code will give you a 2D array with the mean for the first year, you can put your arrays and time_series into list and loop it. Then you will have for each year a 2d array of the mean for that year and you can average this arrays. The part of "weight" average that I said is because if you do this and in your average you include February your's means will have be done taking different amount of days, for your example it is not necesary, but if you use February then you have to weight the amount of data used for each mean value.
- Second possible solution
For this solution is almost the same than the other one, but I like it more. You can merge all your arrays into a big array doing it in order so the time index is in increasing order, I will call this array BigArray. Then merge the Time series associated with each array, I will call it BigTime. And the look for the indexes you want to average and it is done. The big advantage is that you don't have to make a loop with the data in a list, and that you don't have to care about February changing size.
Index <- month(BigTime) < 10 & month(BigTime) > 3 # this make an index from april to septembre
Index <- Index & hour(BigTime) <= 19 & hour(BigTime) >= 8 # then you add the hour restriction
Mean <- apply(BigArray[,,Index],1:2,mean)
And then it is done the mean for your values.
In both possibles a 2d array is build, if you want a 3d array with one dimension (time) having only one value chase add that dimension. And if you want to look for more information taking mean of specific time values is normally call composite technique in Meteorology Science.
I hope this solve your problem.
add a comment |
I know two diferent possible solutions for your problem. One is base on taking the average for each .nc file and then take a weight average of that, the other is to get a really large array and average using that array.
- First possible solution
Each .nc that you read will give you and array, array1, array2 and so on. Also for each array you will have a time series associated to one dimension of the array. This meaning that time_serie1 has all the different times in POSIXct format for array1. So first you have to build in that vector. One you have that you can get a vector index of the times you want to use for average. For this I would use lubridate package but it is not necessary.
index1 <- month(time_serie1) < 10 & month(time_serie1) > 3 # this make an index from april to septembre
index1 <- index1 & hour(time_serie1) <= 19 & hour(time_serie1) >= 8 # then you add the hour restriction
mean1 <- apply(array1[,,index1],1:2,mean)
This code will give you a 2D array with the mean for the first year, you can put your arrays and time_series into list and loop it. Then you will have for each year a 2d array of the mean for that year and you can average this arrays. The part of "weight" average that I said is because if you do this and in your average you include February your's means will have be done taking different amount of days, for your example it is not necesary, but if you use February then you have to weight the amount of data used for each mean value.
- Second possible solution
For this solution is almost the same than the other one, but I like it more. You can merge all your arrays into a big array doing it in order so the time index is in increasing order, I will call this array BigArray. Then merge the Time series associated with each array, I will call it BigTime. And the look for the indexes you want to average and it is done. The big advantage is that you don't have to make a loop with the data in a list, and that you don't have to care about February changing size.
Index <- month(BigTime) < 10 & month(BigTime) > 3 # this make an index from april to septembre
Index <- Index & hour(BigTime) <= 19 & hour(BigTime) >= 8 # then you add the hour restriction
Mean <- apply(BigArray[,,Index],1:2,mean)
And then it is done the mean for your values.
In both possibles a 2d array is build, if you want a 3d array with one dimension (time) having only one value chase add that dimension. And if you want to look for more information taking mean of specific time values is normally call composite technique in Meteorology Science.
I hope this solve your problem.
add a comment |
I know two diferent possible solutions for your problem. One is base on taking the average for each .nc file and then take a weight average of that, the other is to get a really large array and average using that array.
- First possible solution
Each .nc that you read will give you and array, array1, array2 and so on. Also for each array you will have a time series associated to one dimension of the array. This meaning that time_serie1 has all the different times in POSIXct format for array1. So first you have to build in that vector. One you have that you can get a vector index of the times you want to use for average. For this I would use lubridate package but it is not necessary.
index1 <- month(time_serie1) < 10 & month(time_serie1) > 3 # this make an index from april to septembre
index1 <- index1 & hour(time_serie1) <= 19 & hour(time_serie1) >= 8 # then you add the hour restriction
mean1 <- apply(array1[,,index1],1:2,mean)
This code will give you a 2D array with the mean for the first year, you can put your arrays and time_series into list and loop it. Then you will have for each year a 2d array of the mean for that year and you can average this arrays. The part of "weight" average that I said is because if you do this and in your average you include February your's means will have be done taking different amount of days, for your example it is not necesary, but if you use February then you have to weight the amount of data used for each mean value.
- Second possible solution
For this solution is almost the same than the other one, but I like it more. You can merge all your arrays into a big array doing it in order so the time index is in increasing order, I will call this array BigArray. Then merge the Time series associated with each array, I will call it BigTime. And the look for the indexes you want to average and it is done. The big advantage is that you don't have to make a loop with the data in a list, and that you don't have to care about February changing size.
Index <- month(BigTime) < 10 & month(BigTime) > 3 # this make an index from april to septembre
Index <- Index & hour(BigTime) <= 19 & hour(BigTime) >= 8 # then you add the hour restriction
Mean <- apply(BigArray[,,Index],1:2,mean)
And then it is done the mean for your values.
In both possibles a 2d array is build, if you want a 3d array with one dimension (time) having only one value chase add that dimension. And if you want to look for more information taking mean of specific time values is normally call composite technique in Meteorology Science.
I hope this solve your problem.
I know two diferent possible solutions for your problem. One is base on taking the average for each .nc file and then take a weight average of that, the other is to get a really large array and average using that array.
- First possible solution
Each .nc that you read will give you and array, array1, array2 and so on. Also for each array you will have a time series associated to one dimension of the array. This meaning that time_serie1 has all the different times in POSIXct format for array1. So first you have to build in that vector. One you have that you can get a vector index of the times you want to use for average. For this I would use lubridate package but it is not necessary.
index1 <- month(time_serie1) < 10 & month(time_serie1) > 3 # this make an index from april to septembre
index1 <- index1 & hour(time_serie1) <= 19 & hour(time_serie1) >= 8 # then you add the hour restriction
mean1 <- apply(array1[,,index1],1:2,mean)
This code will give you a 2D array with the mean for the first year, you can put your arrays and time_series into list and loop it. Then you will have for each year a 2d array of the mean for that year and you can average this arrays. The part of "weight" average that I said is because if you do this and in your average you include February your's means will have be done taking different amount of days, for your example it is not necesary, but if you use February then you have to weight the amount of data used for each mean value.
- Second possible solution
For this solution is almost the same than the other one, but I like it more. You can merge all your arrays into a big array doing it in order so the time index is in increasing order, I will call this array BigArray. Then merge the Time series associated with each array, I will call it BigTime. And the look for the indexes you want to average and it is done. The big advantage is that you don't have to make a loop with the data in a list, and that you don't have to care about February changing size.
Index <- month(BigTime) < 10 & month(BigTime) > 3 # this make an index from april to septembre
Index <- Index & hour(BigTime) <= 19 & hour(BigTime) >= 8 # then you add the hour restriction
Mean <- apply(BigArray[,,Index],1:2,mean)
And then it is done the mean for your values.
In both possibles a 2d array is build, if you want a 3d array with one dimension (time) having only one value chase add that dimension. And if you want to look for more information taking mean of specific time values is normally call composite technique in Meteorology Science.
I hope this solve your problem.
answered Nov 16 '18 at 13:59
Santiago I. HurtadoSantiago I. Hurtado
125210
125210
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52608917%2faverage-of-different-layer-and-several-netcdf-files-with-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
please give some sample reproducible examples
– ghub24
Oct 2 '18 at 14:09
I added a description of the nc file above.
– virginie
Oct 2 '18 at 14:59
Not a full solution, but if you have access to cdo utilities or can install them, you can get the mean over the required hours & months using
cdo timmean -selhour,8,9,10,11,12,13,14,15,16,17,19 -selmonth,4,5,6,7,8,9 input.nc output.nc
But then you want to combine years as well.– Robert Davy
Jan 2 at 1:10