Task: Write a script to read the Norfolk temperature file for 2005 to 2010 (details in task 1). Plot all of the February and August temperatures vs year as two lines on a single plot. Calculate the monthly average February and August temperatures for Norfolk, and display the values on the plot.
Missing values are a perpetual problem when dealing with data. In the previous task, missing values in the csv file were indicated by a blank. The input function converted the blank value to zero. This value is not always a good way to indicate that a measurement is missing. For example, an air temperature of zero (Centigrade or Farenheit) is possible.
In many data sets, an impossible value is chosen as a missing value. In many meteorology data sets, a value of -999 or -99 is used to indicate a missing value.
In any case, if you get data from some source, it is important to know what the missing value is or if there is a flag attached to observations indicating their quality. You will have to do something to avoid using these missing values in your analysis.
MATLAB has a nice option for missing values: Not-a-Number or NaN. It is a valid real number (in addition to inf and -inf, being positive and negative infinity). It is clearly not a possible measurement. It can occur in normal arithmetic by evaluating 0/0, which has no known value. Any operation with infinity, such as 0*inf, or 1*inf, results in NaN. Any arithmetic involving NaN, such as 1*NaN, or 1+NaN, results in NaN.
Missing values can be indicated by NaN. As an extra benefit, MATLAB will not plot though NaN, so there will be a blank spot on a plot of a list of values containing NaN.
So far, so good. The problem is that any arithmetic involving NaN results in NaN, so the sum of a set of values containing one or more NaN, is NaN. We need some way to avoid these missing values when doing calculations.
A logical function, isnan, can be used to find which values in a list or array are NaN. Logical variables have only two values: 0 = false and 1 = true. So, isnan will return a zero for normal numbers and 1 for NaN. For example, consider the following:
A = [ 10 20 NaN 40 50 NaN 70]; L = isnan(A); % the result is L = [0 0 1 0 0 1 0]The isnan function returns a set of 0's and 1's the same length (and shape) as the input variable. The values of L(3) and L(6) above are 1 indicating that A(3) and A(6) are NaN.
We need some way to pickout which values are NaN, which is the find function. find returns the index of the true values in a list of logical values. Given the setup above, find(L) will return the values 3 and 6, indicating the third and sixth entries are true. All of this can be combined as
A = [ 10 20 NaN 40 50 NaN 70]; bad = find(isnan(A));where the variable bad has the values 3 and 6. Find returns a list of indexes (locations in the array) for which the test is true, in this case locations in the array of NaN values.
What we really want is the values which are not NaN. We can use the logical negative (not) operator "~" which reverses the logical values. So, we can use the following to find the "good", that is, non-missing, values:
good=find(~isnan(A)); % good = [1 2 4 5 7];
It is possible to get a mean of the values in the array that are not NaN with
MeanA=sum(A(good))/length(good);where the sum function uses values A(1), A(2), A(4), A(5) and A(7).
We will work with logical variables in future classes, but we can expand this capability to do a variety of useful things:
ig = find(A >0); % find the positive values ig = find(month == 3); % Find entries for month 3. ig = find(month <=2 & month >=11); % find entries associated with months 1, 2, 11, 12These examples show some common logical operators (we will return to this topic in a later class).
Flow chart for this task
%%% task44.m %%% load data from file, extract variables %%% extract temperature for Feb and Aug %%% save year values for one of the months for plotting %%% calculate mean for each month %%% plot temperature for two months %%% add legends and a title %%% add mean temperature lines
Script to complete this task:
%%% task44.m %%% load data from file, extract variables data=load('NorfolkMeanTemp2005.dat'); year=data(:,1);month=data(:,2);temp=data(:,3); clear data %%% extract temperature for Feb and Aug feb=find(month == 2);aug=find(month == 8); %%% save year values for one of the months for plotting myear=year(feb); %%% calculate mean for each month FebTemp=temp(feb);AugTemp=temp(aug); FebMean=mean(FebTemp);AugMean=mean(AugTemp); %%% plot temperature for two months figure plot(myear,FebTemp,'b',myear,AugTemp,'r') %%% add legends and a title title('Monthly Mean Temp for Feb(b) and Aug(r)') xlabel('year') ylabel('deg F') text(2007,50,['Feb Mean Temp ' num2str(FebMean)]) text(2007,75,['Aug Mean Temp ' num2str(AugMean)]) %%% add mean temperature lines hold on plot([myear(1) myear(end)],[FebMean FebMean],'k') plot([myear(1) myear(end)],[AugMean AugMean],'k') hold off