Summary Statistics

Given a vector, we can find several quantities: When there are missing values in the data, the functions max(), min(), range(), mean(), and median() return NA, and the functions var(), cor(), and quantile() return an error message. Using the command with the argument na.rm=T, forces Splus to remove any missing values from the vector x and to return the maximum value in x. Many of the commands above can have arguments. For example, var(x, unbiased=F) finds the population variance (divides over n).

> x_c(1,2,3,3,3,4,7,8,9,NA)  
> max(x, na.rm=T)
[1] 9                         
> mean(x, na.rm=T)
[1]  4.444444
> var(x[!is.na(x)])
[1] 8.027778 
> y_c(1,2,3,4,5,6,7,8,9,10)
> cor(x[!is.na(x)],y[!is.na(x)]
[1] 0.9504597 
We can find trimmed means. the argument trim takes any value between 0 and 0.5 inclusive to be trimmed from each end of the ordered data
       
> mean(x, trim=0.2, na.rm=T)
[1]  4.285714             
The function quantile() returns the quantiles of x specified in the argument "probs"
  
> quantile(x, probs=c(0,0.1,0.9), na.rm=T)
  0% 10% 90%                  
   1 1.8 8.2                                                   
These functions may also be used on matrices; they will not be applied to the rows or columns individually but rather will find the max, min, etc. of the whole matrix

Comments to: Miguel A. Arcones

Go to main homepage: