Descriptive Statistics
Uni-Variate
statistics
summary(DF) descriptive summary for all the variables the data frame (use DF$VARIABLE for specific variables)
describe(DF) from the package Hmisc does similar job
desc <- data.frame(describe(DF2)) creates a desc object that is the descriptive data frame that is easy to edit and export
freq(DF) from the package summarytools provides detailed frequencies for all the variables the data frame (use DF$VARIABLE for specific variables)
graphs
hist(DF$VARIABLE) provides an histogram for specific variables
plot(DF$VARIABLE) provides scatters of bar charts for variables depending if they are numeric or categorical
boxplot(DF) provides a box plot for all the variables in the data frame (use DF$VARIABLE for specific variables)
barchart(DF$VARIABLE) from the package lattice provides a bar chart with categorical variables
Bi-Variate
statistics
TBL <- table(DF$FACTOR1, DF$FACTOR2) creates a 2x2 contingency table
chisq.test(TBL) can be used to test for statistical differences (MASSpackage is required)
describeBy(DF$VARIABLE_num, DF$VARIABLE_cat) from the package psych provides detailed descriptive statistics of a numeric variable by the levels of the categorical variable
cor(DF, use = "pairwise.complete.obs") prints a correlation matrix with use specifying a pairwise deletion of missing cases
t.test(), aov(), lm() and glm() functions can be used to represent and test the statistcal effects, see more in 5
graphs
boxplot(DF$VARIABLE_num ~ DF$VARIABLE_cat) provides box plots of a numeric variable by the levels of the categorical variable
plotmeans(DF$VARIABLE_num ~ DF$VARIABLE_cat, mean.labels=TRUE) from package ggplots2, provides a chart with the averages of a numeric variable by the levels of the categorical variable
plot(DF$VARIABLE_num ~ DF$VARIABLE_num) provides a scatter dot of a numeric variable by another numeric variable
plot(lm(DF$VARIABLE_num ~ DF$VARIABLE_num) set of graphs embedded in the lm() function
COR <- cor(DF, use = "pairwise.complete.obs") and corrplot(COR) from the corrplot package creates a visual matrix for the COR object (the arguments method, type and diag allow for substantial configuration of the plot)
Data visualization [in preprartion]
Online resources on data visualization