Variables cualitativas
Importamos la base de datos nuevamente para evitar algunos conflictos
<- read.csv("shows.csv", sep = ";") animacion
Recodificamos las variables
names(animacion) <- c("Puesto","Nombre","Año","Duracion","Género","Puntaje","Votos")
Omitimos los valores nulos
<- na.omit(animacion) animacion
Ejemplo: Encuentre si existe relación entre el “Género” y el “Puntaje” de la lista de series y películas.
Para el desarrollo de este ejemplo tendremos que recodificar las variables de la columna “Puntaje” teniendo en cuenta el siguiente nivel de satisfaccion:
#Nivel de satisfaccion
#Excelente (9 - 10)
#Muy bueno (7 - 8)
#Bueno (5 - 6)
#Malo (3 - 4)
#Pesimo (1 - 2)
#recodificamos
$Puntaje <- cut(animacion$Puntaje, breaks = c(1,2,4,6,8,10),
animacionlabels = c("Pesimo", "Malo","Bueno","Muy bueno", "Excelente"))
$Puntaje animacion
## [1] Muy bueno Muy bueno Excelente Excelente Muy bueno Excelente Muy bueno
## [8] Excelente Muy bueno Excelente Excelente Excelente Muy bueno Excelente
## [15] Excelente Muy bueno Muy bueno Muy bueno Excelente Excelente Muy bueno
## [22] Excelente Excelente Excelente Muy bueno Muy bueno Excelente Muy bueno
## [29] Excelente Excelente Excelente Muy bueno Excelente Malo Excelente
## [36] Excelente Excelente Muy bueno Excelente Excelente Muy bueno Muy bueno
## [43] Excelente Excelente Excelente Excelente Excelente
## Levels: Pesimo Malo Bueno Muy bueno Excelente
0.14 Tabla de contingencia o cruzada
<- table(animacion$Género, animacion$Puntaje)
tab tab
##
## Pesimo Malo Bueno Muy bueno Excelente
## Animation, Action, Adventure 0 0 0 3 14
## Animation, Action, Comedy 0 0 0 0 3
## Animation, Adventure, Comedy 0 0 0 7 3
## Animation, Adventure, Drama 0 0 0 0 1
## Animation, Comedy 0 0 0 1 4
## Animation, Comedy, Drama 0 0 0 0 1
## Animation, Comedy, Family 0 1 0 2 0
## Animation, Comedy, Romance 0 0 0 1 0
## Animation, Drama, Family 0 0 0 1 0
## Animation, Family 0 0 0 0 1
## Animation, Family, Fantasy 0 0 0 1 0
## Animation, Short, Action 0 0 0 0 2
## Animation, Short, Adventure 0 0 0 1 0
library(gmodels)
CrossTable(animacion$Género, animacion$Puntaje)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | Chi-square contribution |
## | N / Row Total |
## | N / Col Total |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 47
##
##
## | animacion$Puntaje
## animacion$Género | Malo | Muy bueno | Excelente | Row Total |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Action, Adventure | 0 | 3 | 14 | 17 |
## | 0.362 | 1.613 | 1.175 | |
## | 0.000 | 0.176 | 0.824 | 0.362 |
## | 0.000 | 0.176 | 0.483 | |
## | 0.000 | 0.064 | 0.298 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Action, Comedy | 0 | 0 | 3 | 3 |
## | 0.064 | 1.085 | 0.713 | |
## | 0.000 | 0.000 | 1.000 | 0.064 |
## | 0.000 | 0.000 | 0.103 | |
## | 0.000 | 0.000 | 0.064 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Adventure, Comedy | 0 | 7 | 3 | 10 |
## | 0.213 | 3.164 | 1.629 | |
## | 0.000 | 0.700 | 0.300 | 0.213 |
## | 0.000 | 0.412 | 0.103 | |
## | 0.000 | 0.149 | 0.064 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Adventure, Drama | 0 | 0 | 1 | 1 |
## | 0.021 | 0.362 | 0.238 | |
## | 0.000 | 0.000 | 1.000 | 0.021 |
## | 0.000 | 0.000 | 0.034 | |
## | 0.000 | 0.000 | 0.021 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy | 0 | 1 | 4 | 5 |
## | 0.106 | 0.361 | 0.271 | |
## | 0.000 | 0.200 | 0.800 | 0.106 |
## | 0.000 | 0.059 | 0.138 | |
## | 0.000 | 0.021 | 0.085 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Drama | 0 | 0 | 1 | 1 |
## | 0.021 | 0.362 | 0.238 | |
## | 0.000 | 0.000 | 1.000 | 0.021 |
## | 0.000 | 0.000 | 0.034 | |
## | 0.000 | 0.000 | 0.021 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Family | 1 | 2 | 0 | 3 |
## | 13.730 | 0.771 | 1.851 | |
## | 0.333 | 0.667 | 0.000 | 0.064 |
## | 1.000 | 0.118 | 0.000 | |
## | 0.021 | 0.043 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Romance | 0 | 1 | 0 | 1 |
## | 0.021 | 1.126 | 0.617 | |
## | 0.000 | 1.000 | 0.000 | 0.021 |
## | 0.000 | 0.059 | 0.000 | |
## | 0.000 | 0.021 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Drama, Family | 0 | 1 | 0 | 1 |
## | 0.021 | 1.126 | 0.617 | |
## | 0.000 | 1.000 | 0.000 | 0.021 |
## | 0.000 | 0.059 | 0.000 | |
## | 0.000 | 0.021 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Family | 0 | 0 | 1 | 1 |
## | 0.021 | 0.362 | 0.238 | |
## | 0.000 | 0.000 | 1.000 | 0.021 |
## | 0.000 | 0.000 | 0.034 | |
## | 0.000 | 0.000 | 0.021 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Family, Fantasy | 0 | 1 | 0 | 1 |
## | 0.021 | 1.126 | 0.617 | |
## | 0.000 | 1.000 | 0.000 | 0.021 |
## | 0.000 | 0.059 | 0.000 | |
## | 0.000 | 0.021 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Short, Action | 0 | 0 | 2 | 2 |
## | 0.043 | 0.723 | 0.475 | |
## | 0.000 | 0.000 | 1.000 | 0.043 |
## | 0.000 | 0.000 | 0.069 | |
## | 0.000 | 0.000 | 0.043 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Short, Adventure | 0 | 1 | 0 | 1 |
## | 0.021 | 1.126 | 0.617 | |
## | 0.000 | 1.000 | 0.000 | 0.021 |
## | 0.000 | 0.059 | 0.000 | |
## | 0.000 | 0.021 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Column Total | 1 | 17 | 29 | 47 |
## | 0.021 | 0.362 | 0.617 | |
## -----------------------------|-----------|-----------|-----------|-----------|
##
##
?CrossTable
#Obtener la tabla solo con las frecuencias fij
CrossTable(animacion$Género, animacion$Puntaje, prop.r=F,
prop.c=F, prop.t=F, prop.chisq=F)
##
##
## Cell Contents
## |-------------------------|
## | N |
## |-------------------------|
##
##
## Total Observations in Table: 47
##
##
## | animacion$Puntaje
## animacion$Género | Malo | Muy bueno | Excelente | Row Total |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Action, Adventure | 0 | 3 | 14 | 17 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Action, Comedy | 0 | 0 | 3 | 3 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Adventure, Comedy | 0 | 7 | 3 | 10 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Adventure, Drama | 0 | 0 | 1 | 1 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy | 0 | 1 | 4 | 5 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Drama | 0 | 0 | 1 | 1 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Family | 1 | 2 | 0 | 3 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Romance | 0 | 1 | 0 | 1 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Drama, Family | 0 | 1 | 0 | 1 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Family | 0 | 0 | 1 | 1 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Family, Fantasy | 0 | 1 | 0 | 1 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Short, Action | 0 | 0 | 2 | 2 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Short, Adventure | 0 | 1 | 0 | 1 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Column Total | 1 | 17 | 29 | 47 |
## -----------------------------|-----------|-----------|-----------|-----------|
##
##
#Obtener las proporciones (porcentajes) por filas
CrossTable(animacion$Género, animacion$Puntaje, prop.r=T,
prop.c=F, prop.t=F, prop.chisq=F)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Row Total |
## |-------------------------|
##
##
## Total Observations in Table: 47
##
##
## | animacion$Puntaje
## animacion$Género | Malo | Muy bueno | Excelente | Row Total |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Action, Adventure | 0 | 3 | 14 | 17 |
## | 0.000 | 0.176 | 0.824 | 0.362 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Action, Comedy | 0 | 0 | 3 | 3 |
## | 0.000 | 0.000 | 1.000 | 0.064 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Adventure, Comedy | 0 | 7 | 3 | 10 |
## | 0.000 | 0.700 | 0.300 | 0.213 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Adventure, Drama | 0 | 0 | 1 | 1 |
## | 0.000 | 0.000 | 1.000 | 0.021 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy | 0 | 1 | 4 | 5 |
## | 0.000 | 0.200 | 0.800 | 0.106 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Drama | 0 | 0 | 1 | 1 |
## | 0.000 | 0.000 | 1.000 | 0.021 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Family | 1 | 2 | 0 | 3 |
## | 0.333 | 0.667 | 0.000 | 0.064 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Romance | 0 | 1 | 0 | 1 |
## | 0.000 | 1.000 | 0.000 | 0.021 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Drama, Family | 0 | 1 | 0 | 1 |
## | 0.000 | 1.000 | 0.000 | 0.021 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Family | 0 | 0 | 1 | 1 |
## | 0.000 | 0.000 | 1.000 | 0.021 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Family, Fantasy | 0 | 1 | 0 | 1 |
## | 0.000 | 1.000 | 0.000 | 0.021 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Short, Action | 0 | 0 | 2 | 2 |
## | 0.000 | 0.000 | 1.000 | 0.043 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Short, Adventure | 0 | 1 | 0 | 1 |
## | 0.000 | 1.000 | 0.000 | 0.021 |
## -----------------------------|-----------|-----------|-----------|-----------|
## Column Total | 1 | 17 | 29 | 47 |
## -----------------------------|-----------|-----------|-----------|-----------|
##
##
#Obtener las proporciones (porcentajes) por columnas
CrossTable(animacion$Género, animacion$Puntaje, prop.r=F,
prop.c=T, prop.t=F, prop.chisq=F)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Col Total |
## |-------------------------|
##
##
## Total Observations in Table: 47
##
##
## | animacion$Puntaje
## animacion$Género | Malo | Muy bueno | Excelente | Row Total |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Action, Adventure | 0 | 3 | 14 | 17 |
## | 0.000 | 0.176 | 0.483 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Action, Comedy | 0 | 0 | 3 | 3 |
## | 0.000 | 0.000 | 0.103 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Adventure, Comedy | 0 | 7 | 3 | 10 |
## | 0.000 | 0.412 | 0.103 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Adventure, Drama | 0 | 0 | 1 | 1 |
## | 0.000 | 0.000 | 0.034 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy | 0 | 1 | 4 | 5 |
## | 0.000 | 0.059 | 0.138 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Drama | 0 | 0 | 1 | 1 |
## | 0.000 | 0.000 | 0.034 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Family | 1 | 2 | 0 | 3 |
## | 1.000 | 0.118 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Romance | 0 | 1 | 0 | 1 |
## | 0.000 | 0.059 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Drama, Family | 0 | 1 | 0 | 1 |
## | 0.000 | 0.059 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Family | 0 | 0 | 1 | 1 |
## | 0.000 | 0.000 | 0.034 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Family, Fantasy | 0 | 1 | 0 | 1 |
## | 0.000 | 0.059 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Short, Action | 0 | 0 | 2 | 2 |
## | 0.000 | 0.000 | 0.069 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Short, Adventure | 0 | 1 | 0 | 1 |
## | 0.000 | 0.059 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Column Total | 1 | 17 | 29 | 47 |
## | 0.021 | 0.362 | 0.617 | |
## -----------------------------|-----------|-----------|-----------|-----------|
##
##
#Obtener las proporciones respecto al total
CrossTable(animacion$Género, animacion$Puntaje, prop.r=F,
prop.c=F, prop.t=T, prop.chisq=F)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 47
##
##
## | animacion$Puntaje
## animacion$Género | Malo | Muy bueno | Excelente | Row Total |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Action, Adventure | 0 | 3 | 14 | 17 |
## | 0.000 | 0.064 | 0.298 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Action, Comedy | 0 | 0 | 3 | 3 |
## | 0.000 | 0.000 | 0.064 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Adventure, Comedy | 0 | 7 | 3 | 10 |
## | 0.000 | 0.149 | 0.064 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Adventure, Drama | 0 | 0 | 1 | 1 |
## | 0.000 | 0.000 | 0.021 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy | 0 | 1 | 4 | 5 |
## | 0.000 | 0.021 | 0.085 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Drama | 0 | 0 | 1 | 1 |
## | 0.000 | 0.000 | 0.021 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Family | 1 | 2 | 0 | 3 |
## | 0.021 | 0.043 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Comedy, Romance | 0 | 1 | 0 | 1 |
## | 0.000 | 0.021 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Drama, Family | 0 | 1 | 0 | 1 |
## | 0.000 | 0.021 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Family | 0 | 0 | 1 | 1 |
## | 0.000 | 0.000 | 0.021 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Family, Fantasy | 0 | 1 | 0 | 1 |
## | 0.000 | 0.021 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Short, Action | 0 | 0 | 2 | 2 |
## | 0.000 | 0.000 | 0.043 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Animation, Short, Adventure | 0 | 1 | 0 | 1 |
## | 0.000 | 0.021 | 0.000 | |
## -----------------------------|-----------|-----------|-----------|-----------|
## Column Total | 1 | 17 | 29 | 47 |
## -----------------------------|-----------|-----------|-----------|-----------|
##
##
Ahora para obtener la tabla de manera calculada podemos optar por la prueba de chi-cuadrado:
Para chi-cuadrado
H0 : Las variables X e Y son independientes (no relacionadas) H1 : Las variables X e Y no son independientes (relacionadas)
Como nuestra base de datos cuenta con pocos datos tendremos que usar la prueba de fisher, que cumple con las mismas caracteristicas que el chi cuadrado respecto a la hipotesis nula y alterna