R - Cleaning Data
Example Workflow:
- Access data -> Explore & Process -> Extract Insights -> Report
1. Accessing data
- Glimpse
library(dplyr)
glimpse(df)
library(assertive)
assert_is_numeric(df$column)
# or simply
str(df)
is.numeric(df$column)
# is.character, is.logical, is.factor, is.Date, ...
class(df$column) #returns class/type
2. String to Numerics
# strings-commas in numbers, "54,567"
library(stringr)
#ex. str_remove(column, ","))
df_no_commas = str_remove(df$string(df$string, ","))
numers = asnumeric(df_no_commas)
# Then us mutate to convert to tibble
df %>% mutate(as.numeric(str_remove(col, ",")))
mean(df)
3. Factors to Numerics
# Factors to numeric
product_type = as.factor("100", "200", "300")
class(product_type) # factor
as.numeric(as.character(product_type))
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.