So I have a really large dataset that has some missing/bad data. I would like to code the missing data using an IF else statement. Instead of assigning just one value for all of the missing/bad ones, I want to assign base on a fraction.
So for instance for df below:
Assign 50% of the df$col2==B to BLUE and the other 50% to RED
col1 col2
1 a
2 a
3 b
4 b
I know you can do:
if else( df$col2==b, "BLUE", df$col1)
but I want:
col1 col2
1 a
2 a
3 BLUE
4 RED
I'm looking to do the partitioning base of the condition.