Assigning cells in R dataframe average value by value in different column

Question

I have a very large dataframe, part of which looks like this:

col1	col2
A	3
A	4
B	5
B	7

I know all rows with the same value in col1 should have the same value in col2, and any deviation is due to measuring uncertainty. I want to assign the cells in col2 the average value for all col2 cells in rows with the same col1 value, resulting in something like this:

col1	col2
A	3.5
A	3.5
B	6
B	6

The real dataset is too large to do this manually for each individual unique col1 value. Does anyone have any idea on how to automate this? Thanks in advance.

score 0 · Answer 1 · answered Sep 06 '22 at 13:22

You can use group_by() and mutate() from the tidyverse package to achieve this. First you group for col1 and then you use mutate() to write the means into col2:

library(tidyverse)
col1 <- c("A", "A", "B", "B")
col2 <- c(3,4,5,7)

df <- data.frame(col1, col2)

df %>%
  group_by(col1) %>%
  mutate(col2 = mean(col2))
#> # A tibble: 4 × 2
#> # Groups:   col1 [2]
#>   col1   col2
#>   <chr> <dbl>
#> 1 A       3.5
#> 2 A       3.5
#> 3 B       6  
#> 4 B       6

^{Created on 2022-09-06 with reprex v2.0.2}

Assigning cells in R dataframe average value by value in different column

1 Answers1