Skip to content

Best way to check for duplicates of one column grouped by another? #394

Answered by rich-iannone
Aariq asked this question in Q&A
Discussion options

You must be logged in to vote

Eric, I gotcha there too! With rows_distinct() you can focus on a subset of columns:

library(pointblank)
library(tidyverse)

df <- 
  tibble(
    year = rep(2001:2003, each = 3),
    ID = c("A1", "A2", "A3",
           "A1", "A1", "A3",
           "A1", "A2", "A3"
    )
  )

agent <-
  create_agent(
    tbl = df,
    actions = action_levels(warn_at = 1)
  ) %>%
  rows_distinct(columns = "ID", segments = vars(year)) %>%
  interrogate()

agent

This yields this report:

Replies: 2 comments 4 replies

Comment options

You must be logged in to vote
2 replies
@Aariq
Comment options

@Aariq
Comment options

Comment options

You must be logged in to vote
2 replies
@Aariq
Comment options

@rich-iannone
Comment options

Answer selected by Aariq
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants