R语言计算并集：根据种植和收获月份筛选数据

假设数据1的数据框名为df1，可以按照以下步骤实现上述的过程：

导入数据框：

df1 <- data.frame(
  country = c('China', 'China', 'India', 'India', 'USA', 'USA', 'Brazil', 'Brazil'),
  sets = c('plant', 'harvest', 'plant', 'harvest', 'plant', 'harvest', 'plant', 'harvest'),
  month = c(3, 9, 4, 10, 5, 11, 6, 12)
)

使用dplyr包中的group_by和summarize函数，分组计算每个国家的plant和harvest的最小和最大月份：

library(dplyr)

df2 <- df1 %>%
  group_by(country, sets) %>%
  summarize(min_month = min(month), max_month = max(month))

使用group_by和filter函数，按照国家和sets进行分组，并筛选出小于plant的月份和大于harvest的月份：

df3 <- df1 %>%
  group_by(country, sets) %>%
  filter(month < min_month | month > max_month) %>%
  select(country, month)

使用dplyr包中的distinct函数，去除重复的行：

df4 <- distinct(df3)

最终得到的数据框df4即为按照小于plant的月份和大于harvest的月份的并集合。