r - find highest proportion in data.frame -
i have dataframe looks this:
x <- data.frame(sector=rep(1:5, each=2), subspecies=rep(c("type a", "type b"), 5), proportion= c(.2, 1-.2, .3, 1-.3, .4, 1-.4, .5, 1-.5, .6, 1-.6)) x$dominance <- na x[,1] <- sort(x[,1]) x sector subspecies proportion dominance 1 1 type 0.2 na 2 1 type b 0.8 na 3 2 type 0.3 na 4 2 type b 0.7 na 5 3 type 0.4 na 6 3 type b 0.6 na 7 4 type 0.5 na 8 4 type b 0.5 na 9 5 type 0.6 na 10 5 type b 0.4 na
in each sector 1-5, if type highest proportion, need add 'a dominant' 'dominance' column, or if type b highest proportion, need add 'b dominant' 'dominance' column. if there tie, need add 'tie' 'dominance' column.
this should output dataframe:
x$dominance <- c("b dominant", "b dominant", "b dominant", "b dominant", "b dominant", "b dominant", "tie", "tie", "a dominant", "a dominant") x sector subspecies proportion dominance 1 1 type 0.2 b dominant 2 1 type b 0.8 b dominant 3 2 type 0.3 b dominant 4 2 type b 0.7 b dominant 5 3 type 0.4 b dominant 6 3 type b 0.6 b dominant 7 4 type 0.5 tie 8 4 type b 0.5 tie 9 5 type 0.6 dominant 10 5 type b 0.4 dominant
here base r solution
compare <- function(x) { ## return subspecies of max proportion res <- x[which(x$proportion == max(x$proportion)), "subspecies"] if(length(res) > 1l) { ## if tied length(res) == 2 out <- "tie" } else { ## simple string replacement out <- paste(sub("type ", "", res), "dominant") ## or use #out <- if(res == "type a") {"a dominant"} else {"b dominant")} } out } x$dominance <- unsplit(lapply(split(x, x$sector), compare), x$sector) > x sector subspecies proportion dominance 1 1 type 0.2 b dominant 2 1 type b 0.8 b dominant 3 2 type 0.3 b dominant 4 2 type b 0.7 b dominant 5 3 type 0.4 b dominant 6 3 type b 0.6 b dominant 7 4 type 0.5 tie 8 4 type b 0.5 tie 9 5 type 0.6 dominant 10 5 type b 0.4 dominant
Comments
Post a Comment