r - Replace values in a sampled subset -
i have dataset of 2000 individuals. 330 of them have missing data vector have named y.n.17 (culture results). replace 17 of 330 missing (na) values "1", indicating result positive culture.
this line of code trying use:
y.n.17[sample(is.na(y.n.17),17)]=1
it seems replace 17 individuals "1" every 100 individuals, whether or not na! doing wrong?
let n
length of y.n.17
, m < n
number of na
in vector. is.na(y.n.17)
boolean vector of length n
containing m
true
, n-m
false
. when sample vector doing sample(is.na(y.n.17),17)
getting vector of length 17
of randomly selected true
or false
. lot of false
, maybe 1 true
. when y.n.17[sample(is.na(y.n.17),17)]=1
, vector of length 17
recycled 1
inserted @ regular intervals...
you mean do:
na.idx <- which(is.na(y.n.17)) replace.idx <- head(sample(na.idx), 17) y.n.17[replace.idx] <- 1
note: doing head(sample(na.idx), 17)
more robust sample(na.idx, 17)
work when data has less 17 na
s. if prefer code error out if case, y.n.17[sample(which(is.na(y.n.17)), 17)] <- 1
.
Comments
Post a Comment