assign - Efficiency in assigning programmatically in R -
in summary, have script importing lots of data stored in several txt files. in sigle file not rows put in same table (df switching dt), each file select rows belonging same df, get
df , assign
rows.
the first time create df named ,say, table1 do:
name <- "table1" # in code value of name depend on different factors # , **not** known in advance assign(name, somerows)
then, during execution code may find (in other files) other lines put in table1 data frame, so:
name <- "table" assign(name, rbindfill(get(name), somerows))
my question is: assign(get(string), anyobject)
best way doing assignment programmatically? thanks
edit:
here simplified version of code: (each item in datasource
result of read.table()
1 single text file)
set.seed(1) # datasource <- list(data.frame(filetype = rep(letters[1:2], each=4), id = rep(letters[1:4], each=2), var1 = as.integer(rnorm(8))), data.frame(filetype = rep(letters[1:2], each=4), id = rep(letters[1:4], each=2), var1 = as.integer(rnorm(8)))) # # # # library(plyr) # tablesnames <- unique(unlist(lapply(datasource,function(x) as.character(unique(x[,1]))))) for(l in tablesnames){ temp <- lapply(datasource, function(x) x[x[,1]==l, -1]) if(exists(l)) assign(l, rbind.fill(get(l), rbind.fill(temp))) else assign(l, rbind.fill(temp)) } # # # 2 data frames , b crated # # # different method using rbindlist in place of rbind.fill (faster and, until now, don't # have missing column fill) # rm(a,b) library(data.table) # tablesnames <- unique(unlist(lapply(datasource,function(x) as.character(unique(x[,1]))))) for(l in tablesnames){ temp <- lapply(datasource, function(x) x[x[,1]==l, -1]) if(exists(l)) assign(l, rbindlist(list(get(l), rbindlist(temp)))) else assign(l, rbindlist(temp)) }
i recommend using named list
, , skip using assign
, get
. many of cool r features (lapply
example) work on lists, , not work using assign
, get
. in addition, can pass lists in function, while can cumbersome groups of variables combined assign
, get
.
if want read set of files 1 big data.frame i'd use (assuming csv text files):
library(plyr) list_of_files = list.files(pattern = "*.csv") big_dataframe = ldply(list_of_files, read.csv)
or if want keep result in list:
big_list = lapply(list_of_files, read.csv)
and possibly use rbind.fill
:
big_dataframe = do.call("rbind.fill", big_list)
Comments
Post a Comment