Описание тега filehash
Working with large datasets in R can be cumbersome because of the need to keep objects in physical memory. While many might generally see that as a feature of the system, the need to keep whole objects in memory creates challenges to those who might want to work interactively with large datasets. Here we take a simple definition of “large dataset” to be any dataset that cannot be loaded into R as a single R object because of memory limitations. For example, a very large data frame might be too large for all of the columns and rows to be loaded at once. In such a situation, one might load only a subset of the rows or columns, if that is possible.
The filehash package provides a full read-write implementation of a key-value database for R. The package does not depend on any external packages (beyond those provided in a standard R installation) or software systems and is written entirely in R, making it readily usable on most platforms. The filehash package represents a database as an instance of an S4 class and operates directly on the S4 object via various methods.
Text adapted from: Peng, Roger, "INTERACTING WITH DATA USING THE FILEHASH PACKAGE FOR R" (June 2006). Johns Hopkins University, Dept. of Biostatistics Working Papers. Working Paper 108. http://biostats.bepress.com/jhubiostat/paper108 & http://cran.r-project.org/web/packages/filehash/vignettes/filehash.pdf