Package: ff
Version: 2.0.1
Date: 2008-08-03
Title: memory-efficient storage of large atomic vectors and arrays on
        disk and fast access functions
Author: Daniel Adler <dadler@uni-goettingen.de>, Christian Glser
        <christian_glaeser@gmx.de>, Oleg Nenadic
        <onenadi@uni-goettingen.de>, Jens Oehlschlgel
        <Jens.Oehlschlaegel@truecluster.com>, Walter Zucchini
        <wzucchi@uni-goettingen.de>
Maintainer: Jens Oehlschlgel <Jens.Oehlschlaegel@truecluster.com>
Depends: R (>= 2.1), utils
Suggests:
Description: The ff package provides atomic data structures that are
        stored on disk but behave (almost) as if they were in RAM by
        transparently mapping only a section (pagesize) in main memory
        - the effective virtual memory consumption per ff object. ff
        supports atomic data types 'double', 'logical', 'raw' and
        'integer' and close-to-atomic types 'factor', 'POSIXct' and
        custom close-to-atomic types. ff now has native C-support for
        vectors, matrices and arrays with flexible dimorder (major
        column-order, major row-order and generalizations for arrays).
        ff objects store raw data in binary flat files in native
        encoding, and complement this with metadata stored in R as
        physical and virtual attributes. ff objects have well-defined
        hybrid copying semantics, which gives rise to certain
        performance improvements through virtualization. ff objects can
        be stored and reopened across R sessions. ff files can be
        shared by multiple ff R objects (using different data
        en/de-coding schemes) in the same process or from multiple R
        processes to exploit parallelism. A wide choice of finalizer
        options allows to work with 'permanent' files as well as
        creating/removing 'temporary' ff files completely transparent
        to the user. On certain OS/Filesystem combinations, creating
        the ff files works without notable delay thanks to using sparse
        file allocation. Several access optimization techniques such as
        Hybrid Index Preprocessing and Virtualization are implemented
        to achieve good performance even with large datasets, for
        example virtual matrix transpose without touching a single byte
        on disk. Further, to reduce disk I/O, the atomic data gets
        stored native and compact on binary flat files i.e. logicals
        take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond
        basic access functions, the ff package also provides
        compatibility functions that facilitate writing code for ff and
        ram objects and support for batch processing on ff objects
        (e.g. as.ram, as.ff, ffapply). NOTE: A professional extension
        is available from the authors, which integrates additional
        high-performance features neatly into the ff package. The
        extension allows efficient handling of symmetric matrices and
        supports more packed data types: boolean (1 bit), quad (2 bit
        unsigned), nibble (4 bit unsigned), byte (1 byte signed with
        NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs),
        ushort (2 byte unsigned), single (4 byte float with NAs). For
        example 'quad' allows efficient storage of genomic data as an
        'A','T','G','C' factor. The unsigned types support 'circular'
        arithmetic.
License: file LICENSE
Encoding: latin1
Packaged: 2009-04-16 06:49:00 UTC; hornik
Repository: CRAN
Date/Publication: 2009-04-16 06:57:14
