Advantages and Disadvantages of R

I want to share an excerpt from a book Data Mining with Rattle and R by-Graham Williams which captures in totality about advantages and disadvantages of R

ADVANTAGES

R is the most comprehensive statistical analysis package available.It incorporates all of the standard statistical tests, models, and analyses, as well as providing a comprehensive language for managing and manipulating data. New technology and ideas often appear fi rst in R.

ˆ R is a programming language and environment developed for statistical analysis by practising statisticians and researchers. It reflects well on a very competent community of computational statisticians. ˆ R is now maintained by a core team of some 19 developers, including some very senior statisticians.

ˆ The graphical capabilities of R are outstanding, providing a fully programmable graphics language that surpasses most other statistical and graphical packages. ˆ The validity of the R software is ensured through openly validated
and comprehensive governance as documented for the US Food and Drug Administration (R Foundation for Statistical Computing, 2008). Because R is open source, unlike closed source software, it has been reviewed by many internationally renowned statisticians and computational scientists.

ˆ R is free and open source software, allowing anyone to use and, importantly, to modify it. R is licensed under the GNU General Public License, with copyright held by The R Foundation for Statistical Computing.

ˆ R has no license restrictions (other than ensuring our freedom to use it at our own discretion), and so we can run it anywhere and at any time, and even sell it under the conditions of the license.

ˆ Anyone is welcome to provide bug xes, code enhancements, and new packages, and the wealth of quality packages available for R is a testament to this approach to software development and sharing.

R has over 4800 packages available from multiple repositories specializing in topics like econometrics, data mining, spatial analysis, and bio-informatics.

R is cross-platform. R runs on many operating systems and diff erent hardware. It is popularly used on GNU/Linux, Macintosh, and Microsoft Windows, running on both 32 and 64 bit processors. ˆ

R plays well with many other tools, importing data, for example, from CSV les, SAS, and SPSS, or directly from Microsoft Excel, Microsoft Access, Oracle, MySQL, and SQLite. It can also produce graphics output in PDF, JPG, PNG, and SVG formats, and table output for LATEX and HTML.

ˆ R has active user groups where questions can be asked and are often quickly responded to, often by the very people who developed the environment|this support is second to none. Have you ever tried getting support from the core developers of a commercial vendor?

ˆ New books for R (the Springer Use R! series) are emerging, and there is now a very good library of books for using R.

DISADVANTAGES

R has a steep learning curve|it does take a while to get used to the power of R|but no steeper than for other statistical languages. ˆ R is not so easy to use for the novice. There are several simple-to use graphical user interfaces (GUIs) for R that encompass point and-click interactions, but they generally do not have the polish of the commercial offerings.

ˆ Documentation is sometimes patchy and terse, and impenetrable to the non-statistician. However, some very high-standard books are increasingly plugging the documentation gaps.

ˆ The quality of some packages is less than perfect, although if a package is useful to many people, it will quickly evolve into a very robust product through collaborative efforts.

There is, in general, no one to complain to if something doesn’t work. R is a software application that many people freely devote their own time to developing. Problems are usually dealt with quickly on the open mailing lists, and bugs disappear with lightning speed. Users who do require it can purchase support from a number of vendors internationally.

ˆ Many R commands give little thought to memory management, and so R can very quickly consume all available memory. This can be a restriction when doing data mining. There are various solutions, including using 64 bit operating systems that can access much more memory than 32 bit ones.

Here is the link to purchase the book on amazon- Data Mining with Rattle and R

Posted in R Fundas