R Free Statistical Software Overview
The R Project for Statistical Computing’s R is an open-source, free statistical package. R uses a command-line syntax, meaning that you will have to type commands for R. Although there are separate projects underway to add a graphical user interface (GUI) to R, such as RStudio and R Commander, these currently only add limited point-and-click functionality like opening files and viewing charts.
R is available in both a 32-bit and 64-bit version for both Windows and Mac. R is also available for Debian, Red Hat, Suse, and Ubuntu versions of Linux. R can be downloaded at the R Project for Statistical Computing. All versions of R are completely free statistical software.
The standard R free statistical software interface consists of a number of menus and a window for the command-line syntax. Menus include File, Edit, View, Misc, Packages, Windows, and Help. Misc provides point-and-click access to control over computations, help in syntax writing such as automatic word completion, and control over objects. All of these commands can also be accessed through the syntax. The Packages menu lets the user install, update, and load packages.
R Base and Packages
R free statistical software consists of a base and packages. When you download and install R you download and install the base and select packages. When you open R only the base is in use. If you want to do something that is not available in the base, such as creating a specific type of chart, you first install a package with that functionality. Then you load that package. Installing and loading a package can be accomplished through both the GUI and the syntax. Once a package is installed in R it doesn’t have to be installed again but does have to be loaded in every session where it will be used. Like R, packages are completely free.
R packages are great for a number of reasons. First, using packages for advanced features allows the R base to be clean and small because that it doesn’t have to be everything to everyone. Second, packages allow anyone to contribute to R. If someone wants specific functionality that isn’t available in R, they can develop a package that provides that function and make it available to every R user. Third, R packages provide options. Many developers can develop packages and users can choose the packages they like best. To install packages in R, click Packages, then Install Package(s). Select the mirror site (server) that you want to download from. Then select the package. R will install the package automatically.
R Learning Curve
If you’re used to SPSS or any other statistical software that looks like a spreadsheet and is operated through a GUI, R will seem very unfamiliar. Unfortunately R has an extremely steep learning curve because almost any procedure you might want to do in R requires one or more lines of code that you need to remember or copy from a reference. For example, if I want to show how R’s data Editor, I first have to load some of the example data that comes with R. To do this, I type the following line into the command-line interface, also known as the R Console, and press Enter. This command tells R to show the example dataset ChickWeight in the R Data Editor window. Note: Type your code just to the right of the > in the R console.
R Data Editor
In the R Data Editor window each row represents a case (for example, a test subject). Each column represents a variable. Double-click on a cell to change the data. Click the name of a variable to change the name or type of a variable. Unfortunately that is all you can do in the R data editor. It is not possible to add or remove rows or columns in the R data editor. Other data editing must be done through the command-line interface.
One of R’s strengths is data analysis. R packages can produce almost any analysis you can think of. These range from simple descriptive statistics to very specific, sophisticated packages like SGP, used to produce student growth percentiles. The trick is figuring out which package you need and how to use it. Since packages are written by many different developers, the commands can be inconsistent and the documentation can be hit-or-miss.
An example of descriptive statistics is the summary command, provided in the base R package.Using this command we’ll get descriptives on the variables in the ChickWeight dataset.
R displays the results in the command-line interface:
As you can see in the above example, R’s output leaves something to be desired. The descriptive statistics are functional but they aren’t looking very polished. It is sometimes possible to save R output to file but it requires commands that are beyond the scope of this article. Oftentimes it is best to copy and paste the output to Word and clean it up there.
While it is possible to create very impressive documents using R with LaTeX or other additional code, these are projects for an advanced R user and even then can be very frustrating. Despite the difficulty, if you want to create reproducible research, R is one method.
R is capable of producing some very impressive charts. Although the R base can produce charts, many users prefer the package ggplot2. Like everything else in R, charts are produced through the command-line interface. Fortunately, charts in R are easily output to png or similar files for pasting into documents.
An example of a ggplot2 chart:
R syntax files are saved as .R. The R workspace, including objects and functions, is saved as an .Rdata file. The R commands used in a session are saved as .Rhistory. All of these filetypes are specific to R and are not shared with other software. Note, though, that R can open many types of data files, including comma-delimited, tab-delimited, Excel, EpiInfo, Minitab, S-PLUS, SAS, SPSS, Stata, Systat, and Octave. R can also connect to databases. R can also write many types of files.