<li><aname="toc-Importing-from-other-statistical-systems-1"href="#Importing-from-other-statistical-systems">3 Importing from other statistical systems</a>
Next: <ahref="#Acknowledgements"accesskey="n"rel="next">Acknowledgements</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="R-Data-Import_002fExport"></a>
<h1class="top">R Data Import/Export</h1>
<p>This is a guide to importing and exporting data to and from R.
<tr><tdalign="left"valign="top">•<ahref="#Importing-from-other-statistical-systems"accesskey="4">Importing from other statistical systems</a>:</td><td> </td><tdalign="left"valign="top">
<tr><tdalign="left"valign="top">•<ahref="#Function-and-variable-index">Function and variable index</a>:</td><td> </td><tdalign="left"valign="top">
Next: <ahref="#Introduction"accesskey="n"rel="next">Introduction</a>, Previous: <ahref="#Top"accesskey="p"rel="prev">Top</a>, Up: <ahref="#Top"accesskey="u"rel="up">Top</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Acknowledgements-1"></a>
<h2class="unnumbered">Acknowledgements</h2>
<p>The relational databases part of this manual is based in part on an
earlier manual by Douglas Bates and Saikat DebRoy. The principal author
of this manual was Brian Ripley.
</p>
<p>Many volunteers have contributed to the packages used here. The
principal authors of the packages mentioned are
</p>
<blockquote>
<tablesummary="">
<tr><td><ahref="https://CRAN.R-project.org/package=DBI"><strong>DBI</strong></a></td><td>David A. James</td></tr>
<tr><td><ahref="https://CRAN.R-project.org/package=dataframes2xls"><strong>dataframes2xls</strong></a></td><td>Guido van Steen</td></tr>
<tr><td><ahref="https://CRAN.R-project.org/package=foreign"><strong>foreign</strong></a></td><td>Thomas Lumley, Saikat DebRoy, Douglas Bates, Duncan Murdoch and Roger Bivand</td></tr>
<tr><td><ahref="https://CRAN.R-project.org/package=gdata"><strong>gdata</strong></a></td><td>Gregory R. Warnes</td></tr>
<tr><tdalign="left"valign="top">•<ahref="#Export-to-text-files"accesskey="2">Export to text files</a>:</td><td> </td><tdalign="left"valign="top">
Next: <ahref="#Export-to-text-files"accesskey="n"rel="next">Export to text files</a>, Previous: <ahref="#Introduction"accesskey="p"rel="prev">Introduction</a>, Up: <ahref="#Introduction"accesskey="u"rel="up">Introduction</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
it is best to use what Windows calls ‘Unicode’<aname="DOCF2"href="#FOOT2"><sup>2</sup></a>, that is <code>"UTF-16LE"</code>. Using UTF-8 is a good way
to make portable files that will not easily be confused with any other
encoding, but even OS X applications (where UTF-8 is the system
encoding) may not recognize them, and Windows applications are most
unlikely to. Apparently Excel:mac 2004/8 expects <code>.csv</code> files in
<code>"macroman"</code> encoding (the encoding used in much earlier versions
of Mac OS).
</p>
</li></ol>
<aname="index-write_002ematrix"></a>
<p>Function <code>write.matrix</code> in package <ahref="https://CRAN.R-project.org/package=MASS"><strong>MASS</strong></a> provides a
specialized interface for writing matrices, with the option of writing
them in blocks and thereby reducing memory usage.
</p>
<aname="index-sink"></a>
<p>It is possible to use <code>sink</code> to divert the standard R output to
a file, and thereby capture the output of (possibly implicit)
<code>print</code> statements. This is not usually the most efficient route,
and the <code>options(width)</code> setting may need to be increased.
</p>
<aname="index-write_002eforeign"></a>
<p>Function <code>write.foreign</code> in package <ahref="https://CRAN.R-project.org/package=foreign"><strong>foreign</strong></a> uses
<code>write.table</code> to produce a text file and also writes a code file
that will read this text file into another statistical package. There is
currently support for export to <code>SAS</code>, <code>SPSS</code> and <code>Stata</code>.
</p>
<hr>
<aname="XML"></a>
<divclass="header">
<p>
Previous: <ahref="#Export-to-text-files"accesskey="p"rel="prev">Export to text files</a>, Up: <ahref="#Introduction"accesskey="u"rel="up">Introduction</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="XML-1"></a>
<h3class="section">1.3 XML</h3>
<aname="index-XML"></a>
<p>When reading data from text files, it is the responsibility of the user
to know and to specify the conventions used to create that file,
e.g. the comment character, whether a header line is present, the value
separator, the representation for missing values (and so on) described
in <ahref="#Export-to-text-files">Export to text files</a>. A markup language which can be used to
describe not only content but also the structure of the content can
make a file self-describing, so that one need not provide these details
to the software reading the data.
</p>
<p>The eXtensible Markup Language – more commonly known simply as
<acronym>XML</acronym>– can be used to provide such structure, not only for
standard datasets but also more complex data structures.
<acronym>XML</acronym> is becoming extremely popular and is emerging as a
standard for general data markup and exchange. It is being used by
different communities to describe geographical data such as maps,
graphical displays, mathematics and so on.
</p>
<p><acronym>XML</acronym> provides a way to specify the file’s encoding, e.g.
<p>NB: <ahref="https://CRAN.R-project.org/package=XML"><strong>XML</strong></a> is available as a binary package for Windows, normally
from the ‘CRAN extras’ repository (which is selected by default on
Windows).
</p>
<hr>
<aname="Spreadsheet_002dlike-data"></a>
<divclass="header">
<p>
Next: <ahref="#Importing-from-other-statistical-systems"accesskey="n"rel="next">Importing from other statistical systems</a>, Previous: <ahref="#Introduction"accesskey="p"rel="prev">Introduction</a>, Up: <ahref="#Top"accesskey="u"rel="up">Top</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
<tr><tdalign="left"valign="top">•<ahref="#Variations-on-read_002etable"accesskey="1">Variations on read.table</a>:</td><td> </td><tdalign="left"valign="top">
<tr><tdalign="left"valign="top">•<ahref="#Data-Interchange-Format-_0028DIF_0029"accesskey="3">Data Interchange Format (DIF)</a>:</td><td> </td><tdalign="left"valign="top">
<p>(This would most likely work without specifying an encoding in a UTF-8 locale.)
</p>
<p>Another problem with this (real-life) example is that whereas
<code>file-5.03</code> reported the BOM, <code>file-4.17</code> found on OS
10.5 (Leopard) did not.
</p></li></ol>
<aname="index-read_002ecsv"></a>
<aname="index-read_002ecsv2"></a>
<aname="index-read_002edelim"></a>
<aname="index-read_002edelim2"></a>
<aname="index-CSV-files-1"></a>
<aname="index-Sys_002elocaleconv"></a>
<aname="index-locales"></a>
<p>Convenience functions <code>read.csv</code> and <code>read.delim</code> provide
arguments to <code>read.table</code> appropriate for CSV and tab-delimited
files exported from spreadsheets in English-speaking locales. The
variations <code>read.csv2</code> and <code>read.delim2</code> are appropriate for
use in those locales where the comma is used for the decimal point and
(for <code>read.csv2</code>) for spreadsheets which use semicolons to separate
fields.
</p>
<p>If the options to <code>read.table</code> are specified incorrectly, the error
message will usually be of the form
</p>
<divclass="example">
<preclass="example">Error in scan(file = file, what = what, sep = sep, :
line 1 did not have 5 elements
</pre></div>
<p>or
</p>
<divclass="example">
<preclass="example">Error in read.table("files.dat", header = TRUE) :
more columns than column names
</pre></div>
<aname="index-count_002efields"></a>
<p>This may give enough information to find the problem, but the auxiliary
function <code>count.fields</code> can be useful to investigate further.
</p>
<p>Efficiency can be important when reading large data grids. It will help
to specify <code>comment.char = ""</code>, <code>colClasses</code> as one of the
atomic vector types (logical, integer, numeric, complex, character or
perhaps raw) for each column, and to give <code>nrows</code>, the number of
rows to be read (and a mild over-estimate is better than not specifying
this at all). See the examples in later sections.
</p>
<hr>
<aname="Fixed_002dwidth_002dformat-files"></a>
<divclass="header">
<p>
Next: <ahref="#Data-Interchange-Format-_0028DIF_0029"accesskey="n"rel="next">Data Interchange Format (DIF)</a>, Previous: <ahref="#Variations-on-read_002etable"accesskey="p"rel="prev">Variations on read.table</a>, Up: <ahref="#Spreadsheet_002dlike-data"accesskey="u"rel="up">Spreadsheet-like data</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
<p>The <code>reshape</code> function has a more complicated syntax than
<code>stack</code> but can be used for data where the ‘long’ form has more
than the one column in this example. With <code>direction="wide"</code>,
<code>reshape</code> can also perform the opposite transformation.
</p>
<p>Some people prefer the tools in packages <ahref="https://CRAN.R-project.org/package=reshape"><strong>reshape</strong></a>,
<ahref="https://CRAN.R-project.org/package=reshape2"><strong>reshape2</strong></a> and <ahref="https://CRAN.R-project.org/package=plyr"><strong>plyr</strong></a>.
</p>
<hr>
<aname="Flat-contingency-tables"></a>
<divclass="header">
<p>
Previous: <ahref="#Re_002dshaping-data"accesskey="p"rel="prev">Re-shaping data</a>, Up: <ahref="#Spreadsheet_002dlike-data"accesskey="u"rel="up">Spreadsheet-like data</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
<tr><tdalign="left"valign="top">•<ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat"accesskey="1">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a>:</td><td> </td><tdalign="left"valign="top">
Next: <ahref="#Octave"accesskey="n"rel="next">Octave</a>, Previous: <ahref="#Importing-from-other-statistical-systems"accesskey="p"rel="prev">Importing from other statistical systems</a>, Up: <ahref="#Importing-from-other-statistical-systems"accesskey="u"rel="up">Importing from other statistical systems</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
<p>The recommended package <ahref="https://CRAN.R-project.org/package=foreign"><strong>foreign</strong></a> provides import facilities for
files produced by these statistical systems, and for export to Stata. In
some cases these functions may require substantially less memory than
<code>read.table</code> would. <code>write.foreign</code> (See <ahref="#Export-to-text-files">Export to text files</a>) provides an export mechanism with support currently for
<code>SAS</code>, <code>SPSS</code> and <code>Stata</code>.
</p>
<aname="index-EpiInfo"></a>
<aname="index-EpiData"></a>
<aname="index-read_002eepiinfo"></a>
<p>EpiInfo versions 5 and 6 stored data in a self-describing fixed-width
text format. <code>read.epiinfo</code> will read these <samp>.REC</samp> files into
an R data frame. EpiData also produces data in this format.
</p>
<aname="index-Minitab"></a>
<aname="index-read_002emtp"></a>
<p>Function <code>read.mtp</code> imports a ‘Minitab Portable Worksheet’. This
returns the components of the worksheet as an R list.
</p>
<aname="index-SAS"></a>
<aname="index-read_002export"></a>
<p>Function <code>read.xport</code> reads a file in SAS Transport (XPORT) format
and return a list of data frames. If SAS is available on your system,
function <code>read.ssd</code> can be used to create and run a SAS script that
saves a SAS permanent dataset (<samp>.ssd</samp> or <samp>.sas7bdat</samp>) in
Transport format. It then calls <code>read.xport</code> to read the resulting
file. (Package <ahref="https://CRAN.R-project.org/package=Hmisc"><strong>Hmisc</strong></a> has a similar function <code>sas.get</code>, also
running SAS.) For those without access to SAS but running on Windows,
the SAS System Viewer (a zero-cost download) can be used to open SAS
datasets and export them to e.g. <samp>.csv</samp> format.
</p>
<aname="index-S_002dPLUS"></a>
<aname="index-read_002eS"></a>
<aname="index-data_002erestore"></a>
<p>Function <code>read.S</code> which can read binary objects produced by S-PLUS
3.x, 4.x or 2000 on (32-bit) Unix or Windows (and can read them on a
different OS). This is able to read many but not all S objects: in
particular it can read vectors, matrices and data frames and lists
containing those.
</p>
<p>Function <code>data.restore</code> reads S-PLUS data dumps (created by
<code>data.dump</code>) with the same restrictions (except that dumps from the
Alpha platform can also be read). It should be possible to read data
dumps from S-PLUS 5.x and later written with <code>data.dump(oldStyle=T)</code>.
</p>
<p>If you have access to S-PLUS, it is usually more reliable to <code>dump</code>
the object(s) in S-PLUS and <code>source</code> the dump file in R. For
S-PLUS 5.x and later you may need to use <code>dump(..., oldStyle=T)</code>,
and to read in very large objects it may be preferable to use the dump
file as a batch script rather than use the <code>source</code> function.
</p>
<aname="index-SPSS"></a>
<aname="index-SPSS-Data-Entry"></a>
<aname="index-read_002espss"></a>
<p>Function <code>read.spss</code> can read files created by the ‘save’ and
‘export’ commands in <acronym>SPSS</acronym>. It returns a list with one
component for each variable in the saved data set. <acronym>SPSS</acronym>
variables with value labels are optionally converted to R factors.
</p>
<p><acronym>SPSS</acronym> Data Entry is an application for creating data entry
forms. By default it creates data files with extra formatting
information that <code>read.spss</code> cannot handle, but it is possible to
export the data in an ordinary <acronym>SPSS</acronym> format.
</p>
<p>Some third-party applications claim to produce data ‘in SPSS format’ but
with differences in the formats: <code>read.spss</code> may or may not be able
to handle these.
</p>
<aname="index-Stata"></a>
<aname="index-read_002edta"></a>
<aname="index-write_002edta"></a>
<p>Stata <samp>.dta</samp> files are a binary file format. Files from versions 5
up to 11 of Stata can be read and written by functions <code>read.dta</code>
and <code>write.dta</code>. Stata variables with value labels are optionally
converted to (and from) R factors. Stata version 12 by default
writes ‘format-115 datasets’: <code>read.dta</code> currently may not be able
to read those.
</p>
<aname="index-Systat"></a>
<aname="index-read_002esystat"></a>
<p><code>read.systat</code> reads those Systat <code>SAVE</code> files that are
rectangular data files (<code>mtype = 1</code>) written on little-endian
machines (such as from Windows). These have extension <samp>.sys</samp>
or (more recently) <samp>.syd</samp>.
</p>
<hr>
<aname="Octave"></a>
<divclass="header">
<p>
Previous: <ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat"accesskey="p"rel="prev">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a>, Up: <ahref="#Importing-from-other-statistical-systems"accesskey="u"rel="up">Importing from other statistical systems</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Octave-1"></a>
<h3class="section">3.2 Octave</h3>
<aname="index-Octave"></a>
<aname="index-read_002eoctave"></a>
<p>Octave is a numerical linear algebra system
(<ahref="http://www.octave.org">http://www.octave.org</a>), and function <code>read.octave</code> in
package <ahref="https://CRAN.R-project.org/package=foreign"><strong>foreign</strong></a> can read in files in Octave text data format
created using the Octave command <code>save -ascii</code>, with support for
most of the common types of variables, including the standard atomic
(real and complex scalars, matrices, and <em>N</em>-d arrays, strings,
ranges, and boolean scalars and matrices) and recursive (structs, cells,
and lists) ones.
</p>
<hr>
<aname="Relational-databases"></a>
<divclass="header">
<p>
Next: <ahref="#Binary-files"accesskey="n"rel="next">Binary files</a>, Previous: <ahref="#Importing-from-other-statistical-systems"accesskey="p"rel="prev">Importing from other statistical systems</a>, Up: <ahref="#Top"accesskey="u"rel="up">Top</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
<tr><tdalign="left"valign="top">•<ahref="#Why-use-a-database_003f"accesskey="1">Why use a database?</a>:</td><td> </td><tdalign="left"valign="top">
</td></tr>
<tr><tdalign="left"valign="top">•<ahref="#Overview-of-RDBMSs"accesskey="2">Overview of RDBMSs</a>:</td><td> </td><tdalign="left"valign="top">
is used by KDE4 to store personal information. Several OS X
applications, including Mail and Address Book, use SQLite.
</p>
<hr>
<aname="Overview-of-RDBMSs"></a>
<divclass="header">
<p>
Next: <ahref="#R-interface-packages"accesskey="n"rel="next">R interface packages</a>, Previous: <ahref="#Why-use-a-database_003f"accesskey="p"rel="prev">Why use a database?</a>, Up: <ahref="#Relational-databases"accesskey="u"rel="up">Relational databases</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Overview-of-RDBMSs-1"></a>
<h3class="section">4.2 Overview of RDBMSs</h3>
<p>Traditionally there had been large (and expensive) commercial RDBMSs
Next: <ahref="#Data-types"accesskey="n"rel="next">Data types</a>, Previous: <ahref="#Overview-of-RDBMSs"accesskey="p"rel="prev">Overview of RDBMSs</a>, Up: <ahref="#Overview-of-RDBMSs"accesskey="u"rel="up">Overview of RDBMSs</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
<p>The more comprehensive R interfaces generate <acronym>SQL</acronym> behind the
scenes for common operations, but direct use of <acronym>SQL</acronym> is needed
for complex operations in all. Conventionally <acronym>SQL</acronym> is written
in upper case, but many users will find it more convenient to use lower
case in the R interface functions.
</p>
<p>A relational DBMS stores data as a database of <em>tables</em> (or
<em>relations</em>) which are rather similar to R data frames, in that
they are made up of <em>columns</em> or <em>fields</em> of one type
(numeric, character, date, currency, …) and <em>rows</em> or
<em>records</em> containing the observations for one entity.
</p>
<p><acronym>SQL</acronym>‘queries’ are quite general operations on a relational
database. The classical query is a SELECT statement of the type
</p>
<divclass="example">
<preclass="example">SELECT State, Murder FROM USArrests WHERE Rape > 30 ORDER BY Murder
SELECT t.sch, c.meanses, t.sex, t.achieve
FROM student as t, school as c WHERE t.sch = c.id
SELECT sex, COUNT(*) FROM student GROUP BY sex
SELECT sch, AVG(sestat) FROM student GROUP BY sch LIMIT 10
</pre></div>
<p>The first of these selects two columns from the R data frame
<code>USArrests</code> that has been copied across to a database table,
subsets on a third column and asks the results be sorted. The second
performs a database <em>join</em> on two tables <code>student</code> and
<code>school</code> and returns four columns. The third and fourth queries do
some cross-tabulation and return counts or averages. (The five
aggregation functions are COUNT(*) and SUM, MAX, MIN and AVG, each
applied to a single column.)
</p>
<p>SELECT queries use FROM to select the table, WHERE to specify a
condition for inclusion (or more than one condition separated by AND or
OR), and ORDER BY to sort the result. Unlike data frames, rows in RDBMS
tables are best thought of as unordered, and without an ORDER BY
statement the ordering is indeterminate. You can sort (in
lexicographical order) on more than one column by separating them by
commas. Placing DESC after an ORDER BY puts the sort in descending
order.
</p>
<p>SELECT DISTINCT queries will only return one copy of each distinct row
in the selected table.
</p>
<p>The GROUP BY clause selects subgroups of the rows according to the
criterion. If more than one column is specified (separated by commas)
then multi-way cross-classifications can be summarized by one of the
five aggregation functions. A HAVING clause allows the select to
include or exclude groups depending on the aggregated value.
</p>
<p>If the SELECT statement contains an ORDER BY statement that produces a
unique ordering, a LIMIT clause can be added to select (by number) a
contiguous block of output rows. This can be useful to retrieve rows a
block at a time. (It may not be reliable unless the ordering is unique,
as the LIMIT clause can be used to optimize the query.)
</p>
<p>There are queries to create a table (CREATE TABLE, but usually one
copies a data frame to the database in these interfaces), INSERT or
DELETE or UPDATE data. A table is destroyed by a DROP TABLE ‘query’.
</p>
<p>Kline and Kline (2001) discuss the details of the implementation of SQL
in Microsoft SQL Server 2000, Oracle, MySQL and PostgreSQL.
</p>
<hr>
<aname="Data-types"></a>
<divclass="header">
<p>
Previous: <ahref="#SQL-queries"accesskey="p"rel="prev">SQL queries</a>, Up: <ahref="#Overview-of-RDBMSs"accesskey="u"rel="up">Overview of RDBMSs</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Data-types-1"></a>
<h4class="subsection">4.2.2 Data types</h4>
<p>Data can be stored in a database in various data types. The range of
data types is DBMS-specific, but the <acronym>SQL</acronym> standard defines many
types, including the following that are widely implemented (often not by
the <acronym>SQL</acronym> name).
</p>
<dlcompact="compact">
<dt><code>float(<var>p</var>)</code></dt>
<dd><p>Real number, with optional precision. Often called <code>real</code> or
<code>double</code> or <code>double precision</code>.
</p></dd>
<dt><code>integer</code></dt>
<dd><p>32-bit integer. Often called <code>int</code>.
</p></dd>
<dt><code>smallint</code></dt>
<dd><p>16-bit integer
</p></dd>
<dt><code>character(<var>n</var>)</code></dt>
<dd><p>fixed-length character string. Often called <code>char</code>.
<dd><p>variable-length character string. Often called <code>varchar</code>. Almost
always has a limit of 255 chars.
</p></dd>
<dt><code>boolean</code></dt>
<dd><p>true or false. Sometimes called <code>bool</code> or <code>bit</code>.
</p></dd>
<dt><code>date</code></dt>
<dd><p>calendar date
</p></dd>
<dt><code>time</code></dt>
<dd><p>time of day
</p></dd>
<dt><code>timestamp</code></dt>
<dd><p>date and time
</p></dd>
</dl>
<p>There are variants on <code>time</code> and <code>timestamp</code>, <code>with
timezone</code>. Other types widely implemented are <code>text</code> and
<code>blob</code>, for large blocks of text and binary data, respectively.
</p>
<p>The more comprehensive of the R interface packages hide the type
conversion issues from the user.
</p>
<hr>
<aname="R-interface-packages"></a>
<divclass="header">
<p>
Previous: <ahref="#Overview-of-RDBMSs"accesskey="p"rel="prev">Overview of RDBMSs</a>, Up: <ahref="#Relational-databases"accesskey="u"rel="up">Relational databases</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="R-interface-packages-1"></a>
<h3class="section">4.3 R interface packages</h3>
<p>There are several packages available on <acronym>CRAN</acronym> to help R
communicate with DBMSs. They provide different levels of abstraction.
Some provide means to copy whole data frames to and from databases. All
have functions to select data within the database via <acronym>SQL</acronym>
queries, and to retrieve the result as a whole as a
data frame or in pieces (usually as groups of rows).
</p>
<p>All except <ahref="https://CRAN.R-project.org/package=RODBC"><strong>RODBC</strong></a> are tied to one DBMS, but there has been a
proposal for a unified ‘front-end’ package <ahref="https://CRAN.R-project.org/package=DBI"><strong>DBI</strong></a>
(<ahref="https://developer.r-project.org/db">https://developer.r-project.org/db</a>) in conjunction with a
‘back-end’, the most developed of which is <ahref="https://CRAN.R-project.org/package=RMySQL"><strong>RMySQL</strong></a>. Also on
<acronym>CRAN</acronym> are the back-ends <ahref="https://CRAN.R-project.org/package=ROracle"><strong>ROracle</strong></a>, <ahref="https://CRAN.R-project.org/package=RPostgreSQL"><strong>RPostgreSQL</strong></a> and
<ahref="https://CRAN.R-project.org/package=RSQLite"><strong>RSQLite</strong></a> (which works with the bundled DBMS <code>SQLite</code>,
<ahref="https://www.sqlite.org">https://www.sqlite.org</a>), <ahref="https://CRAN.R-project.org/package=RJDBC"><strong>RJDBC</strong></a> (which uses Java and can
connect to any DBMS that has a JDBC driver) and <ahref="https://CRAN.R-project.org/package=RpgSQL"><strong>RpgSQL</strong></a> (a
specialist interface to PostgreSQL built on top of <ahref="https://CRAN.R-project.org/package=RJDBC"><strong>RJDBC</strong></a>).
</p>
<p>The BioConductor project has updated <strong>RdbiPgSQL</strong> (formerly on
<acronym>CRAN</acronym> ca 2000), a first-generation interface to PostgreSQL.
</p>
<p><strong>PL/R</strong> (<ahref="http://www.joeconway.com/plr/"><code>http://www.joeconway.com/plr/</code></a>) is a project to embed R into
PostgreSQL.
</p>
<p>Package <ahref="https://CRAN.R-project.org/package=RMongo"><strong>RMongo</strong></a> provides an R interface to a Java client for
‘MongoDB’ (<ahref="https://en.wikipedia.org/wiki/MongoDB">https://en.wikipedia.org/wiki/MongoDB</a>) databases, which
are queried using JavaScript rather than SQL. Package <ahref="https://CRAN.R-project.org/package=rmongodb"><strong>rmongodb</strong></a> is
another client using <strong>mongodb</strong>’s C driver.
<h4class="subsection">4.3.1 Packages using DBI</h4>
<aname="index-MySQL-database-system"></a>
<p>Package <ahref="https://CRAN.R-project.org/package=RMySQL"><strong>RMySQL</strong></a> on <acronym>CRAN</acronym> provides an interface to the
MySQL database system (see <ahref="https://www.mysql.com">https://www.mysql.com</a> and Dubois,
2000) or its fork MariaDB (see <ahref="https://mariadb.org/">https://mariadb.org/</a>). The
description here applies to versions <code>0.5-0</code> and later: earlier
versions had a substantially different interface. The current version
requires the <ahref="https://CRAN.R-project.org/package=DBI"><strong>DBI</strong></a> package, and this description will apply with
minor changes to all the other back-ends to <ahref="https://CRAN.R-project.org/package=DBI"><strong>DBI</strong></a>.
</p>
<p>MySQL exists on Unix/Linux/OS X and Windows: there is a ‘Community
Edition’ released under GPL but commercial licenses are also available.
MySQL was originally a ‘light and lean’ database. (It preserves the
case of names where the operating file system is case-sensitive, so not
on Windows.)
</p>
<aname="index-dbDriver"></a>
<aname="index-dbConnect"></a>
<aname="index-dbDisconnect"></a>
<p>The call <code>dbDriver("MySQL")</code> returns a database connection manager
object, and then a call to <code>dbConnect</code> opens a database connection
which can subsequently be closed by a call to the generic function
<code>dbDisconnect</code>. Use <code>dbDriver("Oracle")</code>,
<code>dbDriver("PostgreSQL")</code> or <code>dbDriver("SQLite")</code> with those
DBMSs and packages <ahref="https://CRAN.R-project.org/package=ROracle"><strong>ROracle</strong></a>, <ahref="https://CRAN.R-project.org/package=RPostgreSQL"><strong>RPostgreSQL</strong></a> or <ahref="https://CRAN.R-project.org/package=RSQLite"><strong>RSQLite</strong></a>
respectively.
</p>
<aname="index-dbSendQuery"></a>
<aname="index-dbClearResult"></a>
<aname="index-dbGetQuery"></a>
<p><acronym>SQL</acronym> queries can be sent by either <code>dbSendQuery</code> or
<code>dbGetQuery</code>. <code>dbGetquery</code> sends the query and retrieves the
results as a data frame. <code>dbSendQuery</code> sends the query and returns
an object of class inheriting from <code>"DBIResult"</code> which can be used
to retrieve the results, and subsequently used in a call to
<code>dbClearResult</code> to remove the result.
</p>
<aname="index-fetch"></a>
<p>Function <code>fetch</code> is used to retrieve some or all of the rows in the
query result, as a list. The function <code>dbHasCompleted</code> indicates if
all the rows have been fetched, and <code>dbGetRowCount</code> returns the
number of rows in the result.
</p>
<aname="index-dbReadTable"></a>
<aname="index-dbWriteTable"></a>
<aname="index-dbExistsTable"></a>
<aname="index-dbRemoveTable"></a>
<p>These are convenient interfaces to read/write/test/delete tables in the
database. <code>dbReadTable</code> and <code>dbWriteTable</code> copy to and from
an R data frame, mapping the row names of the data frame to the field
<code>row_names</code> in the <code>MySQL</code> table.
</p>
<divclass="smallexample">
<preclass="smallexample">> library(RMySQL) # will load DBI as well
## open a connection to a MySQL database
> con <- dbConnect(dbDriver("MySQL"), dbname = "test")
## list the tables in the database
> dbListTables(con)
## load a data frame into the database, deleting any existing copy
<tr><tdalign="left"valign="top">•<ahref="#Binary-data-formats"accesskey="1">Binary data formats</a>:</td><td> </td><tdalign="left"valign="top">
<ahref="https://CRAN.R-project.org/package=RNetCDF"><strong>RNetCDF</strong></a>, <ahref="https://CRAN.R-project.org/package=ncdf"><strong>ncdf</strong></a> and <ahref="https://CRAN.R-project.org/package=ncdf4"><strong>ncdf4</strong></a> on <acronym>CRAN</acronym> provide
interfaces to <acronym>NASA</acronym>’s HDF5 (Hierarchical Data Format, see
<ahref="https://www.hdfgroup.org/HDF5/">https://www.hdfgroup.org/HDF5/</a>) and to UCAR’s netCDF data files
<p>Both of these are systems to store scientific data in array-oriented
ways, including descriptions, labels, formats, units, …. HDF5 also
allows <em>groups</em> of arrays, and the R interface maps lists
to HDF5 groups, and can write numeric and character vectors and
matrices.
</p>
<p>NetCDF’s version 4 format (confusingly, implemented in netCDF 4.1.1 and
later, but not in 4.0.1) includes the use of various HDF5 formats. This
is handled by package <ahref="https://CRAN.R-project.org/package=ncdf4"><strong>ncdf4</strong></a> whereas <ahref="https://CRAN.R-project.org/package=RNetCDF"><strong>RNetCDF</strong></a> and
<ahref="https://CRAN.R-project.org/package=ncdf"><strong>ncdf</strong></a> handle version 3 files.
</p>
<p>The availability of software to support these formats is somewhat
limited by platform, especially on Windows.
</p>
<hr>
<aname="dBase-files-_0028DBF_0029"></a>
<divclass="header">
<p>
Previous: <ahref="#Binary-data-formats"accesskey="p"rel="prev">Binary data formats</a>, Up: <ahref="#Binary-files"accesskey="u"rel="up">Binary files</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="dBase-files-_0028DBF_0029-1"></a>
<h3class="section">5.2 dBase files (DBF)</h3>
<aname="index-dBase"></a>
<aname="index-DBF-files"></a>
<p><code>dBase</code> was a DOS program written by Ashton-Tate and later owned by
Borland which has a binary flat-file format that became popular, with
file extension <samp>.dbf</samp>. It has been adopted for the ’Xbase’ family
of databases, covering dBase, Clipper, FoxPro and their Windows
equivalents Visual dBase, Visual Objects and Visual FoxPro (see
<ahref="http://www.e-bachmann.dk/docs/xbase.htm">http://www.e-bachmann.dk/docs/xbase.htm</a>). A dBase file contains
a header and then a series of fields and so is most similar to an R
data frame. The data itself is stored in text format, and can include
character, logical and numeric fields, and other types in later versions
<p>Functions <code>read.dbf</code> and <code>write.dbf</code> provide ways to read and
write basic DBF files on all R platforms. For Windows users
<code>odbcConnectDbase</code> in package <ahref="https://CRAN.R-project.org/package=RODBC"><strong>RODBC</strong></a> provides more
comprehensive facilities to read DBF files <em>via</em> Microsoft’s dBase
ODBC driver (and the Visual FoxPro driver can also be used via
<code>odbcDriverConnect</code>).
<aname="index-odbcConnectDbase"></a>
</p>
<hr>
<aname="Image-files"></a>
<divclass="header">
<p>
Next: <ahref="#Connections"accesskey="n"rel="next">Connections</a>, Previous: <ahref="#Binary-files"accesskey="p"rel="prev">Binary files</a>, Up: <ahref="#Top"accesskey="u"rel="up">Top</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Image-files-1"></a>
<h2class="chapter">6 Image files</h2>
<p>A particular class of binary files are those representing images, and a
not uncommon request is to read such a file into R as a matrix.
</p>
<p>There are many formats for image files (most with lots of variants), and
it may be necessary to use external conversion software to first convert
the image into one of the formats for which a package currently provides
an R reader. A versatile example of such software is ImageMagick and
its fork GraphicsMagick. These provide command-line programs
<code>convert</code> and <code>gm convert</code> to convert images from one
format to another: what formats they can input is determined when they
are compiled, and the supported formats can be listed by e.g.
<code>convert -list format</code>.
</p>
<p>Package <ahref="https://CRAN.R-project.org/package=pixmap"><strong>pixmap</strong></a> has a function <code>read.pnm</code> to read ‘portable
anymap’ images in PBM (black/white), PGM (grey) and PPM (RGB colour)
formats. These are also known as ‘netpbm’ formats.
</p>
<p>Packages <ahref="https://CRAN.R-project.org/package=bmp"><strong>bmp</strong></a>, <ahref="https://CRAN.R-project.org/package=jpeg"><strong>jpeg</strong></a> and <ahref="https://CRAN.R-project.org/package=png"><strong>png</strong></a> read the
formats after which they are named. See also packages <ahref="https://CRAN.R-project.org/package=biOps"><strong>biOps</strong></a>
and <ahref="https://CRAN.R-project.org/package=Momocs"><strong>Momocs</strong></a>, and Bioconductor package <strong>EBImage</strong>.
</p>
<p>TIFF is more a meta-format, a wrapper within which a very large variety
of image formats can be embedded. Packages <ahref="https://CRAN.R-project.org/package=rtiff"><strong>rtiff</strong></a> (orphaned)
and <ahref="https://CRAN.R-project.org/package=tiff"><strong>tiff</strong></a> can read some of the sub-formats (depending on the
external <code>libtiff</code> software against which they are compiled).
There some facilities for specialized sub-formats, for example in
Bioconductor package <strong>beadarray</strong>.
</p>
<p>Raster files are common in the geographical sciences, and package
<ahref="https://CRAN.R-project.org/package=rgdal"><strong>rgdal</strong></a> provides an interface to GDAL which provides some
facilities of its own to read raster files and links to many others.
Which formats it supports is determined when GDAL is compiled: use
<code>gdalDrivers()</code> to see what these are for the build you are using.
It can be useful for uncommon formats such as JPEG 2000 (which is a
different format from JPEG, and not currently supported in the OS X
nor Windows binary versions of <ahref="https://CRAN.R-project.org/package=rgdal"><strong>rgdal</strong></a>).
<tr><tdalign="left"valign="top">•<ahref="#Types-of-connections"accesskey="1">Types of connections</a>:</td><td> </td><tdalign="left"valign="top">
</td></tr>
<tr><tdalign="left"valign="top">•<ahref="#Output-to-connections"accesskey="2">Output to connections</a>:</td><td> </td><tdalign="left"valign="top">
</td></tr>
<tr><tdalign="left"valign="top">•<ahref="#Input-from-connections"accesskey="3">Input from connections</a>:</td><td> </td><tdalign="left"valign="top">
</td></tr>
<tr><tdalign="left"valign="top">•<ahref="#Listing-and-manipulating-connections"accesskey="4">Listing and manipulating connections</a>:</td><td> </td><tdalign="left"valign="top">
Next: <ahref="#Output-to-connections"accesskey="n"rel="next">Output to connections</a>, Previous: <ahref="#Connections"accesskey="p"rel="prev">Connections</a>, Up: <ahref="#Connections"accesskey="u"rel="up">Connections</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Types-of-connections-1"></a>
<h3class="section">7.1 Types of connections</h3>
<aname="index-Connections-1"></a>
<aname="index-file"></a>
<aname="index-File-connections"></a>
<p>The most familiar type of connection will be a file, and file
connections are created by function <code>file</code>. File connections can
(if the OS will allow it for the particular file) be opened for reading
or writing or appending, in text or binary mode. In fact, files can be
opened for both reading and writing, and R keeps a separate file
position for reading and writing.
</p>
<aname="index-open"></a>
<aname="index-close-1"></a>
<p>Note that by default a connection is not opened when it is created. The
rule is that a function using a connection should open a connection
(needed) if the connection is not already open, and close a connection
after use if it opened it. In brief, leave the connection in the state
you found it in. There are generic functions <code>open</code> and
<code>close</code> with methods to explicitly open and close connections.
</p>
<aname="index-gzfile"></a>
<aname="index-bzfile"></a>
<aname="index-Compressed-files"></a>
<p>Files compressed via the algorithm used by <code>gzip</code> can be used as
connections created by the function <code>gzfile</code>, whereas files
compressed by <code>bzip2</code> can be used via <code>bzfile</code>.
</p>
<aname="index-Terminal-connections"></a>
<aname="index-stdin"></a>
<aname="index-stdout"></a>
<aname="index-stderr"></a>
<p>Unix programmers are used to dealing with special files <code>stdin</code>,
<code>stdout</code> and <code>stderr</code>. These exist as <em>terminal
connections</em> in R. They may be normal files, but they might also
refer to input from and output to a GUI console. (Even with the standard
Unix R interface, <code>stdin</code> refers to the lines submitted from
<code>readline</code> rather than a file.)
</p>
<p>The three terminal connections are always open, and cannot be opened or
closed. <code>stdout</code> and <code>stderr</code> are conventionally used for
normal output and error messages respectively. They may normally go to
the same place, but whereas normal output can be re-directed by a call
to <code>sink</code>, error output is sent to <code>stderr</code> unless re-directed
by <code>sink, type="message")</code>. Note carefully the language used here:
the connections cannot be re-directed, but output can be sent to other
connections.
</p>
<aname="index-Text-connections"></a>
<aname="index-textConnection"></a>
<p><em>Text connections</em> are another source of input. They allow R
character vectors to be read as if the lines were being read from a text
file. A text connection is created and opened by a call to
<code>textConnection</code>, which copies the current contents of the
character vector to an internal buffer at the time of creation.
</p>
<p>Text connections can also be used to capture R output to a character
vector. <code>textConnection</code> can be asked to create a new character
object or append to an existing one, in both cases in the user’s
workspace. The connection is opened by the call to
<code>textConnection</code>, and at all times the complete lines output to the
connection are available in the R object. Closing the connection
writes any remaining output to a final element of the character vector.
</p>
<aname="index-Pipe-connections"></a>
<aname="index-pipe"></a>
<p><em>Pipes</em> are a special form of file that connects to another
process, and pipe connections are created by the function <code>pipe</code>.
Opening a pipe connection for writing (it makes no sense to append to a
pipe) runs an OS command, and connects its standard input to whatever
R then writes to that connection. Conversely, opening a pipe
connection for input runs an OS command and makes its standard output
available for R input from that connection.
</p>
<aname="index-URL-connections"></a>
<aname="index-url"></a>
<p><acronym>URL</acronym>s of types ‘<samp>http://</samp>’, ‘<samp>ftp://</samp>’ and ‘<samp>file://</samp>’
can be read from using the function <code>url</code>. For convenience,
<code>file</code> will also accept these as the file specification and call
<code>url</code>. On most platforms ‘<samp>https://</samp>’ are also accepted.
</p>
<aname="index-Sockets"></a>
<aname="index-socketConnection"></a>
<p>Sockets can also be used as connections via function
<code>socketConnection</code> on platforms which support Berkeley-like sockets
(most Unix systems, Linux and Windows). Sockets can be written to or
read from, and both client and server sockets can be used.
</p>
<hr>
<aname="Output-to-connections"></a>
<divclass="header">
<p>
Next: <ahref="#Input-from-connections"accesskey="n"rel="next">Input from connections</a>, Previous: <ahref="#Types-of-connections"accesskey="p"rel="prev">Types of connections</a>, Up: <ahref="#Connections"accesskey="u"rel="up">Connections</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Output-to-connections-1"></a>
<h3class="section">7.2 Output to connections</h3>
<aname="index-Connections-2"></a>
<aname="index-cat-1"></a>
<aname="index-write-1"></a>
<aname="index-write_002etable-1"></a>
<aname="index-sink-1"></a>
<p>We have described functions <code>cat</code>, <code>write</code>, <code>write.table</code>
and <code>sink</code> as writing to a file, possibly appending to a file if
argument <code>append = TRUE</code>, and this is what they did prior to R
version 1.2.0.
</p>
<p>The current behaviour is equivalent, but what actually happens is that
when the <code>file</code> argument is a character string, a file connection
is opened (for writing or appending) and closed again at the end of the
function call. If we want to repeatedly write to the same file, it is
more efficient to explicitly declare and open the connection, and pass
the connection object to each call to an output function. This also
makes it possible to write to pipes, which was implemented earlier in a
limited way via the syntax <code>file = "|cmd"</code> (which can still be
used).
</p>
<aname="index-writeLines"></a>
<p>There is a function <code>writeLines</code> to write complete text lines
to a connection.
</p>
<p>Some simple examples are
</p>
<divclass="example">
<preclass="example">zz <- file("ex.data", "w") # open an output file connection
## now ‘ex.lm.out’ contains the output for futher processing.
## Look at it by, e.g.,
cat(ex.lm.out, sep = "\n")
</pre></div>
<hr>
<aname="Input-from-connections"></a>
<divclass="header">
<p>
Next: <ahref="#Listing-and-manipulating-connections"accesskey="n"rel="next">Listing and manipulating connections</a>, Previous: <ahref="#Output-to-connections"accesskey="p"rel="prev">Output to connections</a>, Up: <ahref="#Connections"accesskey="u"rel="up">Connections</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Input-from-connections-1"></a>
<h3class="section">7.3 Input from connections</h3>
<aname="index-scan-2"></a>
<aname="index-read_002etable-1"></a>
<aname="index-readLines-1"></a>
<p>The basic functions to read from connections are <code>scan</code> and
<code>readLines</code>. These take a character string argument and open a
file connection for the duration of the function call, but explicitly
opening a file connection allows a file to be read sequentially in
different formats.
</p>
<p>Other functions that call <code>scan</code> can also make use of connections,
in particular <code>read.table</code>.
</p>
<p>Some simple examples are
</p>
<divclass="example">
<preclass="example">## read in file created in last examples
Previous: <ahref="#Input-from-connections"accesskey="p"rel="prev">Input from connections</a>, Up: <ahref="#Input-from-connections"accesskey="u"rel="up">Input from connections</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Pushback-1"></a>
<h4class="subsection">7.3.1 Pushback</h4>
<aname="index-pushBack_002e"></a>
<aname="index-Pushback-on-a-connection"></a>
<p>C programmers may be familiar with the <code>ungetc</code> function to push
back a character onto a text input stream. R connections have the
same idea in a more powerful way, in that an (essentially) arbitrary
number of lines of text can be pushed back onto a connection via a call
to <code>pushBack</code>.
</p>
<p>Pushbacks operate as a stack, so a read request first uses each line
from the most recently pushbacked text, then those from earlier
pushbacks and finally reads from the connection itself. Once a
pushbacked line is read completely, it is cleared. The number of
pending lines pushed back can be found via a call to
<h3class="section">7.4 Listing and manipulating connections</h3>
<aname="index-Connections-3"></a>
<aname="index-showConnections"></a>
<p>A summary of all the connections currently opened by the user can be
found by <code>showConnections()</code>, and a summary of all connections,
including closed and terminal connections, by <code>showConnections(all
= TRUE)</code>
</p>
<aname="index-seek"></a>
<aname="index-isSeekable"></a>
<p>The generic function <code>seek</code> can be used to read and (on some
connections) reset the current position for reading or writing.
Unfortunately it depends on OS facilities which may be unreliable
(e.g. with text files under Windows). Function <code>isSeekable</code>
reports if <code>seek</code> can change the position on the connection
given by its argument.
</p>
<aname="index-truncate"></a>
<p>The function <code>truncate</code> can be used to truncate a file opened for
writing at its current position. It works only for <code>file</code>
connections, and is not implemented on all platforms.
</p>
<hr>
<aname="Binary-connections"></a>
<divclass="header">
<p>
Previous: <ahref="#Listing-and-manipulating-connections"accesskey="p"rel="prev">Listing and manipulating connections</a>, Up: <ahref="#Connections"accesskey="u"rel="up">Connections</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Binary-connections-1"></a>
<h3class="section">7.5 Binary connections</h3>
<aname="index-Binary-files-1"></a>
<aname="index-readBin"></a>
<aname="index-writeBin"></a>
<p>Functions <code>readBin</code> and <code>writeBin</code> read to and write from
binary connections. A connection is opened in binary mode by appending
<code>"b"</code> to the mode specification, that is using mode <code>"rb"</code> for
reading, and mode <code>"wb"</code> or <code>"ab"</code> (where appropriate) for
<tr><tdalign="left"valign="top">•<ahref="#Reading-from-sockets"accesskey="1">Reading from sockets</a>:</td><td> </td><tdalign="left"valign="top">
<p>Base R comes with some facilities to communicate <em>via</em>
<acronym>BSD</acronym> sockets on systems that support them (including the common
Linux, Unix and Windows ports of R). One potential problem with
using sockets is that these facilities are often blocked for security
reasons or to force the use of Web caches, so these functions may be
more useful on an intranet than externally. For new projects it
is suggested that socket connections are used instead.
</p>
<aname="index-make_002esocket"></a>
<aname="index-read_002esocket"></a>
<aname="index-write_002esocket"></a>
<aname="index-close_002esocket"></a>
<p>The earlier low-level interface is given by functions <code>make.socket</code>,
<code>read.socket</code>, <code>write.socket</code> and <code>close.socket</code>.
</p>
<hr>
<aname="Using-download_002efile"></a>
<divclass="header">
<p>
Previous: <ahref="#Reading-from-sockets"accesskey="p"rel="prev">Reading from sockets</a>, Up: <ahref="#Network-interfaces"accesskey="u"rel="up">Network interfaces</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Using-download_002efile-1"></a>
<h3class="section">8.2 Using <code>download.file</code></h3>
<p>Function <code>download.file</code> is provided to read a file from a
Web resource via FTP or HTTP and write it to a file. Often this can be
avoided, as functions such as <code>read.table</code> and <code>scan</code> can read
directly from a URL, either by explicitly using <code>url</code> to open a
connection, or implicitly using it by giving a URL as the <code>file</code>
argument.
</p>
<hr>
<aname="Reading-Excel-spreadsheets"></a>
<divclass="header">
<p>
Next: <ahref="#References"accesskey="n"rel="next">References</a>, Previous: <ahref="#Network-interfaces"accesskey="p"rel="prev">Network interfaces</a>, Up: <ahref="#Top"accesskey="u"rel="up">Top</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
<p>The most common R data import/export question seems to be ‘how do I read
an Excel spreadsheet’. This chapter collects together advice and
options given earlier. Note that most of the advice is for pre-Excel
2007 spreadsheets and not the later <samp>.xlsx</samp> format.
</p>
<aname="index-read_002ecsv-1"></a>
<aname="index-read_002edelim-1"></a>
<aname="index-read_002eDIF-1"></a>
<aname="index-read_002etable-2"></a>
<aname="index-readClipboard"></a>
<p>The first piece of advice is to avoid doing so if possible! If you have
access to Excel, export the data you want from Excel in tab-delimited or
comma-separated form, and use <code>read.delim</code> or <code>read.csv</code> to
import it into R. (You may need to use <code>read.delim2</code> or
<code>read.csv2</code> in a locale that uses comma as the decimal point.)
Exporting a DIF file and reading it using <code>read.DIF</code> is another
possibility.
</p>
<p>If you do not have Excel, many other programs are able to read such
spreadsheets and export in a text format on both Windows and Unix, for
example Gnumeric (<ahref="http://www.gnome.org/projects/gnumeric/">http://www.gnome.org/projects/gnumeric/</a>) and
OpenOffice (<ahref="https://www.openoffice.org">https://www.openoffice.org</a>). You can also
cut-and-paste between the display of a spreadsheet in such a program and
R: <code>read.table</code> will read from the R console or, under Windows,
from the clipboard (via <code>file = "clipboard"</code> or
<code>readClipboard</code>). The <code>read.DIF</code> function can also read from
the clipboard.
</p>
<p>Note that an Excel <samp>.xls</samp> file is not just a spreadsheet: such
files can contain many sheets, and the sheets can contain formulae,
macros and so on. Not all readers can read other than the first sheet,
and may be confused by other contents of the file.
</p>
<aname="index-odbcConnectExcel-1"></a>
<aname="index-odbcConnectExcel2007"></a>
<p>Windows users (of 32-bit R) can use <code>odbcConnectExcel</code> in
package <ahref="https://CRAN.R-project.org/package=RODBC"><strong>RODBC</strong></a>. This can select rows and columns from any of the
sheets in an Excel spreadsheet file (at least from Excel 97–2003,
depending on your ODBC drivers: by calling <code>odbcConnect</code> directly
versions back to Excel 3.0 can be read). The version
<code>odbcConnectExcel2007</code> will read the Excel 2007 formats as well as
earlier ones (provided the drivers are installed, including with 64-bit
Windows R: see <ahref="#RODBC">RODBC</a>). OS X users can also use <ahref="https://CRAN.R-project.org/package=RODBC"><strong>RODBC</strong></a> if
they have a suitable driver (e.g. that from Actual Technologies).
</p>
<aname="index-read_002exls"></a>
<p><code>Perl</code> users have contributed a module
<code>OLE::SpreadSheet::ParseExcel</code> and a program <code>xls2csv.pl</code> to
convert Excel 95–2003 spreadsheets to CSV files. Package <ahref="https://CRAN.R-project.org/package=gdata"><strong>gdata</strong></a>
provides a basic wrapper in its <code>read.xls</code> function. With suitable
<code>Perl</code> modules installed this function can also read Excel 2007
spreadsheets.
</p>
<aname="index-xlsReadWrite"></a>
<p>32-bit Windows package <ahref="https://CRAN.R-project.org/package=xlsReadWrite"><strong>xlsReadWrite</strong></a> from
<ahref="http://www.swissr.org/">http://www.swissr.org/</a> and CRAN has a function <code>read.xls</code> to
read <samp>.xls</samp> files (based on a third-party non-Open-Source Delphi
component).
</p>
<aname="index-dataframes2xls"></a>
<aname="index-WriteXLS"></a>
<p>Packages <ahref="https://CRAN.R-project.org/package=dataframes2xls"><strong>dataframes2xls</strong></a> and <ahref="https://CRAN.R-project.org/package=WriteXLS"><strong>WriteXLS</strong></a> each contain a function
to <em>write</em> one or more data frames to an <samp>.xls</samp> file, using
Python and Perl respectively. Another version of <code>write.xls</code> in
available in package <ahref="https://CRAN.R-project.org/package=xlsReadWrite"><strong>xlsReadWrite</strong></a>.
</p>
<aname="index-xlsx"></a>
<aname="index-RExcelXML"></a>
<p>Two packages which can read and and manipulate Excel 2007/10
spreadsheets but not earlier formats are <ahref="https://CRAN.R-project.org/package=xlsx"><strong>xlsx</strong></a> (which requires
Java) and the Omegahat package <strong>RExcelXML</strong>.
</p>
<aname="index-XLConnect"></a>
<p>Package <ahref="https://CRAN.R-project.org/package=XLConnect"><strong>XLConnect</strong></a> can read, write and manipulate both Excel
97–2003 and Excel 2007/10 spreadsheets, requiring Java.
</p>
<hr>
<aname="References"></a>
<divclass="header">
<p>
Next: <ahref="#Function-and-variable-index"accesskey="n"rel="next">Function and variable index</a>, Previous: <ahref="#Reading-Excel-spreadsheets"accesskey="p"rel="prev">Reading Excel spreadsheets</a>, Up: <ahref="#Top"accesskey="u"rel="up">Top</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="References-1"></a>
<h2class="appendix">Appendix A References</h2>
<p>R. A. Becker, J. M. Chambers and A. R. Wilks (1988)
<em>The New S Language. A Programming Environment for Data Analysis
and Graphics.</em> Wadsworth & Brooks/Cole.
</p>
<p>J. Bowman, S. Emberson and M. Darnovsky (1996) <em>The
Practical <acronym>SQL</acronym> Handbook. Using Structured Query Language.</em>
Addison-Wesley.
</p>
<p>J. M. Chambers (1998) <em>Programming with Data. A Guide to the S
Language.</em> Springer-Verlag.
</p>
<p>P. Dubois (2000) <em>MySQL.</em> New Riders.
</p>
<p>M. Henning and S. Vinoski (1999) <em>Advanced CORBA Programming
with C++.</em> Addison-Wesley.
</p>
<p>K. Kline and D. Kline (2001) <em>SQL in a Nutshell.</em> O’Reilly.
</p>
<p>B. Momjian (2000) <em>PostgreSQL: Introduction and Concepts.</em>
Addison-Wesley.
Also available at <ahref="http://momjian.us/main/writings/pgsql/aw_pgsql_book/">http://momjian.us/main/writings/pgsql/aw_pgsql_book/</a>.
</p>
<p>B. D. Ripley (2001) Connections. \<em>R News</em>, <strong>1/1</strong>, 16–7.
<tr><td></td><tdvalign="top"><ahref="#index-bzfile"><code>bzfile</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-cat"><code>cat</code></a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-cat-1"><code>cat</code></a>:</td><td> </td><tdvalign="top"><ahref="#Output-to-connections">Output to connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-close-1"><code>close</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-close_002esocket"><code>close.socket</code></a>:</td><td> </td><tdvalign="top"><ahref="#Reading-from-sockets">Reading from sockets</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-count_002efields"><code>count.fields</code></a>:</td><td> </td><tdvalign="top"><ahref="#Variations-on-read_002etable">Variations on read.table</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-data_002erestore"><code>data.restore</code></a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-file"><code>file</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-format"><code>format</code></a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-gzfile"><code>gzfile</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-hdf5"><code>hdf5</code></a>:</td><td> </td><tdvalign="top"><ahref="#Binary-data-formats">Binary data formats</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-isSeekable"><code>isSeekable</code></a>:</td><td> </td><tdvalign="top"><ahref="#Listing-and-manipulating-connections">Listing and manipulating connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-make_002esocket"><code>make.socket</code></a>:</td><td> </td><tdvalign="top"><ahref="#Reading-from-sockets">Reading from sockets</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-netCDF"><code>netCDF</code></a>:</td><td> </td><tdvalign="top"><ahref="#Binary-data-formats">Binary data formats</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-open"><code>open</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-pipe"><code>pipe</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002ecsv"><code>read.csv</code></a>:</td><td> </td><tdvalign="top"><ahref="#Variations-on-read_002etable">Variations on read.table</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002ecsv2"><code>read.csv2</code></a>:</td><td> </td><tdvalign="top"><ahref="#Variations-on-read_002etable">Variations on read.table</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002edelim"><code>read.delim</code></a>:</td><td> </td><tdvalign="top"><ahref="#Variations-on-read_002etable">Variations on read.table</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002edelim2"><code>read.delim2</code></a>:</td><td> </td><tdvalign="top"><ahref="#Variations-on-read_002etable">Variations on read.table</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002eDIF"><code>read.DIF</code></a>:</td><td> </td><tdvalign="top"><ahref="#Data-Interchange-Format-_0028DIF_0029">Data Interchange Format (DIF)</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002edta"><code>read.dta</code></a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002eepiinfo"><code>read.epiinfo</code></a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002emtp"><code>read.mtp</code></a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002eS"><code>read.S</code></a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002esocket"><code>read.socket</code></a>:</td><td> </td><tdvalign="top"><ahref="#Reading-from-sockets">Reading from sockets</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002espss"><code>read.spss</code></a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002esystat"><code>read.systat</code></a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002etable"><code>read.table</code></a>:</td><td> </td><tdvalign="top"><ahref="#Variations-on-read_002etable">Variations on read.table</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002etable-1"><code>read.table</code></a>:</td><td> </td><tdvalign="top"><ahref="#Input-from-connections">Input from connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-read_002export"><code>read.xport</code></a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-readLines-1"><code>readLines</code></a>:</td><td> </td><tdvalign="top"><ahref="#Input-from-connections">Input from connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-scan-2"><code>scan</code></a>:</td><td> </td><tdvalign="top"><ahref="#Input-from-connections">Input from connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-seek"><code>seek</code></a>:</td><td> </td><tdvalign="top"><ahref="#Listing-and-manipulating-connections">Listing and manipulating connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-showConnections"><code>showConnections</code></a>:</td><td> </td><tdvalign="top"><ahref="#Listing-and-manipulating-connections">Listing and manipulating connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-sink"><code>sink</code></a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-sink-1"><code>sink</code></a>:</td><td> </td><tdvalign="top"><ahref="#Output-to-connections">Output to connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-socketConnection"><code>socketConnection</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-stderr"><code>stderr</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-stdin"><code>stdin</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-stdout"><code>stdout</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Sys_002elocaleconv"><code>Sys.localeconv</code></a>:</td><td> </td><tdvalign="top"><ahref="#Variations-on-read_002etable">Variations on read.table</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-textConnection"><code>textConnection</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-truncate"><code>truncate</code></a>:</td><td> </td><tdvalign="top"><ahref="#Listing-and-manipulating-connections">Listing and manipulating connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-url"><code>url</code></a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-write"><code>write</code></a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-write-1"><code>write</code></a>:</td><td> </td><tdvalign="top"><ahref="#Output-to-connections">Output to connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-write_002ecsv"><code>write.csv</code></a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-write_002ecsv2"><code>write.csv2</code></a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-write_002edta"><code>write.dta</code></a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-write_002eforeign"><code>write.foreign</code></a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-write_002ematrix"><code>write.matrix</code></a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-write_002esocket"><code>write.socket</code></a>:</td><td> </td><tdvalign="top"><ahref="#Reading-from-sockets">Reading from sockets</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-write_002etable"><code>write.table</code></a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-write_002etable-1"><code>write.table</code></a>:</td><td> </td><tdvalign="top"><ahref="#Output-to-connections">Output to connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-writeLines"><code>writeLines</code></a>:</td><td> </td><tdvalign="top"><ahref="#Output-to-connections">Output to connections</a></td></tr>
Previous: <ahref="#Function-and-variable-index"accesskey="p"rel="prev">Function and variable index</a>, Up: <ahref="#Top"accesskey="u"rel="up">Top</a> [<ahref="#SEC_Contents"title="Table of contents"rel="contents">Contents</a>][<ahref="#Function-and-variable-index"title="Index"rel="index">Index</a>]</p>
</div>
<aname="Concept-index-1"></a>
<h2class="unnumbered">Concept index</h2>
<tablesummary=""><tr><thvalign="top">Jump to: </th><td><aclass="summary-letter"href="#Concept-index_cp_letter-A"><b>A</b></a>
<tr><td></td><tdvalign="top"><ahref="#index-comma-separated-values">comma separated values</a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Compressed-files">Compressed files</a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Connections-1">Connections</a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Connections-2">Connections</a>:</td><td> </td><tdvalign="top"><ahref="#Output-to-connections">Output to connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Connections-3">Connections</a>:</td><td> </td><tdvalign="top"><ahref="#Listing-and-manipulating-connections">Listing and manipulating connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-CSV-files">CSV files</a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-CSV-files-1">CSV files</a>:</td><td> </td><tdvalign="top"><ahref="#Variations-on-read_002etable">Variations on read.table</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Data-Interchange-Format-_0028DIF_0029">Data Interchange Format (DIF)</a>:</td><td> </td><tdvalign="top"><ahref="#Data-Interchange-Format-_0028DIF_0029">Data Interchange Format (DIF)</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Encodings-1">Encodings</a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-EpiData">EpiData</a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-EpiInfo">EpiInfo</a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Exporting-to-a-text-file">Exporting to a text file</a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-File-connections">File connections</a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Hierarchical-Data-Format">Hierarchical Data Format</a>:</td><td> </td><tdvalign="top"><ahref="#Binary-data-formats">Binary data formats</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Importing-from-other-statistical-systems">Importing from other statistical systems</a>:</td><td> </td><tdvalign="top"><ahref="#Importing-from-other-statistical-systems">Importing from other statistical systems</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-locales">locales</a>:</td><td> </td><tdvalign="top"><ahref="#Variations-on-read_002etable">Variations on read.table</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Minitab">Minitab</a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Missing-values">Missing values</a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Missing-values-1">Missing values</a>:</td><td> </td><tdvalign="top"><ahref="#Variations-on-read_002etable">Variations on read.table</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-network-Common-Data-Form">network Common Data Form</a>:</td><td> </td><tdvalign="top"><ahref="#Binary-data-formats">Binary data formats</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-ODBC">ODBC</a>:</td><td> </td><tdvalign="top"><ahref="#Overview-of-RDBMSs">Overview of RDBMSs</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Open-Database-Connectivity">Open Database Connectivity</a>:</td><td> </td><tdvalign="top"><ahref="#Overview-of-RDBMSs">Overview of RDBMSs</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Pipe-connections">Pipe connections</a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Pushback-on-a-connection">Pushback on a connection</a>:</td><td> </td><tdvalign="top"><ahref="#Pushback">Pushback</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Quoting-strings">Quoting strings</a>:</td><td> </td><tdvalign="top"><ahref="#Export-to-text-files">Export to text files</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Quoting-strings-1">Quoting strings</a>:</td><td> </td><tdvalign="top"><ahref="#Variations-on-read_002etable">Variations on read.table</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-S_002dPLUS">S-PLUS</a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-SAS">SAS</a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Sockets">Sockets</a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Sockets-1">Sockets</a>:</td><td> </td><tdvalign="top"><ahref="#Reading-from-sockets">Reading from sockets</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-SPSS">SPSS</a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-SPSS-Data-Entry">SPSS Data Entry</a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Stata">Stata</a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Systat">Systat</a>:</td><td> </td><tdvalign="top"><ahref="#EpiInfo-Minitab-SAS-S_002dPLUS-SPSS-Stata-Systat">EpiInfo Minitab SAS S-PLUS SPSS Stata Systat</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Terminal-connections">Terminal connections</a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-Text-connections">Text connections</a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-URL-connections">URL connections</a>:</td><td> </td><tdvalign="top"><ahref="#Types-of-connections">Types of connections</a></td></tr>
<tr><td></td><tdvalign="top"><ahref="#index-URL-connections-1">URL connections</a>:</td><td> </td><tdvalign="top"><ahref="#Input-from-connections">Input from connections</a></td></tr>