Link Search Menu Expand Document (external link)

How to quickly load some sample data (in Julia)

See all solutions.

Task

Sometimes you just need to try out a new piece of code, whether it be data manipulation, statistical computation, plotting, or whatever. And it’s handy to be able to quickly load some example data to work with. There is a lot of freely available sample data out there. What’s the easiest way to load it?

Solution

The R programming language comes with many free datasets built in. To make these same datasets available to Julia programmers as well, you can install and import the RDatasets package.

First, ensure that you have it installed, by running the Julia commands using Pkg and then Pkg.add( "RDatasets" ). Then you can get access to many datasets as follows:

1
2
3
using RDatasets
iris = dataset( "datasets", "iris" )
first( iris, 5 ) # just show the first 5 rows
5×5 DataFrame
RowSepalLengthSepalWidthPetalLengthPetalWidthSpecies
Float64Float64Float64Float64Cat…
15.13.51.40.2setosa
24.93.01.40.2setosa
34.73.21.30.2setosa
44.63.11.50.2setosa
55.03.61.40.2setosa

But what datasets are available? There are many! You can find a full list in the package itself.

1
RDatasets.packages()
34×2 DataFrame
9 rows omitted
RowPackageTitle
String15String
1COUNTFunctions, data and code for count data.
2EcdatData sets for econometrics
3HSAURA Handbook of Statistical Analyses Using R (1st Edition)
4HistDataData sets from the history of statistics and data visualization
5ISLRData for An Introduction to Statistical Learning with Applications in R
6KMsurvData sets from Klein and Moeschberger (1997), Survival Analysis
7MASSSupport Functions and Datasets for Venables and Ripley's MASS
8SASmixedData sets from "SAS System for Mixed Models"
9ZeligEveryone's Statistical Software
10adehabitatLTAnalysis of Animal Movements
11bootBootstrap Functions (Originally by Angelo Canty for S)
12carCompanion to Applied Regression
13clusterCluster Analysis Extended Rousseeuw et al.
23plmLinear Models for Panel Data
24plyrTools for splitting, applying and combining data
25psclPolitical Science Computational Laboratory, Stanford University
26psychProcedures for Psychological, Psychometric, and Personality Research
27quantregQuantile Regression
28reshape2Flexibly Reshape Data: A Reboot of the Reshape Package.
29robustbaseBasic Robust Statistics
30rpartRecursive Partitioning and Regression Trees
31sandwichRobust Covariance Matrix Estimators
32semStructural Equation Models
33survivalSurvival Analysis
34vcdVisualizing Categorical Data

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Contributed by Nathan Carter (ncarter@bentley.edu)