Information
Introduction
This tutorial covers material which is not actually in R for Data Science (2e) by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. But the material is in keeping with the spirit of that book. Some of the material from Chapter 6 Workflow: scripts and projects appears here.
This tutorial introduces the structure and uses of the Terminal from
within RStudio. The older, more general term for the terminal is the
“command line.” All modern operating systems provide a command line,
from which sophisticated users can modify files and directories, using
commands like pwd
, ls
, touch
,
mkdir
, mv
, rm
, and
cd
. We also discuss regular expressions as well as some of
the metacharacters (like, *
, ^
and
$
) from which they are constructed.
Note that a default Windows installation is different from Mac or Linux. If you are using Windows, you must first install RTools. Download the latest version from here. Currently, the Rtools44 installer is what you want. The link is in the seventh paragraph of this page.
For the most part, Windows works like Mac/Unix. There are two main exceptions which are important for this tutorial:
Environment variables look like
$R_HOME
on Mac/Unix and%R_HOME%
on Windows. That is, in both cases, we are using theR_HOME
environment variable. We just access this variable differently depending on the operating system.Directory separators are forward slashes –
/
– on Max/Unix and backward slashes –\
– on Windows.
The tutorial defaults use Mac/Unix nomenclature. If you are using Windows, you need to translate, for example:
ls $R_HOME/etc
into
ls %R_HOME%\etc
Terminal
The Terminal tab is next to the Console tab within the Console pane in the lower left portion of RStudio. (Yes, it is a bit confusing that we call the entire structure the Console pane even though only one part of it is the Console tab.) We use the Console tab to talk to R. We use the Terminal tab to talk directly with the computer. Sadly, the Console and the Terminal speak different languages.
Exercise 1
Hit the “return” key (the “enter” key on Windows) two times in the Terminal to see what happens. The terminal has a string of characters ending in a $ before your cursor, called the prompt. After a command has been executed, a prompt will be generated on a new line to let you know that Terminal is ready for a new command. Copy and paste the three (blank) lines from the Terminal as your answer below. We do a lot of copy/pasting of commands/responses, so we abbreviate those instructions as CP/CR.
Your answer should look something like:
Davids-MacBook-Pro-2:averages dkane$
Davids-MacBook-Pro-2:averages dkane$
Davids-MacBook-Pro-2:averages dkane$
Your answer will look different from mine. Default prompts vary from
computer to computer. In my case, the first part of the prompt is the
name of my computer: Davids-MacBook-Pro-2
. The second part,
separated with a colon, is the name of the directory,
averages
, in which I am running the tutorial. The third
part, separated by a space, is dkane
, my username.
The prompt acts as a quick way to tell you where in the computer you are. The Terminal is sensitive to which folder location you are in, just like the R Console is.
Exercise 2
Let’s figure out the location of the folder in which we currently
are. (Note that the terms “folder” and “directory” mean the same thing.)
To see your current location within your computer, type the command
pwd
(present working
directory) in the Terminal. Hit return/enter key to run
the command.
Going forward, we will instruct you to “run” a given command. You do this by typing the command at the prompt and then hitting the return/enter key.
CP/CR.
Your answer should look something like mine:
Davids-MacBook-Pro-2:averages dkane$ pwd
/Users/dkane/Desktop/projects/averages
Davids-MacBook-Pro-2:averages dkane$
Notice how the last directory (folder) in the path is included in the prompt.
Exercise 3
Let’s see a list of what is where
we are, i.e. in our working directory. Run ls
. CP/CR.
If you do not give it any more information, ls
assumes
that you want a list of what is in your current working directory.
Exercise 4
Let’s make a
directory called example
. Run
mkdir example
. Run ls
to confirm that the new
directory exists inside of the current working directory. CP/CR.
mkdir
creates an empty directory. We — and most other
people — use the terms “folder” and “directory” interchangeably.
Exercise 5
Move into the example
directory by running
cd example
. Run pwd
. CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:averages dkane$ cd example
Davids-MacBook-Pro-2:example dkane$ pwd
/Users/dkane/Desktop/projects/averages/example
Davids-MacBook-Pro-2:example dkane$
The example
directory is one level below the directory
in which I started.
Exercise 6
Let’s create a file called my.txt
. Run
touch my.txt
. Run ls
to confirm that the new
file exists inside the current directory. CP/CR.
Davids-MacBook-Pro-2:example dkane$ touch my.txt
Davids-MacBook-Pro-2:example dkane$ ls
my.txt
Davids-MacBook-Pro-2:example dkane$
touch
creates an empty file.
Exercise 7
Make a copy of my.txt
called my_2.txt
by
using the cp
command. That is, run
cp my.txt my_2.txt
. Confirm that this worked by running
ls
. CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:example dkane$ cp my.txt my_2.text
Davids-MacBook-Pro-2:example dkane$ ls
my.txt my_2.text
Davids-MacBook-Pro-2:example dkane$
Exercise 8
We rename files with the mv
command, which is derived
from move. The first argument is the
file we want to rename, the second argument is the new name
Rename my.txt
to fake.txt
by running
mv my.txt fake.txt
. Run ls
to confirm.
CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:example dkane$ mv my.txt fake.txt
Davids-MacBook-Pro-2:example dkane$ ls
fake.txt my_2.text
Davids-MacBook-Pro-2:example dkane$
From the computer’s point of view, renaming a file is the same thing as moving a file. In fact, moving is more general since it allows us to change both the name and the location of a file.
Exercise 9
Let’s remove the file
fake.txt
. Run rm fake.txt
. Run ls
to confirm. CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:example dkane$ rm fake.txt
Davids-MacBook-Pro-2:example dkane$ ls
my_2.text
Davids-MacBook-Pro-2:example dkane$
Be careful with the rm
command in the Terminal. Unlike
moving files to Trash on your computer, it is (usually)
irreversible.
Exercise 10
We want to remove the example
directory. But we can’t do
that while we are “located” in that directory. That is, the computer is
aware that our Terminal process — the entity which is executing our
commands like ls
and cp
— is currently working
from the example
directory.
Run cd ..
. The “dot-dot” symbol — ..
—
references the directory which contains the current directory. Run
pwd
to confirm. CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:example dkane$ cd ..
Davids-MacBook-Pro-2:averages dkane$ pwd
/Users/dkane/Desktop/projects/averages
Davids-MacBook-Pro-2:averages dkane$
Again, note how the prompt changed because we moved directories. The
..
symbol — two periods — always indicates the directory
one level up, i.e., the directory in which example
is
located.
Exercise 11
Run rm -r example
to remove the directory
example
. CP/CR.
-r
is an option that allows us to
delete a directory. The r
stands for
recursive because, in order to delete a directory, you
must also (recursively) deleting every directory and file within that
directory.
It is good to clean up. Don’t leave junk files lying around.
Paths
You will practice using paths both inside and outside the current working directory. You will learn a few shortcuts to make this easier.
Exercise 1
By default, the Terminal in RStudio starts with a working directory
which is the same as the folder in which the current R project resides,
i.e., the folder with the .Rproj
file in it. Run
pwd
to show your current working directory. CP/CR.
The Output pane (bottom right corner) in RStudio includes several
tabs, the most important of which is the Files tab. Click on it. At the
top of the Files tab, you will see a series of directories, separated by
>
, on your computer, beginning with the Home icon. This
should match the path generated by pwd
.
Exercise 2
Use mkdir
to make a directory named paths
inside your current working directory. CP/CR.
The new directory is created directly inside the working directory.
Exercise 3
Let’s change the working directory
to paths
. Run cd paths
. Run pwd
to confirm that you have changed the working directory. CP/CR.
Your answer should look something like:
Davids-MacBook-Pro-2:averages dkane$ cd paths
Davids-MacBook-Pro-2:paths dkane$ pwd
/Users/dkane/Desktop/projects/averages/paths
Davids-MacBook-Pro-2:paths dkane$
Note how the prompt changed after I ran the cd paths
command. Before, the prompt included averages
because that
was the name of the current working directory. Using cd
changed the current working directory to paths
, causing the
prompt to change as well.
Exercise 4
Use mkdir
to make a directory called
lessons
inside paths
by running
mkdir lessons
. Confirm that this worked by running
ls
. CP/CR.
Your answer should look like this:
Davids-MacBook-Pro-2:paths dkane$ mkdir lessons
Davids-MacBook-Pro-2:paths dkane$ ls
lessons
Davids-MacBook-Pro-2:paths dkane$
Because we have changed working directories, the lessons
directory is created inside the paths
directory. It is the
only object in the paths
directory.
Exercise 5
Make a directory within lessons
called
fruits
by running mkdir lessons/fruits
. In
Windows, this would be mkdir lessons\fruits
. The only
difference between the Mac/Linux command and the Windows command is that
the former uses a forward slash, /
, to separate out
directories while the latter uses a back slash, \
.
Run ls lessons
to check to see if fruits
exists inside the lessons
directory. CP/CR.
Your answer should look like this:
Davids-MacBook-Pro-2:paths dkane$ mkdir lessons/fruits
Davids-MacBook-Pro-2:paths dkane$ ls lessons/
fruits
Davids-MacBook-Pro-2:paths dkane$
We could not, as before, simply give fruits
as an
argument to mkdir
because then the directory would be
created directly inside paths
, the current working
directory, rather than within lessons
. To refer to a
location other than the working directory, we need to use a
path, which describes the location directory by
directory. The path above is called a relative path
because it assumes the working directory as its starting point.
Exercise 6
Let’s change our working directory to fruits
. Run
cd lessons/fruits
. In Windows,
cd lessons\fruits
.
As you are typing this command, type only the f
in
fruits
and then press the tab key (twice may be necessary
on some computers). Pressing tab autocompletes the name of a file or
directory.
CP/CR.
Get in the practice of using the tab key. Avoid typing whenever possible. Be lazy!
Exercise 7
Use touch
to make a text file inside
fruits
, named pineapple.txt
. Recall that your
current location — i.e., your working directory — is now
fruits
. So, to make a new file within fruits
,
you just need touch pineapple.txt
. Run ls
to
confirm. CP/CR.
Your answer should look like this:
Davids-MacBook-Pro-2:fruits dkane$ touch pineapple.txt
Davids-MacBook-Pro-2:fruits dkane$ ls
pineapple.txt
Davids-MacBook-Pro-2:fruits dkane$
pineapple.txt
is still a relative path, but since the
location is directly inside the assumed starting point, i.e. the working
directory, the name of the file is all we need. If we had used
lessons/fruits/apple.txt
, as we would have had to do if our
working directory were paths
, we would have gotten an error
because those directories would not be recognized within our actual
starting point of fruits
.
Exercise 8
Use touch
to make two more text files inside
fruits
, named pear.txt
and
does-not-belong
. That is, run
touch pear.txt does-not-belong
. Confirm by running
ls
. CP/CR.
Your answer should look like this:
Davids-MacBook-Pro-2:fruits dkane$ touch pear.txt does-not-belong
Davids-MacBook-Pro-2:fruits dkane$ ls
does-not-belong pear.txt pineapple.txt
Davids-MacBook-Pro-2:fruits dkane$
As always, note how the prompt reminds us that we are in the
fruits
directory.
Exercise 9
Let’s change our working directory to lessons
. Run
cd ..
to go to the folder immediately above the working
directory. Use pwd
to confirm that you are in the right
directory. CP/CR.
Your answer should look like this:
Davids-MacBook-Pro-2:fruits dkane$ cd ..
Davids-MacBook-Pro-2:lessons dkane$ pwd
/Users/dkane/Desktop/projects/averages/paths/lessons
Davids-MacBook-Pro-2:lessons dkane$
..
is shorthand for the directory immediately above the
current working directory. The phrase “current working directory” is a
bit redundant. You don’t need the word “current.”
Exercise 10
Use mkdir
to make a directory within
lessons
, named tbd
. Run ls
to
confirm. CP/CR.
Your answer should look like this:
Davids-MacBook-Pro-2:lessons dkane$ mkdir tbd
Davids-MacBook-Pro-2:lessons dkane$ ls
fruits tbd
Davids-MacBook-Pro-2:lessons dkane$
Exercise 11
Use mv
to move the
file does-not-belong
from the fruits
directory
to the tbd
directory by running
mv fruits/does-not-belong tbd
. Run ls tbd
to
confirm. (Again, change forward slashes to backward slashes if you are
using Windows.)
CP/CR.
Your answer should look like this:
Davids-MacBook-Pro-2:lessons dkane$ mv fruits/does-not-belong tbd
Davids-MacBook-Pro-2:lessons dkane$ ls tbd
does-not-belong
Davids-MacBook-Pro-2:lessons dkane$
We can act on files which are not in the working directory as long as
we provide a path — either relative, as here, or absolute — to those
files. In this case, neither mv
nor ls
are
acting on the current directory. Instead, they are acting on lower
directories: fruits
and tbd
.
Exercise 12
Use cd
to change the working directory up to
paths
by running cd ..
. Confirm with
pwd
. Use mv
and the .
shorthand
to move the directory tbd
directly inside of
paths
by running mv lessons/tbd .
. Confirm
with ls
. CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:lessons dkane$ cd ..
Davids-MacBook-Pro-2:paths dkane$ pwd
/Users/dkane/Desktop/projects/averages/paths
Davids-MacBook-Pro-2:paths dkane$ mv lessons/tbd .
Davids-MacBook-Pro-2:paths dkane$ ls
lessons tbd
Davids-MacBook-Pro-2:paths dkane$
When using mv
, there is no difference between moving a
directory and moving a file.
Exercise 13
Move up one more directory, back to the original working directory,
by running cd ..
. Confirm with pwd
. Run
rm -r paths
to remove all the directories and files with
which we have been working. CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:paths dkane$ cd ..
Davids-MacBook-Pro-2:averages dkane$ pwd
/Users/dkane/Desktop/projects/averages
Davids-MacBook-Pro-2:averages dkane$ rm -r paths
Davids-MacBook-Pro-2:averages dkane$
In olden times, professional programmers would spend a lot of time
learning the various options to commands like rm
. Now, we
just ask ChatGPT or a similar tool.
Important symbols
We have seen one important symbol — ..
— already. This
section will explore it along with .
and ~
.
The most common English terms for these symbols are “dot” or
.
, “dot dot” for ..
, and “tilde” for
~
. The .
symbol indicates the current
directory. The ..
symbol is for the directory one above the
current directory. The ~
symbol is for your “home”
directory.
Exercise 1
Use mkdir symbols
to create a new directory called
symbols
. cd
into that directory by running
cd symbols
. Confirm the change with pwd
.
Confirm that the directory is empty by running ls
.
CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:averages dkane$ mkdir symbols
Davids-MacBook-Pro-2:averages dkane$ cd symbols
Davids-MacBook-Pro-2:symbols dkane$ pwd
/Users/dkane/Desktop/projects/averages/symbols
Davids-MacBook-Pro-2:symbols dkane$ ls
Davids-MacBook-Pro-2:symbols dkane$
The more practice we get with command line commands — often called “shell” comamnds — the easier it is to string together several of them in a row.
Exercise 2
But is the symbols
directory really empty? Run
ls -a
to check. The -
indicates an “option” to
the ls
command. The a
option tells
ls
to return all the members of the directory.
CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:symbols dkane$ ls -a
. ..
Davids-MacBook-Pro-2:symbols dkane$
Every directory, even an “empty” one includes .
and
..
. The .
is link to the current directory,
i.e., to the symbols
directory in which we are currently
located. The ..
is a link to the directory one level up,
which is the averages
directory in my case.
Exercise 3
Run ls ..
to examine the contents of the directory one
level above the current directory. CP/CR.
My answer looks like this:
Davids-MacBook-Pro-2:symbols dkane$ ls ..
R averages.Rproj theory.html theory_files
README.md symbols theory.qmd
Davids-MacBook-Pro-2:symbols dkane$
Your answer will look different because you started this tutorial in a different directory, with different contents, then I did.
Exercise 4
Run ls ../..
to examine the contents of the directory
two levels above the current directory. CP/CR.
My answer looks like this:
Davids-MacBook-Pro-2:symbols dkane$ ls ../..
GBBO-Analysis mtc simmons
averages ncf soccer_player_birth_months
bootcamp numerai syllabi
boston-college organization tidycensus.tutorials
coordination primer tidymodels.tutorials
...
Davids-MacBook-Pro-2:symbols dkane$
I cut off the long list of directories in my projects
directory. The symbols
directory, in which we are current
located, is in in that longer list.
Your answer will look different because you have a different collection of projects, presumably many fewer.
Exercise 5
Run pwd
again. This returns the current working
directory. Note that the return value is an absolute
path. It tells you exactly where you — meaning your Terminal
session or instance — are located on your computer.
My answer looks like this:
Davids-MacBook-Pro-2:symbols dkane$ pwd
/Users/dkane/Desktop/projects/averages/symbols
Davids-MacBook-Pro-2:symbols dkane$
The absolute path for my current location, meaning my current working directory, contains these locations.
The starting
/
— which would be a\
in Windows — indicates the root, or origin of the file systems. There is no directory higher than the root directory.Users
followed by my user name,dkane
. Most Mac/Linux machines follow this organization.The next three directories are, first,
projects
, which is where I store all my R projects, second,averages
, which just happens to be the name of the directory in which I started this tutorial, and, third,symbols
, which is the directory we just created.
Exercise 6
Run cd
without any argument. Doing so changes your
current directory to your home directory because that is the default
behavior for cd
. Run pwd
to confirm where you
are. CP/CR.
My answer looks like this:
Davids-MacBook-Pro-2:symbols dkane$ cd
Davids-MacBook-Pro-2:~ dkane$ pwd
/Users/dkane
Davids-MacBook-Pro-2:~ dkane$
In Mac/Linux, the most common location for a user’s home directory is
within the Users
directory. Note how, after we run
cd
, the prompt changes to report ~
as the
current working directory. ~
is the symbol for the current
user’s home directory.
Exercise 7
Use cd
, along with the full path to the
symbols
directory which you determined above, to change the
the working directory back to the symbols
directory. Run
pwd
to confirm. CP/CR.
My answer looks like this:
Davids-MacBook-Pro-2:~ dkane$ cd /Users/dkane/Desktop/projects/averages/symbols
Davids-MacBook-Pro-2:symbols dkane$ pwd
/Users/dkane/Desktop/projects/averages/symbols
Davids-MacBook-Pro-2:symbols dkane$
This is an example of using a “absolute” path to change locations.
Exercise 8
Run ls ~
. The list should be identical to the list you
generated above in the home directory. Because the home directory is
used in the absolute path of so many files, its path — generally
something like /Users/your-user-name-here
— can be written
with the shorthand ~
, which is a tilde, pronounced TIL-D.
CP/CR.
The shorthands we have learned so far, .
,
..
, ~
, stand for the text of the paths for the
corresponding directory. They are not specific to any function and can
be used in any situation where the text of the relevant path could be
used.
Exercise 9
Confirm that you are currently located in the symbols
directory by running pwd
. Run cd ..
to move
one directory up, out of the symbols
directory. Delete the
symbols
directory with rm -rf symbols
.
CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:symbols dkane$ pwd
/Users/dkane/Desktop/projects/averages/symbols
Davids-MacBook-Pro-2:symbols dkane$ cd ..
Davids-MacBook-Pro-2:averages dkane$ rm -rf symbols
Davids-MacBook-Pro-2:averages dkane$
Note that rm -rf symbols
and
rm -rf symbols/
, with the trailing forward slash, have the
same effect. Except in rare circumstances — although see an example
before — including the trailing forward slash in the name of a directory
has no effect.
Options
Options modify the behavior of command line functions like
ls
.
Exercise 1
Use pwd
to confirm that you are in your default working
directory. Make a directory directly inside the working directory, named
options
, by running mkdir options
.
cd
into options
. Run ls
to
confirm that it is empty. CP/CR.
My answer looks like:
Davids-MacBook-Pro-2:averages dkane$ pwd
/Users/dkane/Desktop/projects/averages
Davids-MacBook-Pro-2:averages dkane$ mkdir options
Davids-MacBook-Pro-2:averages dkane$ cd options/
Davids-MacBook-Pro-2:options dkane$ ls
Davids-MacBook-Pro-2:options dkane$
Note how easy it is to string together several commands in order to accomplish our goals.
Exercise 2
Add a text file to the options
directory, named
.my-hidden.txt
— make sure to include the .
at
the front of the file name — by running
touch .my-hidden.txt
. CP/CR.
Other than .
, avoid using special characters or spaces
anywhere in file names. These are difficult to work with in the
Terminal.
Exercise 3
Run ls
to look for the file you created. CP/CR.
Your answer should be something like:
Davids-MacBook-Pro-2:options dkane$ ls
Davids-MacBook-Pro-2:options dkane$
If you did the previous exercise correctly, you should
not see the new file. This is because we prefixed the
name with .
, which hid the file from normal view. Files
whose name begins with a .
are called “hidden” files for
this reason.
Exercise 4
Run ls -a
. Using the option
-a
, you should be able to see all the
files in the directory. CP/CR.
Options are rpreceded by a -
and come before the
argument. Your answer should look like:
Davids-MacBook-Pro-2:options dkane$ ls -a
. .. .my-hidden.txt
Davids-MacBook-Pro-2:options dkane$
The .
file refers to the options directory itself. The
..
refers to the directory in which the options directy is
located. Every directory has both a .
and a ..
file inside of it, but we only see those files if we use the
-a
option.
Hidden files live all over your computer. You can turn on your ability to see hidden files in the R Studio Files pane (bottom right corner) by clicking the “More” button towards the top.
Exercise 5
The environmental variable R_HOME
represents the path to a folder of set-up files that R has installed on
your computer. Type and enter echo $R_HOME
alone to make
sure we are telling you the truth.
Recall that, on Windows, this would be echo %R_HOME%
and
cd %R_HOME%
.
CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:options dkane$ echo $R_HOME
/Library/Frameworks/R.framework/Resources
Davids-MacBook-Pro-2:options dkane$
Note how the prompt has changed because I am now in the
options
directory.
Exercise 6
Run ls $HOME
. This should generate a list of the items
in your home directory. $HOME
is an environmental variable
that represents the home directory.
If you are on Windows, you can try ls %HOME%
, but,
sometimes, %HOME%
is not defined on Windows. In that case,
ls %USERPROFILE%
should work.
CP/CR.
There are many different environmental variables for places throughout your computer. You can also make your own.
Exercise 7
Use pwd
to confirm that you are still in
options
. Use ls
to see what is inside
R_HOME
by running ls $R_HOME
. CP/CR.
Don’t worry about the details of these files. Our main point is that environmental variables are very handy because they make it easy to refer to a specific location on a computer, like where R is installed, even though the absolute path to this location is different on every person’s computer.
Exercise 8
Run ls -l $R_HOME
, so that we can see more information
in a long list. CP/CR.
My answer looks like this:
Davids-MacBook-Pro-2:options dkane$ ls -l $R_HOME
lrwxrwxr-x 1 root admin 26 Nov 27 11:20 /Library/Frameworks/R.framework/Resources -> Versions/Current/Resources
Davids-MacBook-Pro-2:options dkane$
This is an example of a situation in which $R_HOME
and
$R_HOME/
produce different results, as we will see in the
next exercise. When the location pointed to be an environmental variable
is a link, as in this case, using ls -l $R_HOME
just gives
us back the link.
Exercise 9
Run ls -l $R_HOME/
. Just adding the trailing backward
slash (or forward slash in Windows?) changes the output. CP/CR.
My answer looks like:
Davids-MacBook-Pro-2:options dkane$ ls -l $R_HOME/
total 200
-rw-rw-r-- 1 root admin 18011 Oct 31 21:02 COPYING
-rw-rw-r-- 1 root admin 488 Oct 31 21:02 Info.plist
lrwxrwxr-x 1 root admin 5 Nov 27 11:20 R -> bin/R
-rwxrwxr-x 1 root admin 70944 Oct 31 21:04 Rscript
-rw-rw-r-- 1 root admin 46 Oct 31 21:02 SVN-REVISION
drwxrwxr-x 28 root admin 896 Nov 27 11:20 bin
drwxrwxr-x 23 root admin 736 Nov 27 11:20 doc
...
In a long list, names are in the last column and file type is in the
first: if the string in the first column begins with d
it
is a directory; if it begins with -
it is
a file; if it begins with l
it is a link
to another file or directory on the computer. The middle columns give,
from left to right, the number of files, author, permission, file size
(in bytes), and the date/time last modified.
Exercise 10
Check what is inside the doc
directory by running
ls -l $R_HOME/doc
. CP/CR.
As you could guess from the name, the directory contains mostly
files, including AUTHORS
.
Exercise 11
Let’s make a copy of $R_HOME/doc/AUTHORS
in
options
. Use the command cp
, whose syntax is
identical to mv
, and the .
shorthand. That is,
run cp $R_HOME/doc/AUTHORS .
. CP/CR.
On Windows, we would refer to this file as
%R_HOME%\doc\AUTHORS
. Recall that, in Windows, environment
variables are surrounded by percentage signs – %
– and
directories are separated by backward slashes: \
.
Like mv
, cp
can change a file name. Like
mv
, if the destination file name is omitted,
cp
will retain the same name by default.
Exercise 12
Try to use cp
and .
to copy the
doc
directory into options
. You will get an
error informing you that the thing you tried to copy was a directory.
Run the same command using the option -r
. CP/CR.
As we have seen for rm
, -r
allows us to use
cp
with directories. This is different from
mv
, which does not need -r
.
Exercise 13
Let’s view just the initial lines of the AUTHORS
. Run
head AUTHORS
. CP/CR.
Now you know how R got its name.
Exercise 14
Let’s view more than just the initial 10 lines. The option
-n
allows you to specify the number of initial lines you
want to see. Run head -n 18 AUTHORS
. CP/CR.
The option -n
takes an argument which allows head to
display any number of initial lines.
Exercise 15
Use the command tail
and the same -n
option
to view the last 15 lines of AUTHORS
. CP/CR.
If the argument of -n
is greater than the number of lines
in the file, head
or tail
will just print the
whole file.
Exercise 16
Let’s view the whole file at once. Run cat AUTHORS
.
CP/CR.
cat
stands for concatenate, which means
“to chain together.” While it is often used to a single text file, the
command can take two arguments and print the files together.
Exercise 17
Run cat -n AUTHORS
. Notice that -n
has no
argument. CP/CR.
cat -n
prints the text file with line numbers. Note that
the names end at line 28.
Exercise 18
Let’s make our own text file. Start by running echo
which takes a quoted string, for example, “Your First and Last Name
Here” as its argument. This “echoes” the argument by printing it in the
next line of the terminal. CP/CR.
echo
can also take a string that is not in quotes as its
argument.
Exercise 19
We can change the place that we echo
to, from the
Terminal to a text file, using >
to redirect the output
Run echo "Your First and Last Name Here" > test.txt
.
Then use ls
to see if the file test.txt
exists
within your working directory. Then use cat
to read this
file. After you have confirmed that the file contains your name, use
rm
to remove test.txt
. CP/CR.
Exercise 20
>
can be used to redirect the output of any command,
not just echo
. Let’s try to make a new text file in
options
with just the first 28 lines of
AUTHORS
called AUTHORS-fake
. Run
head -n 28 AUTHORS > AUTHORS-fake
. CP/CR.
Exercise 21
Using >>
, whose syntax is the same as
>
, we can append output to an existing document. Use
echo
and >>
to append your name to the
list of AUTHORS-fake
. CP/CR.
Like >
, >>
is not limited to
echo
, but can be used to append the output of any
command.
Exercise 22
Use tail
and >>
to append the last 11
lines of AUTHORS
to AUTHORS-fake
. Refer to the
syntax in Exercise 18 if needed. CP/CR.
Your fake is not great because your name is likely not placed in correct alphabetical order.
Exercise 23
Run cd ..
to change your directory to the directory in
which you started the tutorial. Run rm -r options
to remove
the options
directory. CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:options dkane$ cd ..
Davids-MacBook-Pro-2:averages dkane$ rm -r options
Davids-MacBook-Pro-2:averages dkane$
As usual, note how the prompt changes from
Davids-MacBook-Pro-2:options dkane$
to
Davids-MacBook-Pro-2:averages dkane$
as we move from the
options
directory to directory from which we started this
tutorial, which is the averages
directory in my case.
Wildcards
You will learn how to search for files by using wildcards to form regular expressions.
Exercise 1
Use pwd
to confirm that you are in your default working
directory. Use mkdir
to create a directory named
wildcards
. CP/CR.
The mkdir
command does not produce any output. To
confirm that it had the desired effect, you could run
ls
.
Exercise 2
Change your working directory to wildcards
using
cd
. Confirm the change with pwd
. CP/CR.
You answer should look something like:
Davids-MacBook-Pro-2:averages dkane$ cd wildcards/
Davids-MacBook-Pro-2:wildcards dkane$ pwd
/Users/dkane/Desktop/projects/averages/wildcards
Davids-MacBook-Pro-2:wildcards dkane$
Note how the prompt changed after we switche directories.
Keep track of your current location. There is a difference between
being in the directory above wildcards
and being inside of
wildcards
. Right now, you are inside
wildcard
.
Exercise 3
Using touch
, Make three text files named
hat.txt
, cat.txt
, and ate.txt
.
Confirm that you have done so by running ls
. (No worries if
you make a mistake. Just fix it by removing the misnamed file.)
CP/CR.
touch
can take multiple arguments, making a file for
each one. touch hat.txt cat.txt ate.txt
is equivalent to
running touch
three times.
Exercise 4
Terminal commands allow the use of wildcards. A
wildcard *
represents zero or more of any character, and it
is used to work with similarly named files as a grouping.
Let’s generate a list of all our files whose names contain the letter
a
. First simply try to run ls a
. You will get
an error message — “No such file or directory” — because there is no
file named a
in the working directory. CP/CR.
Though we have been using ls
to return all the content
of a directory, ls
can also take a name as an argument, as
above.
Exercise 5
Let’s try again. Run ls a*
. The only return will be
ate.txt
. Because you put *
after
a
, ls
returns only files that began with
a
. The “star” wildcard — *
— matches anything.
CP/CR.
Though *
will match strings of any length, its
positioning matters. a*
is not the same thing as
*a
. The first matches any string (i.e., any file name)
which starts with a
. The second matches any string which
ends with a
.
Exercise 6
Now run ls *a*
. This should return
hat.txt cat.txt ate.txt
. CP/CR.
Note that these regular expressions are case-sensitive. A regular expression is a “pattern” which you can use for many purposes, including as a search criteria.
Exercise 7
Make a list of all the files with names, before the suffix, which end
in t
by running ls *t.*
CP/CR.
This should return hat.txt
and cat.txt
.
Read this regular expression — *t.*
— as matching any
string which includes a t
followed directly by a period,
regardless of any characters which are also included, before or after.
ate.txt
is not matched because the t
and the
.
are separated by e
.
Exercise 8
Save the output of our list of all the file names that contained
a
to a new file names afiles.txt
. Keep
pressing the up arrow key to navigate through your
previous commands until you get to the command you used above:
ls *a*
. Then add > afiles.txt
afterwards
and run the command. Run ls
to confirm that it worked.
CP/CR.
You can use your up
and down
arrow keys to
navigate command history. The >
symbol outputs the
result of the left-hand side command into the right-hand side
location.
Exercise 9
Wildcards can be used with more than ls
. Let’s move
hat.txt
and cat.txt
together into one
directory.
First, make a new directory called seuss
within the
wildcards
directory in which you are currently located.
Run ls
to confirm. CP/CR
You answer should look like:
Davids-MBP:wildcards dkane$ mkdir seuss
Davids-MBP:wildcards dkane$ ls
afiles.txt ate.txt cat.txt hat.txt seuss
Davids-MBP:wildcards dkane$
The wildcards
directory, in which you are currently
located, now includes 5 items, one of which is the seuss
directory.
Exercise 10
Run ls -l
. CP/CR
The -l
option provides a long-format
listing of the contents of the directory, including information about
the type of content (file versus directory) and creation date/times. The
leading d
in the line for seuss
indicates that
it is a directory.
Exercise 11
Run ls *t.*
. CP/CR.
Before moving or otherwise acting on a group of files, we often want
to ensure that we have the correct grouping. Running ls
with the appropriate regular expression is an easy way to test.
Exercise 12
Use the previous command (since we have confirmed that it generates
the list of files we want to move) but replace ls
with
mv
and then add seuss
, which indicates the
location to which we want to move the files. In other words, run
mv *t.* seuss
. CP/CR.
The mv
command, like mkdir
, does not
produce any output. You need to check by hand to ensure it has done what
you wanted.
Exercise 13
Run ls -R
. CP/CR.
Your answer should look like:
Davids-MBP:wildcards dkane$ ls -R
afiles.txt ate.txt seuss
./seuss:
cat.txt hat.txt
Davids-MBP:wildcards dkane$
The -R
option to the ls
command stands for
recursive. We see, not simply the contents of the
current directory (two files and the seuss
directory) but
also the contents of the seuss
directory itself.
Exercise 14
Use cd ..
to move out of the wildcards
directory, one level up. Run pwd
. CP/CR.
In order to delete a directory, you can not be located within that
directory. We want to delete (that is, “remove”) the
wildcards
directory since we always want to clean up any
temporary files/directories which we have created.
Exercise 15
Run rm wildcards
. That command will fail because
“wildcards: is a directory”. You can’t just rm
directories
in the same way that you rm
files. Instead, you need to use
rm -r
, where the -r
indicates
recursive. Run rm -r wildcards
. CP/CR.
Note that both wildcards
and wildcards/
refer to the same entity, the wildcards
directory. The
Terminal often “adds” a forward slash (/
) — or, on Windows,
a backward slash (\
) — to the end of directory names to
indicate that the object is a directory, not simply a file.
Files
Downloading files and moving them to a sensible location is a core skill for data scientists.
Exercise 1
Confirm your location with pwd
. Run
mkdir files
to create in a temporary directory in which to
work. cd
to that directory with cd files
.
Confirm that it is empty with ls
. CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:averages dkane$ pwd
/Users/dkane/Desktop/projects/averages
Davids-MacBook-Pro-2:averages dkane$ mkdir files
Davids-MacBook-Pro-2:averages dkane$ cd files/
Davids-MacBook-Pro-2:files dkane$ ls
Davids-MacBook-Pro-2:files dkane$
We often, as here, move to the directory in which we will be working.
But we could have stayed in the starting directory, which is
/Users/dkane/Desktop/projects/averages
for me, and then
worked from there, using files/
in the various commands to
come.
Exercise 2
Every computer has a default directory in which files are downloaded.
The most common name for this directory is Downloads
and
its most common location is your home directory.
Run ls ~/Downloads
to examine the most likely location
for that directory.
CP/CR.
If this command fails for you, then figure out where your default download location is and then adjust this command, and future commands, accordingly.
If, for some reason, you can’t locate your Downloads
directory, then just answer all the exercises in this section of the
tutorial with “I could not find my Downloads directory.”
My answer looks like:
Davids-MacBook-Pro-2:files dkane$ ls ~/Downloads/
113-6439507-9501836.pdf
113-6441072-0767436.pdf
2024_11_06_10_29_42.pdf
223591000809.JPEG
Bassett.jpg
...
The ...
indicates that there are many more files in this
directory. I am not as organized as I ought to be. Try not to have your
Downloads
directory overrun with old files.
Exercise 3
In your browser, go to
https://github.com/PPBDS/r4ds.tutorials
.
This is the location for the R package which contains this tutorial.
Click on the TODO.txt
file, which should bring you to this
URL:
https://github.com/PPBDS/r4ds.tutorials/blob/main/TODO.txt
.
Click on the “Download raw file” symbol to download the
DESCRIPTION
file to your computer.
Run ls ~/Dowloads/T*
. CP/CR.
My answer looks like:
Davids-MacBook-Pro-2:files dkane$ ls ~/Downloads/T*
/Users/dkane/Downloads/TODO.txt /Users/dkane/Downloads/Toronto.jpg
/Users/dkane/Downloads/TOM.pdf
Davids-MacBook-Pro-2:files dkane$
The *
is a wildcard character, meaning that the
ls
command should return all the files which begin with a
capital D
. For me, there are three such files in my
Downloads
directory.
Note that, depending on your computer/setup, the file
DESCRIPTION
might or might not have a .txt
suffix. The original file, located on Github, does not have one, but
some computers, including mine, add the suffix automatically. If your
downloaded file does not have the suffix, use DESCRIPTION
in place of DESCRIPTION.txt
in the upcoming exercises.
Exercise 4
Run ls ~/Downloads/TODO.txt
. CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:files dkane$ ls ~/Downloads/TODO.txt
/Users/dkane/Downloads/TODO.txt
Davids-MacBook-Pro-2:files dkane$
I often use ls
to confirm that a file exists in the
location in which I think it should. ls
also returns the
absolute path to the file.
Exercise 5
Run mv /Users/dkane/Downloads/TODO.txt .
. Run
ls
to confirm that you have moved the TODO.txt
file from the Downloads
directory to the current working
directory. CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:files dkane$ mv /Users/dkane/Downloads/TODO.txt .
Davids-MacBook-Pro-2:files dkane$ ls
TODO.txt
Davids-MacBook-Pro-2:files dkane$
Recall that the .
symbol indicates the current
directory. Since we did not give a new name for the file, the old name
is retained. We could have give the file a new name by using:
mv /Users/dkane/Downloads/TODO.txt TODO_2.txt
We also could gave used ~
to avoid providing the
absolute path by running:
mv ~/Downloads/TODO.txt .
Exercise 6
Delete the file by running rm TODO.txt
. Go one level up
with cd ..
. Delete the files
directory with
rm -r files
. CP/CR.
Your answer should look like:
Davids-MacBook-Pro-2:files dkane$ rm TODO.txt
Davids-MacBook-Pro-2:files dkane$ cd ..
Davids-MacBook-Pro-2:averages dkane$ rm -r files/
Davids-MacBook-Pro-2:averages dkane$
grep
In a previous section, we used wildcards to create
regular expressions in order to generate lists of file
names which matched a specific pattern. In this section, we will use
grep
to look for words inside of files. Fortunately,
wildcards and regular expressions work the same with
grep
.
Exercise 1
Make a directory called grepping
using
mkdir
. Run ls
. CP/CR.
We call this folder grepping
rather than
grep
because you should not use a command, like
grep
, as the name of a file or directory. It is possible to
do this, just like it is possible to jump off a bridge. But, in both
cases, you will probably regret it.
Exercise 2
cd
into the grepping
directory. Run this
command:
printf "hat.txt\ncat.txt\nate.txt\n\n" > files.txt
The \n
symbol will insert a new line between
hat.txt
, cat.txt
, and
ate.txt
.
Run cat files.txt
. CP/CR.
The extra \n
at the end ensures that there is a blank
line at the end of the file, which is always a good idea when dealing
with text files.
Don’t worry about the printf
command. It is not used
very often. We didn’t use echo
because it is annoying to
use in the case of multiple lines of input when you are writing code
which needs to work on both Mac/Unix and Windows.
Exercise 3
Let’s search the text file for lines containing the letter “a”. Run
grep a files.txt
. CP/CR.
Remember to use the tab key to take advantage of auto-completion. Don’t type! Tab complete!
This should return all three lines from the file, which are the names of the three files we initially created. The blank line is not returned because it does not include an “a”.
grep
is used to search for a string of characters within
a text file. When it finds a match, it prints the line. You can put the
regular expression, in this case a
, in quotes but that is
not required.
Exercise 4
Let’s search through files.txt
for lines that begin with
the letter “a”. Run grep ^a files.txt
. CP/CR.
This should return ate.txt
.
^
is put at the beginning of a regular expression to
indicate that the expression must occur at the beginning of a line. A
dollar sign — $
— at the end of a regular expression
indicates that the expression must occur at the end of the line.
We will not attempt the previous search we made of these files with
wildcards because it is difficult, though possible, to search for
strings with special characters like .
with
grep
.
Exercise 5
Run ls $R_HOME
— ls %R_HOME%
on Windows —
to make a list of the contents of the R home directory. CP/CR.
R_HOME
is an environment variable which should behave
similarly regardless of your computer. There should be several
directories and files in this listing, including COPYING
and tests
.
Exercise 6
Add >
to ls $R_HOME
to save this output
to a new file named RHOME.txt
located withing the
grepping
directory. That is, run
ls $R_HOME > RHOME.txt
or, on Windows,
ls %R_HOME% > RHOME.txt
Run ls
. CP/CR.
The >
symbol is known as the “output redirection
operator.” Because all the output has been sent (or redirected) into a
new file, RHOME.txt
, this command produces no output.
There are now two files in grepping
:
files.txt
and R_HOME.txt
.
Exercise 7
Run wc RHOME.txt
. The wc
command stands for
word count. CP/CR.
The first number returned by wc
is the number of lines
in the file. The second is the total number of words. The third is the
number of characters. In this case, the number of lines is the same as
the number of words.
Exercise 8
Use grep
to search RHOME.txt
for lines that
contain R
. That is, run:
grep R RHOME.txt
CP/CR.
A mistyped grep
command can cause the Terminal to get
stuck and not generate a new prompt. If this happens, navigate to the
top of the Terminal page and click the dropdown menu where Terminal says
“busy.” Select “Interrupt Terminal Command” from this menu in order to
generate a new prompt and start again.
Exercise 9
Use grep
and ^
to search
RHOME.txt
for lines that begin with
R
. That is, run:
grep ^R RHOME.txt
CP/CR.
^
and $
are called metacharacters. It is
often useful to put quotes around a regular expression which includes
metacharacters.
Exercise 10
Use grep
and $
to search
RHOME.txt
for lines that end with c
. That is,
run:
grep c$ RHOME.txt
CP/CR.
This should return two lines: doc
and etc
.
(Your computer set up might be different, so don’t worry if your answer
is not the same.)
Exercise 11
Use grep
to search RHOME.txt
for lines than
contain the string ib
. That is, run:
grep ib RHOME.txt
CP/CR
egrep
is an even more powerful version of
grep
. Learning more about egrep
is beyond the
scope of this tutorial.
Exercise 12
Use cd ..
to change the working directory to the
directory which is one level above your current location. This should be
the default working directory from which we started. Run
pwd
. CP/CR.
If you keep your R projects in the same place on your computer, only the last directory in the path for the default working directory will change between projects.
Exercise 13
Use rm -r
to remove grepping
. CP/CR.
It is always a good idea to clean up after yourself. As the signs say when taking a hike in the forest: “Take only photos. Leave only footprints.”
Summary
This tutorial covered material which is not actually in R for Data Science (2e) by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. But the material is in keeping with the spirit of that book. Some of the material from Chapter 6 Workflow: scripts and projects appeared here.
This tutorial introduced the structure and uses of the Terminal from
within RStudio. The older, more general term for the terminal is the
“command line.” All modern operating systems provide a command line,
from which sophisticated users can modify files and directories, using
commands like pwd
, ls
, touch
,
mkdir
, mv
, rm
, and
cd
. We also discussed regular expressions as well as some
of the metacharacters (like, *
, ^
and
$
) from which they are constructed.
Want to learn more about the command line and related topics? Read The Unix Workbench by Sean Kross.
Download answers
- Click the button to download a file containing your answers.
- Save the file onto your computer in a convenient location.
(If no file seems to download, try clicking with the alternative button on the download button and choose "Save link as...")