R markdown как пользоваться
Перейти к содержимому

R markdown как пользоваться

  • автор:

27 R Markdown

R Markdown provides an unified authoring framework for data science, combining your code, its results, and your prose commentary. R Markdown documents are fully reproducible and support dozens of output formats, like PDFs, Word files, slideshows, and more.

R Markdown files are designed to be used in three ways:

  1. For communicating to decision makers, who want to focus on the conclusions, not the code behind the analysis.
  2. For collaborating with other data scientists (including future you!), who are interested in both your conclusions, and how you reached them (i.e. the code).
  3. As an environment in which to do data science, as a modern day lab notebook where you can capture not only what you did, but also what you were thinking.

R Markdown integrates a number of R packages and external tools. This means that help is, by-and-large, not available through ? . Instead, as you work through this chapter, and use R Markdown in the future, keep these resources close to hand:

  • R Markdown Cheat Sheet: Help > Cheatsheets > R Markdown Cheat Sheet,
  • R Markdown Reference Guide: Help > Cheatsheets > R Markdown Reference Guide.

27.1.1 Prerequisites

You need the rmarkdown package, but you don’t need to explicitly install it or load it, as RStudio automatically does both when needed.

27.2 R Markdown basics

This is an R Markdown file, a plain text file that has the extension .Rmd :

--- title: "Diamond sizes" date: 2016-08-25 output: html_document --- ``` library(ggplot2) library(dplyr) smaller % filter(carat smaller %>% ggplot(aes(carat)) + geom_freqpoly(binwidth = 0.01) ```

It contains three important types of content:

  1. An (optional) YAML header surrounded by — s.
  2. Chunks of R code surrounded by «` .
  3. Text mixed with simple text formatting like # heading and _italics_ .

When you open an .Rmd , you get a notebook interface where code and output are interleaved. You can run each code chunk by clicking the Run icon (it looks like a play button at the top of the chunk), or by pressing Cmd/Ctrl + Shift + Enter. RStudio executes the code and displays the results inline with the code:

To produce a complete report containing all text, code, and results, click “Knit” or press Cmd/Ctrl + Shift + K. You can also do this programmatically with rmarkdown::render(«1-example.Rmd») . This will display the report in the viewer pane, and create a self-contained HTML file that you can share with others.

When you knit the document, R Markdown sends the .Rmd file to knitr, http://yihui.name/knitr/, which executes all of the code chunks and creates a new markdown (.md) document which includes the code and its output. The markdown file generated by knitr is then processed by pandoc, http://pandoc.org/, which is responsible for creating the finished file. The advantage of this two step workflow is that you can create a very wide range of output formats, as you’ll learn about in R markdown formats.

To get started with your own .Rmd file, select File > New File > R Markdown… in the menubar. RStudio will launch a wizard that you can use to pre-populate your file with useful content that reminds you how the key features of R Markdown work.

The following sections dive into the three components of an R Markdown document in more details: the markdown text, the code chunks, and the YAML header.

27.2.1 Exercises

  1. Create a new notebook using File > New File > R Notebook. Read the instructions. Practice running the chunks. Verify that you can modify the code, re-run it, and see modified output.
  2. Create a new R Markdown document with File > New File > R Markdown… Knit it by clicking the appropriate button. Knit it by using the appropriate keyboard short cut. Verify that you can modify the input and see the output update.
  3. Compare and contrast the R notebook and R markdown files you created above. How are the outputs similar? How are they different? How are the inputs similar? How are they different? What happens if you copy the YAML header from one to the other?
  4. Create one new R Markdown document for each of the three built-in formats: HTML, PDF and Word. Knit each of the three documents. How does the output differ? How does the input differ? (You may need to install LaTeX in order to build the PDF output — RStudio will prompt you if this is necessary.)

27.3 Text formatting with Markdown

Prose in .Rmd files is written in Markdown, a lightweight set of conventions for formatting plain text files. Markdown is designed to be easy to read and easy to write. It is also very easy to learn. The guide below shows how to use Pandoc’s Markdown, a slightly extended version of Markdown that R Markdown understands.

Text formatting ------------------------------------------------------------ *italic* or _italic_ **bold** __bold__ `code` superscript^2^ and subscript~2~ Headings ------------------------------------------------------------ # 1st Level Header ## 2nd Level Header ### 3rd Level Header Lists ------------------------------------------------------------ * Bulleted list item 1 * Item 2 * Item 2a * Item 2b 1. Numbered list item 1 1. Item 2. The numbers are incremented automatically in the output. Links and images ------------------------------------------------------------ [linked phrase](http://example.com) ![optional caption text](path/to/img.png) Tables ------------------------------------------------------------ First Header | Second Header ------------- | ------------- Content Cell | Content Cell Content Cell | Content Cell

The best way to learn these is simply to try them out. It will take a few days, but soon they will become second nature, and you won’t need to think about them. If you forget, you can get to a handy reference sheet with Help > Markdown Quick Reference.

27.3.1 Exercises

  1. Practice what you’ve learned by creating a brief CV. The title should be your name, and you should include headings for (at least) education or employment. Each of the sections should include a bulleted list of jobs/degrees. Highlight the year in bold.
  2. Using the R Markdown quick reference, figure out how to:
    1. Add a footnote.
    2. Add a horizontal rule.
    3. Add a block quote.

    27.4 Code chunks

    To run code inside an R Markdown document, you need to insert a chunk. There are three ways to do so:

    1. The keyboard shortcut Cmd/Ctrl + Alt + I
    2. The “Insert” button icon in the editor toolbar.
    3. By manually typing the chunk delimiters «` and «` .

    Obviously, I’d recommend you learn the keyboard shortcut. It will save you a lot of time in the long run!

    You can continue to run the code using the keyboard shortcut that by now (I hope!) you know and love: Cmd/Ctrl + Enter. However, chunks get a new keyboard shortcut: Cmd/Ctrl + Shift + Enter, which runs all the code in the chunk. Think of a chunk like a function. A chunk should be relatively self-contained, and focussed around a single task.

    The following sections describe the chunk header which consists of «` . Next comes your R code and the chunk end is indicated by a final «` .

    27.4.1 Chunk name

    Chunks can be given an optional name: «` . This has three advantages:

      You can more easily navigate to specific chunks using the drop-down code navigator in the bottom-left of the script editor:

    There is one chunk name that imbues special behaviour: setup . When you’re in a notebook mode, the chunk named setup will be run automatically once, before any other code is run.

    27.4.2 Chunk options

    Chunk output can be customised with options, arguments supplied to chunk header. Knitr provides almost 60 options that you can use to customize your code chunks. Here we’ll cover the most important chunk options that you’ll use frequently. You can see the full list at http://yihui.name/knitr/options/.

    The most important set of options controls if your code block is executed and what results are inserted in the finished report:

    • eval = FALSE prevents code from being evaluated. (And obviously if the code is not run, no results will be generated). This is useful for displaying example code, or for disabling a large block of code without commenting each line.
    • include = FALSE runs the code, but doesn’t show the code or results in the final document. Use this for setup code that you don’t want cluttering your report.
    • echo = FALSE prevents code, but not the results from appearing in the finished file. Use this when writing reports aimed at people who don’t want to see the underlying R code.
    • message = FALSE or warning = FALSE prevents messages or warnings from appearing in the finished file.
    • results = ‘hide’ hides printed output; fig.show = ‘hide’ hides plots.
    • error = TRUE causes the render to continue even if code returns an error. This is rarely something you’ll want to include in the final version of your report, but can be very useful if you need to debug exactly what is going on inside your .Rmd . It’s also useful if you’re teaching R and want to deliberately include an error. The default, error = FALSE causes knitting to fail if there is a single error in the document.

    The following table summarises which types of output each option suppresses:

    Option Run code Show code Output Plots Messages Warnings
    eval = FALSE
    include = FALSE
    echo = FALSE
    results = «hide»
    fig.show = «hide»
    message = FALSE
    warning = FALSE

    27.4.3 Table

    By default, R Markdown prints data frames and matrices as you’d see them in the console:

    mtcars[1:5, ] #> mpg cyl disp hp drat wt qsec vs am gear carb #> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 #> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 #> Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 #> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 #> Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2

    If you prefer that data be displayed with additional formatting you can use the knitr::kable function. The code below generates Table 27.1.

    knitr::kable( mtcars[1:5, ], caption = "A knitr kable." )
    Table 27.1: A knitr kable.

    mpg cyl disp hp drat wt qsec vs am gear carb
    Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
    Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
    Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
    Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
    Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2

    Read the documentation for ?knitr::kable to see the other ways in which you can customise the table. For even deeper customisation, consider the xtable, stargazer, pander, tables, and ascii packages. Each provides a set of tools for returning formatted tables from R code.

    There is also a rich set of options for controlling how figures are embedded. You’ll learn about these in saving your plots.

    27.4.4 Caching

    Normally, each knit of a document starts from a completely clean slate. This is great for reproducibility, because it ensures that you’ve captured every important computation in code. However, it can be painful if you have some computations that take a long time. The solution is cache = TRUE . When set, this will save the output of the chunk to a specially named file on disk. On subsequent runs, knitr will check to see if the code has changed, and if it hasn’t, it will reuse the cached results.

    The caching system must be used with care, because by default it is based on the code only, not its dependencies. For example, here the processed_data chunk depends on the raw_data chunk:

    ``` rawdata processed_data % filter(!is.na(import_var)) %>% mutate(new_variable = complicated_transformation(x, y, z)) ```

    Caching the processed_data chunk means that it will get re-run if the dplyr pipeline is changed, but it won’t get rerun if the read_csv() call changes. You can avoid that problem with the dependson chunk option:

    ``` processed_data % filter(!is.na(import_var)) %>% mutate(new_variable = complicated_transformation(x, y, z)) ```

    dependson should contain a character vector of every chunk that the cached chunk depends on. Knitr will update the results for the cached chunk whenever it detects that one of its dependencies have changed.

    Note that the chunks won’t update if a_very_large_file.csv changes, because knitr caching only tracks changes within the .Rmd file. If you want to also track changes to that file you can use the cache.extra option. This is an arbitrary R expression that will invalidate the cache whenever it changes. A good function to use is file.info() : it returns a bunch of information about the file including when it was last modified. Then you can write:

    ``` rawdata 

    As your caching strategies get progressively more complicated, it’s a good idea to regularly clear out all your caches with knitr::clean_cache() .

    I’ve used the advice of David Robinson to name these chunks: each chunk is named after the primary object that it creates. This makes it easier to understand the dependson specification.

    27.4.5 Global options

    As you work more with knitr, you will discover that some of the default chunk options don’t fit your needs and you want to change them. You can do this by calling knitr::opts_chunk$set() in a code chunk. For example, when writing books and tutorials I set:

    knitr::opts_chunk$set( comment = "#>", collapse = TRUE )

    This uses my preferred comment formatting, and ensures that the code and output are kept closely entwined. On the other hand, if you were preparing a report, you might set:

    knitr::opts_chunk$set( echo = FALSE )

    That will hide the code by default, so only showing the chunks you deliberately choose to show (with echo = TRUE ). You might consider setting message = FALSE and warning = FALSE , but that would make it harder to debug problems because you wouldn’t see any messages in the final document.

    27.4.6 Inline code

    There is one other way to embed R code into an R Markdown document: directly into the text, with: `r ` . This can be very useful if you mention properties of your data in the text. For example, in the example document I used at the start of the chapter I had:

    We have data about `r nrow(diamonds)` diamonds. Only `r nrow(diamonds) - nrow(smaller)` are larger than 2.5 carats. The distribution of the remainder is shown below:

    When the report is knit, the results of these computations are inserted into the text:

    We have data about 53940 diamonds. Only 126 are larger than 2.5 carats. The distribution of the remainder is shown below:

    When inserting numbers into text, format() is your friend. It allows you to set the number of digits so you don’t print to a ridiculous degree of accuracy, and a big.mark to make numbers easier to read. I’ll often combine these into a helper function:

    comma  function(x) format(x, digits = 2, big.mark = ",") comma(3452345) #> [1] "3,452,345" comma(.12358124331) #> [1] "0.12"

    27.4.7 Exercises

    1. Add a section that explores how diamond sizes vary by cut, colour, and clarity. Assume you’re writing a report for someone who doesn’t know R, and instead of setting echo = FALSE on each chunk, set a global option.
    2. Download diamond-sizes.Rmd from https://github.com/hadley/r4ds/tree/master/rmarkdown. Add a section that describes the largest 20 diamonds, including a table that displays their most important attributes.
    3. Modify diamonds-sizes.Rmd to use comma() to produce nicely formatted output. Also include the percentage of diamonds that are larger than 2.5 carats.
    4. Set up a network of chunks where d depends on c and b , and both b and c depend on a . Have each chunk print lubridate::now() , set cache = TRUE , then verify your understanding of caching.

    27.5 Troubleshooting

    Troubleshooting R Markdown documents can be challenging because you are no longer in an interactive R environment, and you will need to learn some new tricks. The first thing you should always try is to recreate the problem in an interactive session. Restart R, then “Run all chunks” (either from Code menu, under Run region), or with the keyboard shortcut Ctrl + Alt + R. If you’re lucky, that will recreate the problem, and you can figure out what’s going on interactively.

    If that doesn’t help, there must be something different between your interactive environment and the R markdown environment. You’re going to need to systematically explore the options. The most common difference is the working directory: the working directory of an R Markdown is the directory in which it lives. Check the working directory is what you expect by including getwd() in a chunk.

    Next, brainstorm all the things that might cause the bug. You’ll need to systematically check that they’re the same in your R session and your R markdown session. The easiest way to do that is to set error = TRUE on the chunk causing the problem, then use print() and str() to check that settings are as you expect.

    27.6 YAML header

    You can control many other “whole document” settings by tweaking the parameters of the YAML header. You might wonder what YAML stands for: it’s “yet another markup language”, which is designed for representing hierarchical data in a way that’s easy for humans to read and write. R Markdown uses it to control many details of the output. Here we’ll discuss two: document parameters and bibliographies.

    27.6.1 Parameters

    R Markdown documents can include one or more parameters whose values can be set when you render the report. Parameters are useful when you want to re-render the same report with distinct values for various key inputs. For example, you might be producing sales reports per branch, exam results by student, or demographic summaries by country. To declare one or more parameters, use the params field.

    This example uses a my_class parameter to determine which class of cars to display:

    --- output: html_document params: my_class: "suv" --- ``` library(ggplot2) library(dplyr) class % filter(class == params$my_class) ``` # Fuel economy for `r params$my_class`s ``` ggplot(class, aes(displ, hwy)) + geom_point() + geom_smooth(se = FALSE) ```

    As you can see, parameters are available within the code chunks as a read-only list named params .

    You can write atomic vectors directly into the YAML header. You can also run arbitrary R expressions by prefacing the parameter value with !r . This is a good way to specify date/time parameters.

     params: start: !r lubridate::ymd("2015-01-01") snapshot: !r lubridate::ymd_hms("2015-01-01 12:30:00")

    In RStudio, you can click the “Knit with Parameters” option in the Knit dropdown menu to set parameters, render, and preview the report in a single user friendly step. You can customise the dialog by setting other options in the header. See http://rmarkdown.rstudio.com/developer_parameterized_reports.html#parameter_user_interfaces for more details.

    Alternatively, if you need to produce many such parameterised reports, you can call rmarkdown::render() with a list of params :

    rmarkdown::render("fuel-economy.Rmd", params = list(my_class = "suv"))

    This is particularly powerful in conjunction with purrr:pwalk() . The following example creates a report for each value of class found in mpg . First we create a data frame that has one row for each class, giving the filename of the report and the params :

    reports  tibble( class = unique(mpg$class), filename = stringr::str_c("fuel-economy-", class, ".html"), params = purrr::map(class, ~ list(my_class = .)) ) reports #> # A tibble: 7 × 3 #> class filename params  #> #> 1 compact fuel-economy-compact.html #> 2 midsize fuel-economy-midsize.html #> 3 suv fuel-economy-suv.html #> 4 2seater fuel-economy-2seater.html #> 5 minivan fuel-economy-minivan.html #> 6 pickup fuel-economy-pickup.html #> # ℹ 1 more row

    Then we match the column names to the argument names of render() , and use purrr’s parallel walk to call render() once for each row:

    reports %>%  select(output_file = filename, params) %>%  purrr::pwalk(rmarkdown::render, input = "fuel-economy.Rmd")

    27.6.2 Bibliographies and Citations

    Pandoc can automatically generate citations and a bibliography in a number of styles. To use this feature, specify a bibliography file using the bibliography field in your file’s header. The field should contain a path from the directory that contains your .Rmd file to the file that contains the bibliography file:

     bibliography: rmarkdown.bib

    You can use many common bibliography formats including BibLaTeX, BibTeX, endnote, medline.

    To create a citation within your .Rmd file, use a key composed of ‘@’ + the citation identifier from the bibliography file. Then place the citation in square brackets. Here are some examples:

     Separate multiple citations with a `;`: Blah blah [@smith04; @doe99]. You can add arbitrary comments inside the square brackets: Blah blah [see @doe99, pp. 33-35; also @smith04, ch. 1]. Remove the square brackets to create an in-text citation: @smith04 says blah, or @smith04 [p. 33] says blah. Add a `-` before the citation to suppress the author's name: Smith says blah [-@smith04].

    When R Markdown renders your file, it will build and append a bibliography to the end of your document. The bibliography will contain each of the cited references from your bibliography file, but it will not contain a section heading. As a result it is common practice to end your file with a section header for the bibliography, such as # References or # Bibliography .

    You can change the style of your citations and bibliography by referencing a CSL (citation style language) file in the csl field:

     bibliography: rmarkdown.bib csl: apa.csl

    As with the bibliography field, your csl file should contain a path to the file. Here I assume that the csl file is in the same directory as the .Rmd file. A good place to find CSL style files for common bibliography styles is http://github.com/citation-style-language/styles.

    27.7 Learning more

    R Markdown is still relatively young, and is still growing rapidly. The best place to stay on top of innovations is the official R Markdown website: http://rmarkdown.rstudio.com.

    There are two important topics that we haven’t covered here: collaboration, and the details of accurately communicating your ideas to other humans. Collaboration is a vital part of modern data science, and you can make your life much easier by using version control tools, like Git and GitHub. We recommend two free resources that will teach you about Git:

    1. “Happy Git with R”: a user friendly introduction to Git and GitHub from R users, by Jenny Bryan. The book is freely available online: http://happygitwithr.com
    2. The “Git and GitHub” chapter of R Packages, by Hadley. You can also read it for free online: http://r-pkgs.had.co.nz/git.html.

    Chapter 1 Installation

    We assume you have already installed R (https://www.r-project.org) (R Core Team 2023) and the RStudio IDE (https://www.rstudio.com). RStudio is not required but recommended, because it makes it easier for an average user to work with R Markdown. If you do not have RStudio IDE installed, you will have to install Pandoc (http://pandoc.org), otherwise there is no need to install Pandoc separately because RStudio has bundled it. Next you can install the rmarkdown package in R:

     # Install from CRAN install.packages('rmarkdown') # Or if you want to test the development version, # install from GitHub if (!requireNamespace("devtools")) install.packages('devtools') devtools::install_github('rstudio/rmarkdown')

    If you want to generate PDF output, you will need to install LaTeX. For R Markdown users who have not installed LaTeX before, we recommend that you install TinyTeX (https://yihui.name/tinytex/):

     install.packages('tinytex') tinytex::install_tinytex() # install TinyTeX

    TinyTeX is a lightweight, portable, cross-platform, and easy-to-maintain LaTeX distribution. The R companion package tinytex (Xie 2023d) can help you automatically install missing LaTeX packages when compiling LaTeX or R Markdown documents to PDF, and also ensures a LaTeX document is compiled for the correct number of times to resolve all cross-references. If you do not understand what these two things mean, you should probably follow our recommendation to install TinyTeX, because these details are often not worth your time or attention.

    With the rmarkdown package, RStudio/Pandoc, and LaTeX, you should be able to compile most R Markdown documents. In some cases, you may need other software packages, and we will mention them when necessary.

    R Markdown: The Definitive Guide

    An R Notebook is an R Markdown document with chunks that can be executed independently and interactively, with output visible immediately beneath the input. See Figure 3.3 for an example.

    An R Notebook example.

    FIGURE 3.3: An R Notebook example.

    R Notebooks are an implementation of Literate Programming that allows for direct interaction with R while producing a reproducible document with publication-quality output.

    Any R Markdown document can be used as a notebook, and all R Notebooks can be rendered to other R Markdown document types. A notebook can therefore be thought of as a special execution mode for R Markdown documents. The immediacy of notebook mode makes it a good choice while authoring the R Markdown document and iterating on code. When you are ready to publish the document, you can share the notebook directly, or render it to a publication format with the Knit button.

    3.2.1 Using Notebooks

    3.2.1.1 Creating a Notebook

    You can create a new notebook in RStudio with the menu command File -> New File -> R Notebook , or by using the html_notebook output type in your document’s YAML metadata.

     --- title: "My Notebook" output: html_notebook ---

    By default, RStudio enables inline output (Notebook mode) on all R Markdown documents, so you can interact with any R Markdown document as though it were a notebook. If you have a document with which you prefer to use the traditional console method of interaction, you can disable notebook mode by clicking the gear button in the editor toolbar, and choosing Chunk Output in Console (Figure 3.4).

    Send the R code chunk output to the console.

    FIGURE 3.4: Send the R code chunk output to the console.

    If you prefer to use the console by default for all your R Markdown documents (restoring the behavior in previous versions of RStudio), you can make Chunk Output in Console the default: Tools -> Options -> R Markdown -> Show output inline for all R Markdown documents .

    3.2.1.2 Inserting chunks

    Notebook chunks can be inserted quickly using the keyboard shortcut Ctrl + Alt + I (macOS: Cmd + Option + I ), or via the Insert menu in the editor toolbar.

    Because all of a chunk’s output appears beneath the chunk (not alongside the statement which emitted the output, as it does in the rendered R Markdown output), it is often helpful to split chunks that produce multiple outputs into two or more chunks which each produce only one output. To do this, select the code to split into a new chunk (Figure 3.5), and use the same keyboard shortcut for inserting a new code chunk (Figure 3.6).

    Select the code to split into a new chunk.

    FIGURE 3.5: Select the code to split into a new chunk.

    Insert a new chunk from the code selected before.

    FIGURE 3.6: Insert a new chunk from the code selected before.

    3.2.1.3 Executing code

    Code in the notebook is executed with the same gestures you would use to execute code in an R Markdown document:

    1. Use the green triangle button on the toolbar of a code chunk that has the tooltip “Run Current Chunk,” or Ctrl + Shift + Enter (macOS: Cmd + Shift + Enter ) to run the current chunk.
    2. Press Ctrl + Enter (macOS: Cmd + Enter ) to run just the current statement. Running a single statement is much like running an entire chunk consisting only of that statement.
    3. There are other ways to run a batch of chunks if you click the menu Run on the editor toolbar, such as Run All , Run All Chunks Above , and Run All Chunks Below .

    The primary difference is that when executing chunks in an R Markdown document, all the code is sent to the console at once, but in a notebook, only one line at a time is sent. This allows execution to stop if a line raises an error.

    When you execute code in a notebook, an indicator will appear in the gutter to show you execution progress (Figure 3.7). Lines of code that have been sent to R are marked with dark green; lines that have not yet been sent to R are marked with light green. If at least one chunk is waiting to be executed, you will see a progress meter appear in the editor’s status bar, indicating the number of chunks remaining to be executed. You can click on this meter at any time to jump to the currently executing chunk. When a chunk is waiting to execute, the Run button in its toolbar will change to a “queued” icon. If you do not want the chunk to run, you can click on the icon to remove it from the execution queue.

    The indicator in the gutter to show the execution progress of a code chunk in the notebook.

    FIGURE 3.7: The indicator in the gutter to show the execution progress of a code chunk in the notebook.

    In general, when you execute code in a notebook chunk, it will do exactly the same thing as it would if that same code were typed into the console. There are however a few differences:

    • Output: The most obvious difference is that most forms of output produced from a notebook chunk are shown in the chunk output rather than, for example, the RStudio Viewer or the Plots pane. Console output (including warnings and messages) appears both at the console and in the chunk output.
    • Working directory: The current working directory inside a notebook chunk is always the directory containing the notebook .Rmd file. This makes it easier to use relative paths inside notebook chunks, and also matches the behavior when knitting, making it easier to write code that works identically both interactively and in a standalone render. You’ll get a warning if you try to change the working directory inside a notebook chunk, and the directory will revert back to the notebook’s directory once the chunk is finished executing. You can suppress this warning by using the warnings = FALSE chunk option. If it is necessary to execute notebook chunks in a different directory, you can change the working directory for all your chunks by using the knitr root.dir option. For instance, to execute all notebook chunks in the grandparent folder of the notebook:

     knitr::opts_knit$set(root.dir = normalizePath(".."))

    To execute an inline R expression in the notebook, put your cursor inside the chunk and press Ctrl + Enter (macOS: Cmd + Enter ). As in the execution of ordinary chunks, the content of the expression will be sent to the R console for evaluation. The results will appear in a small pop-up window next to the code (Figure 3.8).

    Output from an inline R expression in the notebook.

    FIGURE 3.8: Output from an inline R expression in the notebook.

    In notebooks, inline R expressions can only produce text (not figures or other kinds of output). It is also important that inline R expressions executes quickly and do not have side-effects, as they are executed whenever you save the notebook.

    Notebooks are typically self-contained. However, in some situations, it is preferable to re-use code from an R script as a notebook chunk, as in knitr’s code externalization. This can be done by using knitr::read_chunk() in your notebook’s setup chunk, along with a special ## ---- chunkname annotation in the R file from which you intend to read code. Here is a minimal example with two files:

    example.Rmd

     ``` knitr::read_chunk("example.R") ``` 

    example.R

     ## ---- chunk 1 + 1

    When you execute the empty chunk in the notebook example.Rmd , code from the external file example.R will be inserted, and the results displayed inline, as though the chunk contained that code (Figure 3.9).

    Execute a code chunk read from an external R script.

    FIGURE 3.9: Execute a code chunk read from an external R script.

    3.2.1.4 Chunk output

    When code is executed in the notebook, its output appears beneath the code chunk that produced it. You can clear an individual chunk’s output by clicking the X button in the upper right corner of the output, or collapse it by clicking the chevron.

    It is also possible to clear or collapse all of the output in the document at once using the Collapse All Output and Clear All Output menu items available on the gear menu in the editor toolbar (Figure 3.4).

    If you want to fully reset the state of the notebook, the item Restart R and Clear Output on the Run menu on the editor toolbar will do the job.

    Ordinary R Markdown documents are “knitted,” but notebooks are “previewed.” While the notebook preview looks similar to a rendered R Markdown document, the notebook preview does not execute any of your R code chunks. It simply shows you a rendered copy of the Markdown output of your document along with the most recent chunk output. This preview is generated automatically whenever you save the notebook (whether you are viewing it in RStudio or not); see the section beneath on the *.nb.html file for details.

    When html_notebook is the topmost (default) format in your YAML metadata, you will see a Preview button in the editor toolbar. Clicking it will show you the notebook preview (Figure 3.10).

    Preview a notebook.

    FIGURE 3.10: Preview a notebook.

    If you have configured R Markdown previewing to use the Viewer pane (as illustrated in Figure 3.10), the preview will be automatically updated whenever you save your notebook.

    When an error occurs while a notebook chunk is executing (Figure 3.11):

    Errors in a notebook.

    FIGURE 3.11: Errors in a notebook.

    1. Execution will stop; the remaining lines of that chunk (and any chunks that have not yet been run) will not be executed.
    2. The editor will scroll to the error.
    3. The line of code that caused the error will have a red indicator in the editor’s gutter.

    If you want your notebook to keep running after an error, you can suppress the first two behaviors by specifying error = TRUE in the chunk options.

    In most cases, it should not be necessary to have the console open while using the notebook, as you can see all of the console output in the notebook itself. To preserve vertical space, the console will be automatically collapsed when you open a notebook or run a chunk in the notebook.

    If you prefer not to have the console hidden when chunks are executed, uncheck the option from the menu Tools -> Global Options -> R Markdown -> Hide console automatically when executing notebook chunks .

    3.2.2 Saving and sharing

    3.2.2.1 Notebook file

    When a notebook *.Rmd file is saved, a *.nb.html file is created alongside it. This file is a self-contained HTML file which contains both a rendered copy of the notebook with all current chunk outputs (suitable for display on a website) and a copy of the *.Rmd file itself.

    You can view the *.nb.html file in any ordinary web browser. It can also be opened in RStudio; when you open there (e.g., using File -> Open File ), RStudio will do the following:

    1. Extract the bundled *.Rmd file, and place it alongside the *.nb.html file.
    2. Open the *.Rmd file in a new RStudio editor tab.
    3. Extract the chunk outputs from the *.nb.html file, and place them appropriately in the editor.

    Note that the *.nb.html file is only created for R Markdown documents that are notebooks (i.e., at least one of their output formats is html_notebook ). It is possible to have an R Markdown document that includes inline chunk output beneath code chunks, but does not produce an *.nb.html file, when html_notebook is not specified as an output format for the R Markdown document.

    3.2.2.2 Output storage

    The document’s chunk outputs are also stored in an internal RStudio folder beneath the project’s .Rproj.user folder. If you work with a notebook but do not have a project open, the outputs are stored in the RStudio state folder in your home directory (the location of this folder varies between the desktop and the server).

    3.2.2.3 Version control

    One of the major advantages of R Notebooks compared to other notebook systems is that they are plain-text files and therefore work well with version control. We recommend checking in both the *.Rmd and *.nb.html files into version control, so that both your source code and output are available to collaborators. However, you can choose to include only the *.Rmd file (with a .gitignore that excludes *.nb.html ) if you want each collaborator to work with their own private copies of the output.

    3.2.3 Notebook format

    While RStudio provides a set of integrated tools for authoring R Notebooks, the notebook file format itself is decoupled from RStudio. The rmarkdown package provides several functions that can be used to read and write R Notebooks outside of RStudio.

    In this section, we describe the internals of the notebook format. It is primarily intended for front-end applications using or embedding R, or other users who are interested in reading and writing documents using the R Notebook format. We recommend that beginners skip this section when reading this book or using notebooks for the first time.

    R Notebooks are HTML documents with data written and encoded in such a way that:

    1. The source Rmd document can be recovered, and
    2. Chunk outputs can be recovered.

    To generate an R Notebook, you can use rmarkdown::render() and specify the html_notebook output format in your document’s YAML metadata. Documents rendered in this form will be generated with the .nb.html file extension, to indicate that they are HTML notebooks.

    To ensure chunk outputs can be recovered, the elements of the R Markdown document are enclosed with HTML comments, providing more information on the output. For example, chunk output might be serialized in the form:

        
    Hello, World!

    Because R Notebooks are just HTML documents, they can be opened and viewed in any web browser; in addition, hosting environments can be configured to recover and open the source Rmd document, and also recover and display chunk outputs as appropriate.

    3.2.3.1 Generating R Notebooks with custom output

    It is possible to render an HTML notebook with custom chunk outputs inserted in lieu of the result that would be generated by evaluating the associated R code. This can be useful for front-end editors that show the output of chunk execution inline, or for conversion programs from other notebook formats where output is already available from the source format. To facilitate this, one can provide a custom “output source” to rmarkdown::render() . Let’s investigate with a simple example:

     rmd_stub = "examples/r-notebook-stub.Rmd" cat(readLines(rmd_stub), sep = "\n")
     --- title: "R Notebook Stub" output: html_notebook --- ``` print("Hello, World!") ```

    Let’s try to render this document with a custom output source, so that we can inject custom output for the single chunk within the document. The output source function will accept:

    • code : The code within the current chunk.
    • context : An environment containing active chunk options and other chunk information.
    • . : Optional arguments reserved for future expansion.

    In particular, the context elements label and chunk.index can be used to help identify which chunk is currently being rendered.

     output_source = function(code, context, . )  logo = file.path(R.home("doc"), "html", "logo.jpg") if (context$label == "chunk-one") list( rmarkdown::html_notebook_output_code("# R Code"), paste("Custom output for chunk:", context$chunk.index), rmarkdown::html_notebook_output_code("# R Logo"), rmarkdown::html_notebook_output_img(logo) ) >

    We can pass our output_source along as part of the output_options list to rmarkdown::render() .

     output_file = rmarkdown::render( rmd_stub, output_options = list(output_source = output_source), quiet = TRUE )
    ## Warning in eng_r(options): Failed to tidy R code in chunk 'chunk-one'. Reason: ## Error : The formatR package is required by the chunk option tidy = TRUE but not installed; tidy = TRUE will be ignored.

    We have now generated an R Notebook. Open this document in a web browser, and it will show that the output_source function has effectively side-stepped evaluation of code within that chunk, and instead returned the injected result.

    3.2.3.2 Implementing output sources

    In general, you can provide regular R output in your output source function, but rmarkdown also provides a number of endpoints for insertion of custom HTML content. These are documented within ?html_notebook_output .

    Using these functions ensures that you produce an R Notebook that can be opened in R frontends (e.g., RStudio).

    3.2.3.3 Parsing R Notebooks

    The rmarkdown::parse_html_notebook() function provides an interface for recovering and parsing an HTML notebook.

     parsed = rmarkdown::parse_html_notebook(output_file) str(parsed, width = 60, strict.width = 'wrap')
    List of 4 $ source : chr [1:294] "" "" "" "" . $ rmd : chr [1:8] "---" "title: \"R Notebook Stub\"" "output: html_notebook" "---" . $ header : chr [1:180] "" "" "" "" . $ annotations:List of 12 ..$ :List of 4 .. ..$ row : int 213 .. ..$ label: chr "text" .. ..$ state: chr "begin" .. ..$ meta : NULL ..$ :List of 4 .. ..$ row : int 214 .. ..$ label: chr "text" .. ..$ state: chr "end" .. ..$ meta : NULL ..$ :List of 4 .. ..$ row : int 215 .. ..$ label: chr "chunk" .. ..$ state: chr "begin" .. ..$ meta : NULL ..$ :List of 4 .. ..$ row : int 216 .. ..$ label: chr "source" .. ..$ state: chr "begin" .. ..$ meta :List of 1 .. .. ..$ data: chr "```r\n# R Code\n```" ..$ :List of 4 .. ..$ row : int 218 .. ..$ label: chr "source" .. ..$ state: chr "end" .. ..$ meta : NULL ..$ :List of 4 .. ..$ row : int 219 .. ..$ label: chr "output" .. ..$ state: chr "begin" .. ..$ meta :List of 1 .. .. ..$ data: chr "Custom output for chunk: 1\n" ..$ :List of 4 .. ..$ row : int 221 .. ..$ label: chr "output" .. ..$ state: chr "end" .. ..$ meta : NULL ..$ :List of 4 .. ..$ row : int 222 .. ..$ label: chr "source" .. ..$ state: chr "begin" .. ..$ meta :List of 1 .. .. ..$ data: chr "```r\n# R Logo\n```" ..$ :List of 4 .. ..$ row : int 224 .. ..$ label: chr "source" .. ..$ state: chr "end" .. ..$ meta : NULL ..$ :List of 4 .. ..$ row : int 225 .. ..$ label: chr "plot" .. ..$ state: chr "begin" .. ..$ meta : NULL ..$ :List of 4 .. ..$ row : int 227 .. ..$ label: chr "plot" .. ..$ state: chr "end" .. ..$ meta : NULL ..$ :List of 4 .. ..$ row : int 228 .. ..$ label: chr "chunk" .. ..$ state: chr "end" .. ..$ meta : NULL

    This interface can be used to recover the original Rmd source, and also (with some more effort from the front-end) the ability to recover chunk outputs from the document itself.

    Settings Sync

    Settings Sync lets you share your Visual Studio Code configurations such as settings, keybindings, and installed extensions across your machines so you are always working with your favorite setup.

    Turning on Settings Sync

    You can turn on Settings Sync using the Turn On Settings Sync. entry in the Manage gear menu at the bottom of the Activity Bar.

    Turn on Sync command

    You will be asked to sign in and what preferences you would like to sync; currently Settings, Keyboard Shortcuts, Extensions, User Snippets, and UI State are supported.

    Settings Sync configure dialog

    Selecting the Sign in & Turn on button will ask you to choose between signing in with your Microsoft or GitHub account.

    Settings Sync configure dialog

    After making this selection, the browser will open so that you can sign in to your Microsoft or GitHub account. When a Microsoft account is chosen, you can use either personal accounts, such as Outlook accounts, or Azure accounts, and you can also link a GitHub account to a new or existing Microsoft account.

    After signing in, Settings Sync will be turned on and continue to synchronize your preferences automatically in the background.

    Merge or Replace

    If you already synced from a machine and turning on sync from another machine, you will be shown with following Merge or Replace dialog.

    Settings Sync Merge or Replace dialog

    • Merge: Selecting this option will merge local settings with remote settings from the cloud.
    • Replace Local: Selecting this option will overwrite local settings with remote settings from the cloud.
    • Merge Manually. : Selecting this option will open Merges view where you can merge preferences one by one.

    Settings Sync Merges

    Configuring synced data

    Machine settings (with machine or machine-overridable scopes) are not synchronized by default, since their values are specific to a given machine. You can also add or remove settings you want to this list from the Settings editor or using the setting settingsSync.ignoredSettings .

    Settings Sync ignored settings

    Keyboard Shortcuts are synchronized per platform by default. If your keyboard shortcuts are platform-agnostic, you can synchronize them across platforms by disabling the setting settingsSync.keybindingsPerPlatform .

    All built-in and installed extensions are synchronized along with their global enablement state. You can skip synchronizing an extension, either from the Extensions view ( ⇧⌘X (Windows, Linux Ctrl+Shift+X ) ) or using the setting settingsSync.ignoredExtensions .

    Settings Sync ignored settings

    Following UI State is synchronized currently:

    • Display Language
    • Activity Bar entries
    • Panel entries
    • Views layout and visibility
    • Recently used commands
    • Do not show again notifications

    You can always change what is synced via the Settings Sync: Configure command or by opening the Manage gear menu, selecting Settings Sync is On, and then Settings Sync: Configure.

    Conflicts

    When synchronizing settings between multiple machines, there may occasionally be conflicts. Conflicts can happen when first setting up sync between machines or when settings change while a machine is offline. When conflicts occur, you will be presented with the following options:

    • Accept Local: Selecting this option will overwrite remote settings in the cloud with your local settings.
    • Accept Remote: Selecting this option will overwrite local settings with remote settings from the cloud.
    • Show Conflicts: Selecting this will display a diff editor similar to the Source Control diff editor, where you can preview the local and remote settings and choose to either accept local or remote or manually resolve the changes in your local settings file and then accept the local file.

    Switching Accounts

    If at any time you want to sync your data to a different account, you can turn off and turn on Settings Sync again with different account.

    Syncing Stable versus Insiders

    By default, the VS Code Stable and Insiders builds use different Settings Sync services, and therefore do not share settings. You can sync your Insiders with Stable by selecting the Stable sync service while turning on Settings Sync. This option is only available in VS Code Insiders.

    Settings Sync Switch Service

    Note: Since Insiders builds are newer than Stable builds, syncing them can sometimes lead to data incompatibility. In such cases, Settings sync will be disabled automatically on stable to prevent data inconsistencies. Once newer version of Stable build is released, you can upgrade your stable client and turn on sync to continue syncing.

    Restoring data

    VS Code always stores local and remote backups of your preferences while syncing and provides views for accessing these. In case something goes wrong, you can restore your data from these views.

    Settings Sync backup views

    You can open these views using Settings Sync: Show Synced Data command from the Command Palette. The Local Sync activity view is hidden by default and you can enable it using Views submenu under Settings Sync view overflow actions.

    Settings Sync enable local backup views

    Local backups folder in the disk can be accessed via the Settings Sync: Open Local Backups Folder command. The folder is organized by the type of preference and contains versions of your JSON files, named with a timestamp of when the backup occurred.

    Note: Local backups are automatically deleted after 30 days. For remote backups the latest 20 versions of each individual resource (settings, extensions, etc.) is retained.

    Synced Machines

    VS Code keeps track of the machines synchronizing your preferences and provides a view to access them. Every machine is given a default name based on the type of VS Code (Insiders or Stable) and the platform it is on. You can always update the machine name using the edit action available on the machine entry in the view. You can also disable sync on another machine using Turn off Settings Sync context menu action on the machine entry in the view.

    Settings Sync machines views

    You can open this view using Settings Sync: Show Synced Data command from the Command Palette.

    Extension authors

    If you are an extension author, you should make sure your extension behaves appropriately when users enable Setting Sync. For example, you probably don't want your extension to display the same dismissed notifications or welcome pages on multiple machines.

    Sync user global state between machines

    If your extension needs to preserve some user state across different machines then provide the state to Settings Sync using vscode.ExtensionContext.globalState.setKeysForSync . Sharing state such as UI dismissed or viewed flags across machines can provide a better user experience.

    There is an example of using setKeysforSync in the Extension Capabilities topic.

    Reporting issues

    Settings Sync activity can be monitored in the Log (Settings Sync) output view. If you experience a problem with Settings Sync, include this log when creating the issue. If your problem is related to authentication, also include the log from the Account output view.

    How do I delete my data?

    If you want to remove all your data from our servers, just turn off sync via Settings Sync is On menu available under Manage gear menu and select the checkbox to clear all cloud data. If you choose to re-enable sync, it will be as if you're signing in for the first time.

    Next steps

    • User and Workspace settings - Learn how to configure VS Code to your preferences through user and workspace settings.

    Common questions

    Is VS Code Settings Sync the same as the Settings Sync extension?

    No, the Settings Sync extension by Shan Khan uses a private Gist on GitHub to share your VS Code settings across different machines and is unrelated to the VS Code Settings Sync.

    What types of accounts can I use for Settings Sync sign in?

    VS Code Settings Sync supports signing in with either a Microsoft account (for example Outlook or Azure accounts) or a GitHub account. Sign in with GitHub Enterprise accounts is not supported. Other authentication providers may be supported in the future and you can review the proposed Authentication Provider API in issue #88309.

    Can I use a different backend or service for Settings Sync?

    Settings Sync uses a dedicated service to store settings and coordinate updates. A service provider API may be exposed in the future to allow for custom Settings Sync backends.

    Troubleshooting keychain issues

    NOTE: This section applies to VS Code version 1.80 and higher. In 1.80, we moved away from keytar, due to its archival, in favor of Electron's safeStorage API.

    NOTE: keychain, keyring, wallet, credential store are synonymous in this document.

    Settings Sync persists authentication information on desktop using the OS keychain for encryption. Using the keychain can fail in some cases if the keychain is misconfigured or the environment isn't recognized.

    To help diagnose the problem, you can restart VS Code with the following flags to generate a verbose log:

    code --verbose --vmodule="*/components/os_crypt/*=1" 

    Windows & macOS

    At this time, there are no known configuration issues on Windows or macOS but, if you suspect something is wrong, you can open an issue on VS Code with the verbose logs from above. This is important for us to support additional desktop configurations.

    Linux

    Towards the top of the logs from the previous command, you will see something to the effect of:

    [9699:0626/093542.027629:VERBOSE1:key_storage_util_linux.cc(54)] Password storage detected desktop environment: GNOME [9699:0626/093542.027660:VERBOSE1:key_storage_linux.cc(122)] Selected backend for OSCrypt: GNOME_ANY 

    We rely on Chromium's oscrypt module to discover and store encryption key information in the keyring. Chromium supports a number of different desktop environments. Outlined below are some popular desktop environments and troubleshooting steps that may help if the keyring is misconfigured.

    GNOME or UNITY (or similar)

    If the error you're seeing is "Cannot create an item in a locked collection", chances are your keyring's Login keyring is locked. You should launch your OS's keyring (Seahorse is the commonly used GUI for seeing keyrings) and ensure the default keyring (usually referred to as Login keyring) is unlocked. This keyring needs to be unlocked when you log into your system.

    KDE

    KDE 6 is not yet fully supported by Visual Studio Code. As a workaround: The latest kwallet6 is also accessible as kwallet5, so you can force it to use kwallet5 by setting the password store to kwallet5 as explained below in Configure the keyring to use with VS Code.

    It's possible that your wallet (aka keyring) is closed. If you open KWalletManager, you can see if the default kdewallet is closed and if it is, make sure you open it.

    If you are using KDE5 or higher and are having trouble connecting to kwallet5 (like users of the unofficial VS Code Flatpak in issue #189672), you can try configuring the keyring to gnome-libsecret as this will use the Secret Service API to communicate with any valid keyring. kwallet5 implements the Secret Service API and can be accessed using this method.

    Other Linux desktop environments

    First off, if your desktop environment wasn't detected, you can open an issue on VS Code with the verbose logs from above. This is important for us to support additional desktop configurations.

    (recommended) Configure the keyring to use with VS Code

    You can manually tell VS Code which keyring to use by passing the password-store flag. Our recommended configuration is to first install gnome-keyring if you don't have it already and then launch VS Code with code --password-store="gnome" .

    If this solution works for you, you can persist the value of password-store by opening the Command Palette ( ⇧⌘P (Windows, Linux Ctrl+Shift+P ) ) and running the Preferences: Configure Runtime Arguments command. This will open the argv.json file where you can add the setting "password-store":"gnome" .

    Here are all the possible values of password-store if you would like to try using a different keyring than gnome-keyring :

    • kwallet5 : For use with kwalletmanager5.
    • gnome : This option will first try the gnome-libsecret option implementation and then if that fails, it will fallback to the gnome-keyring option implementation.
    • gnome-libsecret : For use with any package that implements the Secret Service API (for example gnome-keyring , kwallet5 , KeepassXC ).
    • (not recommended) kwallet : For use with older versions of kwallet .
    • (not recommended) gnome-keyring : A different implementation to access gnome-keyring and should only be used if gnome-libsecret has a problem.
    • (not recommended) basic : See the section below on basic text for more details.

    Don't hesitate to open an issue on VS Code with the verbose logs if you run into any issues.

    (not recommended) Configure basic text encryption

    We rely on Chromium's oscrypt module to discover and store encryption key information in the keyring. Chromium offers an opt-in fallback encryption strategy that uses an in-memory key based on a string that is hardcoded in the Chromium source. Because of this, this fallback strategy is, at best, obfuscation, and should only be used if you are accepting of the risk that any process on the system could, in theory, decrypt your stored secrets.

    If you accept this risk, you can set password-store to basic by opening the Command Palette ( ⇧⌘P (Windows, Linux Ctrl+Shift+P ) ) and running the Preferences: Configure Runtime Arguments command. This will open the argv.json file where you can add the setting "password-store":"basic" .

    Can I share settings between VS Code Stable and Insiders?

    Yes. Please refer to the Syncing Stable versus Insiders section for more information.

    Please note that this can sometimes lead to data incompatibility because Insiders builds are newer than Stable builds. In such cases, Settings Sync will be disabled automatically on Stable to prevent data inconsistencies. Once a newer version of the Stable build is released, you can upgrade your client and turn on Settings Sync to continue syncing.

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *