I’m Quartoed - Converted to using Quarto for my writings and my new blog!!

After many years of using R Markdown and Python Jupyter notebook, I am now converted to Quarto, the new generation of scientific and technical writing tool.
Quarto
Reproducible report
RMarkdown
Jupyter Notebook
Author

Steven Wang

Published

September 8, 2022


🎯 TL;DR

After many years of using RMarkdown and Python Jupyter notebook, I am now converted to Quarto, the new generation of scientific and technical writing tool. Quarto is an open-source publishing system and uses authoring tools like RStudio, JupyterLab and Visual Studio Code etc to perform text editing and programming language computations. it currently supports Python, R, Julia and Observable.

A Quarto document can be rendered to many different document formats, such as:

1 Context

For many years, I have been using R Markdown and Jupyter Notebook to write the data science project outcomes and reproducible reports, and using Distill and Bookdown to write technical specifications, operational manuals, and documentations and a full accumulated data science knowledge base for my work. I also use R Markdown and Bookdown in conjunction with Zoreto to write my personal research report and articles. This has not only saved me tremendous time in editing and combining text, charts and computation results together but also made generating the multi-format stylish reports with ease.

Quarto, dubbed as the next generation of RMarkDown, had been going around and discussed in the data science community for quite a while. But it had been keeping shushed by RStudio, Oops! I should really say Posit, for a long time until it was officially announced by Posit in 28 July during the rstudio::conf(2022). While I had been testing and sparsely using the Quarto, I was not brave enough to convert all my Markdowns and Jupyter Notebooks to QMD files at that time. However, the more I used Quarto, the more I liked it. And many small format issues and style modification headaches are no longer the case and many more new features have been added. I finally decided to write all new reports and documentations in Quarto. And I also took the plunge to convert my blog into Quarto!! This will motivate and encourage myself to write more posts in the future.

2 So What is Quarto?

If you have never used markdown, then thinking about coding HTML file, with fabulous style and easy navigation. For HTML, you have to add lots lots of markup tags, to the most of people this is daunting and not really a feasible writing tool. And therefore it is better just to use office Word. However, Office word has its limitation on the reproducibility. If you have a chart which was produced in another programming language, every update you will need to replot it then export and update it. When you are doing the scientific research then it will be worse, as you may have many statistical result updates, tabulated number changes, equation changes, and continuous bibliography updates etc. R Markdown and Jupyter notebook are two great solutions to weave text, programming code and computation results including plots and tables together with very light or nil using of markup tags for formatting.

Quarto® is an open-source scientific and technical publishing system built on Pandoc. Quarto claimed as the next generation of RMarkdown, it is indeed not just for R Markdown. Unlike R Markdown, Quarto doesn’t have a dependency or requirement for R. Quarto was developed to be multilingual and the current release supports creating dynamic contents with Python, R, Julia and Observables. It is said that potentially more language could be incorporated in the future. Quarto is really a brand new system based on a decade of user experience on the R Markdown but expanded into other languages. While to R community, Quarto has combined many improvements which may get from many different packages together; to Jupyter community, it is like a manna from heaven. At least for me, I was always hoping that I could use the Jupyter Notebook to do the same thing I can ever do in R Markdown.

3 What Tools Are Used for Quarto Editing?

A Quarto file: QMD can be authored by a variety of IDE and text editors like: RStudio, VS Code, Jupyter Lab and any other text editors. My experience can be summarized:

  1. For Veteran RStudio and R Markdown user: recommended tool is still RStudio, and looking into Visual Studio code for complementary and great extensibility.

  2. For Veteran Python Jupyter User: Keep using Jupyter, and looking into Visual Studio code for great extensibility and flexibility.

  3. For new user: regardless you are learning R or Python or Julia, Visual Studio code is probably a good choice to start with, then looking into Rstudio for documentation and book writing.

4 How Quarto Renders Document?

For R user, Quarto uses the same back-end engine as R Markdown, which is knitr and the process is illustrated as in Figure 1.

graph LR
  subgraph Authoring
    R([RStudio]) & V([VS Code])--> Q{QMD}
  end
  
  subgraph Rendering
    Q-->K(Knitr)
    K-->M{MD}
    M-->P(Pandoc)
  end
 
  subgraph Output
    P-->W[Word]
    P-->F[PDF]
    P-->H[HTML]
    P-->O[Others]
  end
 

  style R fill:#6DCFF6,stroke:#165caa,stroke-width:2px;
  style V fill:#6DCFF6,stroke:#4B8BBE,stroke-width:2px;
  style Q fill:#00FFFF,stroke:#4B8BBE,stroke-width:2px;
  style K color:#FFFFFF,fill:#CB3C33,stroke:#389826,stroke-width:2px;
  style M fill:#00FFFF,stroke:#4B8BBE,stroke-width:2px;
  style P fill:#FFD43B,stroke:#4B8BBE,stroke-width:2px;

Figure 1: Quarto rendering process

For Jupyter notebook, it uses the Jupyter engine, and Quarto also supports Observable as the process engine.

All possible document formats Quarto can render is listed in this Quarto webpage.

5 Language Interoperatability In RStudio

The interoperatability of Python and R in RStudio has been well-developed and widely used through the R reticulate package. In Rmarkdown, within R code chunk, you can call python object by using py$ suffixed with a python object name. While in Python code chunk, you can call R object by using r. suffixed with an R object. This is same in a Quarto document. You can run the native R and python code chunk then exchange the result as needed.

1. Python Chunk: Read csv by using Python Pandas

You can click here to get the penguins csv data file

import pandas as pd

df_py = pd.read_csv("palmer-penguins.csv")

## just simple read the csv data and we know it is a pandas dataframe
2. R Chunk: Call the Pandas dataframe in R and make some change

In R code chunk, using py$R_Object to read any Python object produced from Python computation.

library(readr)
library(dplyr)

df_r <- py$df_py %>% mutate(bill_depth_mm = bill_depth_mm + row_number()) 
## just a silly change the number to see we changed data frame from pandas

r_aggregate <- py$df_py %>%
   group_by(year) %>%
   summarize(average = mean(bill_depth_mm, na.rm = T))

print(r_aggregate)
# A tibble: 3 × 2
   year average
  <dbl>   <dbl>
1  2007    17.4
2  2008    16.9
3  2009    17.1
3. Python Chunk: Read the R changed dataframe and do the aggregation

In Python code chunk, using r.R_Object to read any R object produced from R computation.

py_aggregate = r.df_r.groupby("year")["bill_depth_mm"].mean().\
                  reset_index().rename(columns = {"bill_depth_mm": "mean"})

print(py_aggregate)
# the data has been changed which is manipulated in R chunk
     year        mean
0  2007.0  151.014679
1  2008.0  183.747368
2  2009.0  231.276471

The above simple operation has demonstrated the free data interchange from Python –> R and then R –> Python. All R and Python were using its native syntax not using the reticulate wrapper.

Julia objects can also be consumed in R code Chunk. With the JuliaCall package, a Quarto file can have the Julia code chunk running in the native Julia syntax. The Available Julia objects can be called in an R code chunk by using the julia_eval("Julia_Object_Name")

Observable JS is integrate part of Quarto installation. Therefore Quarto natively supports Observable code chunk, which is the ojs chunk. While Observable can import data source in the Observable JS way, in RStudio, any R objects can be converted to an Observable Object to be used in ojs code chunk.

ojs_define(ojs_data = df_r)

The ojs_data defined in the R code chunk can be used as the data source of Observable plot Figure 2. Observable plot is a great option for beautiful charting and interactivity. However, before we can actually utilize the ojs_define function defined dataframe, there is one additional step we will need to do, which is to use the transpose function in ojs code chunk on ojs_data before it can be plotted. The transpose function will convert a column-oriented dataset (r or python dataframe) into the row-oriented dataset which then can be used by JavaScript plotting libraries.

// first need to transpose the data from R for plotting
data = transpose(ojs_data)
Plot.plot({
  x: {label: "Bill Length(mm)"},
  facet: {
          data: filtered,
          x: "sex",
          y: "species",
          marginLeft: 20,
          marginRight: 80,
},
  marks: [
    Plot.frame(),
    Plot.rectY(filtered, Plot.binX({y: "count"},
              {x:"bill_depth_mm", fill: "species", thresholds: 20}))
  
]
})

Figure 2: Observable Plot using R chunk data

The language Interoperatability data exchange flow is depicted in the Figure 3.

Figure 3: Language Interoperatability data exchange flow


6 Some Great Posts for Quarto

  1. Rstudio: Announcing Quarto, a new scientific and technical publishing system

  2. Alison Hill’s article: We don’t talk about Quarto

  3. Yihui Xie’s article: With Quarto Coming, is R Markdown Going Away? No.

  4. Albert Rapp wrote a great article on how to blog with Quarto: The ultimate guide to starting a Quarto blog