A beginner's guide to using Observable JavaScript, R, and Python with Quarto - IT SPARK Media

A beginner’s guide to using Observable JavaScript, R, and Python with Quarto


There’s an intriguing new option for people who want to do data-wrangling and analysis in R or Python but visualization in JavaScript: Quarto.

This article shows you how to set up a Quarto document to use Observable JavaScript, including how to pass data from R or Python to an Observable code chunk. In Part 2, you’ll find out how to learn Observable JavaScript with Observable notebooks—and why that’s worth doing even if you only plan to use JavaScript in Quarto. Part 3 gives you the basics of data visualization with Observable JavaScript, including how to make your plots interactive.

Let’s get started!

Why use Quarto with Observable JavaScript?

Quarto is an open-source technical publishing platform from RStudio that natively supports Python, R, Julia, and Observable JavaScript—a JavaScript flavor designed for data analysis. Using Quarto with Observable could be a compelling option for R and Python users who also want to use JavaScript. Here are a few reasons why:

  • Quarto includes two easy-to-use, one-line functions to hand off data from Python or R for use in JavaScript.
  • It’s fairly simple to combine the results from code written in multiple languages into one document—and add explantory text.
  • One Quarto document can generate multiple HTML formats: a single page, reveal.js slides, and Quarto websites, for example. (Quarto can create dozens of static formats, too, like PDF or Word, but the Observable integration isn’t designed for those. Currently, there’s no built-in way to export JavaScript output as static images.)

Why visualize in Observable if you already generate plots with R or Python? One reason is that Observable has interactivity built in. Using Observable, there are simple ways to add interactive filters on your data that control what appears in tables and graphs.

Additionally, Quarto’s rendered HTML files can be hosted on any web server or opened locally with a simple browser, with no separate language or framework installations required. That’s not the case for options like Shiny for R or Dash for Python (alpha Shiny for Python can run without a Shiny server, but it’s not yet production-ready). Using Quarto with Observable offers an elegant workflow if you want to combine data analysis in Python and R with reactivity.

Finally, Observable was set up with collaboration in mind, so it’s fairly easy to find and use someone else’s open source code.

Bottom line: If you want to quickly code an interactive report or analysis and email it to colleagues or host it on an intranet that handles HTML files, integrating R or Python and Observable JavaScript in Quarto could be a great choice.

Install and set up Quarto in your IDE

Quarto is a separate software application and not an R or Python package. If you use R and have an up-to-date version of RStudio, Quarto software is bundled in. If you use Python or R in Visual Studio Code (VS Code), download the Quarto software application from the Quarto Get Started page and the separate VS Code Quarto extension.

Using Observable JavaScript in a Quarto document

To incorporate Observable JavaScript in a Quarto document, use {ojs} for an Observable code chunk, as opposed to {r} for R code and {python} for Python. Chunk options are preceded with a #| in R and Python but //| for Observable. Here’s an example:

//| echo: false
//| eval: true 

The above code creates an ojs code chunk that will execute code within it but not display that underlying code in the final document. 

Both RStudio and VS Code offer code completion help when writing the YAML header at the top of a Quarto document. This header defines the file metadata and various document-wide options. However, you won’t find the same kind of robust IDE support for things like executing single cells or code completion for Observable code chunks as you would with R and Python.

Importing data to Observable from R or Python

You can import, wrangle, and analyze data in R or Python and then send those results to Observable with the ojs_define() function. “That’s the magic,” said Maya Ganz, a developer at Atorus who was an intern at RStudio several years ago. It “dramatically reduces the barrier to entry” for R and Python users to add JavaScript to their workflow.

To send wrangled data from R or Python to Observable Javascript, the syntax is:

ojs_define(my_ojs_data = my_wrangled_r_object)

Note that ojs_define() also works in Python code chunks. As of this writing, however, you can’t run an R or Python chunk with ojs_define() interactively in RStudio. If you try clicking the chunk’s run icon, it will fail. For now,  ojs_define() only works when Quarto renders the entire document at once.

An RStudio spokesperson said that they hope to have an improved experience in the future, but because it’s easy to get that user experience (UX) wrong, for now they’ve focused on entire-document execution.

The transpose() function

There is one more step to use data from R or Python in Observable: transpose().

JavaScript visualizations usually use a different data format than the rectangular data frames typically needed in R or Python. For instance, below is a partial example of data for a multiline graph from the Apache ECharts website:

series: [
      name: 'Email',
      type: 'line',
      stack: 'Total',
      data: [120, 132, 101, 134, 90, 230, 210]
      name: 'Search Engine',
      type: 'line',
      stack: 'Total',
      data: [820, 932, 901, 934, 1290, 1330, 1320]

That is an array of JavaScript objects, not the rows-and-columns format used by libraries like ggplot2. Fortunately, there’s a single function, transpose(), that will turn R or Python data frames sent to Observable into the format Observable needs. If you forget to use transpose() on data imported with ojs_define(), and then try to use that data in Observable, you may get an error like this one:

TypeError: e is not iterable

If you see that error, check whether you’ve used transpose(my_ojs_data) on data imported with ojs_define().

You can create a new transposed object within an Observable code chunk with something like this:

mydata = transpose(my_ojs_data)

Then, use mydata in all your subsequent functions. Or, you can refer to transpose(my_ojs_data) each time you access your data set in Observable.

Finally, note that ojs_define() and transpose() are specific to Quarto documents, not Observable notebooks hosted on the Observable web platform at ObservableHQ.com. The hosted notebooks don’t run R or Python code.

Importing external data directly into Observable

You can also import external data directly into Observable. Here’s the syntax for importing a local CSV file with Observable’s FileAttachment() function inside an ojs code chunk:

mydata = FileAttachment("my_csv-file.csv").csv({ typed: true })

The ({ typed: true }) asks FileAttachment() to guess the column types. To properly parse the data for ojs use, be sure to include the .csv() after File.Attachment() for a CSV file. 

Several other formats work with FileAttachment(), too, such as FileAttachment().json().

For a CSV file hosted online, you can use the D3 JavaScript library’s csv() function:

mydata = d3.csv("https://www.domain.com/myfile.csv")

D3.js is included by default in Quarto’s Observable setup.

Cell order in Observable vs. Jupyter or R Markdown

In a Jupyter notebook or R Markdown file, code cells run from the top down when you render a document. The only exception is if you manually trigger cells individually in a different order instead of executing the whole file at once. This means you can’t have a cell which calls a variable, x, before a cell that defines x. So, the following will throw an error:

y = x * 10
x = 6 + 12

In addition, you can’t call a function you’ve written above the cell that created the function.

In Observable, though, cells run in the order they’re needed—what’s called a reactive dataflow. That means you can refer to x with y = x * 10 in a cell  above the cell that defines x.  And, you can use a custom function at the top of your notebook but define it at the very end. That’s part of Observable’s built-in reactivity: if a variable’s value changes in one cell, all cells using the variable get updated, not only the ones below.

But while this is handy for interactivity, it also means you can’t define a variable more than once; not unless it’s within a code block in the same cell, such as an if ... else statement surrounded by braces. Otherwise, Observable doesn’t know which should run first. 

There are a few more differences between Observable and “vanilla” JavaScript, which you can read about in the Observable’s not JavaScript documentation. Next: Learn Observable JavaScript with Observable notebooks.

Copyright © 2022 IDG Communications, Inc.


Source link

Leave a Comment