When you’re using database from R, managing your connections is really important because you want to avoid leaking connections, leaving them open and occupying resources even when you’re not using them. Connection management is usually straightforward in scripts because you open them, use them, and close them. Connection management gets more complex in shiny apps, because apps might run for a long time (possibly days or weeks between updates) and they can used by multiple people at the same time.
This vignette describes two extremes for managing connections (once per app vs once per query) then shows you why pool provides a happy middle ground that is safer, more robust, and offers better overall performance.
The first extreme is have one connection per app:
library(shiny)
library(DBI)
# In a multi-file app, you could create conn at the top of your
# server.R file or in global.R
conn <- DBI::dbConnect(RSQLite::SQLite(), dbname = pool::demoDb())
onStop(function() {
DBI::dbDisconnect(conn)
})
ui <- fluidPage(
textInput("cyl", "Enter your number of cylinders:", "4"),
tableOutput("tbl"),
numericInput("nrows", "How many cars to show?", 10),
plotOutput("popPlot")
)
server <- function(input, output, session) {
output$tbl <- renderTable({
sql <- "SELECT * FROM mtcars WHERE cyl = ?cyl;"
query <- sqlInterpolate(conn, sql, cyl = input$cyl)
dbGetQuery(conn, query)
})
output$popPlot <- renderPlot({
sql <- "SELECT * FROM mtcars LIMIT ?n;"
query <- sqlInterpolate(conn, sql, n = input$nrows)
df <- dbGetQuery(conn, query)
barplot(setNames(df$mpg, df$model))
})
}
if (interactive())
shinyApp(ui, server)
This approach is fast, because you only ever create one connection, but has some serious drawbacks:
Let’s now turn our attention to the other extreme: opening and closing a connection for each query:
library(shiny)
library(DBI)
connect <- function() {
DBI::dbConnect(RSQLite::SQLite(), dbname = pool::demoDb())
}
ui <- fluidPage(
textInput("cyl", "Enter your number of cylinders:", "4"),
tableOutput("tbl"),
numericInput("nrows", "How many cars to show?", 10),
plotOutput("popPlot")
)
server <- function(input, output, session) {
output$tbl <- renderTable({
conn <- connect()
on.exit(DBI::dbDisconnect(conn))
sql <- "SELECT * FROM mtcars WHERE cyl = ?cyl;"
query <- sqlInterpolate(conn, sql, cyl = input$cyl)
dbGetQuery(conn, query)
})
output$popPlot <- renderPlot({
conn <- connect()
on.exit(DBI::dbDisconnect(conn))
sql <- "SELECT * FROM mtcars LIMIT ?n;"
query <- sqlInterpolate(conn, sql, n = input$nrows)
df <- dbGetQuery(conn, query)
barplot(setNames(df$mpg, df$model))
})
}
if (interactive())
shinyApp(ui, server)
The advantages to this approach are the reverse of the disadvantages of the first approach:
dbDisconnect()
).On the other hand, it does less well on the things that the former approach excelled at:
Wouldn’t it be nice if you could combine the advantages of the two approaches? That’s exactly the goal of pool!
A connection pool abstracts away the logic of connection management, so that, for the vast majority of cases, you never have to deal with connections directly. Since the pool knows when it needs more connections and how to open and close them, it creates them on demand and can share existing connections that have already been created.
The code is just as simple as the connection per app approach: all
you need to do is substitute pool::dbPool()
for
DBI::dbConnect()
and pool::poolClose()
for
DBI::dbDisconnect()
.
library(shiny)
library(DBI)
pool <- pool::dbPool(RSQLite::SQLite(), dbname = pool::demoDb())
onStop(function() {
pool::poolClose(pool)
})
ui <- fluidPage(
textInput("cyl", "Enter your number of cylinders:", "4"),
tableOutput("tbl"),
numericInput("nrows", "How many cars to show?", 10),
plotOutput("popPlot")
)
server <- function(input, output, session) {
cars <- tbl(pool, "mtcars")
output$tbl <- renderTable({
cars %>% filter(cyl == !!input$cyl) %>% collect()
})
output$popPlot <- renderPlot({
df <- cars %>% head(input$nrows) %>% collect()
pop <- df %>% pull("mpg", name = "model")
barplot(pop)
})
}
if (interactive())
shinyApp(ui, server)
By default, the pool will maintain one idle connection. When you make a query to the pool, it will always use that connection, unless it happens to already be busy. In that case, the pool will create another connection, use it, and then return it to the pool. If that second connection isn’t used for more then a minute (by default), the pool will disconnect it.