term | definition |
---|---|
APIx | An API, or application programming interface, is a set of defined rules that enable different applications to communicate with each other. It acts as an intermediary layer that processes data transfers between systems, letting companies open their application data and functionality to external third-party developers, business partners, and internal departments within their companies. (<a href="https://www.ibm.com/topics/api">IBM</a>) |
CIx | Continuous integration (CI) is the practice of integrating all your code changes into the main branch of a shared source code repository early and often, automatically testing each change when you commit or merge them, and automatically kicking off a build. |
CSV | Text files where the values are separated with commas (Comma Separated Values = CSV). These files have the file extension .csv |
Git | Git is a distributed version control system that tracks changes to files, enabling developers to manage and collaborate on software projects efficiently. It allows users to keep a record of all changes made to their code over time, making it easy to revert to previous versions if needed. Git is widely used in the software development industry, with nearly 95% of developers reporting it as their primary version control system as of 2022. |
GitHub | GitHub is a web-based platform that allows developers to create, store, manage, and share their code. It uses Git, a distributed version control system, to track changes in code and facilitate collaboration among developers. GitHub provides a user-friendly interface that makes it easier for individuals and teams to use Git for version control and collaboration. GitHub is commonly used to host open-source software development projects. |
StackOverflow | Stack Overflow is a question-and-answer website for computer programmers. (https://stackoverflow.com/) |
17 General guidelines
Table of content for chapter 17
Chapter section list
17.1 Introduction
This chapter introduces the most important software engineering skills you’ll need when writing Shiny apps:
- code organisation (Section 17.2),
- testing (Section 17.3),
- dependency management, (Section 17.4)
- source code control,
- continuous integration,
- and code reviews.
These skills are not specific to Shiny apps, but you’ll need to learn a bit about all of them if you want to write complex apps that get easier to maintain over time, not harder.
Hadley recommends setting aside some time each week to practice your software development skills. During this time, try to avoid touching the behavior or appearance of your app, but instead focus your efforts on making the app easier to understand and develop.
17.2 Code organization
General guidelines:
- Are the variable and function names clear and concise? If not, what names would better communicate the intent of the code?
- Do I have comments where needed to explain complex bits of code?
- Does this whole function fit on my screen or could it be printed on a single piece of paper? If not, is there a way to break it up into smaller pieces?
- Am I copying-and-pasting the same block of code many times throughout my app? If so, is there a way to use a function or a variable to avoid the repetition?
- Are all the parts of my application tangled together, or can I manage the different components of my application in isolation?
Particularly important tools are:
- Functions, the topic of Chapter 18, allow you to reduce duplication in your UI code, make your server functions easier to understand and test, and allow you to more flexibly organize your app code.
- Shiny modules, the topic of Chapter 19, make it possible to write isolated, re-usable code, that coordinates front end and back end behavior. Modules allow you to gracefully separate concerns so that (e.g.) individual pages in your application can operate independently, or repeated components no longer need to be copied and pasted.
17.3 Testing
Developing a test plan for an application is critical to ensure its ongoing stability. Without a test plan, every change jeopardizes the application.
A testing plan could be entirely manual. A great place to start is a simple text file giving a script to follow to check that all is well. But as the application becomes more complex, the next step is to start to automate some of your testing. Automation takes time to set up, but it pays off over time because you can run the tests more frequently. Various forms of automated testing have been developed for Shiny, as outlined in ?sec-chap21.
- Unit tests that confirm the correct behaviour of an individual function.
- Integration tests to confirm the interactions between reactives.
- Functional tests to validate the end-to-end experience from a browser
- Load tests to ensure that the application can withstand the amount of traffic you anticipate for it.
17.4 Dependency management
An app’s dependencies are anything beyond the source code that it requires to run. These could include files on the hard drive, an external database or API, or other R packages that are used by the app.
17.4.1 {renv}
For any analysis that you may want to reproduce in the future, consider using {renv} which enables you to create reproducible R environments.
I have already used several time {renv}, but I am still uncomfortable wit it. There are some properties and side effects I do not oversee. Maybe I should give it another try?
17.4.2 {config}
Another tool for managing dependencies is the {config} package. The config package doesn’t actually manage dependencies itself, but it does provide a convenient place for you to track and manage dependencies other than R packages. For instance, you might specify the path to a CSV file that your application depends on, or the URL of an API that you require. Having these enumerated in the config file gives you a single place where you can track and manage these dependencies. Even better, it enables you to create different configurations for different environments.
For example, if your application analyses a database with lots of data, you might choose to configure a few different environments:
- In the production environment, you connect the app to the real “production” database.
- In a test environment, you can configure the app to use a test database so that you properly exercise the database connections in your tests but you don’t risk corrupting your production database if you accidentally make a change that corrupts the data.
- In development, you might configure the application to use a small CSV with a subset of data to allow for faster iterating.
Lastly, be wary of making assumptions about the local file system. If your code has references to data at C:\data\cars.csv
or ~/my-projects/genes.rds
, for example, you need to realize that it’s very unlikely that these files will exist on another computer. Instead, either use a path relative to the app directory (e.g. data/cars.csv
or genes.rds
), or use the config package to make the external path explicit and configurable.
I have never heard about this package. I am not sure if I will need it, because I do not have for my learning trajectory this separation between production and test environment.
17.5 Source code management
Anyone who’s been programming for a long time has inevitably arrived at a state where they’ve accidentally broken their app and want to roll back to a previous working state. This is incredibly arduous when done manually. Fortunately, however, you can rely on a “version-control system” that makes it easy to track atomic changes, roll back to previous work, and integrate the work of multiple contributors.
The most popular version-control system in the R community is Git. Git is typically paired with GitHub, a website that makes it easy to share your git repos with others. It definitely takes work to become proficient with Git and GitHub, but any experienced developer will confirm that the effort is well worth it. If you’re new to git, I’d highly recommend starting with Happy Git and GitHub for the useR, by Jenny Bryan.
I am still not an expert with git/GitHub and have in unusually situations to look up relevant questions in StackOverflow.
17.6 Continuous integration/deployment (CI, CD)
Once you are using a version control system and have a robust set of automated tests, you might benefit from continuous integration (CI). CI is a way to perpetually validate that the changes you’re making to your application haven’t broken anything. You can use it retroactively (to notify you if a change you just made broke your application) or proactively (to notify you if a proposed change would break your app).
There are a variety of services that can connect to a Git repo and automatically run tests when you push a new commit or propose changes. Depending on where your code is hosted, you can consider GitHub actions, Travis CI, Azure Pipelines, AppVeyor, Jenkins, or GitLab CI/CD, to name a few.
Having a CI process not only prevents experienced developers from making accidental mistakes, but also helps new contributors feel confident in their changes.
The reason I have so far not used CI is that I have not (yet) established an automatic test environment with {testthat}. This should be one of my next learning steps. Therefore I am already curious about ?sec-chap21 to learn more about it.
17.7 Code reviews
Many software companies have found the benefits of having someone else review code before it’s formally incorporated into a code base. This process of “code review” has a number of benefits:
- Catches bugs before they get incorporated into the application making them much less expensive to fix.
- Offers teaching opportunities — programmers at all levels often learn something new by reviewing others’ code or by having their code reviewed.
- Facilitates cross-pollination and knowledge sharing across a team to eliminate having only one person who understands the app.
The resulting conversation often improves the readability of the code.
Typically, a code review involves someone other than you, but you can still benefit even if it’s only you. Most experienced developers will agree that taking a moment to review your own code often reveals some small flaw, particularly if you can let it sit for at least a few hours between writing and review.
Here are few questions to hold in your head when reviewing code:
- Do new functions have concise but evocative names?
- Are there parts of the code you find confusing?
- What areas are likely to change in the future, and would particularly benefit from automated testing?
- Does the style of the code match the rest of the app? (Or even better, your group’s documented code style.)
If you’re embedded in an organisation with a strong engineering culture, setting up code reviews for data science code should be relatively straightforward, and you’ll have existing tools and experience to draw on. If you’re in an organisation that has few other software engineers, you may need to do more convincing.
Resource 17.1 : Tips for code reviews
- Code Review by Mattheus Richard
- Google engineering practices
17.8 Glossary Entries
Session Info
Session Info
Code
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.5.1 (2025-06-13)
#> os macOS Sequoia 15.5
#> system aarch64, darwin20
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Vienna
#> date 2025-07-18
#> pandoc 3.7.0.2 @ /opt/homebrew/bin/ (via rmarkdown)
#> quarto 1.8.4 @ /usr/local/bin/quarto
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> cli 3.6.5 2025-04-23 [1] CRAN (R 4.5.0)
#> commonmark 2.0.0 2025-07-07 [1] CRAN (R 4.5.0)
#> curl 6.4.0 2025-06-22 [1] CRAN (R 4.5.0)
#> dichromat 2.0-0.1 2022-05-02 [1] CRAN (R 4.5.0)
#> digest 0.6.37 2024-08-19 [1] CRAN (R 4.5.0)
#> evaluate 1.0.4 2025-06-18 [1] CRAN (R 4.5.0)
#> farver 2.1.2 2024-05-13 [1] CRAN (R 4.5.0)
#> fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.5.0)
#> glossary * 1.0.0.9003 2025-06-08 [1] local
#> glue 1.8.0 2024-09-30 [1] CRAN (R 4.5.0)
#> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.5.0)
#> htmlwidgets 1.6.4 2023-12-06 [1] CRAN (R 4.5.0)
#> jsonlite 2.0.0 2025-03-27 [1] CRAN (R 4.5.0)
#> kableExtra 1.4.0 2024-01-24 [1] CRAN (R 4.5.0)
#> knitr 1.50 2025-03-16 [1] CRAN (R 4.5.0)
#> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.5.0)
#> litedown 0.7 2025-04-08 [1] CRAN (R 4.5.0)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.5.0)
#> markdown 2.0 2025-03-23 [1] CRAN (R 4.5.0)
#> R6 2.6.1 2025-02-15 [1] CRAN (R 4.5.0)
#> RColorBrewer 1.1-3 2022-04-03 [1] CRAN (R 4.5.0)
#> rlang 1.1.6 2025-04-11 [1] CRAN (R 4.5.0)
#> rmarkdown 2.29 2024-11-04 [1] CRAN (R 4.5.0)
#> rstudioapi 0.17.1 2024-10-22 [1] CRAN (R 4.5.0)
#> rversions 2.1.2 2022-08-31 [1] CRAN (R 4.5.0)
#> scales 1.4.0 2025-04-24 [1] CRAN (R 4.5.0)
#> sessioninfo 1.2.3 2025-02-05 [1] CRAN (R 4.5.0)
#> stringi 1.8.7 2025-03-27 [1] CRAN (R 4.5.0)
#> stringr 1.5.1 2023-11-14 [1] CRAN (R 4.5.0)
#> svglite 2.2.1 2025-05-12 [1] CRAN (R 4.5.0)
#> systemfonts 1.2.3 2025-04-30 [1] CRAN (R 4.5.0)
#> textshaping 1.0.1 2025-05-01 [1] CRAN (R 4.5.0)
#> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.5.0)
#> viridisLite 0.4.2 2023-05-02 [1] CRAN (R 4.5.0)
#> xfun 0.52 2025-04-02 [1] CRAN (R 4.5.0)
#> xml2 1.3.8 2025-03-14 [1] CRAN (R 4.5.0)
#> yaml 2.3.10 2024-07-26 [1] CRAN (R 4.5.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/4.5-arm64/library
#> [2] /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library
#> * ── Packages attached to the search path.
#>
#> ──────────────────────────────────────────────────────────────────────────────