Important note
These functions were developed because existing R tools for loading RIS files often fail to preserve the original formatting from PsycINFO. The new functions are designed to handle this RIS file structure with other database formats. For an full-scale use of these functions, see the Using OpenAI’s GPT API models for Title and Abstract Screening in Systematic Reviews vignette. If you experience any issues with these functions, please report them on GitHub. Otherwise, you can try using synthesisr::read_refs().
Overview
This vignette introduces two helpers for working with RIS files. The function read_ris_to_dataframe(file_path) parses an RIS file into a data frame with the following features:
- Automatically maps RIS tags to descriptive column names (e.g.,
AU→author,TI→title,PY→year) - Preserves the order of tags as they first appear in the file
- Collapses repeated tags within a record into a single semicolon-separated string
- Stores metadata to preserve original formatting when writing back
The function save_dataframe_to_ris(df, file_path) writes a data frame back to RIS format:
- Writes
TY(source type) first for each record, followed by all other fields - Splits semicolon-separated values into multiple RIS tag lines
- Preserves original formatting when available (from metadata)
- Terminates each record with
ER -
Load the package
Read an RIS file
The example below builds a small RIS file in a temporary location and reads it.
ris <- c(
"TY - JOUR",
"AU - Author, One",
"AU - Author, Two",
"TI - An example title",
"PY - 2020",
"ER - ",
"",
"TY - CHAP",
"TI - Another title",
"AU - Author, Three",
"ER - "
)
tmp_in <- tempfile(fileext = ".ris")
writeLines(ris, tmp_in, useBytes = TRUE)
df <- read_ris_to_dataframe(tmp_in)
df source_type author title year
1 JOUR Author, One; Author, Two An example title 2020
2 CHAP Author, Three Another title
The output data frame has descriptive column names instead of RIS tags. For example:
-
TY(type) becomessource_type -
AU(author) becomesauthor -
TI(title) becomestitle -
PY(publication year) becomesyear
Repeated tags, such as multiple AU lines, are collapsed to a single string with “;” (e.g., “Author, One; Author, Two”).
Write a data frame to RIS
Create a data frame and write it to a .ris file. You can use either descriptive column names (as returned by read_ris_to_dataframe()) or raw RIS tags. Semicolon-separated values are automatically split into multiple tag lines.
# Using raw RIS tags
df_out <- data.frame(
TY = c("JOUR", "CHAP"),
AU = c("Author, One; Author, Two", "Author, Three"),
TI = c("An example title", "Another title"),
PY = c("2020", ""),
stringsAsFactors = FALSE
)
tmp_out <- tempfile(fileext = ".ris")
invisible(capture.output(save_dataframe_to_ris(df_out, tmp_out)))
readLines(tmp_out, encoding = "UTF-8") [1] "TY - JOUR" "AU - Author, One" "AU - Author, Two"
[4] "TI - An example title" "PY - 2020" "ER - "
[7] "" "TY - CHAP" "AU - Author, Three"
[10] "TI - Another title" "ER - " ""
Each record writes the TY tag first, splits any field value containing “;” into multiple RIS tag lines, and ends with ER - followed by a blank line.
You can also use descriptive column names (they will be automatically mapped back to RIS tags):
# Using descriptive names
df_descriptive <- data.frame(
source_type = c("JOUR", "CHAP"),
author = c("Author, One; Author, Two", "Author, Three"),
title = c("An example title", "Another title"),
year = c("2020", ""),
stringsAsFactors = FALSE
)
tmp_out2 <- tempfile(fileext = ".ris")
invisible(capture.output(save_dataframe_to_ris(df_descriptive, tmp_out2)))
readLines(tmp_out2, encoding = "UTF-8") [1] "TY - JOUR" "AU - Author, One" "AU - Author, Two"
[4] "TI - An example title" "PY - 2020" "ER - "
[7] "" "TY - CHAP" "AU - Author, Three"
[10] "TI - Another title" "ER - " ""
Both approaches produce identical RIS output.