Citing & Describing Redivis

Citing Redivis

Publications and other outputs should cite Redivis as research software, using the appropriate conventions for your publication medium.

The Digital Object Identifier (DOI) for Redivis is: https://doi.org/10.71778/V2DW-7A53
The canonical URL for Redivis is: https://redivis.com

If referencing Redivis inline, the abbreviated citation can be used:

Redivis [https://doi.org/10.71778/V2DW-7A53]

The following BibTeX entry can be used:

@misc{https://doi.org/10.71778/V2DW-7A53,
  doi = {10.71778/V2DW-7A53},
  url = {https://redivis.com},
  author = {Redivis},
  keywords = {FOS: Software},
  language = {en},
  title = {Redivis},
  publisher = {Redivis},
  year = {2025}
}

These identifiers are also associated with Redivis, though do not need to be included in all citations:

Re3data identifier: https://www.re3data.org/repository/r3d100014398
RRID: RRID:SCR_023111

Describing Redivis in grants and publications

We provide the language below to help you describe Redivis in your publication or grant application. It may be adapted and modified as needed. No attribution is necessary when using or modifying this content.

Overview

Redivis is a secure, scalable, cloud-based data platform developed to meet the needs of academic research. Redivis was first developed in collaboration with the Stanford Center for Population Health Sciences and is now deployed across a number of leading research institutions [1], supporting the distribution and analysis of large, sensitive datasets across multiple disciplines in alignment with FAIR data practices.

Users of Redivis are able to upload large-scale numeric, text, structured, and unstructured data directly through the browser or via APIs and other integrations. Built-in tools are available to curate and tag rich metadata, maximizing the shareability of datasets. The platform allows for robust search and exploration across multiple datasets and their metadata and variables.

Additionally, Redivis provides a rich toolset for analysis and exploration. Users can filter, merge, analyze and visualize billions of records in real-time, and can easily bring together disparate datasets to answer novel questions. They can leverage a massively-parallelized architecture to execute SQL queries (either composed as code or through a graphical user interface), in addition to customizable Jupyter notebooks running Python, R, Stata, and SAS. REST APIs give users and applications additional programmatic interfaces to the data, ensuring interoperability with other tools and ecosystems.

Technical infrastructure

The platform is built on Google Cloud Platform infrastructure using open-source software. The main application runs in containerized services orchestrated with Kubernetes. It integrates high-performance tools including Google BigQuery and its ANSI-SQL interface for large-scale tabular data processing and JupyterLab for interactive analytics. Researchers may provision environments preconfigured for R, Python, SAS, and Stata, with support for customizable environments. Compute capacity scales dynamically to meet workload demands, with configurations available up to 416 CPUs, 11.5 TB of RAM, and 16 NVIDIA A100 GPUs, facilitating complex statistical and machine learning workflows on terabyte-scale datasets.

Reproducibility

Redivis is designed such that reproducibility is an automatic byproduct of researchers' use of the platform. A novel version control system for datasets enables efficient data updates without duplication, supporting full reproducibility and cost-efficient storage management. All analytical activity — including code, queries, and derivative outputs — is tracked and recoverable, ensuring transparency and compliance with NIH data-sharing policies.

All datasets, workflows, and versions thereof can be assigned a unique Digital Object Identifier (DOI), allowing for researchers to persistently link to a fully-reproducible artifact of their research. Future investigators, assuming they have appropriate access to the underlying data, can then re-run these analyses and produce identical results, in turn modifying and building upon prior works.

Security

Redivis supports rigorous data governance with a tiered attribute-based access control (ABAC) system, customizable user agreements, egress restrictions, and an intuitive, searchable audit trail.

Redivis has been audited and approved for the use of FERPA, PII, PHI, and HIPAA data. The platform is SOC2 and NIST 800-171 (rev 3) compliant and undergoes regular audits and penetration testing. All data and metadata are stored in a multi-redundant, AES256 encrypted datastore. All connections to the platform are over an encrypted TLS 1.2 or greater protocol. User login is managed by single sign-on via eduGAIN and InCommon, allowing users and collaborators to use their institution credentials to authenticate. The platform supports HTTP strict transport security (HSTS) and is on the HSTS preload list for all major browsers, preventing users from establishing an unencrypted connection.

Data administrators on Redivis have access to detailed, searchable audit logs, and reports can easily be generated for auditability and traceability. The platform allows for the restriction of data exports and downloads as well as the automatic expiration of dataset access. For highly sensitive data, multi-layer protection exists to prevent accidental data sharing, viewing, and downloading. Redivis also allows for the complete deletion of data, including the automatic and instantaneous deletion of all data derivatives, as needed.

—

[1] Stanford, Columbia, UCLA Libraries, Kellogg Business School, Duke Libraries, Mass General Brigham CSPH

Last updated 1 month ago

Was this helpful?