Redivis Documentation
API DocumentationRedivis Home
  • Introduction
  • Redivis for open science
    • FAIR data practices
    • Open access
    • Data repository characteristics
    • Data retention policy
    • Citations
  • Guides
    • Getting started
    • Discover & access data
      • Discover datasets
      • Apply to access restricted data
      • Create a study
    • Analyze data in a workflow
      • Reshape data in transforms
      • Work with data in notebooks
      • Running ML workloads
      • Example workflows
        • Analyzing large tabular data
        • Create an image classification model
        • Fine tuning a Large Language Model (LLM)
        • No-code visualization
        • Continuous enrollment
        • Select first/last encounter
    • Export & publish your work
      • Export to other environments
      • Build your own site with Observable
    • Create & manage datasets
      • Create and populate a dataset
      • Upload tabular data as tables
      • Upload unstructured data as files
      • Cleaning tabular data
    • Administer an organization
      • Configure access systems
      • Grant access to data
      • Generate a report
      • Example tasks
        • Emailing subsets of members
    • Video guides
  • Reference
    • Your account
      • Creating an account
      • Managing logins
      • Single Sign-On (SSO)
      • Workspace
      • Studies
      • Compute credits and billing
    • Datasets
      • Documentation
      • Tables
      • Variables
      • Files
      • Creating & editing datasets
      • Uploading data
        • Tabular data
        • Geospatial data
        • Unstructured data
        • Metadata
        • Data sources
        • Programmatic uploads
      • Version control
      • Sampling
      • Exporting data
        • Download
        • Programmatic
        • Google Data Studio
        • Google Cloud Storage
        • Google BigQuery
        • Embedding tables
    • Workflows
      • Workflow concepts
      • Documentation
      • Data sources
      • Tables
      • Transforms
        • Transform concepts
        • Step: Aggregate
        • Step: Create variables
        • Step: Filter
        • Step: Join
        • Step: Limit
        • Step: Stack
        • Step: Order
        • Step: Pivot
        • Step: Rename
        • Step: Retype
        • Step: SQL query
        • Variable selection
        • Value lists
        • Optimization and errors
        • Variable creation methods
          • Common elements
          • Aggregate
          • Case (if/else)
          • Date
          • DateTime
          • Geography
          • JSON
          • Math
          • Navigation
          • Numbering
          • Other
          • Statistical
          • String
          • Time
      • Notebooks
        • Notebook concepts
        • Compute resources
        • Python notebooks
        • R notebooks
        • Stata notebooks
        • SAS notebooks
        • Using the Jupyter interface
      • Access and privacy
    • Data access
      • Access levels
      • Configuring access
      • Requesting access
      • Approving access
      • Usage rules
      • Data access in workflows
    • Organizations
      • Administrator panel
      • Members
      • Studies
      • Workflows
      • Datasets
      • Permission groups
      • Requirements
      • Reports
      • Logs
      • Billing
      • Settings and branding
        • Account
        • Public profile
        • Membership
        • Export environments
        • Advanced: DOI configuration
        • Advanced: Stata & SAS setup
        • Advanced: Data storage locations
        • Advanced: Data egress configuration
    • Institutions
      • Administrator panel
      • Organizations
      • Members
      • Datasets
      • Reports
      • Settings and branding
    • Quotas and limits
    • Glossary
  • Additional Resources
    • Events and press
    • API documentation
    • Redivis Labs
    • Office hours
    • Contact us
    • More information
      • Product updates
      • Roadmap
      • System status
      • Security
      • Feature requests
      • Report a bug
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Reference
  2. Datasets
  3. Uploading data

Metadata

Last updated 2 months ago

Was this helpful?

Overview

Redivis determines variable names and types during . Additionally, it will automatically parse certain metadata based on the uploaded file format:

  • SAS (.sas7bdat): labels

  • Stata (.dta): labels and value labels

  • SPSS (.sav): labels and value labels

For other file types (e.g., csv), you will need to augment the metadata directly. To apply metadata in bulk, you can upload a file containing metadata information directly from your computer. This file can either be a CSV or JSON.

Is your metadata stuck in a PDF? We're truly sorry — if you can, please let the data provider know that it is essential that they provide metadata in a machine-readable format; hopefully in time this will change.

While you can just upload the PDF to the dataset's , you'll be doing your researchers a huge service if you can add structured metadata to the variables. That might mean some manual copying and pasting from the PDF, or you could consider the various (and imperfect) online PDF to CSV conversion tools, or .

If you don't have the bandwidth, consider asking for your researchers to contribute by making them a .

CSV metadata format

The CSV should be formatted without a header, with each row corresponding to a variable, with column 1 as the name, 2 as the label, 3 as the description. If the variable doesn't have a label or description, leave these columns empty.

variable1_name,variable1_label,variable1_description
variable2_name,variable2_label,variable2_description

For example:

sex,patient sex,patient's recorded sex
id,patient identifier,unique patient identifier

JSON metadata format

When uploading a JSON file, specify the name, label, description, and valueLabels using the appropriately named attributes in the object corresponding to each variable. If the variable doesn't have a label, description, or value labels, you don't need to include these attributes.

For example:

// JSON format is an array of objects, with each object representing a variable
[
    {
        "name": "sex",
        "label": "patient sex",
        "description": "patient's recorded sex",
        "valueLabels": [
            {
                "value": 1,
                "label": "Male"
            },
            {
                "value": 2,
                "label": "Female"
            }
        ]
    }
]

To upload value labels in bulk, you must use the JSON format. We no longer support bulk upload of value labels via CSV.

data upload
documentation
this python library
dataset editor