Redivis Documentation
API DocumentationRedivis Home
  • Introduction
  • Redivis for open science
    • FAIR data practices
    • Open access
    • Data repository characteristics
    • Data retention policy
    • Citations
  • Guides
    • Getting started
    • Discover & access data
      • Discover datasets
      • Apply to access restricted data
      • Create a study
    • Analyze data in a workflow
      • Reshape data in transforms
      • Work with data in notebooks
      • Running ML workloads
      • Example workflows
        • Analyzing large tabular data
        • Create an image classification model
        • Fine tuning a Large Language Model (LLM)
        • No-code visualization
        • Continuous enrollment
        • Select first/last encounter
    • Export & publish your work
      • Export to other environments
      • Build your own site with Observable
    • Create & manage datasets
      • Create and populate a dataset
      • Upload tabular data as tables
      • Upload unstructured data as files
      • Cleaning tabular data
    • Administer an organization
      • Configure access systems
      • Grant access to data
      • Generate a report
      • Example tasks
        • Emailing subsets of members
    • Video guides
  • Reference
    • Your account
      • Creating an account
      • Managing logins
      • Single Sign-On (SSO)
      • Workspace
      • Studies
      • Compute credits and billing
    • Datasets
      • Documentation
      • Tables
      • Variables
      • Files
      • Creating & editing datasets
      • Uploading data
        • Tabular data
        • Geospatial data
        • Unstructured data
        • Metadata
        • Data sources
        • Programmatic uploads
      • Version control
      • Sampling
      • Exporting data
        • Download
        • Programmatic
        • Google Data Studio
        • Google Cloud Storage
        • Google BigQuery
        • Embedding tables
    • Workflows
      • Workflow concepts
      • Documentation
      • Data sources
      • Tables
      • Transforms
        • Transform concepts
        • Step: Aggregate
        • Step: Create variables
        • Step: Filter
        • Step: Join
        • Step: Limit
        • Step: Stack
        • Step: Order
        • Step: Pivot
        • Step: Rename
        • Step: Retype
        • Step: SQL query
        • Variable selection
        • Value lists
        • Optimization and errors
        • Variable creation methods
          • Common elements
          • Aggregate
          • Case (if/else)
          • Date
          • DateTime
          • Geography
          • JSON
          • Math
          • Navigation
          • Numbering
          • Other
          • Statistical
          • String
          • Time
      • Notebooks
        • Notebook concepts
        • Compute resources
        • Python notebooks
        • R notebooks
        • Stata notebooks
        • SAS notebooks
        • Using the Jupyter interface
      • Access and privacy
    • Data access
      • Access levels
      • Configuring access
      • Requesting access
      • Approving access
      • Usage rules
      • Data access in workflows
    • Organizations
      • Administrator panel
      • Members
      • Studies
      • Workflows
      • Datasets
      • Permission groups
      • Requirements
      • Reports
      • Logs
      • Billing
      • Settings and branding
        • Account
        • Public profile
        • Membership
        • Export environments
        • Advanced: DOI configuration
        • Advanced: Stata & SAS setup
        • Advanced: Data storage locations
        • Advanced: Data egress configuration
    • Institutions
      • Administrator panel
      • Organizations
      • Members
      • Datasets
      • Reports
      • Settings and branding
    • Quotas and limits
    • Glossary
  • Additional Resources
    • Events and press
    • API documentation
    • Redivis Labs
    • Office hours
    • Contact us
    • More information
      • Product updates
      • Roadmap
      • System status
      • Security
      • Feature requests
      • Report a bug
Powered by GitBook
On this page
  • Overview
  • 1. Create a workflow
  • 2. Transform data
  • 3. Analyze data in a notebook
  • 4. Share and collaborate
  • Next steps

Was this helpful?

Export as PDF
  1. Guides

Analyze data in a workflow

Last updated 5 months ago

Was this helpful?

Overview

Workflows are where you work with data on Redivis. In a workflow you can query, merge, reshape, and analyze any data that you have access to, all from within your web browser.

In a workflow, you can construct reproducible data transformations and analyses, and share and collaborate with your peers in real time.

1. Create a workflow

Add a dataset to a new or existing workflow from any Dataset page where you have "Data access" by clicking the Analyze in workflow button.

Within workflows you can navigate between entities on the left side of the screen, and inspect them further on the right panel. You can inspect your dataset further by clicking on any table to see its cells and summary statistics.

To add more data to this workflow you can click the Add data button in the top left of the workflow toolbar. It is also possible to add other linked workflows to this workflow. This is useful as you develop more complex analyses that you want to segment into discrete pieces of work that you can link together.

You can find this workflow later by going back to your workspace.

2. Transform data

Transforming tables is a crucial step in working with data on Redivis. Conceptually, transforms execute a query on source table(s), whose results are materialized in a new output table. In most cases you'll want to use transforms to reshape your data to contain the information you're interested in, before analyzing that table in a notebook or exporting it for further use.

To create a Transform, select a table in this dataset and click the +Transform button. You can get started here building a query through the point and click interface or writing SQL code by adding a SQL step.

For all transforms you will need to select which variables you want to keep in your output table. The rest of the steps are up to you. Some common operations you can get started with include:

Once you've built your query, execute it by clicking the Run button in the top right of the transform. This will create a new output table where you can inspect the output of your query by clicking on the table beneath the transform in the map and making sure it contains the data we would expect.

From here you can create a new transform from this output table to continue reshaping your data, or go back to your original transform to make changes and rerun it.

As you become more familiar with transforms, you can start doing more advanced work such as geospatial joins, complex aggregations, and statistical analyses.

3. Analyze data in a notebook

Once you have a table you're ready to analyze, you can select any table and click the + Notebook button to create a notebook that references this table.

The default notebook configuration is free, and provides access to 2 CPUs and 32GB working memory, alongside a 60GB (SSD) disk and gigabit network. The computational powerful of these default notebooks are comparable to most personal computers, and will be more than enough for many analyses.

From here it’s all up to you in how you want to analyze and visualize your data. Once you’ve finalized your notebook, you can easily export it in different formats to share your findings!

4. Share and collaborate

You can share your in-progress work or finished results with collaborators by sharing this workflow.

Researchers can work side by side in this workflow in real-time. Leave comments to communicate, and see a visual cue for what each person is working on. You can even collaborate within a running notebook at the same time.

If any of the data in your workflow is restricted, your collaborator must also have access to those datasets in order to view their derivatives within your workflow.

Next steps

Share and collaborate

Redivis workflows are built for collaboration and include real-time visuals to see where collaborators with edit access are working in the workflow, and a comments interface to discuss changes asynchronously.

Export data

Browse our example workflows

Redivis workflows excel at working with large tables, whether it's filtering and joining, complex aggregation and date manipulation, or visualization and analysis.

in any other dataset table or output table in this workflow

records to match defined parameters

variables or

data

Learn more in the guide.

Notebooks are available in Python and R, as well as Stata or SAS (with a corresponding license). Notebooks come pre-installed with common libraries in the data science toolkit, but you can also customize the notebook’s and startup script to create a custom, reproducible analysis environment that meets your needs.

If you're working with larger tables, creating an ML model, or performing other particularly intensive tasks, you may choose to configure additional for the notebook. This will cost an hourly rate to run based on your chosen environment, and require you to purchase on your account.

Notebooks come pre-populated with some starter code you can use to import data, and the contain comprehensive documentation and further examples.

To learn more about analyzing data, see our guide.

to work with collaborators in real time, and make it public so that others can fork off of and build upon your work.

If you'd like to export data to a different system, you can download it in , reference in , or visualize in tools such as .

Learn more in the guide.

Learn more in the guide.

Joining
Creating variables
Filtering
Renaming
changing their type
Aggregating
Reshape data in transform
compute resources
compute credits
Work with data in notebooks
Share your workflow
various file formats
Python / R
Google Data Studio
Export to other environments
Example workflows
API docs
dependencies