Analyze data in a workflow

Overview

Workflows are where you work with data on Redivis. In a workflow you can query, merge, reshape, and analyze any data that you have access to, all from within your web browser.

In a workflow, you can construct reproducible data transformations and analyses, and share and collaborate with your peers in real time.

1. Create a workflow

Add a dataset to a new or existing workflow from any Dataset page where you have "Data access" by clicking the Analyze in workflow button.

Within workflows you can navigate between entities on the left side of the screen, and inspect them further on the right panel. You can inspect your dataset further by clicking on any table to see its cells and summary statistics.

To add more data to this workflow you can click the Add data button in the top left of the workflow toolbar. It is also possible to add other linked workflows to this workflow. This is useful as you develop more complex analyses that you want to segment into discrete pieces of work that you can link together.

You can find this workflow later by going back to your workspace.

2. Transform data

Transforming tables is a crucial step in working with data on Redivis. Conceptually, transforms execute a query on source table(s), whose results are materialized in a new output table. In most cases you'll want to use transforms to reshape your data to contain the information you're interested in, before analyzing that table in a notebook or exporting it for further use.

To create a Transform, select a table in this dataset and click the +Transform button. You can get started here building a query through the point and click interface or writing SQL code by adding a SQL step.

For all transforms you will need to select which variables you want to keep in your output table. The rest of the steps are up to you. Some common operations you can get started with include:

Joining in any other dataset table or output table in this workflow
Creating variables
Filtering records to match defined parameters
Renaming variables or changing their type
Aggregating data

Once you've built your query, execute it by clicking the Run button in the top right of the transform. This will create a new output table where you can inspect the output of your query by clicking on the table beneath the transform in the map and making sure it contains the data we would expect.

From here you can create a new transform from this output table to continue reshaping your data, or go back to your original transform to make changes and rerun it.

As you become more familiar with transforms, you can start doing more advanced work such as geospatial joins, complex aggregations, and statistical analyses.

Learn more in the Reshape data in transform guide.

3. Analyze data in a notebook

Once you have a table you're ready to analyze, you can select any table and click the + Notebook button to create a notebook that references this table.

Notebooks are available in Python and R, as well as Stata or SAS (with a corresponding license). Notebooks come pre-installed with common libraries in the data science toolkit, but you can also customize the notebook’s dependencies and startup script to create a custom, reproducible analysis environment that meets your needs.

The default notebook configuration is free, and provides access to 2 CPUs and 32GB working memory, alongside a 60GB (SSD) disk and gigabit network. The computational powerful of these default notebooks are comparable to most personal computers, and will be more than enough for many analyses.

If you're working with larger tables, creating an ML model, or performing other particularly intensive tasks, you may choose to configure additional compute resources for the notebook. This will cost an hourly rate to run based on your chosen environment, and require you to purchase compute credits on your account.

Notebooks come pre-populated with some starter code you can use to import data, and the API docs contain comprehensive documentation and further examples.

From here it’s all up to you in how you want to analyze and visualize your data. Once you’ve finalized your notebook, you can easily export it in different formats to share your findings!

To learn more about analyzing data, see our Work with data in notebooks guide.

You can share your in-progress work or finished results with collaborators by sharing this workflow.

Researchers can work side by side in this workflow in real-time. Leave comments to communicate, and see a visual cue for what each person is working on. You can even collaborate within a running notebook at the same time.

If any of the data in your workflow is restricted, your collaborator must also have access to those datasets in order to view their derivatives within your workflow.

Next steps

Redivis workflows are built for collaboration and include real-time visuals to see where collaborators with edit access are working in the workflow, and a comments interface to discuss changes asynchronously.

Share your workflow to work with collaborators in real time, and make it public so that others can fork off of and build upon your work.

Export data

If you'd like to export data to a different system, you can download it in various file formats, reference in Python / R, or visualize in tools such as Google Data Studio.

Learn more in the Export to other environments guide.

Browse our example workflows

Redivis workflows excel at working with large tables, whether it's filtering and joining, complex aggregation and date manipulation, or visualization and analysis.

Learn more in the Example workflows guide.

Last updated 8 months ago

Was this helpful?