Analyze data in a project
Last updated
Last updated
Projects are where you work with data on Redivis. In a project you can query, merge, reshape, and analyze any datasets that you have access to, all from within your web browser.
In a project, you can construct reproducible data transformations and analyses, and share and collaborate with your peers in real time.
Add a dataset to a new or existing project from any Dataset page where you have "Data access" by clicking the Analyze in project button.
Within projects you can navigate between entities on the left side of the screen, and inspect them further on the right panel. You can inspect your dataset further by clicking on any table to see its cells and summary statistics.
To add more data to this project you can click the Add dataset button in the top left of the project toolbar.
You can find this project later by going back to your workspace.
Transforming tables is a crucial step in working with data on Redivis. Conceptually, transforms execute a query on source table(s), whose results are materialized in a new output table. In most cases you'll want to use transforms to reshape your data to contain the information you're interested in, before analyzing that table in a notebook or exporting it for further use.
To create a Transform, select a table in this dataset and click the +Transform button. You can get started here building a query through the point and click interface or writing SQL code by adding a SQL step.
For all transforms you will need to select which variables you want to keep in your output table. The rest of the steps are up to you. Some common operations you can get started with include:
Joining in any other dataset table or output table in this project
Filtering records to match defined parameters
Renaming variables or changing their type
Aggregating data
Once you've built your query, execute it by clicking the Run button in the top right of the transform. This will create a new output table where you can inspect the output of your query by clicking on the table beneath the transform in the map and making sure it contains the data we would expect.
From here you can create a new transform from this output table to continue reshaping your data, or go back to your original transform to make changes and rerun it.
As you become more familiar with transforms, you can start doing more advanced work such as geospatial joins, complex aggregations, and statistical analyses.
Learn more in the Reshape data in transform guide.
Once you have a table you're ready to analyze, you can select any table and click the + Notebook button to create a notebook that references this table.
Notebooks are available in Python and R, as well as Stata or SAS (with a corresponding license). Notebooks come pre-installed with common libraries in the data science toolkit, but you can also customize the notebook’s dependencies and startup script to create a custom, reproducible analysis environment that meets your needs.
The default notebook configuration is free, and provides access to 2 CPUs and 32GB working memory, alongside a 60GB (SSD) disk and gigabit network. The computational powerful of these default notebooks are comparable to most personal computers, and will be more than enough for many analyses.
If you're working with larger tables, creating an ML model, or performing other particularly intensive tasks, you may choose to configure additional compute resources for the notebook. This will cost an hourly rate to run based on your chosen environment, and require you to purchase compute credits on your account.
Notebooks come pre-populated with some starter code you can use to import data, and the API docs contain comprehensive documentation and further examples.
From here it’s all up to you in how you want to analyze and visualize your data. Once you’ve finalized your notebook, you can easily export it in different formats to share your findings!
To learn more about analyzing data, see our Work with data in notebooks guide.
You can share your in-progress work or finished results with collaborators by sharing this project.
Researchers can work side by side in this project in real-time. Leave comments to communicate, and see a visual cue for what each person is working on. You can even collaborate within a running notebook at the same time.
If any of the data in your project is restricted, your collaborator must also have access to those datasets in order to view their derivatives within your project.
Redivis projects are built for collaboration and include real-time visuals to see where collaborators with edit access are working in the project, and a comments interface to discuss changes asynchronously.
Share your project to work with collaborators in real time, and make it public so that others can fork off of and build upon your work.
If you'd like to export data to a different system, you can download it in various file formats, reference in Python / R, or visualize in tools such as Google Data Studio.
Learn more in the Export to other environments guide.
Redivis projects excel at working with large tables, whether it's filtering and joining, complex aggregation and date manipulation, or visualization and analysis.
Learn more in the Example projects guide.