Table nodes

Table nodes in your project can either be generated by running transform, or will be part of a dataset you've added. The main purpose of table nodes in projects is to store data which you can sanity check and further query.
Derivative tables in a project behave quite similarly to dataset tables throughout Redivis, where you can preview cells, view summary statistics, and run quick queries.


All table nodes have one upstream parent. You can view the table's data and metadata similarly to other tables on Redivis. You cannot edit or update the metadata here.
You can create multiple transforms to operate on a single table, which allows you to create various branches within your project. To create a new transform, select the table or dataset and click the small + icon that appears under the bottom right corner of the node.

Sanity check output

After you run a transform, you can investigate the downstream output table to get feedback on the success and validity of your querying operation – both the filtering criteria you've applied and the new features you've created.
Understanding at the content of an output table allows you perform important sanity checks at each step of your research process, answering questions like:
  • Did my filtering criteria remove the rows I expected?
  • Do my new variables contain the information I expect?
  • Does the distribution of values in a given variable make sense?
  • Have I dropped unnecessary variables?
To sanity check the contents of a table node, you can inspecting the general table characteristics, checking the summary statistics of different variables, looking at the table's cells, or create a notebook for more in-depth analysis.


If you haven't interacted with tables in your project for a while, these tables may become archived, which will temporarily limit your ability to view cells and query data in that table. This is done to prevent runaway storage costs, while leveraging the built-in reproducibility of projects to allow you to unarchive the table and pick up where you last left off.
The archival algorithm prioritizes tables that are large, quick to regenerate, and intermediary (not at the bottom of the tree). It currently does not archive tables less than 1GB; in many cases you may never interact with archived tables.
If a table is archived, you can still see the name, row count, and variable names/types. To access variable summary statistics, view cells, or run transforms downstream of an archived table, you'll have to reconstitute the table by re-running upstream transforms.
Note that the transform immediately upstream (or any additional upstream transforms, if multiple sequential tables are archived) is invalid, you'll have to resolve the invalid state before un-archiving the table.

File index tables

If you've added a dataset to a project that contains files (storage for non-tabular data) you will see a table with a File index label in that dataset's list of tables. This is an automatically generated table, where every row represents one file. You can work with this table just like any other in the project tool, but the file_id variable will remain linked to the files in that dataset for use in a notebook.