Custom data storage

Overview

Custom data storage allows organizations to store the data in their Redivis datasets within their own Google Cloud environments, while continuing to take advantage of the Redivis platform for dataset discovery, access management, and querying.

Generally, storing your data in the default Redivis location is the most cost-effective and secure option. However, custom storage locations allow for increased control and visibility, which may be necessary to meet certain compliance requirements.

Custom storage locations are an advanced feature that introduce new operational requirements to your organization, and may have data security ramifications. Please ensure that these requirements are understood before utilizing custom data storage.

Creating a storage location

You can create a new storage location from the Storage locations tab within your organization's settings.

Each storage location must correspond to a unique Google Cloud project. You can also give it a separate nickname in order to easily remember that location's role and purpose. All datasets assigned to this storage location will have their data stored within BigQuery datasets (and tables) in the corresponding Google Cloud project.

Authorizing a storage location

In order for Redivis to create and operate on data assigned to this storage location, you must grant BigQuery administrator privileges to the corresponding Redivis-managed service account. Follow the on-screen instructions for configuring these permissions, either through the Google Cloud console or command-line interface.

Redivis will only every access Redivis-managed datasets within this Google Cloud project, and only when an authorized Redivis user is ingesting, querying, or exporting data, pursuant to the access controls defined on Redivis. Each dataset and table will be named with a unique, persistent, Redivis-specific identifier. While not required, it is strongly recommended that any projects used as a custom storage location only store Redivis data within BigQuery, as opposed to intermingling Redivis-managed BigQuery datasets with other BigQuery resources created directly within your GCP project.

Make sure that you don't accidentally revoke access to this service account in the future. If you do, users will no longer be able to access, query, or upload to any datasets in this storage location.

Authorizing projects in a VPC

If your organization utilizes a Virtual Private Cloud (VPC), you will need to take additional steps to authorize data communication between Redivis and your VPC. Specifically, you will need to create a service perimeter with the appropriate data ingress and egress rules.

After performing the initial authorization steps outlined above, click the "Validate" button. If the project is within a VPC, you will be taken to a secondary screen outlining the steps necessary to configure the service perimeter. After following these steps, you will be able to fully validate your configuration and create the custom storage location.

Assigning storage locations

To store datasets in a custom storage location, navigate to the desired dataset through the 'Datasets' page on the organization administration panel. Then select Manage storage location from the dropdown under the dataset's name.

From here, you can choose the storage location you'd like to use for this dataset. All existing data will be transferred to this new location, and all data uploaded to this dataset will now be stored in the Google Cloud project corresponding to the selected storage location. You can change the storage location at any time to move data to a different Google Cloud project or back to the default Redivis storage location.

Management and security

Access and use of datasets whose data is stored in custom storage location is similar to any other Redivis dataset; only authorized users can access data through Redivis interfaces.

However, it is up to the owner(s) of any Google Cloud project(s) used to store Redivis data to secure any raw data stored there. Anyone with BigQuery access to this Google Cloud project will be able to bypass Redivis access controls and view, query, and export all datasets stored in the location. It is important that access to this Google Cloud project is appropriately scoped.

Redivis does not back up data stored in any non-default Redivis storage location. Any deletion of BigQuery datasets or tables through the Google Cloud console will cause permanent data loss.

Billing

You will be billed at a reduced rate for data stored in any custom location, though you may incur separate Google Cloud charges for these storage fees. Additional information about reduced rates and cost management details are outlined in the Billing reference.

Last updated