File

class File

An interface for working with raw files stored on Redivis.

Redivis automatically registers itself with fsspec on import. If you are working with an fsspec compatible library, you can use the Redivis file URI scheme to reference files directly:

with fsspec.open("redivis://table_reference/path/to/file") as f:
    """ Do great things """

Constructors

Reference a file within a table.

Reference a file within a query result

Get a file (or directory) within a directory.

Directory.list([max_results, *, ...])

List files (and/or directories) within a directory

Query.list_files([max_results, *, ...])

List files contained within a query result. The query result must contain at least one file_id variable.

Table.list_files([max_results, *, ...])

List files contained within a file index table. The table must contain at least one file_id variable.

Examples

import redivis
from io import TextIOWrapper
from PIL import Image

# See https://redivis.com/datasets/yz1s-d09009dbb/files for example data
table = redivis.table("demo.example_data_files:yz1s:v1_3.example_file_types:4c10")
text_file = table.file("pandas_core.py")
image_file = table.file("bogota.tiff")

## Read file contents
str = text_file.read(as_text=True)
bytes = image_file.read()

## Open the file, as if it was on the filesystem
with file.open("rb") as f:
  f.read(100) # read 100 bytes

with file.open() as f:
  f.readline() # read first line
  
# Tools that integrate with fsspec can open Redivis URIs:
pystac.Catalog.from_file("redivis://table_ref/stac/catalog.json")
  
Image.open(table.file("bogota.tiff")) # PIL will automatically call open() on the file
  
## Download the file  
image_file.download("./path") # will be downloaded as ./path/bogota.tiff
text_file.download("./path/renamed.txt") # will be downloaded as ./path/renamed.txt

Attributes

directory

A reference to the associated Directory for this file.

id

The globally unique identifier for the file, as a string.

name

The name of the file as a string, without any directory subpaths as present. Same as file.path.name.

path

The full path of the file, as a pathlib.Path

query

A reference to the Query from which this file was loaded from. Either this or table will be present.

properties

A dict containing properties associated with the file. This will always contain the following properties, derived from the file's original index table:

  • file_id (str): The globally unique id of the file

  • file_name (str): The full name of the file, including any extensions

  • size (int): The size of the file, in bytes

  • added_at (datetime): When the file was initially uploaded to Redivis

  • md5_hash (str): The md5 checksum of the file, as a base64 string

Additionally, if the file was loaded from a table or query with additional variables, those variables' values will exist in properties.

table

A reference to the Table from which this file was loaded from.

Methods

file.download(path[, ...])

Download the file.

file.read(*[, as_text, start_byte, end_byte])

Read the file contents into memory, either as bytes (the default) or as a string if as_text=True.

file.open(*, [start_byte, end_byte])

Read the file as a BytesIO stream, as if it was located on disk.

Last updated

Was this helpful?