SAS notebooks
Last updated
Was this helpful?
Last updated
Was this helpful?
SAS notebooks are available for those researchers who are more comfortable using SAS and its ecosystem. These are built off the same base image as , but include the to allow for the execution of SAS in a notebook environment.
Working with SAS a notebook environment is slightly different than the SAS desktop application, in that we need to utilize python to interchange data with SAS. This step is quite simple, and doesn't require any expertise in python – see below.
Because SAS is proprietary software, you will need to have a licensed version of SAS 9.4 in order to enable SAS notebooks on Redivis. Organizations can specify license information in , which will make SAS notebooks available to all members of their organization. Alternatively, you can provide your own SAS license in .
SAS notebooks are based off the , and can combine SAS with Python and optional python dependencies to create novel workflows.
To further customize your compute environment, you can specify various dependencies by clicking the Dependencies button at the top-right of your notebook. Here you will see three tabs: Packages, pre_install.sh, and post_install.sh.
Use packages to specify the python packages that you would like to install. When adding a new package, it will be pinned to the latest version of that package, but you can specify another version if preferred.
To manage system dependencies, and for more complicated workflows, you can use the pre- and post- install shell scripts. These scripts are executed on either side of the python package installation, and are used to execute arbitrary code in the shell. For example, you can also use apt
to install system packages (apt-get update && apt-get install -y <package>
), or mamba
to install from conda.
In order to load data into SAS, we first pull it into a data frame in python, and then pass that variable into SAS. If you're unfamiliar with python, you can just copy+paste the below into the first cell of your notebook to load the data in python.
Now that we have the dataset "df" in SAS, we can run SAS code against the data. To do so, we must prefix any SAS cell with the line %%SAS sas_session
:
SAS offers some support for geospatial datatypes. However, we can't pass geospatial data from python natively, and instead need to first create a shapefile that can then be loaded into SAS.
Next, we can load this shapefile via SAS:
If your data is too big to fit into memory, you may need to first download the data as a CSV, and then read that file into SAS:
To create an output table, we first need to pass our SAS data back to python. We can then use the redivis.current_notebook().create_output_table()
method in python to output our data.
If an output table for the notebook already exists, by default it will be overwritten. You can pass append=True
to append, rather than overwrite, the table. In order for the append to succeed, all variables in the appended table, which are also present in the existing table, must have the same type.
As you perform your analysis, you may generate files and figures that are stored on the notebook's hard disk. There are two locations that you should write files to: /out
for persistent storage, and /scratch
for temporary storage.
Any files written to persistent storage will be available when the notebook is stopped, and will be restored to the same state when the notebook is run again. Alternatively, any files written to temporary storage will only exist for the duration of the current notebook session.
View the
Consult the full samples of interchanging data between SAS and python in the .
View the
Redivis notebooks offer the ability to materialize notebook outputs as a new in your workflow. This table can then be processed by transforms, read into other notebooks, exported, or even .