# Populate metadata

## Dataset metadata

The following fields are available to be populated on a dataset overview:

<table data-header-hidden><thead><tr><th width="189.82421875"></th><th></th></tr></thead><tbody><tr><td><strong>Abstract</strong></td><td>The abstract is limited to 256 characters and will show up in previews and search results for the dataset. This should be a concise, high-level summary of this dataset.</td></tr><tr><td><strong>Provenance</strong></td><td>This section is intended to display information about where this dataset came from and how it came to be in its current form. Redivis will auto-populate fields where possible but you can add additional information or override it. More detail is available on the <a href="dois-and-provenance">DOIs and provenance</a> page.</td></tr><tr><td><strong>Supporting files</strong></td><td>Files of any type and up to 100MB can be uploaded to the dataset page where anyone with access can download them. These should not contain any data for this dataset, as access to them is managed separately. </td></tr><tr><td><strong>Links</strong></td><td>Links can be added with display names to direct someone to another URL with more information.</td></tr><tr><td><strong>License</strong></td><td><p>This is where you can add the license information about your dataset's redistribution policies. If this data is governed by a common redistribution license you can select it here from the menu of standard licenses. If you want to reference a license that isn't listed here you can include the link, or upload a custom license. This will be displayed on the dataset front page to let others know how they can use your data. This information will be included on the dataset's DOI.</p><p></p><p>Do you think a common license is missing? <a href="https://redivis.com/contact">Contact us</a> to let us know what you'd like to see here.</p></td></tr><tr><td><strong>Funding</strong></td><td>If this dataset was funded by an institution you'd like to recognize, this is the section where you can include information about funder(s). You'll need the funding organization's name and ROR, as well as an award number if applicable. You can add multiple funders to each dataset. This information will be included on the dataset's DOI.</td></tr><tr><td><strong>Contact</strong></td><td>This section should be used to let someone viewing this dataset know how to get in touch if there is any issue or question.</td></tr><tr><td><strong>Custom sections</strong></td><td><p>You can create documentation sections with their own titles and assign them custom access levels. <br></p><p>By default, all dataset documentation is visible to anyone with <a href="../../../data-access/access-levels#overview-access">overview access</a> to the dataset. However, there may be some content in the documentation that is sensitive — for example, information about named variables that would require metadata access.<br></p><p>To protect this information you can create a custom documentation section with a more restrictive access level. Users without the appropriate level of access will only see a placeholder for that section of the documentation.</p></td></tr><tr><td><strong>Tags</strong></td><td>In addition to documentation, you may add up to 25 tags to your dataset, which will help researchers discover and understand the dataset.</td></tr><tr><td><strong>Other metadata</strong></td><td>Additionally, information about the dataset's size and temporal range will be automatically computed from the <a href="../../data#table-characteristics">metadata on its tables.</a> Additional table documentation, as well as the <a href="../../../tables/variables#characteristics">variable metadata</a>, will be indexed and surfaced as part of the <a href="../../../../guides/getting-started#2-discover-data">dataset discovery process</a>.</td></tr></tbody></table>

## Variable metadata

Redivis determines variable names and types during [data upload](https://docs.redivis.com/reference/datasets/create-and-edit-datasets/import-tabular-data). Additionally, it will automatically parse certain metadata based on the uploaded file format:

* SAS (.sas7bdat): labels
* Stata (.dta): labels and value labels
* SPSS (.sav): labels and value labels

For other file types (e.g., csv), you will need to augment the metadata directly. To apply metadata in bulk, you can upload a file containing metadata information directly from your computer. This file can either be a CSV or JSON.

{% hint style="info" %}
Is your metadata stuck in a PDF? We're truly sorry — if you can, please let the data provider know that it is *essential* that they provide metadata in a machine-readable format; hopefully in time this will change.

While you can just upload the PDF to the dataset's [documentation](https://docs.redivis.com/reference/datasets/create-and-edit-datasets/broken-reference), you'll be doing your researchers a huge service if you can add structured metadata to the variables. That might mean some manual copying and pasting from the PDF, or you could consider the various (and imperfect) online PDF to CSV conversion tools, or [this python library](https://tabula-py.readthedocs.io/en/latest/).

If you don't have the bandwidth, consider asking for your researchers to contribute by making them a [dataset editor](https://docs.redivis.com/data-access/access-levels#dataset-editor).&#x20;
{% endhint %}

#### **CSV metadata format**

The CSV should be formatted without a header, with each row corresponding to a variable, with column 1 as the `name`, 2 as the `label`, 3 as the `description`. If the variable doesn't have a label or description, leave these columns empty.

```
variable1_name,variable1_label,variable1_description
variable2_name,variable2_label,variable2_description
```

For example:

```
sex,patient sex,patient's recorded sex
id,patient identifier,unique patient identifier
```

#### **JSON metadata format**

When uploading a JSON file, specify the `name`, `label`, `description`, and `valueLabels` using the appropriately named attributes in the object corresponding to each variable. If the variable doesn't have a label, description, or value labels, you don't need to include these attributes.

For example:

```javascript
// JSON format is an array of objects, with each object representing a variable
[
    {
        "name": "sex",
        "label": "patient sex",
        "description": "patient's recorded sex",
        "valueLabels": [
            {
                "value": 1,
                "label": "Male"
            },
            {
                "value": 2,
                "label": "Female"
            }
        ]
    }
]
```

{% hint style="info" %}
To upload value labels in bulk, you must use the JSON format. We no longer support bulk upload of value labels via CSV.
{% endhint %}
