Select first/last encounter

Selecting first / last encounter

E.g.: I want the date of the first and last hospital admission for every patient in my table
Let's take an example where we have a cohort of hospital patients, where in addition to a unique identifier we have a timestamp / date variable for all of their encounters. For each of these patients, we want to find the first and last encounter — or more formally, the min and max encounter date for that patient
Any time we want to compute aggregate variables for a particular subset of our data, we have a pretty strong indicator to use partition clause(s) within a transform. Partitioned queries allow for us to compute aggregate values on any arbitrary partition that we define on our table, with a partition being a unique combination of one or more variables. For example, we could use the SUM partition method, partitioning on patient_id and year , to take the sum of a cost variable to create a new variable that represents the total cost for each patient over a given year.
For this particular case, we want to find the MIN and MAX of the encounter_date of each patient; in other words, partitioned on patient_id . This will create two new variables, min_encounter_date and max_encounter_date for each patient. When we run this query, note that no records will be dropped; rather, these new variables will be have consistent values for any given patient_id .
patient_id
encounter_date
min_encounter_date
max_encounter_date
1
2012-01-01
2010-01-01
2013-01-01
1
2011-01-01
2010-01-01
2013-01-01
1
2010-01-01
2010-01-01
2013-01-01
1
2013-01-01
2010-01-01
2013-01-01
2
2009-01-01
2008-01-01
2015-01-01
2
2008-01-01
2008-01-01
2015-01-01
2
2015-01-01
2008-01-01
2015-01-01
Now, let's say we only want each patient's first and last encounter. We can do this easily be creating a new row filter, choosing records where min_encounter_date = encounter_date OR max_encounter_date = encounter_date !
We can also generalize this approach to find the Nth encounter for each patient — take a look at the RANK, DENSE_RANK, and ROW_NUMBER methods. And if we want to be more specific on our partition (e.g., first / last encounter in a given calendar year), we can always apply additional variables (year ) when defining our partition.
Export as PDF
Copy link