Monday, September 19, 2011

A Readmissions Data Model, Mk II (part 3)

This is part three in a series of thoughts about how to track these readmission measures, you can read part 1 and part 2.

So after a lot of thought, struggle and some guidance from a very wise data modeler I had the opportunity to work with on another project I've come up with some ideas on how to manage this in various time periods. It's not trivial by any stretch of the imagination, but it is doable. While I don't have my schema fully designed yet, I've been able to test portions of it and it seems to work well. I have tested this basic design with SQL Server Reporting Services, Microstrategy, and Information Builders WebFocus. It appears to function well in all three environments.

The very first thing that needs to be done is to clean up and stage the data, in our case we are downloading the raw data from our EMR and storing it in SQL Server. Then I am using SSIS with stored procedures to go over and manipulate the data into the shape I need. I start by pulling every visit by encounter_id and I join that to itself by the medical_record_id where the prior visit encounter_id does not equal the current encounter_id and the date of the prior visit is less than or equal to the current visit, and the prior visit is an inpatient stay. This meets our basic needs, but I still need to iterate over the data again getting the max(prior_discharge_date) for each current.encounter_id and then filtering this out. I find that a series of temporary tables make this work much easier. Don't try to do this in one single statement loop over and remove the bits you need to from the dataset each time until you have a list of current and matching prior encounter_id fields.

Now the easy part is to just use this map (you did remember to store the output in a real and not temporary table right?) and link a current encounter to a prior encounter. I wind up making a very wide table (or view) that contains all the data elements I'm interested in including a bunch of calculated fields that I called things like [3day], [7day], [10day] and [30day]. Finally I populate those fields like so:

CASE WHEN datediff(dd,prior.discharge_date,current.admit_date) between 0 and 3 THEN 1 ELSE 0 END

Finally just open this table in your reporting app and merge with your date dimension table and then SUM on the above described calculated fields to get the number of patients in each bucket. In my case I'm interested in Fiscal_Month and Financial_Class buckets so I SUM those fields against those buckets and get an accurate and fast count of patients returning.

To make this model even more functional I decided to add another calculated field that I called [PatientCount] which simple equals 1.0 for each encounter_id. Then I can use this field to count patient visits in each bucket and give my users multiple ratio/percentage calculations on the same report.

While this is a lot more prep work than my first idea of using MS SSAS, it's a lot simpler to actually accomplish and since most of the execution is run by the stored procedures before the users ever start their work it's also a lot faster at runtime.

Let me know if you fine this useful or if anyone wants to discuss this further.