Why are 5% of Services Occurring on January 1, 1960?
Claims data passed as an extract to either a vendor, state agency, or at-risk provider has usually been transferred over half a dozen times.
It began its journey as billing data sent from the provider to the payer. Then it was transferred to the adjudication system (frequently this data is transferred back and forth a few times before it is finalized but that’s another blog post). Adjudicated data becomes paid data via the payment system. Within the payer organization this data passes through a kind of financial or general ledger accounting system. Usually this data is then transferred to a ‘warehouse’ for analysis. Lastly, the data is extracted and transferred for a specific purpose. During these transactions data is passed from one data processing system to another and in healthcare some of these systems rely on legacy systems built on older platforms. Date fields (I’m using ‘date’ here as shorthand for both date and date-time data.) are one area where this process can lead to problems.
Think about it this way. Dates are usually stored as a numeric indicating the number of units past a particular day. Importantly, the numeric date of 0 indicates not the absence of an amount (as it would with a standard numeric field) but an actual day. That day depends on the data storage language being used.
MS SQL SERVER — 1900-01-01 00:00:00.000
MYSQL – 0000-00-00 00:00:00.000
EXCEL – 1900-01-00
SAS – 1960-01-01
The SAS ‘0’ is an important date because many legacy warehouses and especially data extraction code runs on SAS. A null date can often get converted to 0 as data gets passed back and forth and what was missing becomes an actual date. So, quality check processes that only look for “null” in the date field can miss these important omissions. We’ll explore this more in future posts.
This happens with many fields, but happens most frequently with admit date, discharge date, and adjudication date as shown in this dataset. This is a problem for trending data and particularly for identifying average length of stay.
To see an example of this data go to the APCD Journal.