Adventures in Computing, Social Security Edition

Headline
Adventures in Computing, Social Security Edition
Pubdate
One-liner
"Which dataset is Elon referring to when he says “the SS database”? I don’t know. I’m not sure he knows."
Timeline
Report Excerpt

[I]n 1961 the millions of already existing paper SSN records had to be imported into newly invented computer databases. This was done through a very similar process to the creation of the records originally. Some human would grab an existing paper SSN record, and type on a keyboard to enter that data into the computer database (an activity sensibly called “data entry”). But what does a data entry person do when they have a record with a DOB from 1787? They might figure it was just a transposition error, and they’d correct it to 1877. Or they might input it as 1787 because, hey, that’s what the record says. Or they might leave it blank, recognizing it to be erroneous. Or maybe enter it as 6-6-0000, or 6-6-XXXX. I wasn’t there, but I imagine people probably did all of the above and the choice probably varied from office to office and even person to person….

Given this error-prone, fallible system, it’s not at all surprising that some of the data would be erroneous. Anyone who knows anything about how things work in the real world would see these inconsistencies in, say, 2025, and think “ah, that’s probably due to data entry errors,” unless you are the richest and smartest man in the world, in which case you would conclude it’s part of some Marxist neoliberal conspiracy to give lazy people mansions, or something.

It sure seems like anyone who can craft a SELECT statement with an ORDER BY clause (or even sort a spreadsheet) could have and should have noticed this a long time ago. The incompetence and/or stupidity required to overlook this is just UNBELIEVABLE! I mean, it’s literally incredible no one noticed this, isn’t it? Yes, it is incredible and unbelievable, which makes sense because it isn’t true. For example, an SSA OIG report from March 20152 discusses the issue of missing dates of death in some detail. That report discusses three different datasets:

  • Master Beneficiary Records (MBR) – this is the source of truth SSA uses for managing benefits and payments. If you’re interested in whether dead people are getting payments, this is the database you’re going to be concerned with.
  • NUMIDENT (Numerical Identification System) – this is a subset of MBR data, extracted from the MBR.
  • Death Master File (DMF) – this is a subset of NUMIDENT data containing records for numberholders who have died. It’s made available to financial institutions, insurance companies, other government agencies, etc to (among other things) enable them to catch fraud.

NUMIDENT (and therefore the DMF) are missing data that’s in the MBR. This won’t surprise anyone who has had to manage data extraction pipelines that have been running for any significant amount of time. But importantly, neither NUMIDENT nor the DMF are used for managing payments. Which dataset is Elon referring to when he says “the SS database”? I don’t know. I’m not sure he knows. But it should be the MBR for any of these announcements to be meaningful. And it would be really helpful for someone in one of these press conferences to ask him which one he’s talking about.

This 2015 OIG report is primarily focused on the lack of death info on some NUMIDENT records, but importantly it states that said problem led to zero improper payments being made to dead people…. [This] also explains why the SSA hasn’t fixed this bad data: because after analysis they concluded that it would be expensive, potentially cause problems, and have very little benefit. In other words, it would be a waste of taxpayer money.

Kicker
People
Government Entity
Databases and Systems (Government)

Add new comment

You have the option to tag the comment. When you start typing in the "Comment Tags" field, a dropdown with existing tags will appear; use these if possible. You can create tags that do not appear in the dropdown, but please remember that this is a family blog.