Drugmakers misspelled drug names and listed the same drug under multiple names which made analysis of prescription data difficult, according to an article by ProPublica and co-published with The New York Times.

As part of its ongoing “Dollars for Docs” series on payments from pharmaceutical companies to doctors, ProPublica reported there are widespread problems with the data that is posted to the “Open Payments” section of the Centers for Medicare and Medicaid Services (CMS) website.

Forest Laboratories misspelled its depression drug, Fetzima, as “Fetziima” 953 times — in more than one-third of all the reports on the drug.

CMS doesn’t double-check the data nor do they correct spelling errors or alter the data in any way, according to the story.

For people who work with data on a daily basis, much of the work is still janitorial work, as The New York Times reported in August. It’s the mundane work of cleaning and standardizing terms in a database. The process of cleaning data involves changing inconsistencies, such as different spellings of the same term, to a single standard or discovering inaccuracies.

Names of people, drugs, or companies can be spelled or entered differently depending on the person or people entering or collecting the data.

From the ProPublica story:

Take H.P. Acthar Gel, an expensive injectable drug used to treat multiple sclerosis, kidney disease, lupus and other conditions. The drug’s maker, Questcor Pharmaceuticals, logged payments related to the drug under eight names, including Acthar, Acthar-Pulm, Acthar-IS, Acthar-Rheum and Acthar-MS. The payments associated with each name didn’t stand out much. But when they were all added together, the drug ranked in the top 20 for spending on doctors.

The reporters didn’t find any evidence that errors in the data were deliberately made by companies trying to be evasive.

With so many errors in the original data, it would be very difficult for the average person to access the data and derive any meaning from it. This is often the case with large government databases if they haven’t been cleaned. Terms entered with a different spelling or entered in a different format appear as separate entries in a database, but are the same in reality.

This type of data cleaning makes a government database ready to be analyzed and used to draw conclusions for a news story.

Reach Eric Holmberg at 412-315-0266 or at eholmberg@publicsource.org. Follow him on Twitter @holmberges.

We don't have paywalls — but your support helps us bridge crucial information gaps.

Readers tell us they can't find the information they get from our reporting anywhere else, and we're glad to provide this important service for our community. We work hard to produce accurate, timely, impactful journalism without paywalls that keeps our region informed and moving forward.

However, only .01% of the people who read our stories contribute to our work financially. Our newsroom depends on the generosity of readers like yourself to make our high-quality local journalism possible, and the costs of the resources it takes to produce it have been rising, so each member means a lot to us.

Your donation to our nonprofit newsroom helps ensure everyone in Allegheny County can stay up-to-date about decisions and events that affect them. Please make your gift of support now.

Eric Holmberg was a reporter for PublicSource between 2014 and 2016.