Drugmakers misspelled drug names and listed the same drug under multiple names which made analysis of prescription data difficult, according to an article by ProPublica and co-published with The New York Times.

As part of its ongoing “Dollars for Docs” series on payments from pharmaceutical companies to doctors, ProPublica reported there are widespread problems with the data that is posted to the “Open Payments” section of the Centers for Medicare and Medicaid Services (CMS) website.

Forest Laboratories misspelled its depression drug, Fetzima, as “Fetziima” 953 times — in more than one-third of all the reports on the drug.

CMS doesn’t double-check the data nor do they correct spelling errors or alter the data in any way, according to the story.

For people who work with data on a daily basis, much of the work is still janitorial work, as The New York Times reported in August. It’s the mundane work of cleaning and standardizing terms in a database. The process of cleaning data involves changing inconsistencies, such as different spellings of the same term, to a single standard or discovering inaccuracies.

Names of people, drugs, or companies can be spelled or entered differently depending on the person or people entering or collecting the data.

From the ProPublica story:

Take H.P. Acthar Gel, an expensive injectable drug used to treat multiple sclerosis, kidney disease, lupus and other conditions. The drug’s maker, Questcor Pharmaceuticals, logged payments related to the drug under eight names, including Acthar, Acthar-Pulm, Acthar-IS, Acthar-Rheum and Acthar-MS. The payments associated with each name didn’t stand out much. But when they were all added together, the drug ranked in the top 20 for spending on doctors.

The reporters didn’t find any evidence that errors in the data were deliberately made by companies trying to be evasive.

With so many errors in the original data, it would be very difficult for the average person to access the data and derive any meaning from it. This is often the case with large government databases if they haven’t been cleaned. Terms entered with a different spelling or entered in a different format appear as separate entries in a database, but are the same in reality.

This type of data cleaning makes a government database ready to be analyzed and used to draw conclusions for a news story.

Reach Eric Holmberg at 412-315-0266 or at eholmberg@publicsource.org. Follow him on Twitter @holmberges.

This fact-based local reporting drives impact and creates change. Help power that impact.

James Baldwin wrote, “Not everything that is faced can be changed, but nothing can be changed until it is faced.” PublicSource exists to help the Pittsburgh region face its realities and create opportunities for change. When we shine a light on inequity in our region, like the “completely unacceptable” conditions in low-income housing in McKeesport, things change. When we ask questions about policymakers’ decisions, like how Allegheny County is handling COVID-19 safety for its employees, things change. When we push for transparency on issues that affect the public, like in the use of facial recognition software by Pittsburgh police, things change.

It takes a lot of time, skill and resources to produce journalism like this. Our stories are always made available for free so that they can benefit the most people, regardless of ability to pay. But as an independent, nonprofit newsroom, we count on donations from our readers to support this crucial work. Can you make a contribution of any amount (or better yet, set up a recurring monthly gift) to help ensure we can continue to report on what matters and tell stories for a better Pittsburgh?

Eric Holmberg was a reporter for PublicSource between 2014 and 2016.