CMS Transparency Law: United Healthcare Data Deep Dive

Dec 5 By Ezzy Sriram

This blog post accompanies the video presentation I posted here.

One of the most painful parts of healthcare is the lack of ability to shop around compared to the other types of purchases you make. Take buying hotel stays, for example - everyone knows the good hotels and Kayak tells you if a price is higher than average.

Shopping for healthcare is nightmarish because you don’t know if you are overpaying and you also don’t know know if you are getting your money’s worth.

The Transparency in Coverage Act requires insurance companies to share their price lists across covered services and items for each of their available insurance plans. For in-network providers, they must share the negotiated rates (a.k.a fee schedule rates). For out-of-network providers, they must use historical claims data to show allowable amounts (the max amount that a payer will pay for a service) and the billed charges (the total charge recently billed to a plan by a provider).

Payers need to make the data available via monthly data files posted publicly on their website. And great news - starting Jan. 1st, 2024, these data files must follow the CMS recommended file format and include all covered services and items.

Data Deep Dive on United Healthcare's Choice Plus Plan

Fast Facts

The single file was over 120GB in size
Included rates for over 17,000 services and items
Included 1.4 million unique NPIs, but bucketed into 57k buckets of providers for showing prices

Learnings

Garbage rates: When looking at rates for Psychotherapy 60 minutes (CPT Code 90837), I found many garbage rates for providers like radiologist and surgery groups that do not offer therapy services at all. The data definitely needs some consideration in removing outliers.
Vague provider-level rates: Rates are shown at a macro-level across an organization. I saw an instance of 1,500 providers at NewYork-Presbyterian Hospital/Weill Cornell were bucketed into a single entity for the purpose of surfacing rates. Unfortunately, this begs the questions:
- What are the individual prices per physician? This is an important question because costs depend on the location of treatment and the doctors credential (e.g. MD vs physicians assistant).
- Is the price the average price across physicians? How is it calculated?
Data analysis requires complementary datasets: The payer data is not useful on its own. What consumers really want is the ability to find the best bang for your buck when it comes to healthcare, but consumers don’t search in terms of NPIs and tax IDs. Consumers would probably like to look at the data by specialty (e.g. dermatology), practice name, location, rating, etc., but there is no free lunch here. We have to work together to make the data useful for consumers.
- NUCC Taxonomy: Taxonomy code levels and descriptions.
- CMS Doctors and Clinicians Data File: Medicare enrollment records with doctor and facility names and locations.
- NPPES NPI Database: NPI database with taxonomy, credential text, locations, and more.
Unwieldy data access: The data’s size makes it unwieldy. For the single plan I analyzed, the data was smooshed into a single file at a whopping 120GB file size. Some payers even require you to make thousands of HTTP requests to pull a complete dataset rather than downloading one file.

Sample Dataset (NYC Therapists)

You can download the dataset here.

We found that NYC in-network provider rates for Psychotherapy 60 minutes (CPT Code 90837) were between $135-160 for United Healthcare Choice Plus Plan.

Each row has the negotiated in-network rate for a provider entity listed in the payer file, where a provider entity refers to how payers bucket providers together for the purpose of surfacing a rate. Each provider entity row also has columns with aggregate details from other datasets (NPPES, NUCC taxonomy, and CMS Doctors and Clinicians File). See the data dictionary below for details.

Note: This dataset only includes provider entities that contained at least 1 NPI that we were able to match to a provider with a practice address in a New York City zip code via the NPPES NPI dataset.

Data dictionary:

cms_facility_name: Facility name for top most common facility found across NPIs listed for this provider entity (from CMS data file)
cms_facility_num_providers: Number of providers (from CMS data file) negotiated_rate: Negotiated in-network rate (from payer data file)
payer_num_providers: Total number of providers across all provider groups under this provider entity (from payer data file)
payer_num_tax_ids: Total number of provider groups calculated using unique TINs (from payer data file)
nppes_num_taxonomy_codes: Total number of unique taxonomy codes found by matching providers listed under this provider entity (from CMS data file) (from NPPES file)
nppes_num_matched_providers: Total number of unique provider NPIs listed under this provider entity (from NPPES file)
nppes_top_1_taxonomy_code: #1 most popular taxonomy code listed across provider NPPIs under this provider entity (from NPPES file)
nppes_top_2_taxonomy_code: #2 most popular taxonomy code listed under this provider entity (from NPPES file)
nppes_top_3_taxonomy_code: #3 most popular taxonomy code listed under this provider entity (from NPPES file)
cms_facility_state: facility state for top most common facility found across NPIs under this provider entity (from CMS data file)
cms_facility_zip_code: facility zip code for top most common facility found across NPIs listed under this provider entity (from CMS data file)
most_common_name: total number of unique provider NPIs listed under this provider entity (from NPPES file)
most_common_address: most popular practice addresses listed across provider NPIs under this provider entity (from NPPES file)
most_common_mailing_address: most popular mailing addresses listed across provider NPIs under this provider entity (from NPPES file)
negotiated_type: negotiated_type (from payer data file)
expiration_date: expiration date (from payer data file)
payer_provider_reference_id: ID of provider entity provider by payer. The payer shows provider groups underneath a provider entity (from payer data file)
billing_class: billing class (from payer data file)
billing_code_modifier: service modifier (from payer data file)
additional_information: additional information (from payer data file)

Additional Resources

WEBSITE DISCLAIMER: The information provided by SERO LLC ("we," "us," or "our") on (the "Site") is for general informational purposes only. All information on the Site is provided in good faith, however we make no representation or warranty of any kind, express or implied, regarding the accuracy, adequacy, validity, reliability, availability, or completeness of any information on the Site. UNDER NO CIRCUMSTANCE SHALL WE HAVE ANY LIABILITY TO YOU FOR ANY LOSS OR DAMAGE OF ANY KIND INCURRED AS A RESULT OF THE USE OF THE SITE OR RELIANCE ON ANY INFORMATION PROVIDED ON THE SITE. YOUR USE OF THE SITE AND YOUR RELIANCE ON ANY INFORMATION ON THE SITE IS SOLELY AT YOUR OWN RISK.

EXTERNAL LINKS DISCLAIMER: The Site may contain (or you may be sent through the Site) links to other websites or content belonging to or originating from third parties or links to websites and features in banners or other advertising. Such external links are not investigated, monitored, or checked for accuracy, adequacy, validity, reliability, availability, or completeness by us. WE DO NOT WARRANT, ENDORSE, GUARANTEE, OR ASSUME RESPONSIBILITY FOR THE ACCURACY OR RELIABILITY OF ANY INFORMATION OFFERED BY THIRD-PARTY WEBSITES LINKED THROUGH THE SITE OR ANY WEBSITE OR FEATURE LINKED IN ANY BANNER OR OTHER ADVERTISING. WE WILL NOT BE A PARTY TO OR IN ANY WAY BE RESPONSIBLE FOR MONITORING ANY TRANSACTION BETWEEN YOU AND THIRD-PARTY PROVIDERS OF PRODUCTS OR SERVICES.

© SERO LLC 2024 Terms of Use Privacy Policy Made with ❤️ in NYC by Ezzy

Healthcare Sumo