Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

National Practitioner Data Bank online lookup tool #22

Open
cornstein opened this issue Mar 20, 2015 · 15 comments
Open

National Practitioner Data Bank online lookup tool #22

cornstein opened this issue Mar 20, 2015 · 15 comments

Comments

@cornstein
Copy link

The National Practitioner Data Bank is a federally run repository for adverse actions and malpractice payouts involving doctors and other health professionals. Details on individual doctors are confidential, but summary statistics and deidentifiable data are releasable: There's a very rudimentary data analytic tool available (http://www.npdb.hrsa.gov/analysistool/) but it would be great if there was more granularity without downloading the Public Use File and agreeing to unacceptable restrictions. It would be nice to be able to see the ability to add more than two layers of analysis, for instance different categories of sanctions, by state, by year. Allowing users to layer in items like med school graduation, etc. (Perhaps limiting cell sizes to 2+?). I see this as having the potential to be as useful as the OPTN/UNOS data, also overseen by HRSA.That data site is: http://optn.transplant.hrsa.gov/converge/latestData/viewDataReports.asp.

@dportnoy
Copy link
Member

Created page for full use case specifications and solution: http://hhs.ddod.us/wiki/Use_Case_22

@marks
Copy link

marks commented Jun 20, 2015

+1

@cornstein correct me if I'm wrong but it sounds like a lot of this would be made a lot better if the PUF (from http://www.npdb.hrsa.gov/resources/publicData.jsp) had less restrictions?

Personally, I'd like to be able to repost the dataset with the stipulation that I would require others to agree to the other terms such as not trying to identify practitioners or patients

@betshsu
Copy link

betshsu commented Aug 20, 2015

@cornstein I'd like to open conversions with HRSA around this and use case #23 (performance data on health centers)

In approaching program owners, it would be helpful to be able to demonstrate the value and the use of the data, and also to have clarity around the following:

  • Is it primarily a matter of finer granularity in the data fields currently available on the web tool (e.g., not rolling up practitioner type and sanction type)?
  • If not, which data fields from the PUF you'd like to see available through the web tool?
  • Or was @marks was on target regarding less restrictions around downloading the PUF being the optimal solution?

Thanks.

@cornstein
Copy link
Author

Hi @betshsu. Thank you for your questions. This is primarily a matter of granularity. For instance, sanction type isn't broken down by type of sanction--all state licensure sanctions are grouped together. And you can only display your results in very limited ways.
It would be great to also be able to see how many providers have various types of actions, from lawsuit settlements to state sanctions.

@betshsu
Copy link

betshsu commented Sep 30, 2015

Great, thanks. That's helpful info for approaching HRSA about what might be possible. I'll open the conversation with them.

@betshsu
Copy link

betshsu commented Sep 30, 2015

Reached out to HRSA to find a contact for the National Practitioner Data Bank web tool. Will update as things progress.

@betshsu
Copy link

betshsu commented Oct 2, 2015

Talking next week with the folks that run the NPDB to see what disclosure issues might prevent providing data at a finer granularity.

@betshsu
Copy link

betshsu commented Oct 8, 2015

Had a conversation with the folks that run the NPDB web tool. It's not clear that the platform used to host the web tool would allow anything further than a 2x2 table (e.g., adding additional fields like the OPTN tool allows), but you can choose which fields to display as row and columns as well as filtering data (so for example, you could get sanction type by practitioner type, but it would be rolled up in state and year if you selected multiple states or years), which could allow a user to build more granular data in a tedious way (multiple exports of 2x2 tables). They are looking into whether the platform could allow additional fields and we also inquired whether they would be interested in deploying an API as an alternative.

The idea behind the web tool was to allow for someone to answer high level questions, and if they wanted to dig deeper into the data, then they could use the PUF file. At the same time, they have to balance potential identification and disclosure issues due to federal law. The team would like to hear from the public on how to make the tool more useful (e.g., if there are additional data fields from the PUF that should be included in the web tool) and are also very responsive to support requests filed through the NPDB help email (they will respond within one day and call users to help walk through analysis questions or PUF questions).

@betshsu
Copy link

betshsu commented Dec 7, 2015

Update -- the program owners have some ideas that they would like to discuss with us; we'll be setting up a time in the next couple of weeks with them.

@betshsu
Copy link

betshsu commented Dec 14, 2015

Talking to program owners on Dec 16 re: potential ideas for tool improvement

@betshsu
Copy link

betshsu commented Dec 16, 2015

@cornstein We spoke with the program owners today; they are supportive of allowing for more granularity in the 2x2 tables available via the web tool, with the stipulation that results might need to be restricted to a minimum cell size in order to protect against disclosure. The most feasible time frame for improvements to the tool are for those improvements to be rolled out in conjunction with the annual update to the data; that update doesn't get released publicly until June, but they will start working with the contractor in March on the first quarter data. We will be scheduling a follow up call in Jan with the program owners and then their contractor to explore what can be done and provide technical support as we can. We will post updates as we are able to meet with the NPDB contractor about what might be possible with the web tool.

@marks The program owners are also supportive of the idea of easing the DUA language on the PUF around point 3 "Not repost the dataset and only report, disclose or post data from the dataset in connection with statistical reporting or analysis that does not identify any individual or entity." The most critical component of the DUA is in point 1, in not using the dataset alone or in conjunction with another dataset to attempt to identify individuals. They are open to the idea of reposting subsets of the data, as long as those terms are agreed upon. However, as you can imagine, changing the language around the DUA involves approval from on above the program owners; we will start working the channels to see what can be done and will keep you updated.

@cornstein
Copy link
Author

Thanks so much for doing this. I'm happy to be a part of a conversation
about how the data could be more useful. I very much understand and support
that the data needs to be restricted to a minimum cell size in order to
protect against disclosure (CMS does this now), although it's worth
including in those discussions that cell size minimums should display 0.
Zero should not be redacted (CMS does not redact zero.) Let me know how I
can help.

Thanks,
Charlie

On Wed, Dec 16, 2015 at 1:45 PM, Elizabeth Hsu [email protected]
wrote:

@cornstein https://github.com/cornstein We spoke with the program
owners today; they are supportive of allowing for more granularity in the
2x2 tables available via the web tool, with the stipulation that results
might need to be restricted to a minimum cell size in order to protect
against disclosure. The most feasible time frame for improvements to the
tool are for those improvements to be rolled out in conjunction with the
annual update to the data; that update doesn't get released publicly until
June, but they will start working with the contractor in March on the first
quarter data. We will be scheduling a follow up call in Jan with the
program owners and then their contractor to explore what can be done and
provide technical support as we can. We will post updates as we are able to
meet with the NPDB contractor about what might be possible with the web
tool.

@marks https://github.com/marks The program owners are also supportive
of the idea of easing the DUA language on the PUF around point 3 "Not
repost the dataset and only report, disclose or post data from the dataset
in connection with statistical reporting or analysis that does not identify
any individual or entity." The most critical component of the DUA is in
point 1, in not using the dataset alone or in conjunction with another
dataset to attempt to identify individuals. They are open to the idea of
reposting subsets of the data, as long as those terms are agreed upon.
However, as you can imagine, changing the language around the DUA involves
approval from on above the program owners; we will start working the
channels to see what can be done and will keep you updated.


Reply to this email directly or view it on GitHub
#22 (comment)
.

@betshsu
Copy link

betshsu commented Jan 5, 2016

@cornstein Good point about the displaying the 0's. We have a conversation with the program owners scheduled this week to start discussing the technical issues behind improving the web tool, and will add this to the list. We will consult/loop you in as we get into further discussions of the exact data that would be most useful for the web tool to be able to display -- I know the program owners are eager to hear from the public how to make the data and the tool most useful.

@betshsu
Copy link

betshsu commented Jan 7, 2016

We had to reschedule this meeting for next week; will post an update after the meeting.

@betshsu
Copy link

betshsu commented Jan 13, 2016

@cornstein The program owners absolutely agree that 0's should be displayed when minimum cell size restrictions are implemented, as 0 has a very different meaning.

The program owners would like to explore linking the three data tools for NPDB they current have available, the PUF, the web tool, and the statistic reports as they all are based on the same data. They are looking into linking the web tool to the PUF data, so that it is updated whenever the PUF is updated (quarterly, as opposed to the current annual update), and then along with allowing finer granularity in the tables that the web tool builds, producing visualizations based on those tables (currently they provide downloadable visualizations using pre-defined tables here: http://www.npdb.hrsa.gov/resources/npdbstats/npdbStatistics.jsp).

Will continue to update as things move along.

@betshsu betshsu removed their assignment Feb 2, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants