Deduplicated spreadsheets

OIDA has deduplicated spreadsheets from the hundreds of thousands of native format spreadsheet files across its three largest collections (Insys, Mallinckrodt and McKinsey) to save time for users who wish to work with these files. SearchMyFiles, a program created by Nirsoft, was used to deduplicate files with .xls, .xlsx, .xlsm, and .csv extensions. The first occurrence of each duplicate spreadsheet was preserved, and the remainder deleted. Another ‘Duplicate Search’ was performed to ensure no duplicates were found. Testing on a sample of spreadsheets was used to validate the robustness of the deduplication methods. The deduplication process reduced the spreadsheet count by an average of 45% across collections.

The sets of deduplicated spreadsheets are located in [external website]. Please note that there are additional spreadsheets with other file extensions, e.g. .xlsb, that were not captured in the data set.

Insys Authorized Prescriptions Data

A searchable table that represents 11,827,050 authorized prescriptions submitted through the FDA-required Transmucosal Immediate Release Fentanyl (TIRF) Risk Evaluation and Mitigation Strategy (REMS) program from 2012-2016.

Authorized Prescriptions (authorized_rx)

11,827,050 rows in 1 table

Mallinckrodt Sales Visit Data

A searchable table that represents over 1,000,000 visits by Mallinckrodt sales representatives to prescribers, distributors, and pharmacists around the United States in 2009-2015.