uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Automated data extraction: A feasible way to construct patient registers of primary care utilization
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Public Health and Caring Sciences, Family Medicine and Preventive Medicine.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Public Health and Caring Sciences, Family Medicine and Preventive Medicine.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Public Health and Caring Sciences, Family Medicine and Preventive Medicine.
2012 (English)In: Upsala Journal of Medical Sciences, ISSN 0300-9734, E-ISSN 2000-1967, Vol. 117, no 1, 52-56 p.Article in journal (Refereed) Published
Abstract [en]

Introduction. Electronic medical records (EMRs) enable analysis of health care data by using data mining techniques to build research databases. Though the reliability of the data extraction process is crucial for the credibility of the final analysis, there are few published validations of this process. In this paper we validate the performance of an automated data mining tool on EMR in a primary care setting.

Methods. The Pygargus Customized eXtraction Program (CXP) was programmed to find and then extract data from patients meeting criteria for type 2 diabetes mellitus (T2DM) at one primary health care clinic (PHC). The ability of CXP to extract relevant cases was assessed by comparing cases extracted by an EMR integrated search engine. The concordance of extracted data with the original EMR source was manually controlled.

Results. Prevalence of T2DM was 4.0%, which correspond well to previous estimations. By searching for drug prescriptions, diagnosis codes, and laboratory values, 38%, 53%, and 91% of relevant cases were found, respectively. The sensitivity of CXP regarding extraction of relevant cases was 100%. The specificity was 99.9% due to 12 non-T2DM cases extracted. The congruity at single-item level was 99.6%. The 13 incorrect data items were all located in the same structural module.

Conclusion. The CXP is a reliable and accurate data mining tool to extract selective data from EMR.

Place, publisher, year, edition, pages
2012. Vol. 117, no 1, 52-56 p.
Keyword [en]
Data extraction, data mining, electronic medical records (EMRs), knowledge discovery in databases (KDD), primary health care
National Category
Medical and Health Sciences
Identifiers
URN: urn:nbn:se:uu:diva-170621DOI: 10.3109/03009734.2011.653015ISI: 000300304000009PubMedID: 22335391OAI: oai:DiVA.org:uu-170621DiVA: diva2:509365
Available from: 2012-03-12 Created: 2012-03-12 Last updated: 2017-12-07Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textPubMed

Authority records BETA

Martinell, MatsStålhammar, JanHallqvist, Johan

Search in DiVA

By author/editor
Martinell, MatsStålhammar, JanHallqvist, Johan
By organisation
Family Medicine and Preventive Medicine
In the same journal
Upsala Journal of Medical Sciences
Medical and Health Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 429 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf