What is Clinical Data Mining?

Dec 8

08:24

2011

Jason Gaya

Clinical Data-Mining (CDM) involves the conceptualization, extraction, analysis, and interpretation of available clinical data for practice knowledge-building, clinical decision-making and practitioner reflection.

Clinical data can be obtained from various sources like Medical Transcript Files and Electronic Medical Records (EMR). We can create a new Clinical database which accumulates large quantities of information about patients and their medical conditions using these two sources. Relationships and patterns within this data could provide new medical knowledge.

Importance of Clinical Data Mining:

In Year 2010 more than 30 million people were treated for life threatening diseases. Cancer and Heart Disease are few of them. Identification of early signs of cancer and heart disease is possible and can save thousands of lives. Analyzing a database of thousands of patients which can provide valuable information about the probable causes, nature of progression, etc., can help in developing systems that could identify disease at the earliest signs of occurrence leading to timely treatment and preventive techniques.
Every year, new guidelines come out regarding the usage and the dose of different drugs. Sometimes guidelines show some drugs taken in combination can produce adverse effects. The latest example of the same is :

June 8 2011, the FDA came out with new guidelines for the use of simvastatin, particularly noting specific combinations of medications that are now defined as "contraindicated" with simvastatin at any dose. Using this knowledge database we can find the patients taking those contradicting drugs.

Approach of Clinical Data Mining:

The process of Data Mining is divided into four phases: i) Data Collection ii) Pre-Processing iii) Data Parsing iv) Application of Knowledge

Data Collection: Clinical Data of any patient is stored in two Different formats. i) Medical Transcript File (contains 25 to 30% of information) ii) EMR (contains 75-80% of information).In this phase, each patient information of transcript file and EMR is mapped.
Pre-Processing: To get accurate output from the parser, the input document needs to be in Clinical Document Architecture (CDA). So in pre-processing phase given input document is converted into CDA format.
Data Parsing: Pre-Processed Data is parsed into a single structured format. Here negation, Snomed Codes, Rx-Norm Codes, ICD-9 Codes, Body Measurements, Drug Dosages, Smoking Status and Allergies are detected.
Application of Knowledge: Using this knowledge we can create a new Database, and querying the database can be useful in medical research and in improvement of patient healthcare. For-example we can query:
1. What is LDL laboratory level? Is it below 100? Do they also have MI (history of heart attack)? If so is LDL less than 70.
2. If EF < 40%, needs 2D Echo and 3D Echo
If EF still remains < 40%, needs EP Level 4 If EF < 35%, needs AICD

Article "tagged" as: