特徵分析和機器學習方法應用於肝臟疾病檢測
陳志華、楊子緯、張訓楨、賴永崧
DOI:10.6283/JOCSG.2016.4.3.417
下載PDF檔 ( 已下載次數:2539)
中文摘要 本研究旨在使用臨床上常見的生化檢測數據建構肝臟疾病的預測模式,期望能夠及早篩檢出肝臟疾病患者、及時轉介就醫。本研究使用數據為UCI (University of California, Irvine)機器學習庫(Machine Learning Repository)所提供之「肝臟疾病資料檔」進行分析,考量6個特徵屬性(包含Alanine aminotransferase (GPT)、Aspartate aminotransferase (GOT)等特徵屬性)判斷是否罹患肝臟疾病。並且,本研究結合臨床經驗,將GOT及 GPT兩項肝功能生化數值欄位進行比值換算,新增兩個特徵屬性GPT/GOT和GOT/GPT來提升肝臟疾病篩檢正確率。在機器學習方法的部分,本研究使用決策樹、隨機森林、貝氏分類、支援向量機、k個最近鄰居、以及類神經網路等方法,來進行實作和分析。在研究結果中,運用隨機森林方法,並結合新加入的兩個特徵屬性(即GPT/GOT和GOT/GPT),將可有效將正確率提升至73.91%,較其他機器學習方法好。因此,未來在篩檢肝臟疾病患者時,可以考慮運用GPT/GOT和GOT/GPT特徵屬性,以提升篩檢正確率。
關鍵字:肝臟疾病、機器學習、特徵選取、隨機森林
文章建立時間:2016-08-28
引用格式(APA):
陳志華、楊子緯、張訓楨、賴永崧(2016)。 特徵分析和機器學習方法應用於肝臟疾病檢測。
福祉科技與服務管理學刊, 4(3), 417-430。
Feature Extraction and Machine Learning Methods for Liver Disease Detection and Prediction
Chen, C.-H., Yang, T.-W., Chang, H.-C., Lai, Y.-S.
English Abstract This study focuses on liver disease detection and prediction models that use biochemistry data to medically screen patients. The liver disorders data set of UCI (University of California, Irvine) machine learning repository which includes six features (e.g., alanine aminotransferase (GPT) and aspartate aminotransferase (GOT)) is adopted to detect and predict liver disease. Furthermore, this study considers clinical experiences to analyze GPT and GOT data and generate two features – GPT/GOT and GOT/GPT – for the improvement of liver disease detection and prediction accuracy. For the evaluation of machine learning methods, this study uses and implements decision tree, random forest, naive Bayes, support vector machine, k nearest neighbors, and neural network. Experimental results show that the accuracy of liver disease detection and prediction can be improved up to 73.91% by using random forest with the two proposed features. Therefore, the two proposed features can be adopted into liver disease detection and prediction models for future clinical trials.Keywords:liver disease, machine learning, feature extraction, random forest