6192 - 資料視覺化分析
Data Visualization Analysis
教育目標 Course Target
資料視覺化分析對於清理資料、探索資料結構、偵測界外值(outliers)及異常群體、辨識趨勢及叢聚(clusters)、發現局部模式(pattern)、評估模型分析輸出(output)、與呈現分析結果,都相當有幫助。資料視覺化分析對於探索性數據分析(exploratory data analysis)、資料採礦(data mining)、網絡分析(network analysis)更是不可或缺的。
本課程主要目標為使用 R 軟體與實際調查資料,來展現資料視覺化分析能展現資料中的哪些資訊。透過實際操作來增加具體設計並畫出統計圖像、詮釋統計圖像的經驗,來有效率地了解資料視覺化分析。
Data visual analysis is very helpful for cleaning data, exploring data structure, detecting outliers and abnormal groups, identifying trends and clusters, discovering local patterns, evaluating model analysis output, and presenting analysis results. Data visual analysis is indispensable for exploratory data analysis, data mining, and network analysis.
The main goal of this course is to use R software and actual survey data to demonstrate what information visual analysis of data can reveal in the data. Through practical operations, you can gain experience in designing, drawing, and interpreting statistical images to effectively understand data visualization analysis.
課程概述 Course Description
Data visualization is an important issue that can arise in high-dimensional data analysis. It has become increasingly more important due to the advent of computer and graphics technology. The difficulty lies on how to visualize a high dimensional structure or data set. Such kinds of questions do have a common root in Statistics. This course will introduce some statistical methodologies useful for exploring voluminous data. The main topics include, but not limited to, two parts. The first part is based on dimension reduction methods which include Principal Component Analysis (PCA), Projection Pursuit, Sliced Inverse Regression (SIR), Principal Hessian Direction (PHD), Minimum Average Variance Estimation (MAVE) and LASSO etc. The second part is just a collection of dimension free methods which consist of Parallel Coordinate Plot, Matrix Visualization, Generalized Association Plots (GAP) etc. Most of methods will be discussed from both theoretical and practical perspective for the entire course. Examples from various application areas will be given.
Data visualization is an important issue that can arise in high-dimensional data analysis. It has become increasingly more important due to the advent of computer and graphics technology. The difficulty lies on how to visualize a high dimensional structure or data set. Such kinds of questions do have a common root in Statistics. This course will introduce some statistical methods useful for exploring voluminous data. The main topics include, but not limited to, two parts. The first part is based on dimension reduction methods which include Principal Component Analysis (PCA), Projection Pursuit, Sliced Inverse Regression (SIR), Principal Hessian Direction (PHD), Minimum Average Variance Estimation (MAVE) and LASSO etc. The second part is just a collection of dimension free methods which consist of Parallel Coordinate Plot, Matrix Visualization, Generalized Association Plots (GAP) etc. Most of methods will be discussed from both theoretical and practical perspective for the entire course. Examples from various application areas will be given.
參考書目 Reference Books
(1) Monaé Everett, 2015, Graphical Data Analysis with R, CRC Press.
(2) Tamara Munzner, 2014, Visualization Analysis and Design, CRC Press.
(3) Claus O. Wilke, 2019, Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures, O’Reilly Media.
(4) Winston Chang, 2012, R Graphics Cookbook: Practical Recipes for Visualizing Data, O’Reilly Media.
(5) Eric D. Kolaczyk and Gábor Csárdi, 2020, Statistical Analysis of Network Data with R, Springer.
(1) Monaé Everett, 2015, Graphical Data Analysis with R, CRC Press.
(2) Tamara Munzner, 2014, Visualization Analysis and Design, CRC Press.
(3) Claus O. Wilke, 2019, Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures, O’Reilly Media.
(4) Winston Chang, 2012, R Graphics Cookbook: Practical Recipes for Visualizing Data, O’Reilly Media.
(5) Eric D. Kolaczyk and Gábor Csárdi, 2020, Statistical Analysis of Network Data with R, Springer.
評分方式 Grading
| 評分項目 Grading Method | 配分比例 Percentage | 說明 Description | 
|---|---|---|
| 課堂參與出席 class participation attendance | 15 | 每週上課點名,期末出席成績由 R 軟體隨機選出五週計分。若需請假,請將請假證明寄至教師 email 。 | 
| 平日小組作業 Daily group work | 20 | 包含隨堂電腦軟體操作及輸出。 | 
| 期中小組口頭報告 Midterm group oral report | 15 | 包含關於 R 軟體的操作與分析內容的詮釋;請針對所選資料之研究議題做出之視覺化分析與詮釋,引用及書寫格式請比照期刊撰稿體例,若有抄襲之情事以零分計。 | 
| 期中小組書面報告 Midterm team written report | 15 | 包含關於 R 軟體的操作與分析內容的詮釋;請針對所選資料之研究議題做出之視覺化分析與詮釋,引用及書寫格式請比照期刊撰稿體例,若有抄襲之情事以零分計。 | 
| 期末小組口頭報告 Final group oral report | 15 | 包含關於 R 軟體的操作與分析內容的詮釋;請針對所選資料之研究議題做出之視覺化分析與詮釋,引用及書寫格式請比照期刊撰稿體例,若有抄襲之情事以零分計。 | 
| 期末小組書面報告 Final group written report | 20 | 包含關於 R 軟體的操作與分析內容的詮釋;請針對所選資料之研究議題做出之視覺化分析與詮釋,引用及書寫格式請比照期刊撰稿體例,若有抄襲之情事以零分計。 | 
授課大綱 Course Plan
                        點擊下方連結查看詳細授課大綱
                        Click the link below to view the detailed course plan
                    
相似課程 Related Courses
無相似課程 No related courses found
課程資訊 Course Information
基本資料 Basic Information
- 課程代碼 Course Code: 6192
- 學分 Credit: 0-3
- 
                                上課時間 Course Time:Wednesday/5,6,7[M442]
- 
                                授課教師 Teacher:陳語婕
- 
                                修課班級 Class:統計系3,4,碩1,2
- 
                                選課備註 Memo:亦適用106-109大學部大數據資料群組
交換生/外籍生選課登記
請點選上方按鈕加入登記清單,再等候任課教師審核。
                Add this class to your wishlist by clicking the button above.