Home
統計學系
course information of 113 - 2 | 1594 Data Visualization Analysis(資料視覺化分析)

1594 - 資料視覺化分析 Data Visualization Analysis


教育目標 Course Target

社會科學領域的經驗研究過程中,往往會牽涉到資料的蒐集,而資料視覺化就是將龐大雜亂的資料轉換成較容易理解的圖像。好的圖表可以有效的將資料所包含的資訊向讀者溝通,但不好的圖表可能反而會誤導讀者。 資料視覺化對於清理資料、探索資料結構、偵測界外值(outliers)及異常群體、辨識趨勢及叢聚(clusters)、發現局部模式(pattern)、評估模型分析輸出(output)、與呈現分析結果,都相當有幫助。資料視覺化分析對於探索性數據分析(exploratory data analysis)及潛在變項分析(latent variable analysis)更是不可或缺的。 本課程主要目標為使用 R 軟體與真實社會科學調查的資料,透過資料視覺化分析來展現資料中的資訊。課程前半學期會先介紹圖像理論的概念,再透過實際操作與具體設計並畫出統計圖像、詮釋統計圖像的經驗,有效率地理解探索性資料視覺化。課程後半學期,會介紹如何將大多數研究者感興趣、卻無法實際量測到的概念或態度,使用視覺化的模型進行分析。完成課程後,學生將能夠使用 R 語言繪製圖表,並進行潛在變項分析與解釋分析結果。During the experience research in the social science field, data collection is often involved, and data visualization is to convert complex data into more understandable images. A good chart can effectively communicate the information contained in the data to readers, but a bad chart may instead mislead the readers. Data visualization is of great help in cleaning up data, exploring data structures, detecting outliers and abnormal groups, identifying trends and clusters, discovering local patterns, evaluating model analysis outputs, and presenting analysis results. Data visual analysis is even more indispensable for exploratory data analysis and latent variable analysis. The main purpose of this course is to use R software and real social science survey data, and to display information in the data through visual analysis of data. In the first half of the course, we will first introduce the concept of image theory, and then use actual operations and specific designs and experiences of statistical images and essays to effectively understand exploratory data visualization. In the second half of the course, we will introduce how to analyze concepts or attitudes that most researchers are interested in and cannot be measured using visual models. After completing the course, students will be able to use R language to draw charts and perform potential change analysis and explanation analysis results.


課程概述 Course Description

Data visualization is an important issue that can arise in high-dimensional data analysis. It has become increasingly more important due to the advent of computer and graphics technology. The difficulty lies on how to visualize a high dimensional structure or data set. Such kinds of questions do have a common root in Statistics. This course will introduce some statistical methodologies useful for exploring voluminous data. The main topics include, but not limited to, two parts. The first part is based on dimension reduction methods which include Principal Component Analysis (PCA), Projection Pursuit, Sliced Inverse Regression (SIR), Principal Hessian Direction (PHD), Minimum Average Variance Estimation (MAVE) and LASSO etc. The second part is just a collection of dimension free methods which consist of Parallel Coordinate Plot, Matrix Visualization, Generalized Association Plots (GAP) etc. Most of methods will be discussed from both theoretical and practical perspective for the entire course. Examples from various application areas will be given.
Data visualization is an important issue that can arise in high-dimensional data analysis. It has become increasingly more important due to the adventure of computer and graphics technology. The difficulty lies on how to visualize a high dimension structure or data set. Such kinds of questions do have a common root in Statistics. This course will introduce some statistical methods useful for exploring volumetric data. The main topics include, but not limited to, two parts. The first part is based on dimension reduction methods which include Principal Component Analysis (PCA), Projection Pursuit, Sliced ​​Inverse Regression (SIR), Principal Hessian Direction (PHD), Minimum Average Variance Estimation (MAVE) and LASSO etc. The second part is just a collection of dimension free methods which consist of Parallel Coordinate Plot, Matrix Visualization, Generalized Association Plots (GAP) etc. Most of methods will be discussed from both theoretical and practical perspective for the entire course. Examples from various application areas will be given.


參考書目 Reference Books

(1) Kieran Healy, 2019, Data Visualization: A Practical Introduction, Princeton University Press: Princeton and Oxford.

(2) Antony Unwin, 2015, Graphical Data Analysis with R, CRC Press.

(3) W. Holmes Finch, Brian F. French, 2015, Latent Variable Modeling with R, Routledge.

(4) A. Alexander Beaujean, 2014, Latent Variable Modeling Using R: A Step-by-Step Guide, Routledge.
(1) Kieran Healy, 2019, Data Visualization: A Practical Introduction, Princeton University Press: Princeton and Oxford.

(2) Antony Unwin, 2015, Graphical Data Analysis with R, CRC Press.

(3) W. Holmes Finch, Brian F. French, 2015, Latent Variable Modeling with R, Routledge.

(4) A. Alexander Beaujean, 2014, Latent Variable Modeling Using R: A Step-by-Step Guide, Routledge.


評分方式 Grading

評分項目 Grading Method 配分比例 Grading percentage 說明 Description

授課大綱 Course Plan

Click here to open the course plan. Course Plan
交換生/外籍生選課登記 - 請點選下方按鈕加入登記清單,再等候任課教師審核。
Add this class to your wishlist by click the button below.
請先登入才能進行選課登記 Please login first


相似課程 Related Course

很抱歉,沒有符合條件的課程。 Sorry , no courses found.

Course Information

Description

學分 Credit:0-3
上課時間 Course Time:Friday/2,3,4[ST023]
授課教師 Teacher:陳語婕
修課班級 Class:統計系2-4
選課備註 Memo:大數據資料群組(109-113適用)
授課大綱 Course Plan: Open

選課狀態 Attendance

There're now 3 person in the class.
目前選課人數為 3 人。

請先登入才能進行選課登記 Please login first