自然語言處理(NLP)旨在讓電腦理解、分析與產生人類語言,涵蓋文字前處理、詞向量、語言模型、序列標註、語意分析、生成式模型等多個面向。本課程將由淺入深,結合理論與實作,帶領學生:
熟悉自然語言的基本特性與常見挑戰(多義性、稀疏性、上下文依存等)。
掌握文字前處理(tokenization、stop words、stemming/lemmatization、詞頻/逆文件頻率等)與特徵提取技術。
了解詞向量(word embedding)與分布式表示(Word2Vec、GloVe、FastText)原理及實作。
學習統計與深度學習方法(RNN、LSTM、GRU、Transformer)在文字分類、序列標註、翻譯等任務上的應用。
熟悉目前主流預訓練語言模型(例如 BERT、GPT 等)的架構與微調(fine-tuning)流程。
具備撰寫簡易 NLP 專案(如情感分析、文本生成、對話系統)的能力,並能評估模型效能(Accuracy、Precision、Recall、F1-score、BLEU、ROUGE 等指標)。Natural Language Processing (NLP) aims to allow computers to understand, analyze and produce human languages, covering multiple aspects such as text preprocessing, word vectors, language models, sequence tags, ideology analysis, and generative models. This course will go from purity to depth, establish reasonable discussion and implementation, and lead students:
Be familiar with the basic characteristics of natural language and common challenges (pluralism, sparseness, context dependence, etc.).
Master text pre-processing (tokenization, stop words, steering/lemmatization, word frequency/inverse file frequency, etc.) and feature extraction techniques.
Understand the principles and implementation of word embedding and distributed representation (Word2Vec, GloVe, FastText).
The application of learning statistics and deep learning methods (RNN, LSTM, GRU, Transformer) in tasks such as text classification, sequence tagging, and translation.
Familiar with the architecture and fine-tuning process of current mainstream pre-training language models (such as BERT, GPT, etc.).
Have the ability to write simple NLP projects (such as sentiment analysis, text generation, dialogue system), and be able to evaluate model performance (Accuracy, Precision, Recall, F1-score, BLEU, ROUGE and other indicators).
待選
To be selected
評分項目 Grading Method | 配分比例 Grading percentage | 說明 Description |
---|---|---|
考試考試 exam |
30 | |
作業&平時表現作業&平時表現 Work & Daily Performance |
70 |