Text Mining, 6 credits
Text Mining, 6 hp
TDDE16
Main field of study
Information Technology Computer Science and Engineering Computer ScienceCourse level
Second cycleCourse type
Programme courseExaminer
Marco KuhlmannDirector of studies or equivalent
Ann-Charlotte HallbergEducation components
Preliminary scheduled hours: 0 hRecommended self-study hours: 160 h
Course offered for | Semester | Period | Timetable module | Language | Campus | ECV | |
---|---|---|---|---|---|---|---|
6CDDD | Computer Science and Engineering, M Sc in Engineering | 9 (Autumn 2017) | 2 | 2 | English | Linköping, Valla | E |
6CDDD | Computer Science and Engineering, M Sc in Engineering (AI and Machine Learning) | 9 (Autumn 2017) | 2 | 2 | English | Linköping, Valla | E |
6CMJU | Computer Science and Software Engineering, M Sc in Engineering | 9 (Autumn 2017) | 2 | 2 | English | Linköping, Valla | E |
6CMJU | Computer Science and Software Engineering, M Sc in Engineering (AI and Machine Learning) | 9 (Autumn 2017) | 2 | 2 | English | Linköping, Valla | E |
6MDAV | Computer Science, Master's programme | 3 (Autumn 2017) | 2 | 2 | English | Linköping, Valla | E |
6MICS | Computer Science, Master's programme | 3 (Autumn 2017) | 2 | 2 | English | Linköping, Valla | E |
6CITE | Information Technology, M Sc in Engineering | 9 (Autumn 2017) | 2 | 2 | English | Linköping, Valla | E |
6CITE | Information Technology, M Sc in Engineering (AI and Machine Learning) | 9 (Autumn 2017) | 2 | 2 | English | Linköping, Valla | E |
Main field of study
Information Technology, Computer Science and Engineering, Computer ScienceCourse level
Second cycleAdvancement level
A1XCourse offered for
- Computer Science and Software Engineering, M Sc in Engineering
- Computer Science and Engineering, M Sc in Engineering
- Information Technology, M Sc in Engineering
- Computer Science, Master's programme
Entry requirements
Note: Admission requirements for non-programme students usually also include admission requirements for the programme and threshold requirements for progression within the programme, or corresponding.
Prerequisites
Mathematical analysis; Linear Algebra; Probability and Statistics; Machine Learning; Basic programming.
Intended learning outcomes
The overall aim of the course is to provide an introduction to quantitative analysis of text, with special focus on applying machine learning methods to text documents. The student will learn all the main steps when working with text: i) efficient extraction of text, ii) natural language processing of the text in a form suitable for iii) statistical machine learning methods which are subsequently used for iv) text prediction.
After completing the course the student should be able to:
- use basic methods for information extraction and retrieval of textual data.
- apply text processing techniques to prepare documents for statistical modelling
- apply relevant machine learning models for analyzing textual data and correctly interpreting the results
- use machine learning models for text prediction
- evaluate the performance of machine learning models for textual data
Course content
Introduction and overview of quantitative text analysis and its applications. Information extraction. Web crawling. Information retrieval. Tf-idf. Vector space models. Text preprocessing. Bag of words. N-grams. Sparsity and smoothing for text. Document classification. Sentiment analysis. Model evaluation. Topic models.
Teaching and working methods
The course consists of lectures, computer laboratory work and an individual project. The lectures introduce concepts and theories that students then use in problem solving at the computer labs and in the project work.
Examination
PRA1 | Project | 3 credits | U, 3, 4, 5 |
LAB1 | Laboratory exercises | 3 credits | U, G |
UPG1 consists of computer exercises that tests the students' ability to translate theoretical knowledge into practical problem solving in machine learning.
UPG2 is an individual project where the student solves a real-world problem involving text. The project is documented and evaluated by a written project report.
Grades
Four-grade scale, LiU, U, 3, 4, 5Department
Institutionen för datavetenskapDirector of Studies or equivalent
Ann-Charlotte HallbergExaminer
Marco KuhlmannEducation components
Preliminary scheduled hours: 0 hRecommended self-study hours: 160 h
Code | Name | Scope | Grading scale |
---|---|---|---|
PRA1 | Project | 3 credits | U, 3, 4, 5 |
LAB1 | Laboratory exercises | 3 credits | U, G |
UPG1 consists of computer exercises that tests the students' ability to translate theoretical knowledge into practical problem solving in machine learning.
UPG2 is an individual project where the student solves a real-world problem involving text. The project is documented and evaluated by a written project report.
Regulations (apply to LiU in its entirety)
The university is a government agency whose operations are regulated by legislation and ordinances, which include the Higher Education Act and the Higher Education Ordinance. In addition to legislation and ordinances, operations are subject to several policy documents. The Linköping University rule book collects currently valid decisions of a regulatory nature taken by the university board, the vice-chancellor and faculty/department boards.
LiU’s rule book for education at first-cycle and second-cycle levels is available at http://styrdokument.liu.se/Regelsamling/Innehall/Utbildning_pa_grund-_och_avancerad_niva.
Note: The course matrix might contain more information in Swedish.
I | U | A | Modules | Comment | ||
---|---|---|---|---|---|---|
1. DISCIPLINARY KNOWLEDGE AND REASONING | ||||||
1.1 Knowledge of underlying mathematics and science (G1X level) |
X
|
X
|
X
|
LAB1
PRA1
|
||
1.2 Fundamental engineering knowledge (G1X level) |
X
|
X
|
X
|
LAB1
PRA1
|
||
1.3 Further knowledge, methods, and tools in one or several subjects in engineering or natural science (G2X level) |
|
|
|
|||
1.4 Advanced knowledge, methods, and tools in one or several subjects in engineering or natural sciences (A1X level) |
|
|
|
|||
1.5 Insight into current research and development work |
|
|
|
|||
2. PERSONAL AND PROFESSIONAL SKILLS AND ATTRIBUTES | ||||||
2.1 Analytical reasoning and problem solving |
X
|
X
|
X
|
LAB1
PRA1
|
||
2.2 Experimentation, investigation, and knowledge discovery |
X
|
X
|
X
|
LAB1
PRA1
|
||
2.3 System thinking |
X
|
X
|
X
|
PRA1
|
||
2.4 Attitudes, thought, and learning |
|
|
|
|||
2.5 Ethics, equity, and other responsibilities |
|
|
|
|||
3. INTERPERSONAL SKILLS: TEAMWORK AND COMMUNICATION | ||||||
3.1 Teamwork |
|
|
X
|
PRA1
|
||
3.2 Communications |
|
|
X
|
PRA1
|
||
3.3 Communication in foreign languages |
|
|
|
|||
4. CONCEIVING, DESIGNING, IMPLEMENTING AND OPERATING SYSTEMS IN THE ENTERPRISE, SOCIETAL AND ENVIRONMENTAL CONTEXT | ||||||
4.1 External, societal, and environmental context |
X
|
|
|
|||
4.2 Enterprise and business context |
|
|
|
|||
4.3 Conceiving, system engineering and management |
|
|
|
|||
4.4 Designing |
|
|
|
PRA1
|
||
4.5 Implementing |
|
|
|
|||
4.6 Operating |
|
|
|
|||
5. PLANNING, EXECUTION AND PRESENTATION OF RESEARCH DEVELOPMENT PROJECTS WITH RESPECT TO SCIENTIFIC AND SOCIETAL NEEDS AND REQUIREMENTS | ||||||
5.1 Societal conditions, including economic, social, and ecological aspects of sustainable development for knowledge development |
|
|
|
|||
5.2 Economic conditions for knowledge development |
|
|
|
|||
5.3 Identification of needs, structuring and planning of research or development projects |
|
|
|
|||
5.4 Execution of research or development projects |
|
|
|
|||
5.5 Presentation and evaluation of research or development projects |
|
|
|
This tab contains public material from the course room in Lisam. The information published here is not legally binding, such material can be found under the other tabs on this page.
There are no files available for this course.