Text Mining, 6 credits
Text Mining, 6 hp
732A92
Main field of study
StatisticsCourse level
Second cycleCourse type
Single subject and programme courseExaminer
Marco KuhlmannCourse coordinator
Marco KuhlmannDirector of studies or equivalent
Peter DaleniusAvailable for exchange students
YesContact
Isak Hietala
Kostas Mitropoulos, international coordinator
Course offered for | Semester | Weeks | Timetable module | Language | Campus | ECV | |
---|---|---|---|---|---|---|---|
Single subject course (Half-time, Day-time) | Autumn 2022 | 202244-202302 | 2 | English | Linköping, Valla | ||
Single subject course (Half-time, Day-time) | Autumn 2022 | 202244-202302 | 2 | English | Linköping, Valla |
Main field of study
StatisticsCourse level
Second cycleAdvancement level
A1NEntry requirements
- 180 ECTS credits passed including 90 ECTS credits in one of the following subjects:
- statistics
- mathematics
- applied mathematics
- computer science
- engineering
- Passed courses in:
- calculus
- linear algebra
- statistics
- programming
- English corresponding to the level of English in Swedish upper secondary education (Engelska 6)
Exemption from Swedish
Intended learning outcomes
After completion of the course the student should on an advanced level be able to:
- use basic methods for information extraction and retrieval of textual data,
- apply text processing techniques to prepare documents for statistical modelling,
- apply relevant statistical models for analyzing textual data and correctly interpret the results,
- use statistical models for prediction of textual information,
- evaluate the performance of statistical models for textual data.
Course content
The course presents how textual data can be retrieved, linguistically pre-processed and subsequently analyzed quantitatively using formal statistical methods and models. The course brings together expertise from the areas of database methodology, computational linguistics and statistics.
The following topics are covered:
Introduction and overview of quantitative text analysis and its applications; Information extraction; Web crawling; Information retrieval; Tf-idf; Vector space models; Text preprocessing; Bag of words; N-grams; Sparsity and smoothing for text; Document classification; Sentiment analysis; Model evaluation; Topic models.
Teaching and working methods
The teaching comprises lectures, lab exercises and a text mining project. The lectures are devoted to presentations of concepts, and methods. The computer lab exercises are devoted to practical application of text mining tools. In the project work, the student will get hands-on experience in solving a text mining problem. Homework and independent study are a necessary complement to the course.
Language of instruction: English.
Examination
Written report on the Text mining project. Written reports on lab assignments. Detailed information about the examination can be found in the course’s study guide.
If special circumstances prevail, and if it is possible with consideration of the nature of the compulsory component, the examiner may decide to replace the compulsory component with another equivalent component.
If the LiU coordinator for students with disabilities has granted a student the right to an adapted examination for a written examination in an examination hall, the student has the right to it.
If the coordinator has recommended for the student an adapted examination or alternative form of examination, the examiner may grant this if the examiner assesses that it is possible, based on consideration of the course objectives.
An examiner may also decide that an adapted examination or alternative form of examination if the examiner assessed that special circumstances prevail, and the examiner assesses that it is possible while maintaining the objectives of the course.
Students failing an exam covering either the entire course or part of the course twice are entitled to have a new examiner appointed for the reexamination.
Students who have passed an examination may not retake it in order to improve their grades.
Grades
ECTS, ECOther information
Planning and implementation of a course must take its starting point in the wording of the syllabus. The course evaluation included in each course must therefore take up the question how well the course agrees with the syllabus.
The course is carried out in such a way that both men´s and women´s experience and knowledge is made visible and developed.
If special circumstances prevail, the vice-chancellor may in a special decision specify the preconditions for temporary deviations from this course syllabus, and delegate the right to take such decisions.
Department
Institutionen för datavetenskapCode | Name | Scope | Grading scale |
---|---|---|---|
PRA1 | Examination | 3 credits | EC |
LAB1 | Laboratory work | 3 credits | EC |
This tab contains public material from the course room in Lisam. The information published here is not legally binding, such material can be found under the other tabs on this page.
There are no files available for this course.