Data Mining - Clustering and Association Analysis, 6 credits

Data Mining - Clustering and Association Analysis, 6 hp

TDDD41

Main field of study

Information Technology Computer Science and Engineering Computer Science

Course level

Second cycle

Course type

Programme course

Examiner

Patrick Lambrix

Director of studies or equivalent

Patrick Lambrix

Education components

Preliminary scheduled hours: 26 h
Recommended self-study hours: 134 h
ECV = Elective / Compulsory / Voluntary
Course offered for Semester Period Timetable module Language Campus ECV
6CDDD Computer Science and Engineering, M Sc in Engineering 8 (Spring 2020) 1 3 English Linköping, Valla E
6CDDD Computer Science and Engineering, M Sc in Engineering (AI and Machine Learning) 8 (Spring 2020) 1 3 English Linköping, Valla E
6CDDD Computer Science and Engineering, M Sc in Engineering (Programming and Algorithms) 8 (Spring 2020) 1 3 English Linköping, Valla E
6CMJU Computer Science and Software Engineering, M Sc in Engineering 8 (Spring 2020) 1 3 English Linköping, Valla E
6CMJU Computer Science and Software Engineering, M Sc in Engineering (AI and Machine Learning) 8 (Spring 2020) 1 3 English Linköping, Valla E
6CMJU Computer Science and Software Engineering, M Sc in Engineering (Programming and Algorithms Specialization) 8 (Spring 2020) 1 3 English Linköping, Valla E
6MICS Computer Science, Master's Programme 2 (Spring 2020) 1 3 English Linköping, Valla E
6MICS Computer Science, Master's Programme (AI and Data Mining) 2 (Spring 2020) 1 3 English Linköping, Valla E
6CIEI Industrial Engineering and Management - International, M Sc in Engineering - Chinese 8 (Spring 2020) 1 3 English Linköping, Valla E
6CIEI Industrial Engineering and Management - International, M Sc in Engineering - Chinese (Specialization Computer Science and Engineering) 8 (Spring 2020) 1 3 English Linköping, Valla E
6CIEI Industrial Engineering and Management - International, M Sc in Engineering - French 8 (Spring 2020) 1 3 English Linköping, Valla E
6CIEI Industrial Engineering and Management - International, M Sc in Engineering - French (Specialization Computer Science and Engineering) 8 (Spring 2020) 1 3 English Linköping, Valla E
6CIEI Industrial Engineering and Management - International, M Sc in Engineering - German 8 (Spring 2020) 1 3 English Linköping, Valla E
6CIEI Industrial Engineering and Management - International, M Sc in Engineering - German (Specialization Computer Science and Engineering) 8 (Spring 2020) 1 3 English Linköping, Valla E
6CIEI Industrial Engineering and Management - International, M Sc in Engineering - Japanese 8 (Spring 2020) 1 3 English Linköping, Valla E
6CIEI Industrial Engineering and Management - International, M Sc in Engineering - Japanese (Specialization Computer Science and Engineering) 8 (Spring 2020) 1 3 English Linköping, Valla E
6CIEI Industrial Engineering and Management - International, M Sc in Engineering - Spanish 8 (Spring 2020) 1 3 English Linköping, Valla E
6CIEI Industrial Engineering and Management - International, M Sc in Engineering - Spanish (Specialization Computer Science and Engineering) 8 (Spring 2020) 1 3 English Linköping, Valla E
6CIII Industrial Engineering and Management, M Sc in Engineering 8 (Spring 2020) 1 3 English Linköping, Valla E
6CIII Industrial Engineering and Management, M Sc in Engineering (Computer Science and Engineering Specialization) 8 (Spring 2020) 1 3 English Linköping, Valla E
6CITE Information Technology, M Sc in Engineering 8 (Spring 2020) 1 3 English Linköping, Valla E
6CITE Information Technology, M Sc in Engineering (AI and Machine Learning) 8 (Spring 2020) 1 3 English Linköping, Valla E
6CITE Information Technology, M Sc in Engineering (Programming and Algorithms Specialization) 8 (Spring 2020) 1 3 English Linköping, Valla E

Main field of study

Information Technology, Computer Science and Engineering, Computer Science

Course level

Second cycle

Advancement level

A1X

Course offered for

  • Master's Programme in Computer Science
  • Computer Science and Engineering, M Sc in Engineering
  • Industrial Engineering and Management - International, M Sc in Engineering
  • Industrial Engineering and Management, M Sc in Engineering
  • Information Technology, M Sc in Engineering
  • Computer Science and Software Engineering, M Sc in Engineering

Entry requirements

Note: Admission requirements for non-programme students usually also include admission requirements for the programme and threshold requirements for progression within the programme, or corresponding.

Prerequisites

The course requires thorough knowledge in programming, discrete mathematics, data structures and algorithms and databases.

Intended learning outcomes

The course lays the foundation for professional work and research in which large amounts of data are explored, modified, modelled and assessed to uncover previously unknown patterns and trends. The course focuses on clustering and association analysis.
Having completed the course, the student should be able to:

  • understand and be able to use important terminology in data mining
  • understand and use the theory behind clustering and association analysis
  • use knowledge about techniques for clustering and association analysis
  • demonstrate insightful assessment of the quality of given data sets and the information content on which clustering and association analysis can be based
  • use and evaluate tools for clustering and association analysis

 

Course content

Association analysis: concepts and methods related to frequent item sets and association rules such as Apriori principle, FP-growth, evaluation of association rules,

Clustering: concepts and methods related to clustering, partitional clustering methods, hierarchical clustering methods, density-based clustering methods, cluster evaluation

Teaching and working methods

The teaching comprises lectures and computer laboratory work. Lectures are devoted to theory, concepts and techniques. The techniques are practised in the computer laboratory work.

Examination

LAB1Laboratory work2 creditsU, G
TEN1Written examination4 creditsU, 3, 4, 5

Grades

Four-grade scale, LiU, U, 3, 4, 5

Course literature

Jiawei Han, Micheline Kamber, Jian Pei, Data Mining - Concepts and Techniques, 3rd edition, Morgan-Kaufmann, 2011. ISBN: 978-0123814791.

Article collection.

Other information

About teaching and examination language

The teaching language is presented in the Overview tab for each course. The examination language relates to the teaching language as follows: 

  • If teaching language is Swedish, the course as a whole or in large parts, is taught in Swedish. Please note that although teaching language is Swedish, parts of the course could be given in English. Examination language is Swedish. 
  • If teaching language is Swedish/English, the course as a whole will be taught in English if students without prior knowledge of the Swedish language participate. Examination language is Swedish or English (depending on teaching language). 
  • If teaching language is English, the course as a whole is taught in English. Examination language is English. 

Other

The course is conducted in a manner where both men's and women's experience and knowledge are made visible and developed. 

The planning and implementation of a course should correspond to the course syllabus. The course evaluation should therefore be conducted with the course syllabus as a starting point.  

Department

Institutionen för datavetenskap

Director of Studies or equivalent

Patrick Lambrix

Examiner

Patrick Lambrix

Course website and other links

http://www.ida.liu.se/~TDDD41

Education components

Preliminary scheduled hours: 26 h
Recommended self-study hours: 134 h

Course literature

Books

  • Jiawei Han, Micheline Kamber, Jian Pei, (2011) Data Mining - Concepts and Techniques 3 Morgan-Kaufmann
    ISBN: 978-0123814791

Other

  • Artikelsamling 2018.
Code Name Scope Grading scale
LAB1 Laboratory work 2 credits U, G
TEN1 Written examination 4 credits U, 3, 4, 5

Course syllabus

A syllabus must be established for each course. The syllabus specifies the aim and contents of the course, and the prior knowledge that a student must have in order to be able to benefit from the course.

Timetabling

Courses are timetabled after a decision has been made for this course concerning its assignment to a timetable module. 

Interrupting a course

The vice-chancellor’s decision concerning regulations for registration, deregistration and reporting results (Dnr LiU-2015-01241) states that interruptions in study are to be recorded in Ladok. Thus, all students who do not participate in a course for which they have registered must record the interruption, such that the registration on the course can be removed. Deregistration from a course is carried out using a web-based form: https://www.lith.liu.se/for-studenter/kurskomplettering?l=en. 

Cancelled courses

Courses with few participants (fewer than 10) may be cancelled or organised in a manner that differs from that stated in the course syllabus. The Dean is to deliberate and decide whether a course is to be cancelled or changed from the course syllabus. 

Guidelines relating to examinations and examiners 

For details, see Guidelines for education and examination for first-cycle and second-cycle education at Linköping University, http://styrdokument.liu.se/Regelsamling/VisaBeslut/917592.

An examiner must be employed as a teacher at LiU according to the LiU Regulations for Appointments (https://styrdokument.liu.se/Regelsamling/VisaBeslut/622784). For courses in second-cycle, the following teachers can be appointed as examiner: Professor (including Adjunct and Visiting Professor), Associate Professor (including Adjunct), Senior Lecturer (including Adjunct and Visiting Senior Lecturer), Research Fellow, or Postdoc. For courses in first-cycle, Assistant Lecturer (including Adjunct and Visiting Assistant Lecturer) can also be appointed as examiner in addition to those listed for second-cycle courses. In exceptional cases, a Part-time Lecturer can also be appointed as an examiner at both first- and second cycle, see Delegation of authority for the Board of Faculty of Science and Engineering.

Forms of examination

Examination

Written and oral examinations are held at least three times a year: once immediately after the end of the course, once in August, and once (usually) in one of the re-examination periods. Examinations held at other times are to follow a decision of the board of studies.

Principles for examination scheduling for courses that follow the study periods:

  • courses given in VT1 are examined for the first time in March, with re-examination in June and August
  • courses given in VT2 are examined for the first time in May, with re-examination in August and October
  • courses given in HT1 are examined for the first time in October, with re-examination in January and August
  • courses given in HT2 are examined for the first time in January, with re-examination in March and in August.

The examination schedule is based on the structure of timetable modules, but there may be deviations from this, mainly in the case of courses that are studied and examined for several programmes and in lower grades (i.e. 1 and 2). 

Examinations for courses that the board of studies has decided are to be held in alternate years are held three times during the school year in which the course is given according to the principles stated above.

Examinations for courses that are cancelled or rescheduled such that they are not given in one or several years are held three times during the year that immediately follows the course, with examination scheduling that corresponds to the scheduling that was in force before the course was cancelled or rescheduled.

When a course is given for the last time, the regular examination and two re-examinations will be offered. Thereafter, examinations are phased out by offering three examinations during the following academic year at the same times as the examinations in any substitute course. If there is no substitute course, three examinations will be offered during re-examination periods during the following academic year. Other examination times are decided by the board of studies. In all cases above, the examination is also offered one more time during the academic year after the following, unless the board of studies decides otherwise.

If a course is given during several periods of the year (for programmes, or on different occasions for different programmes) the board or boards of studies determine together the scheduling and frequency of re-examination occasions.

Registration for examination

In order to take an examination, a student must register in advance at the Student Portal during the registration period, which opens 30 days before the date of the examination and closes 10 days before it. Candidates are informed of the location of the examination by email, four days in advance. Students who have not registered for an examination run the risk of being refused admittance to the examination, if space is not available.

Symbols used in the examination registration system:

  ** denotes that the examination is being given for the penultimate time.

  * denotes that the examination is being given for the last time.

Code of conduct for students during examinations

Details are given in a decision in the university’s rule book: http://styrdokument.liu.se/Regelsamling/VisaBeslut/622682.

Retakes for higher grade

Students at the Institute of Technology at LiU have the right to retake written examinations and computer-based examinations in an attempt to achieve a higher grade. This is valid for all examination components with code “TEN” and "DAT". The same right may not be exercised for other examination components, unless otherwise specified in the course syllabus.

A retake is not possible on courses that are included in an issued degree diploma. 

Retakes of other forms of examination

Regulations concerning retakes of other forms of examination than written examinations and computer-based examinations are given in the LiU guidelines for examinations and examiners, http://styrdokument.liu.se/Regelsamling/VisaBeslut/917592.

Plagiarism

For examinations that involve the writing of reports, in cases in which it can be assumed that the student has had access to other sources (such as during project work, writing essays, etc.), the material submitted must be prepared in accordance with principles for acceptable practice when referring to sources (references or quotations for which the source is specified) when the text, images, ideas, data, etc. of other people are used. It is also to be made clear whether the author has reused his or her own text, images, ideas, data, etc. from previous examinations, such as degree projects, project reports, etc. (this is sometimes known as “self-plagiarism”).

A failure to specify such sources may be regarded as attempted deception during examination.

Attempts to cheat

In the event of a suspected attempt by a student to cheat during an examination, or when study performance is to be assessed as specified in Chapter 10 of the Higher Education Ordinance, the examiner is to report this to the disciplinary board of the university. Possible consequences for the student are suspension from study and a formal warning. More information is available at https://www.student.liu.se/studenttjanster/lagar-regler-rattigheter?l=en.

Grades

The grades that are preferably to be used are Fail (U), Pass (3), Pass not without distinction (4) and Pass with distinction (5). 

  1. Grades U, 3, 4, 5 are to be awarded for courses that have written examinations.
  2. Grades Fail (U) and Pass (G) may be awarded for courses with a large degree of practical components such as laboratory work, project work and group work.
  3. Grades Fail (U) and Pass (G) are to be used for degree projects and other independent work.

Examination components

  1. Grades U, 3, 4, 5 are to be awarded for written examinations (TEN).
  2. Examination components for which the grades Fail (U) and Pass (G) may be awarded are laboratory work (LAB), project work (PRA), preparatory written examination (KTR), oral examination (MUN), computer-based examination (DAT), home assignment (HEM), and assignment (UPG).
  3. Students receive grades either Fail (U) or Pass (G) for other examination components in which the examination criteria are satisfied principally through active attendance such as other examination (ANN), tutorial group (BAS) or examination item (MOM).
  4. Grades Fail (U) and Pass (G) are to be used for the examination components Opposition (OPPO) and Attendance at thesis presentation (AUSK) (i.e. part of the degree project).

For mandatory components, the following applies: If special circumstances prevail, and if it is possible with consideration of the nature of the compulsory component, the examiner may decide to replace the compulsory component with another equivalent component. (In accordance with the LiU Guidelines for education and examination for first-cycle and second-cycle education at Linköping University, http://styrdokument.liu.se/Regelsamling/VisaBeslut/917592). 

For written examinations, the following applies: If the LiU coordinator for students with disabilities has granted a student the right to an adapted examination for a written examination in an examination hall, the student has the right to it. If the coordinator has instead recommended for the student an adapted examination or alternative form of examination, the examiner may grant this if the examiner assesses that it is possible, based on consideration of the course objectives. (In accordance with the LiU Guidelines for education and examination for first-cycle and second-cycle education at Linköping University, http://styrdokument.liu.se/Regelsamling/VisaBeslut/917592).

The examination results for a student are reported at the relevant department.

Regulations (apply to LiU in its entirety)

The university is a government agency whose operations are regulated by legislation and ordinances, which include the Higher Education Act and the Higher Education Ordinance. In addition to legislation and ordinances, operations are subject to several policy documents. The Linköping University rule book collects currently valid decisions of a regulatory nature taken by the university board, the vice-chancellor and faculty/department boards.

LiU’s rule book for education at first-cycle and second-cycle levels is available at http://styrdokument.liu.se/Regelsamling/Innehall/Utbildning_pa_grund-_och_avancerad_niva. 

Books

Jiawei Han, Micheline Kamber, Jian Pei, (2011) Data Mining - Concepts and Techniques 3 Morgan-Kaufmann

ISBN: 978-0123814791

Other

Artikelsamling 2018.

Note: The course matrix might contain more information in Swedish.

I = Introduce, U = Teach, A = Utilize
I U A Modules Comment
1. DISCIPLINARY KNOWLEDGE AND REASONING
1.1 Knowledge of underlying mathematics and science (G1X level)
X
Basic mathematical concepts
1.2 Fundamental engineering knowledge (G1X level)
X
X
LAB1
TEN1
Programming, modeling, database technology
1.3 Further knowledge, methods, and tools in one or several subjects in engineering or natural science (G2X level)
X
X
LAB1
TEN1

                            
1.4 Advanced knowledge, methods, and tools in one or several subjects in engineering or natural sciences (A1X level)

                            
1.5 Insight into current research and development work

                            
2. PERSONAL AND PROFESSIONAL SKILLS AND ATTRIBUTES
2.1 Analytical reasoning and problem solving
X
X
LAB1
TEN1
Found good models from a data set
2.2 Experimentation, investigation, and knowledge discovery
X
X
LAB1
Labs
2.3 System thinking
X
X
LAB1
TEN1
Choosing solutions for problems
2.4 Attitudes, thought, and learning
X
X
LAB1
TEN1
Creative and critical thinking
2.5 Ethics, equity, and other responsibilities

                            
3. INTERPERSONAL SKILLS: TEAMWORK AND COMMUNICATION
3.1 Teamwork
X
LAB1
Labs in pairs
3.2 Communications
X
LAB1
written reports for labs
3.3 Communication in foreign languages
X
LAB1
TEN1
Course given in English
4. CONCEIVING, DESIGNING, IMPLEMENTING AND OPERATING SYSTEMS IN THE ENTERPRISE, SOCIETAL AND ENVIRONMENTAL CONTEXT
4.1 External, societal, and environmental context

                            
4.2 Enterprise and business context

                            
4.3 Conceiving, system engineering and management

                            
4.4 Designing

                            
4.5 Implementing

                            
4.6 Operating

                            
5. PLANNING, EXECUTION AND PRESENTATION OF RESEARCH DEVELOPMENT PROJECTS WITH RESPECT TO SCIENTIFIC AND SOCIETAL NEEDS AND REQUIREMENTS
5.1 Societal conditions, including economic, social, and ecological aspects of sustainable development for knowledge development

                            
5.2 Economic conditions for knowledge development

                            
5.3 Identification of needs, structuring and planning of research or development projects

                            
5.4 Execution of research or development projects

                            
5.5 Presentation and evaluation of research or development projects

                            

This tab contains public material from the course room in Lisam. The information published here is not legally binding, such material can be found under the other tabs on this page.

There are no files available for this course.