Big Data EngineeringLaajuus (5 cr)
Code: TT00CN70
Credits
5 op
Objective
After completing the course the student can:
- describe basic solutions for data architectures and big data
- select and use suitable data architecture
- apply ETL process and tools for handling of big data
Content
Architecture and Components of Big Data Frameworks
ETL process with Big Data for batch and streaming
Practical work with suitable tools and frameworks
Enrollment
04.12.2024 - 14.01.2025
Timing
14.01.2025 - 30.04.2025
Number of ECTS credits allocated
5 op
Mode of delivery
Contact teaching
Unit
Engineering and Business
Campus
Kupittaa Campus
Teaching languages
- English
Seats
0 - 80
Teachers
- Tommi Tuomola
Scheduling groups
- Ryhmä 1 (Size: 35. Open UAS: 0.)
- Ryhmä 2 (Size: 35. Open UAS: 0.)
Groups
-
PTIETS23deaiData Engineering and Artificial Intelligence
-
PTIVIS23IData Engineering and Artificial Intelligence
Small groups
- Group 1
- Group 2
Objective
After completing the course the student can:
- describe basic solutions for data architectures and big data
- select and use suitable data architecture
- apply ETL process and tools for handling of big data
Content
Architecture and Components of Big Data Frameworks
ETL process with Big Data for batch and streaming
Practical work with suitable tools and frameworks
Materials
Teacher provided lecture material
Supporting public online material
All needed material (or at least a link to them) will be available in itslearning.
Teaching methods
Contact learning, practical exercises, independent study
Exam schedules
There's an exam in April, re-exam in May.
International connections
Given examples and exercises support each topic studied during the lectures. Additional material in the form of tutorials and reliable information sources is provided.
Student workload
Contact hours 44 h
Independent studying 91h, including:
- Studying the course material
- Completing exercises
- Exam
Content scheduling
-The basic idea of big data engineering methods and pipelines
-different components and processes
-integration of said components (MQ systems)
-data engineering frameworks (Apache family)
-The goal of the course is to be able to build a data pipeline from start to finish and to understand both the process and the different components and their role.
Further information
Itslearning and contact classes are the main communication channels used on this course.
The student is required to have a computer capable of running a simple Ubuntu virtual machine and basic skills to work with Ubuntu command line.
Evaluation scale
H-5
Assessment methods and criteria
Homework exercises returned throughout the course
Small exam at the end of the course
11 sets of exercises, each worth 4 points, max 44 points.
Exam, max 22 points.
Required minimum to pass:
33 points in total.
22 points from the exercises.
11 points from the exam.
Assessment criteria, fail (0)
< 33 points
Or < 22 points from the exercises
or < 11 points from the exam
Assessment criteria, satisfactory (1-2)
33-46 total points.
Assessment criteria, good (3-4)
47-59 total points.
Assessment criteria, excellent (5)
60-66 total points.
Enrollment
04.12.2024 - 13.01.2025
Timing
13.01.2025 - 30.04.2025
Number of ECTS credits allocated
5 op
Mode of delivery
Contact teaching
Unit
Engineering and Business
Campus
Kupittaa Campus
Teaching languages
- English
Seats
0 - 40
Teachers
- Tommi Tuomola
Groups
-
PTIVIS22HHealth Technology
Objective
After completing the course the student can:
- describe basic solutions for data architectures and big data
- select and use suitable data architecture
- apply ETL process and tools for handling of big data
Content
Architecture and Components of Big Data Frameworks
ETL process with Big Data for batch and streaming
Practical work with suitable tools and frameworks
Materials
Teacher provided lecture material
Supporting public online material
Teacher provided virtual machines
All needed material (or at least a link to them) will be available in itslearning.
Teaching methods
Contact learning, practical exercises, independent study
Exam schedules
There's no exam.
International connections
Given examples and exercises support each topic studied during the lectures. Additional material in the form of tutorials and reliable information sources is provided.
Student workload
Contact hours 44 h
Independent studying 91h, including:
- Studying the course material
- Completing exercises
- Small Personal Project
Content scheduling
-The basic idea of big data engineering methods and pipelines
-different components and processes
-integration of said components (MQ systems)
-data engineering frameworks (Apache family)
-The goal of the course is to be able to build a data pipeline from start to finish and to understand both the process and the different components and their role.
Further information
Itslearning and contact classes are the main communication channels used on this course.
The student is required to have a computer capable of running a simple Ubuntu virtual machine and basic skills to work with Ubuntu command line.
Evaluation scale
H-5
Assessment methods and criteria
Homework exercises returned throughout the course
Small project at the end of the course
Assessment criteria, satisfactory (1-2)
Student has basic understanding of how the basic big data engineering processes work, what components the systems consist of and how they are used. The student has an idea of what can be done with big data engineering systems.
Assessment criteria, good (3-4)
Student has a good understanding of big data engineering systems and processes. He is able to install many of the components and understands how they work together in a pipeline.
Assessment criteria, excellent (5)
The student understands and is capable of designing big data engineering pipelines. He is able to install and configure the components and understands what kind of questions need to be considered when designing, deploying and implementing the system.
Enrollment
29.11.2023 - 18.01.2024
Timing
08.01.2024 - 30.04.2024
Number of ECTS credits allocated
5 op
Mode of delivery
Contact teaching
Unit
Engineering and Business
Campus
Kupittaa Campus
Teaching languages
- English
Seats
10 - 50
Degree programmes
- Degree Programme in Information and Communication Technology
- Degree Programme in Business Information Technology
- Degree Programme in Information and Communications Technology
Teachers
- Tommi Tuomola
Teacher in charge
Tommi Tuomola
Groups
-
PTIETS22deaiPTIETS22 Data Engineering and Artificial Intelligence
-
PTIVIS22IData Engineering and AI
Objective
After completing the course the student can:
- describe basic solutions for data architectures and big data
- select and use suitable data architecture
- apply ETL process and tools for handling of big data
Content
Architecture and Components of Big Data Frameworks
ETL process with Big Data for batch and streaming
Practical work with suitable tools and frameworks
Materials
Teacher provided lecture material
Supporting public online material
Teacher provided virtual machines
All needed material (or at least a link to them) will be available in itslearning.
Teaching methods
Contact learning, practical exercises, independent study
International connections
Given examples and exercises support each topic studied during the lectures. Additional material in the form of tutorials and reliable information sources is provided.
Student workload
Contact hours 56 h
Inpendent studying 79h, including:
- Studying the course material
- Completing exercises
- Project
Content scheduling
-The basic idea of data engineering methods and pipelines
-different components
-integration of said components (MQ systems)
-data engineering frameworks (Apache family)
-The goal of the course is to be able to build a data pipeline from start to finish
Further information
Itslearning and contact classes are the main communication channels used on this course.
The student is required to have a computer capable of running a simple Ubuntu virtual machine.
Evaluation scale
H-5
Assessment methods and criteria
Homework exercises returned throughout the course
Small project at the end of the course
Enrollment
02.12.2023 - 16.01.2024
Timing
01.01.2024 - 30.04.2024
Number of ECTS credits allocated
5 op
Mode of delivery
Contact teaching
Unit
Engineering and Business
Campus
Kupittaa Campus
Teaching languages
- English
Seats
20 - 40
Degree programmes
- Degree Programme in Information and Communication Technology
- Degree Programme in Information and Communications Technology
Teachers
- Tommi Tuomola
Teacher in charge
Tommi Tuomola
Groups
-
PTIVIS21HTerveysteknologia
Objective
After completing the course the student can:
- describe basic solutions for data architectures and big data
- select and use suitable data architecture
- apply ETL process and tools for handling of big data
Content
Architecture and Components of Big Data Frameworks
ETL process with Big Data for batch and streaming
Practical work with suitable tools and frameworks
Materials
Teacher provided lecture material
Supporting public online material
Teacher provided virtual machines
All needed material (or at least a link to them) will be available in itslearning.
Teaching methods
Contact learning, practical exercises, independent study
International connections
Given examples and exercises support each topic studied during the lectures. Additional material in the form of tutorials and reliable information sources is provided.
Student workload
Contact hours 56 h
Inpendent studying 79h, including:
- Studying the course material
- Completing exercises
- Project
Content scheduling
-Introduction to data engineering
-The basic idea of data engineering methods and pipelines
-different components
-integration of said components (MQ systems)
-data engineering frameworks (Apache family)
Further information
Itslearning and contact classes are the main communication channels used on this course.
The student is required to have a computer capable of running a simple Ubuntu virtual machine.
Evaluation scale
H-5
Assessment methods and criteria
Homework exercises returned throughout the course
Small project at the end of the course