Skip to main content

Introduction to Data EngineeringLaajuus (5 cr)

Code: TT00CN68

Credits

5 op

Objective

After completing the course the student is able to:
Understand and describe the data engineering process life cycle

Content

What is Data Engineering
Data Storage and Retrieval
Data Engineering Lifecycle
Extract, Transform and Load (ETL) process
Introduction to Big Data Frameworks

Enrollment

01.06.2023 - 14.09.2023

Timing

04.09.2023 - 15.12.2023

Number of ECTS credits allocated

5 op

Mode of delivery

Contact teaching

Unit

Engineering and Business

Campus

Kupittaa Campus

Teaching languages
  • English
Seats

25 - 35

Teachers
  • Golnaz Sahebi
Groups
  • PTIETS22deai
    PTIETS22 Data Engineering and Artificial Intelligence
  • PTIVIS22I
    Data Engineering and AI

Objective

After completing the course the student is able to:
Understand and describe the data engineering process life cycle

Content

What is Data Engineering
Data Storage and Retrieval
Data Engineering Lifecycle
Extract, Transform and Load (ETL) process
Introduction to Big Data Frameworks

Materials

Material will be available via the learning environment (ITS).

Teaching methods

Weekly contact sessions when 3-4 hours for theory and practical exercises.
Additionally, there is home work exercises.

International connections

The course includes approximately 11 theory sessions and guided exercises sessions where students work with practical tasks.
Additionally, exercises for home work that will be partly demonstrated in during contact sessions.

Student workload

Contact hours
- 10 times 3.5h theory and practice: 10 x 3.5h = 35 hours
- Final projects and presentations: 25 hours

Home work: approximately 70 hours

Total: approximately: 130 hours

Content scheduling

Course Topics and Scheduling (pre-planning):
Week 36: Course Overview and Introduction to Data Engineering
Week 37 - 38: The Data Engineering Ecosystem
Week 39: Big Data Platforms
Week 40: Exercise Demo (I)
Week 41: Week 41: Apache Airflow
Week 43: Data Engineering Life Cycle - Data wrangling
Week 44: Data Engineering Life Cycle - Data Wrangling and ETL in Airflow
Week 45: Data Engineering Lifecycle - Governance and Compliance
Week 46 and 47: Exercise demo and working independently on your final projects within your groups
Week 48: Final Project presentations

Further information

ITS.

Evaluation scale

H-5

Assessment methods and criteria

The course is graded on a scale of 0-5.
*
You can achieve a maximum of 60 points from six practical exercises in class room and homework exercises, and a maximum of 40 points from the final project.
*
To pass the course, you need to achieve at least 30 points of the exercises and 20 points of the final project.

Assessment criteria, fail (0)

Less than 50 points in exercises and project not passed.

Assessment criteria, satisfactory (1-2)

50 - 69 points from the total points of the exercises and the final project

Assessment criteria, good (3-4)

70 - 89 points from the total points of the exercises and the final project

Assessment criteria, excellent (5)

90 - 100 points from the total points of the exercises and the final project