Introduction to Data Engineering (5cr)
Code: TT00CN68-3005
General information
- Enrollment
- 15.05.2025 - 12.09.2025
- Registration for the implementation has ended.
- Timing
- 12.09.2025 - 21.12.2025
- Implementation is running.
- Number of ECTS credits allocated
- 5 cr
- Local portion
- 5 cr
- Mode of delivery
- Contact learning
- Unit
- Engineering and Business
- Campus
- Kupittaa Campus
- Teaching languages
- English
- Seats
- 25 - 70
- Teachers
- Golnaz Sahebi
- Tommi Tuomola
- Groups
-
DEAI24AData Engineering and Artificial Intelligence
-
DEAI24BData Engineering and Artificial Intelligence
- Course
- TT00CN68
Realization has 15 reservations. Total duration of reservations is 39 h 0 min.
Time | Topic | Location |
---|---|---|
Wed 10.09.2025 time 12:00 - 14:00 (2 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
ICT_B1026_Gamma
GAMMA
|
Tue 16.09.2025 time 12:00 - 15:00 (3 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
LEM_A173_Lemminkäinen
Lemminkäinen
|
Tue 23.09.2025 time 12:00 - 15:00 (3 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
LEM_A173_Lemminkäinen
Lemminkäinen
|
Tue 30.09.2025 time 12:00 - 15:00 (3 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
LEM_A173_Lemminkäinen
Lemminkäinen
|
Wed 08.10.2025 time 17:00 - 18:00 (1 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
Online
|
Fri 10.10.2025 time 08:00 - 11:00 (3 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
|
Tue 21.10.2025 time 12:00 - 15:00 (3 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
|
Tue 28.10.2025 time 12:00 - 15:00 (3 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
|
Tue 04.11.2025 time 12:00 - 15:00 (3 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
ICT_C1042_Myy
MYY
|
Tue 04.11.2025 time 17:00 - 18:00 (1 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
Online
|
Fri 14.11.2025 time 10:00 - 12:00 (2 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
ICT_B1026_Gamma
GAMMA
|
Fri 21.11.2025 time 12:00 - 15:00 (3 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
ICT_C1035_Delta
DELTA
|
Fri 28.11.2025 time 12:00 - 15:00 (3 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
ICT_C1042_Myy
MYY
|
Tue 02.12.2025 time 12:00 - 15:00 (3 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
ICT_B1041_Omega
OMEGA
|
Fri 05.12.2025 time 13:00 - 16:00 (3 h 0 min) |
Introduction to Data Engineering TT00CN68-3005 |
ICT_B1041_Omega
OMEGA
|
Evaluation scale
H-5
Content scheduling
Course Overview
This course provides an introduction to data engineering, combining theoretical concepts with practical applications. The course is divided into two main parts, each with a distinct focus:
- Part I: Theories and Practice
• Instructor-Led Sessions: Covering general topics in data engineering, taught and supervised by the instructors.
• Self-study tasks
- Part II: Optional AWS Academy Self-Paced Course
• Self-Paced Learning: Students have the option to independently complete the AWS Academy Data Engineering course, gaining in-depth knowledge and earning a certification. This can replace the requirement to complete standard homework assignments.
Student Responsibilities
1. Class Participation and Assignments:
• Active participation in classes, including the completion of in-class assignments.
2. Homework Assignments: (students can choose one of the following options)
• Option A: Complete the individual homework exercises, partially demonstrated during contact sessions.
• Option B: Complete the full AWS Academy Data Engineering course as a substitute for the homework assignments. To do this, students must follow the weekly schedule and upload their AWS Academy course certificate to the ItsLarning platform.
3. Final Project:
• A group project (3-4 students) to be completed over Weeks 47 and 48, culminating in a presentation in Week 49.
________________________________________
Additional Notes
• Flexibility: The option to replace homework with the AWS Academy course allows students to tailor their learning experience to their interests and career goals.
• Project Work: The group project encourages collaboration and the practical application of the skills learned throughout the course.
Objective
After completing the course the student is able to:
Understand and describe the data engineering process life cycle
Content
What is Data Engineering
Data Storage and Retrieval
Data Engineering Lifecycle
Extract, Transform and Load (ETL) process
Introduction to Big Data Frameworks
Materials
- The learning materials including slides and exercises will be prepared by the lecturer from various sources such as online courses and articles, books, videos, etc. The material will be introduced during the lectures and will be available via the learning environment (ITS).
- AWS Academy Data Engineering [91081] Course Materials
Teaching methods
- Participating in lectures (theory and practice)
- Learning through hands-on programming (classwork assignments)
- Completing homework assignments or AWS Academy Course
- Interacting with the teachers and classmates
- Enhancing knowledge through teamwork projects
Exam schedules
No exam
There is a final teamwork project where students must demonstrate their work during a presentation event in week 49.
Pedagogic approaches and sustainable development
- The course includes approximately 12 theory and practice sessions, where students engage with practical tasks.
- Additionally, there are 4 online Q&A sessions to provide extra support.
- Homework exercises will be assigned, with some parts demonstrated during contact sessions.
- Integration of Cloud-based data engineering through the AWS Academy course.
- A teamwork project, requiring students to apply their teamwork skills and the knowledge gained from the course to implement their final project.
Completion alternatives
The practice works and exercises are mainly performed using VS Code, Jupyter Notebook, Apache Airflow, and AWS services.
Student workload
- Contact teaching:
• We have 12 theory and practice sessions, each lasting 3 hours, conducted weekly. (36 hours)
• Additionally, there are 4 online Q&A sessions, each lasting 1 hour.
• Total contact teaching hours per course: 40 hours.
- Homework and teamwork assignment:
• Personal assignments (homework) and independent studies: 75 hours
• Teamwork assignment: 20 hours
Total: approximately 135 hours (5 x 27h)
Evaluation methods and criteria
Course Evaluation and Grading Criteria
Your final grade in this course is determined by your performance across four key areas. To pass, you must meet a minimum performance standard in each area.
The overall grade is calculated based on the following weighted components:
- Lesson Participation: 20%
- Classwork Exercises: 20%
- Homework Exercises: 40%
- Team Project: 20%
1. Lesson Participation (20% of final grade)
Your percentage for this component is awarded based on attendance in live lectures.
- Full Credit (20%): Attendance in more than 70% of lectures (maximum of 3 absences).
- Half Credit (10%): Attendance in 50% – 70% of lectures.
- No Credit (0%): Attendance in less than 50% of lectures.
2. Individual Exercises (60% of final grade)
This component is divided into Classwork and Homework:
2.1. Classwork Exercises (20% of final grade)
These are weekly exercises completed during class time.
- Full Credit: Exercise is submitted on time.
- Half Credit: Exercise is submitted after the deadline.
2.2. Homework Exercises (40% of final grade)
For the homework component, you must choose one of the following two options.
(Note: The AWS Academy course is a complete replacement for all weekly homework assignments.)
Option A: Weekly Homework Assignments
These are weekly exercises to be completed outside of class.
- Full Credit: Assignment is submitted on time and you attend the demonstration session.
- Half Credit: Assignment is submitted after the deadline OR you do not attend the demonstration session.
Option B: AWS Academy Data Engineering Course
This option replaces all weekly homework assignments. You must enroll in and complete the required modules of the course, submitting your certificate of completion by the specified deadline.
- Full Credit (earns the full 40%): You complete 80% - 100% of the AWS course modules.
- Half Credit (earns 20%): You complete 50% - 79% of the AWS course modules.
- No Credit (earns 0%): You complete less than 50% of the AWS course modules.
Note: the required modules will be announced by the teachers during the first lectures.
3. Team Project (20% of final grade)
This is a final, team-based project that must be completed and presented to the instructors by the last week of the course (week 49). A detailed project description and grading rubric will be provided separately.
Passing Criteria and Final Grade
Passing Requirements:
To be eligible to pass the course, you must earn at least 50% of the available credit in each of the three main components separately:
- Participation
- Individual Assignments (Classwork and Homework/AWS scores).
- Group Assignment
Important: If you fail to achieve the 50% minimum in any one of these components, you will receive a failing grade (0) for the course, regardless of your total final percentage.
Final Grades:
If you have met the minimum passing requirements in all three components, your final grade (from 1 to 5) is determined by your total combined percentage.
Grade 5: 90% - 100% total
Grade 4: 80% - 89% total
Grade 3: 70% - 79% total
Grade 2: 60% - 69% total
Grade 1: 50% - 59% total
Grade 0 (Fail): Did not meet the minimum 50% requirement in all components.
Failed (0)
Less than 50% in components not passed.
Assessment criteria, satisfactory (1-2)
1: 50% - 59% from the total points of the components
2: 60% - 69% from the total points of the components
Assessment criteria, good (3-4)
3: 70% - 79% from the total points of the components
4: 80% - 89% from the total points of the components
Assessment criteria, excellent (5)
90%- 100% from the total points of the components
Further information
Use of AI in assignments and final project: USE OF AI REPORTED.
AI can be used in the creation of outputs, but student must clearly report its use. Failure to disclose the use of AI will be interpreted as fraud. The use of AI may affect to assessment.
-----------------------------------------------------------------------------------
Qualifications and Prerequisites:
Before taking an "Introduction to Data Engineering with Python" course, students typically need a foundational understanding of several key areas. Here are the mandatory and recommended prerequisite courses and topics.
1. Mandatory Prerequisites:
1.1. Programming:
1.1.1. Introduction to Programming: Knowledge of programming fundamentals,
including concepts like variables, loops, conditionals, and functions.
1.1.2. Python Programming: Familiarity with Python, including basic syntax, data
types, control structures, and function and modules
1.1.3. Error Handling
1.1.4. Object-oriented programming (OOP)
1.1.5. Data Manipulation: Skills in using Pandas library including DataFrames and
Series, reading, writing, filtering, and transforming data
1.2. Databases: Knowledge of how databases work, including concepts like tables, keys, normalization, and indexing.
2. Recommended Topics:
2.1. Algorithms and Data Structures: Basic understanding of algorithms and data
structures such as arrays, lists, trees, and graphs, which are crucial for data
processing
2.2. Having the fundamental knowledge of cloud services or passing the Cloud Services
Course in TUAS (Lecturer: Ali Khan)
2.3. Version Control Systems: Basic understanding of tools like Git for version control.
2.4. Basic Algebra and Calculus: Fundamental math skills to handle data transformations
and calculations.
2.5. Statistics: Understanding of basic statistical concepts like mean, median, standard
deviation, and probability distributions.
2.6. Being familiar with VirtualBox and Ubuntu