Siirry suoraan sisältöön

Introduction to Data Engineering (5 op)

Toteutuksen tunnus: TT00CN68-3006

Toteutuksen perustiedot


Ilmoittautumisaika
15.05.2025 - 11.09.2025
Ilmoittautuminen toteutukselle on käynnissä.
Ajoitus
11.09.2025 - 21.12.2025
Toteutus ei ole vielä alkanut.
Opintopistemäärä
5 op
Lähiosuus
5 op
Toteutustapa
Lähiopetus
Yksikkö
Tekniikka ja liiketoiminta
Toimipiste
Kupittaan kampus
Opetuskielet
englanti
Paikat
20 - 45
Opettajat
Golnaz Sahebi
Ryhmät
PTIVIS23H
Health Technology
Opintojakso
TT00CN68
Toteutukselle TT00CN68-3006 ei löytynyt varauksia!

Arviointiasteikko

H-5

Sisällön jaksotus

Course Overview
This course provides an introduction to data engineering, combining theoretical concepts with practical applications. The course is divided into two main parts, each with a distinct focus:

- Part I: Theories and Practice
• Instructor-Led Sessions: Covering general topics in data engineering, taught and supervised by the instructors.
• Self-study tasks

- Part II: Optional AWS Academy Self-Paced Course
• Self-Paced Learning: Students have the option to independently complete the AWS Academy Data Engineering course, gaining in-depth knowledge and earning a certification. This can replace the requirement to complete standard homework assignments.


Student Responsibilities
1. Class Participation and Assignments:
• Active participation in all classes, including the completion of in-class assignments.
2. Homework Assignments: (students can choose one of the following options)
• Option A: Complete the individual homework exercises, partially demonstrated during contact sessions.
• Option B: Complete the full AWS Academy Data Engineering course as a substitute for the homework assignments. To do this, students must follow the weekly schedule and upload their AWS Academy course certificate to the Itslearning platform.
3. Final Project:
• A group project (3-4 students) to be completed over Weeks 47 and 48, culminating in a presentation in Week 49.
________________________________________
Additional Notes
• Flexibility: The option to replace homework with the AWS Academy course allows students to tailor their learning experience to their interests and career goals.

• Project Work: The group project encourages collaboration and the practical application of the skills learned throughout the course.

Tavoitteet

After completing the course the student is able to:
Understand and describe the data engineering process life cycle

Sisältö

What is Data Engineering
Data Storage and Retrieval
Data Engineering Lifecycle
Extract, Transform and Load (ETL) process
Introduction to Big Data Frameworks

Oppimateriaalit

- The learning materials including slides and exercises will be prepared by the lecturer from various sources such as online courses and articles, books, videos, etc. The material will be introduced during the lectures and will be available via the learning environment (ITS).

- AWS Academy Data Engineering [91081] Course Materials

Opetusmenetelmät

- Participating in lectures (theory and practice)
- Learning through hands-on programming (classwork assignments)
- Completing homework assignments or AWS Academy Course
- Interacting with the teachers and classmates
- Enhancing knowledge through teamwork projects

Tenttien ajankohdat ja uusintamahdollisuudet

No exam

There is a final teamwork project where students must demonstrate their work during a presentation event in week 49.

Pedagogiset toimintatavat ja kestävä kehitys

- The course includes approximately 12 theory and practice sessions, where students engage with practical tasks.
- Additionally, there are 4 online Q&A sessions to provide extra support.
- Homework exercises will be assigned, with some parts demonstrated during contact sessions.
- Integration of Cloud-based data engineering through the AWS Academy course.
- A teamwork project, requiring students to apply their teamwork skills and the knowledge gained from the course to implement their final project.

Sustainability is integrated in the implementation topics.
.

Toteutuksen valinnaiset suoritustavat

The practice works and exercises are mainly performed using VS Code, Jupyter Notebook, Apache Airflow, and AWS services.

Opiskelijan ajankäyttö ja kuormitus

- Contact teaching:
• We have 12 theory and practice sessions, each lasting 3 hours, conducted weekly. (36 hours)
• Additionally, there are 4 online Q&A sessions, each lasting 1 hour.
• Total contact teaching hours per course: 40 hours.

- Homework and teamwork assignment:
• Personal assignments (homework) and independent studies: 75 hours
• Teamwork assignment: 20 hours

Total: approximately 135 hours (5 x 27h)

Arviointimenetelmät ja arvioinnin perusteet

The course is graded on a scale from 0 to 5, based on the total points accumulated:

1. Lesson Participation (Approx. 20%)
- Full points: Attendance in more than 70% of lectures.
- Half points: Attendance in 50–70% of lectures.
- No points: Attendance in less than 50% of lectures.

2. Weekly Exercises (Approx. 60%)
- Includes classwork, homework, or AWS Academy Labs (as a substitute for homework):
- Full points: Submitted on time and attended the demonstration session
- Half points: Submitted after the deadline or not attending the demonstration session

Note: Demonstrating homework exercises during contact sessions is mandatory. Failure to do so results in a 50% deduction of the respective exercise's points.

3. Team Project (Approx. 20%)
- A final team-based project to be completed by the end of the course.

Passing Criteria
To pass the course, students must earn at least 50% of the possible points in each of the following components:
- Lesson participation
- Exercises
- Final project

The course is graded on a scale of 0-5.
Grading will be according to the total points collected by the student during the course as well as the final project or exam.
1: 50% (minimum to pass the course)
2: 60-69%
3: 70-79%
4: 80-89%
5: 90- 100%

Hylätty (0)

Less than 50% in assignments not passed.

Arviointikriteerit, tyydyttävä (1-2)

1: 50% - 59% from the total points of the assignments

2: 60% - 69% from the total points of the assignments

Arviointikriteerit, hyvä (3-4)

3: 70% - 79% from the total points of the assignments

4: 80% - 89% from the total points of the assignments

Arviointikriteerit, kiitettävä (5)

90%- 100% from the total points of the assignments

Lisätiedot

Use of AI in assignments and final project: USE OF AI REPORTED.
AI can be used in the creation of outputs, but student must clearly report its use. Failure to disclose the use of AI will be interpreted as fraud. The use of AI may affect to assessment.

-----------------------------------------------------------------------------------
Qualifications and Prerequisites:
Before taking an "Introduction to Data Engineering with Python" course, students typically need a foundational understanding of several key areas. Here are the mandatory and recommended prerequisite courses and topics.

1. Mandatory Prerequisites: 
1.1. Programming:
1.1.1. Introduction to Programming: Knowledge of programming fundamentals,
including concepts like variables, loops, conditionals, and functions. 
1.1.2. Python Programming: Familiarity with Python, including basic syntax, data
types, control structures, and function and modules
1.1.3. Error Handling 
1.1.4. Object-oriented programming (OOP) 
1.1.5. Data Manipulation: Skills in using Pandas library including DataFrames and
Series, reading, writing, filtering, and transforming data
1.2. Databases: Knowledge of how databases work, including concepts like tables, keys, normalization, and indexing.


2. Recommended Topics:
2.1. Algorithms and Data Structures: Basic understanding of algorithms and data
structures such as arrays, lists, trees, and graphs, which are crucial for data
processing
2.2. Having the fundamental knowledge of cloud services or passing the Cloud Services
Course in TUAS (Lecturer: Ali Khan)
2.3. Version Control Systems: Basic understanding of tools like Git for version control.
2.4. Basic Algebra and Calculus: Fundamental math skills to handle data transformations
and calculations.
2.5. Statistics: Understanding of basic statistical concepts like mean, median, standard
deviation, and probability distributions.
2.6. Being familiar with VirtualBox and Ubuntu

Siirry alkuun