Introduction to Data Engineering and AI Technologies (5 cr)
Code: MS00CN43-3003
General information
- Enrollment
- 02.05.2025 - 16.09.2025
- Registration for the implementation has begun.
- Timing
- 01.09.2025 - 21.12.2025
- The implementation has not yet started.
- Number of ECTS credits allocated
- 5 cr
- Local portion
- 5 cr
- Mode of delivery
- Contact learning
- Unit
- Engineering and Business
- Campus
- Kupittaa Campus
- Teaching languages
- English
- Seats
- 20 - 36
- Degree programmes
- Master of Engineering, Data Engineering and AI
- Master of Business Administration, Data Engineering and AI
- Teachers
- Golnaz Sahebi
- Course
- MS00CN43
Realization has 4 reservations. Total duration of reservations is 16 h 0 min.
Time | Topic | Location |
---|---|---|
Tue 16.09.2025 time 08:00 - 12:00 (4 h 0 min) |
Introduction to Data Engineering and AI Technologies MS00CN43-3003 |
EDU_3029
Lovisa muunto byod
|
Tue 07.10.2025 time 08:00 - 12:00 (4 h 0 min) |
Introduction to Data Engineering and AI Technologies MS00CN43-3003 |
EDU_3029
Lovisa muunto byod
|
Tue 04.11.2025 time 08:00 - 12:00 (4 h 0 min) |
Introduction to Data Engineering and AI Technologies MS00CN43-3003 |
EDU_3029
Lovisa muunto byod
|
Tue 02.12.2025 time 08:00 - 12:00 (4 h 0 min) |
Introduction to Data Engineering and AI Technologies MS00CN43-3003 |
EDU_3029
Lovisa muunto byod
|
Evaluation scale
H-5
Content scheduling
Course Overview
This project-based course provides a hands-on introduction to data analytics and machine learning, with an emphasis on self-directed learning and collaborative work. Projects can be completed in teams of two or three students. Throughout the course, students will engage with key machine learning concepts including data preprocessing, supervised and/or unsupervised learning, and model tuning. Learning is structured around a practical team project, developed progressively over four sessions.
Each session combines short teacher-led lectures with teamwork on project phases. Starting from the second session, students will present their project progress to encourage peer feedback and shared learning. The course promotes independent exploration, supported by focused guidance, to help students develop both theoretical understanding and applied skills in real-world machine learning tasks.
Course Outline and Schedule (Preliminary Plan)
Session 1: Introduction to the Course, Data, and Preprocessing
• Overview of course structure and project requirements
• Kick-off of project phase 1: Data preprocessing and visualization (Assignment 1)
Session 2: Machine Learning I
• Student presentations on Assignment 1
• Start of project phase 2: Supervised and/or unsupervised learning (Assignment 2)
Session 3: Machine Learning II
• Student presentations on Assignment 2
• Project phase 3: Tuning and optimizing ML algorithms (Assignment 3)
Session 4: Final Presentations and Discussion
• Student presentations on Assignment 3
• Group discussion covering the full project cycle and reflections
Objective
After completing the course, the student can:
- describe basic concepts and processes related to Data Engineering and AI
Content
- Data Engineering process
- Basics of AI
- Fields and evolution of AI
- Big data
- Basics of Machine Learning
Materials
Recommanded Course book:
Aurélien Géron.
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow
2nd Edition.
Publisher : O'Reilly Media; 2nd edition
(October 15, 2019)
The course book can be read in electronic form from our institution's eBook Central database.
Additionally, some other materials will be available via the learning environment (ITS).
Teaching methods
- Short lectures are delivered by the teacher (theory and practice)
- Self-study tasks (theory and practice)
- Practical Teamwork Project (including three assignments)
Exam schedules
No exam
Final Project
Pedagogic approaches and sustainable development
- The course consists of four theory–practice sessions, each combining concise lectures with hands-on exercises that support project development.
- A central pedagogic strategy is project-based learning, encouraging students to apply theoretical knowledge through real-world data analysis and machine learning tasks.
- Team-based projects foster collaboration, peer learning, and problem-solving, helping students develop key competencies such as critical thinking, communication, and responsibility.
- The course promotes sustainable development by emphasizing ethical data use, reproducible analysis practices, and the long-term value of collaborative, self-directed learning.
Completion alternatives
The practice works and exercises are mainly performed using Python, Python libraries for machine learning and data analytics, Jupyter Notebook.
Student workload
Contact hours:
- 4 times 3h theory and practice: 4 x 4h = 12 hours
Assignments for the final project: approximately 118 hours
Total: approximately: 130 hours
Evaluation methods and criteria
The course is graded on a scale of 0-5.
*
In order to receive an approved performance and pass the course, the student must receive an acceptable mark for the three assignments.
*
You can get at most 40, 30, and 30 points for each phase of the project (assignments 1-3). You can therefore get a maximum of 100 points from all phases of the project.
*
The passive students of the groups won't earn the assignment points.
Failed (0)
Less than 50% in assignments not passed.
Assessment criteria, satisfactory (1-2)
1: 50% - 59% from the total points of the assignments
2: 60% - 69% from the total points of the assignments
Assessment criteria, good (3-4)
3: 70% - 79% from the total points of the assignments
4: 80% - 89% from the total points of the assignments
Assessment criteria, excellent (5)
90%- 100% from the total points of the assignments
Further information
Qualifications:
Before taking an "Introduction to Data Engineering and AI Technologies" course, students typically need a foundational understanding of several key areas. Here are the prerequisite courses and topics:
1. Python Programming: Proficiency in basic Python syntax and programming constructs, understanding of Object-Oriented Programming (OOP) concepts.
2. Basic Linear Algebra: Understanding of vectors, matrices, and basic operations on them.
3. Statistics and Probability: Knowledge of descriptive statistics (mean, median, mode, variance, etc.), and familiarity with probability distributions.
Recommended: Data Management: Experience with data manipulation libraries such as Pandas for handling datasets. Data manipulation involves transforming data, cleaning it, organizing it, and preparing it for analysis.