ADA511 Data Science and data-driven engineering
Course description for academic year 2023/2024
Contents and structure
Data science plays an increasingly important role in the engineering and study of complex systems and processes, contributing to the growing focus on software engineering also in traditional engineering disciplines and in industry. The course emphasizes the role of data science across multiple engineering disciplines and for a variety of tasks - such as predictive monitoring and maintenance, advanced manufacturing, optimization of production processes - and expounds what it means for predictions and solutions to be "data-driven".
The course will introduce the basics of inference & forecast, probability theory, decision-making, and information theory, focusing on the role that data plays in them. These basics will clarify how particular characteristics of the data can lead to success in solving an engineering task, but also to failure if misused. The course will foster familiarity with these characteristics and explain how they can be assessed by a variety of methods, from visual to analytical. Some subtle, common pitfalls in such assessment will be analysed in detail, with concrete examples from different engineering fields and possibly through guest lectures by specialists.
The general method behind data-driven inference and optimization will be explained, and then several specific models, especially from machine learning - for instance deep networks, random forests, support-vector machines - will be introduced as approximations of the general method, which are valid only when specific conditions apply. One of the main goals of the course is to make the student understand the assumptions underlying widely used models, be able to assess whether a model is suited to the data available and task of interest, and choose the optimal model. This goal will be achieved with concrete and realistic hands-on examples and assignments.
An overview of how possible deficiencies in the data, such as biases, can be corrected, and of more advanced inference tasks, such as time series, will also be given.
The course will use a variety of programming languages, from Python and Matlab to R, and their ecosystems of packages suited to different data and applications.
• can explain the role of data science in the study of complex systems and processes from science and engineering
• knows basic concepts and techniques from data science, including data exploration and visualization, probability theory, decision theory
• can give concrete examples of extract, transform and load pipelines relevant to engineering use-cases
• understands which criteria are important to suitably couple data, models, and goals
• has a knowledge of various sources of data in engineering systems
The student can...
• analyze and assess the applicability of data-science sources and techniques inconcrete engineering applications
• apply data science techniques to study systems and processes arising in engineering
• design data-driven projects within engineering
• effectively communicate results obtained using data science and data-driven modelling to specialized and general audiences
The student can...
• work efficiently together with others in addressing data-science problems
• present problems, analyses and conclusions based on data-analysis and logical arguments to peers, both in writing and orally
Lectures, case studies, group presentations and group supervision related to course project work, guest lectures from specialists.
Compulsory learning activities
One obligatory assignment. In order to take the examination, the assignment must be approved.
1. Group project work with final report, counts for 50% of the final grade
2. Oral examination based on the final report, counts for 50% of the final grade.
Both parts must be passed to obtain a final grade.
Grade scale A-F, where F is fail.
Examination support material
All support materials are permitted.More about examination support material