Trefoil Academy
Knowledge Graphs Track
Knowledge graphs for AI and applications to Data Managemwent
Dr. Hicham Zmarrou: Head of Trefoil Academy
Cecilie Martinsen: Head of Governance and Data Quality Management
Jesper Doucet: Head of Data Architecture and Data Life Cycle.
The Semantic Web technologies and Knowledge Graphs form the basis of the current development of the Web. At the same time, they have gained significant attention in industry to address use-cases that require exploiting and exchanging heterogeneous and large-scale collections of data. For example, adding semantic understanding to tabular data will be very valuable for data integration, data cleaning, data mining, machine learning and knowledge discovery tasks.
The objective of this track is to train the new generation of data scientists with the necessary skills in semantic technologies to meet today’s demands from industry, where data (and knowledge) scientists are expected to semantically orchestrate diverse types of data sources.
This module gives a practical combination of web-technology, data management technology, knowledge representation and artificial intelligence. More specifically, we aim at covering the topics below.
Who This Training Is For?
We developed this training specifically for Senior Data Scientists, Data Engineers, ML Engineers and DevOps Engineers, that are operational people directly facing the task of scaling machine learning (ML) applications. Given that MLOps is a new field, we developed this courses as a guide for creating a successful MLOps technical environments, taking into account the organizational and the governance challenges
involved.
With the help of the MLOps Track, you get an efficient entry and deep hands-on experience into the design and operationalization of Machine Learning Systems. You will receive answers to many common questions, like:
- The main principles of MLOps; master advanced Python for ope-rationalization, clean code & software design; reminders on storage and architecture choices. The major pitfalls to avoid when scaling up.
- Understanding and getting acquainted with to the operationalization methodologies of a data project, the principles of DevOps in Data Science and take control of the tools necessary for the
deployment of an ML system. - Understand the management of the life cycle of models with the handling of the MLflow framework and data drift detection techniques providing post-production monitoring.
- Understanding the idea of governance as a set of controls to ensure that the business delivers on its responsibilities to all stakeholders, from shareholders and employees to the public and national governments. The responsibilities include financial, legal, and ethical, and are all underpinned by the desire for fairness.
The MLOps Track
- Part time 1 day / week, 13 consecutive weeks and 150 hours
- Four training blocks
- 150 hours course combining different theoretical, practical and project formats
- Insights from industry experts
- An acc
- Certificate of participation from Trefoil Academy
- Discounts for groups and for Trefoil Community members
- Your investment: € 8180 (this includes 21% VTA )
Andrea Mörnas
Machine Learning Engineer | Graduated in 2022
Multimedia International N.V.
Haile Duck
Senior data scientist | Graduated in 2021
Alliander N.V.
Michele Morone
DevOps Engineer| Graduated in 2020
Heineken N.V.
Structure of the MLOps Tack training courses
Machine Learning systems are not only a complex collection of technologies and data algorithms. They are a complex interaction between the technology, data algorithms from one side; and governance, organization and social acceptance from the other side. All too often, ML projects start by trying to implement a particular technical approach, and, not surprisingly, front-line managers and employees don’t find it useful, so there’s no real adoption and no ROI. This is why we believe that ML systems should be a approached as sociotechnical systems as they are recognized to consider the dynamic combination of the technology including data, governance, people and process. In parallel, the sociotechnical approach provides guidance on how one deals with the ethical issues in ML Systems and how those issues are defined.
- Module 1: Fundamentals, Introduction to the Trefoil sociotechnical MLOps framework, ML Systems Design
- Module 2: DataOps, Advanced Python, DevOps for Data Science, Kubernetes for beginners, Airflow
- Module 3: ModelOps, Model Life-cycle management, MLflow
- Module 4: MLOps and Model Governance, Data & ML regulations (GDPR, EU Artificial Intelligence Act), Explainability and Auditability in ML, Agile philosophy and methodologies
Fundamentals
Trefoil MLOps framework
- What is MLOps?
- Why and when to use MLOps? Machine Learning use cases classification.
- MLOps principles and best practices: use cases portfolio management, individual use cases management.
- How to implement MLOps principles into your project.
ML Systems Design
- Solution architecture for the ML application
- Dataflow through the ML application
- What components will be involved in the ML application?
- Other considerations:
- Performance efficiency
- Scalability (do we need a databricks cluster, k8s cluster, or a VM is enough)
- Cost optimization
DataOps
Advanced Python
- Why not underestimate the quality of his code?
- Principle of object-oriented programming in Python
- Why write clean code?
- Clean Code and python conventions (PEP)
- Python virtual environments
- Software Design and design pattern in data science
- Learn to correctly architect your code (packaging)
DevOps for Data Science
- Collaborative Machine Leaning
- Know how to write unit tests in Python
- Use an orchestrator (handy: gitlab-ci) to automate test execution (CI) and code deployment (CD)
- Fundamental concepts of containerization of applications and useful commands in docker
Kubernetes
- Components of a Kubernetes cluster (pod, volume, configmap, secrets)
- Main functionalities: how to ensure the resilience of apps? Notions of replication, services, deployments, load balancing.
- Scalability: how does Kubernetes adapt to demand? Advice on the choice of the size of the nodes, notions of pod, autoscaler and cluster autoscaler
- Conclusion: in which situations should Kubernetes be used? Discussion of serverless and the microservices architecture that interact with each other.
Airflow
- Understand what Airflow is and in which contexts it is used
- Master the different concepts (DAG, DAG run, task, task instance, operator etc.)
- Understand the architecture of Airflow (Webserver, Scheduler, Executor, Metadata Database)
- Know how to define a DAG: parameterization, definition of tasks, dependencies between tasks
- Manipulate the Airflow graphical interface
- Appropriate certain advanced concepts (TriggerDagRunOperator,Connections, Hooks)
MLOps
Model lifecycle management
- What is the model lifecycle?
- How to identify the life cycle of an use case?
- How to organize a life cycle project team?
- Why standard performance metrics are not sufficient in production ?
- How to monitor a model in production?
- How to reduce the maintenance cost of a model by production ?
- How to manage the evolution of a model in production?
- How to minimize the risks of a model in production?
MLFlow
- What is MLflow?
- How to ensure the reproducibility of a modeling chain?
- How to version predictive models?
- How to navigate in a history of models via an interface chart ?
- How to package and distribute predictive models?
- How to deploy a predictive model on a server?
- How do MLflow and Docker interface?
- How to manage a cloud deployment?
- Measurement and detection of data drift?
MLOps and Model Governance
Principles of GDPR
- The fundamental principles and main actors of the GDPR
- The risks and issues of the protection of personal data
- The heavy sanctions of the CNIL
- The risks of personal data breaches
- Anonymization and pseudonymization of personal data
- The concept of privacy by design
- Simulation workshops and testing of acquired knowledge
The AI Act: European law on artificial intelligence (AI)
- The AI risk categories
- Prohibition of unacceptable AI practices
- Regulation of high-risk AI systems
- Conformity assessment
- Transparency obligations for potentially deceptive AI systems
- Ex post market surveillance
- Governance and penalties regime
Explainability
- Definition of explainability and why it is necessary
- How explainability fits into a Machine Learning Project
- Overview of classical machine intelligibility methods Learning
- Focus on SHAP and Shapley values
- Openness to the particular case of Computer Vision models and by NLP
The AI Act: European law on artificial intelligence (AI)
- Facilitating Fairness throughout the Machine Learning Lifecycle
- Ethical Matrix for Advanced Analytics & AI
- Choosing the right Fairness Metrics
- GDPR, calculating and mitigating bias
THe capstone project
Building a Real-time Recommendation API
Scenario: A media organization wants to provide movie or video recommendations to its users. By providing personalized recommendations, the organization meets several business goals, including increased click-through rates, increased engagement on its website, and higher user satisfaction. This solution is optimized for the retail industry and for the media and entertainment industries.
Benefits of joining the training
- Build up the case for state-of Machine Learning Operationalisation through conceptual foundations, hands-on training and ready to use templates.
- Trainers with a solid academic and practical background in data science, data engineering and ML engineering.
- Experts from different size companies will share practical insights and report on challenges and solutions from their daily practice
- The training is flexible in choosing the organization’s needs, when necessary. We may agree to teach advanced Spark instead of Advanced Python or Kubeflow instead of MLflow.
- In small groups you directly apply the newly learned concepts in a well chosen capstone project
- Get inspired by questions from other participants and by confronting your ideas with practices from the others – coming from various industries and markets
- Strengthen your motivation and nurture your thinking forward – cross-check your MLOps roadmap with other practitioners
- Extend your network! You will enjoy plenty of time for personal exchange during coffee and lunch breaks.
Participation Fees
- The participation fee is € 8180 (exclusive of VAT) for the entire MLOps Track training. If more than one participant from the same company attend the training, each additional participant receives a 15% discount.
- Members of the Trefoil Community pay reduced fee of 15%.
- Included in the fee are lunches, catering during breaks, the dinner on the first evening, and all necessary course documents. Travel expenses and accommodation are not included in the fee.
Reimbursement Policy
- In the event of cancellation of participation no later than twenty-eight days prior to the beginning of the course, the fee will be fully refunded. If participation is cancelled at a later date or registered participants do not attend, Trefoil invoices a cancellation fee of € 500.
- Please note that the program is subject to change without prior notice.