Course info
The course centers on Azure Databricks, covering its architecture,
collaborative workspace, data ingestion, transformation using Spark, and
integration with Delta Lake for scalable and reliable data pipelines.
Students learn how to orchestrate workflows, manage compute resources,
and deploy machine learning models in the Databricks environment.
Broader cloud data engineering concepts are discussed, including storage
solutions (e.g. Azure Data Lake), data movement (e.g. Data Factory),
and monitoring. Complementary overviews of comparable services in AWS
(e.g., Glue, Redshift, SageMaker) and GCP (e.g., BigQuery, Dataflow,
Vertex AI) provide a cross-platform perspective, helping students
understand trade-offs in cloud service selection. The course combines
theoretical grounding with practical labs, giving students hands-on
experience building analytics solutions end-to-end in the cloud.
- Teacher: Damianos Chatziantoniou