LATEST NEWS
Industry Academia Collaboration
Data Science & Engineering Lab

Department of Software Systems (PG)

From raw data to real insight. Engineer the future of information.

About the Industry Partner — KGM

KGM is a specialised industry training organisation with expertise in software development and data-focused education. Their data engineering trainers bring active field experience in pipeline design, Hadoop/Spark ecosystems, ETL workflows, and cloud data infrastructure — delivering this directly into the M.Sc. Software Systems lab at KGCAS across four active semesters, followed by internship facilitation and placement support.

About the Data Science & Engineering Lab

A postgraduate-level, industry-integrated lab at KGCAS, exclusively for M.Sc. Software Systems students, run entirely by KGM industry professionals. Students progress through Python, SQL, big data technologies, data warehousing, and ETL pipeline engineering — completing a production-grade capstone project, an industry internship, and receiving dedicated placement support via KGM's employer network.

Why This Course?

Industry trainers deliver every session, bringing real data engineering field experience

Structured curriculum: Python & SQL → Hadoop → Spark → Kafka → Cloud Pipeline Engineering

Postgraduate pace — designed for learners entering technical data roles competitively

Graduates leave with real pipeline projects, an internship record, and industry-aligned certification

KGM's network connects students directly with data engineering and analytics employers across India

Innovation Labs at a Glance

Semester Focus Area
Semester I Python Programming & Database Fundamentals
Semester II Big Data Technologies
Semester III Data Warehousing and Data Pipeline Engineering
Semester IV Applied Data Engineering
Semester V Industry Internship
Semester VI Placement Support

Course curriculum

I
Semester I
Python Programming & Database Fundamentals
Python for data engineering — scripting, data structures, file handling, and libraries
NumPy and Pandas for data manipulation and preprocessing
Relational databases — SQL, schema design, normalisation, and advanced querying
Python-to-database integration and connectivity
Data quality concepts — validation, cleaning, and consistency checks
II
Semester II
Big Data Technologies
Introduction to big data — the 5 Vs and the big data ecosystem
Hadoop framework — HDFS architecture, MapReduce, and distributed storage
Apache Spark — RDDs, DataFrames, Spark SQL, and Spark Streaming
Kafka fundamentals — message queuing, event streaming, and data ingestion pipelines
Cloud platforms for big data — AWS, GCP, or Azure data services
Big data use cases — log processing, clickstream analysis, and large-scale ETL
III
Semester III
Data Warehousing and Data Pipeline Engineering
Data warehousing concepts — star schema, snowflake schema, and dimensional modelling
ETL pipeline design —extraction, transformation, loading, and orchestration
Apache Airflow — workflow automation and pipeline scheduling
Data lakehouse architecture— combining flexibility with structure
dbt (data build tool) — transformation workflows and data modelling best practices Pipeline monitoring, logging, and error handling
IV
Semester IV
Applied Data Engineering
End-to-end data engineering project — scoping, designing, building, and deploying a complete pipeline
Real-time data processing — streaming pipelines and event-driven architectures
Data governance, lineage, and cataloguing — managing data at scale
Cloud-native data engineering— deploying pipelines on cloud infrastructure
Introduction to MLOps— serving machine learning models through engineered pipelines
Capstone project — a production-grade data engineering solution for a real-world data problem
V
Semester V
Industry Internship
Placement at a technology organisation, data engineering team, or analytics firm
Application of pipeline, big data, and cloud data skills in a real workplace
Mentored and monitored by KGM trainers and college coordinators
VI
Semester V
Placement Support
Resume building, portfolio preparation, and technical profile development
Mock technical interviews — system design, SQL challenges, pipeline architecture, and HR rounds
Communication and workplace readiness training at the postgraduate level
Active placement facilitation through KGM's industry network

Certifications & Outcomes

Students completing the lab are prepared for industry-relevant certifications in data engineering, cloud platforms, and big data technologies, communicated at the start of each academic year.

Roles students are equipped for:

01

Data Engineer

02

Big Data Engineer

03

ETL / Pipeline Developer

04

Cloud Data Engineer

05

Data Infrastructure Analyst