Reza Rajan
Data Engineer - AWS Specialization
Experience
myKaarma - Data Engineer (Cloud)
Jul 2023 - Jan 2024
- Designed and contributed to a cloud data lakehouse architecture on AWS to centralize and scale analytics and reporting.
- Built and maintained ETL/ELT pipelines using Python, Apache Airflow, AWS S3, AWS Glue and Redshift to ingest, transform and store cleaned event and transactional data.
- Automated scheduled customer report generation and delivery using AWS Step Functions and Lambda with SLA monitoring and alerting.
- Delivered a PoC batch processing pipeline using Airflow + PySpark that reduced message data processing time by ~70% compared to a stored-procedure based approach; validated performance and reliability metrics before handoff.
myKaarma - Data Engineer (Analytics)
Feb 2022 - Jul 2023
- Collaborated with external stakeholders to capture reporting requirements and translate them into actionable technical specifications for the data team.
- Developed stored procedures for the reporting warehouse, optimizing several with indexing strategies to achieve up to 50% reduction in query execution time.
- Integrated multiple operational and external data sources into Power BI, enabling standardized dashboards that reduced manual reporting workload for the analytics team.
- Partnered with engineers to improve data quality checks and streamline ETL processes feeding the reporting warehouse.
Project Experience
Data Pipeline in AWS using Apache Airflow
Built a modular, production-grade ETL pipeline in Apache Airflow on AWS that supports reusable tasks, backfills, monitoring, and automated data quality checks.
Data Warehouse using AWS Redshift
Developed an ETL process to extract data from S3, stage in Redshift, and transform into dimensional star-schema tables for downstream analytics.
Data Modeling with Apache Cassandra
Designed and implemented an Apache Cassandra data model optimized for query performance and analytics use cases.
Certifications
AWS Certified Data Engineer
Amazon - In-ProgressData Engineer
Udacity - 2024Self-Driving Car Engineer
Udacity - 2021Data Analyst
Udacity - 2020Education
Bsc. Mechanical Engineering
University of Waterloo2015- 2020Technical Skills
- Core Competencies
- etl/elt, data lakehouse, pyspark, apache airflow, dbt, kafka, aws s3/glue/redshift, lambda, step functions, sql, data modeling, performance tuning, monitoring (grafana/prometheus), ci/cd for data pipelines
- Programming Languages
- python, golang, sql, bash, nix
- Data
- postgresql, mariadb, sqlite, redshift, influxdb, cassandra, mongodb, s3, flink, lambda, dbt, airflow, kafka, spark, elasticsearch, etl/elt
- Visualization
- grafana, power bi, looker studio, plotly
- Data Libraries
- pandas, scipy, numpy, tensorflow, pyspark
- Infrastructure
- linux, docker, kubernetes, aws, proxmox, ansible, terraform, opentofu
- GitOps
- git, github actions, ci/cd