Data Enthusiast
As a Data Engineer, I create robust and scalable data architectures, optimize queries, and ensure data integrity across large-scale systems. My expertise includes relational & NoSQL databases, real-time data pipelines with Apache Kafka, Big Data processing with Spark, and automated deployments in Kubernetes. I’ve developed intelligent solutions such as an email filtering system for secure communications and DiabetoWeb, a medical decision-support tool for diabetes risk assessment, while delivering efficient solutions for business intelligence and predictive analytics.
- ⚡ Optimized ETL pipeline with Apache Spark for real-time IoT analytics
- 📑 NLP model to classify legal documents
- 🔹 MLOps with MLflow and Kubeflow
- 🔹 BigQuery query optimization
- 📊 Open-source data visualization projects
- 🧠 LLM model benchmarks
- 🚀 Advanced orchestration with Apache Airflow
- 📦 Feature store implementation
- 🧹 Data cleaning with Pandas
- 📐 Best practices in data modeling
- 💾 Database Design & Optimization : Relational & NoSQL schema design, query tuning, indexing, normalization
- 🔄 ETL & Data Processing : Apache Spark, Kafka, Airflow, data cleaning, transformation, pipeline orchestration
- 🤖 Machine Learning & AI : NLP models, clustering, classification, predictive analytics (Random Forest, Logistic Regression)
- ⚙️ Automation & DevOps for Data : Kubernetes, Docker, CI/CD for scalable deployments
- 📊 Data Visualization & BI : Tableau, Power BI, dashboard creation, KPI monitoring
speciality | Technologies |
---|---|
Data Engineering | |
Data Science | |
Others (Dev & Frameworks) |