Brahma's Personal Blog

Writing about Data Engineering problems and SQL

View the Project on GitHub brahma19/blog

About Me

Hello! I’m a Data Engineer with a strong focus on Google Cloud Platform (GCP) and Hadoop services. I’m passionate about big data and cloud computing, solving complex data challenges with cutting-edge technology. I thrive as a problem solver who loves to work with data.

GCP Certifications

Leadership

Strategic Impact

Budget Management

Passion

In my free time, I enjoy playing chess. You can find me on Chess.com or Lichess playing blitz games.

Expertise

Projects

1. Real-Time Data Pipeline with GCP

Designed and implemented a real-time data pipeline using Google Cloud Pub/Sub, Dataflow, and BigQuery to process and analyze streaming data.

2. Hadoop Cluster Optimization

Optimized a large-scale Hadoop cluster, improving performance and reducing costs through resource allocation fine-tuning and implementing best practices.

3. Data Lake on GCP

Migrated on-premises Hadoop workloads and data to GCP, integrating with BigQuery and Dataproc for analytics and machine learning.

4. BigQuery ELT for Wireless Customers

Developed an ELT process in BigQuery capable of processing 70 TB of data per hour within a 10-minute SLA, aggregating session-level information and calculating hourly KPIs for wireless customers.

5. Cost Optimization with Cloud FinOps

Implemented cost-saving measures across GCP services, reducing project-level costs by 45% by modernizing Spark/Hadoop workloads from Dataproc Long Running clusters to Dataproc Serverless and native BigQuery ELT, leveraging FinOps principles.

Contact Me

Feel free to reach out for collaboration or if you have questions about data engineering on GCP and Hadoop!


This page was generated using GitHub Pages.