Santosh Beora

Santosh Beora

GCP Data Engineer

Followers of Santosh Beora5000 followers
location of Santosh BeoraPune, Maharashtra, India

Connect with Santosh Beora to Send Message

Connect

Connect with Santosh Beora to Send Message

Connect
  • Timeline

  • About me

    Data Engineer @Fractal | Ex TCS | Big Data Engineer | GCP Data Engineer | GCP PDE & ACE CERTIFIED | GCP |Python | SQL | Hadoop | Spark ( Pyspark ) | SparkSQL | Hive | ETL/ELT | Airflow | Dataproc | Dataflow | BigQuery

  • Education

    • KG Engineering Institute

      2015 - 2018
      Diploma Electronics and Telecommunication engineering 9.2 (87.8%)

      Activities and Societies: volleyball

    • Kalyani Government Engineering College

      2018 - 2021
      B.Tech Electronics and Communication Engineering 8.76 ( 80.1%)
  • Experience

    • Tata Consultancy Services

      Oct 2021 - Jan 2024
      GCP Data Engineer

      •PROJECT 2 : DATABASE MIGRATION FROM IBM NETEZZA TO GCP BIGQUERY •Tasks/Responsibilites:-• Orchestrated ETL processes and data batch pipelines using Apache Airflow on Cloud Composer within GCP.• Contributed to the transformation and processing of substantial datasets through the implementation of BigQuery Stored Procedures.• Managed the seamless file ingestion and loading of data into target tables via a proprietary framework (TDF), leveraging multiple GCP services, including Composer, Airflow, Dataflow, and BigQuery.• Implemented table partitioning and clustering in BigQuery to optimize query performance and reduce storage costs by approximately 20-30%, enhancing overall efficiency in data processing workflows.• Proactively monitored and addressed DAG errors, utilized log explorer for troubleshooting, and resolved Dataflow job errors to ensure the successful execution of tasks.• Ensured data quality and integrity by conducting comprehensive data validation using the Data Validation Tool (DVT).• PROJECT 1 : ON-PREMISE DATABASE MIGRATION FROM ORACLE TO GCP BIGQUERY •Tasks/Responsibilites:-• Successfully designed and executed ETL processes and data batch pipelines using Apache Airflow on Cloud Composer within the GCP environment.• Expertly processed and transformed large-scale datasets by harnessing Spark SQL on Dataproc clusters.• Demonstrated proficiency in script optimization, converting Oracle SQL scripts into Spark SQL via Jupyter Notebook IDE, and ensuring data validation.• Optimized BigQuery tables by applying partitioning and clustering strategies, enabling efficient processing of terabytes of data and reducing storage costs by approximately 30%, while boosting query performance for large-scale analytics.• Managed daily responsibilities, including collaboration with BigQuery native tables, External tables, Views, and Airflow (Composer) DAGs.• Conducted comprehensive data quality checks using BITS. Show less

    • Fractal

      Feb 2024 - now
      Data Engineer

      PROJECT: PEGA MESSAGE MIGRATION TO GCPTask/Responsibilities:1. Transferred messages between business and issue groups as per client requirements using the Pega platform.2. Created and tested messages, actions, and treatments in Pega, ensuring end-to-end (E2E) message flow and accuracy.3. Developed and ran Python scripts on Dataproc to process customer data, generating Pega PAR files with targeting and exclusion rules.4. Performed UAT on HSBC HK Cert APP, using Insomnia to validate message details and documenting results. Show less

  • Licenses & Certifications

    • Programming Fundamentals

      Coursera
      May 2020
      View certificate certificate
    • Introduction to Programming Using Java

      Programming Hub
      May 2021
    • Google Cloud certified - ASSOCIATE CLOUD ENGINEER

      Google
      Jan 2022
      View certificate certificate
    • Programming for Everybody (Getting Started with Python)

      Coursera
      Feb 2020
      View certificate certificate
    • PCB design and Fabtrication

      CSIR-Central Mechanical Engineering Research Institute (CMERI)
      Oct 2017
    • Spoken English and Communication Skill

      MGUniversity
      Apr 2014
    • Machine Learning Using python

      Globsyn Business School
      Feb 2020
    • Interview Skills

      TCS iON
      Apr 2020
      View certificate certificate
    • Google Data Engineering Training

      Edureka
      May 2022
      View certificate certificate
    • Python (Basic) Certificate

      HackerRank
      May 2022
      View certificate certificate
  • Honors & Awards

    • Awarded to Santosh Beora
      3 X Learning Achievement Awards Tata Consultancy Services May 2022 I've been given the Learning Achievement Award in recognition of my great contribution to the company.
    • Awarded to Santosh Beora
      Special Initiative Award Tata Consultancy Services Jan 2022 I've been given the Learning Achievement Award in recognition of my remarkable contribution to the organisation.