K Naveen

K Naveen

ETL Developer

Followers of K Naveen705 followers
location of K NaveenFrisco, Texas, United States

Connect with K Naveen to Send Message

Connect

Connect with K Naveen to Send Message

Connect
  • Timeline

  • About me

    Data Engineer at Alexion Pharmaceuticals - ⭐️ AWS Certified ⭐️

  • Education

    • Sri Mittapalli College of Engineering, NH-5, Tummalapalem,PIN-522 233 (CC-U9)

      2012 - 2016
      Bachelor's degree Computer Science
    • University of North Texas

      2022 - 2023
      Master's degree Information Systems and Technologies
  • Experience

    • Citi India

      Jul 2016 - Mar 2018
      ETL Developer

      • Implemented end-to-end ETL processes to ensure seamless data flow and integration across the organization's systems.• Utilized ETL best practices to design and optimize data extraction, transformation, and loading workflows.• Collaborated with cross-functional teams to define and implement data integration strategies for improved efficiency.• Conducted performance tuning of ETL processes to enhance data processing speed and reduce latency.• Ensured data consistency and accuracy by implementing robust error handling and data validation mechanisms within ETL workflows.• Worked closely with stakeholders to understand data requirements and implement ETL solutions aligned with business needs.• Implemented automation for business processes such as bulk inventory loading and order clearance using scheduler jobs.• Demonstrated proficiency in Descriptive Programming, file system operations, and Excel functions.• Played a key role in project planning, automation efforts, and estimation processes.• Conducted thorough code reviews in adherence to the client's technical standards for applications developed by other teams.• Designed and developed Informatica mappings for extracting, transforming, and loading source data into the master schema.• Utilized various Power Center transformations including Source Qualifier, Aggregator, Filter, Router, Sequence Generator, Lookup, Rank, Joiner, Expression, Stored Procedures, and Update Strategy to meet business logic requirements.• Implemented parameterization of hard-coded values in Informatica for enhanced flexibility.• Leveraged UNIX commands for system navigation, file content checks, and permissions management.• Conducted in-depth analysis of business requirements and existing source systems for effective project understanding.• Created DDL scripts (tables, views, indexes) based on the physical data model specifications.• Developed various PL/SQL objects. Show less

    • ICICI Bank

      Apr 2018 - Jan 2019
      Data Analyst | ETL

      • Collaborated with stakeholders to understand data extraction needs from the Help Desk Ticketing System.• Defined requirements for ad-hoc reports, charts, and graphs for comprehensive analysis.• Ensured robust data consistency by designing a systematic approach for integrating data from various source systems, including flat files, XML, and SQL Database.• Leveraged SDLC principles to design and implement Column Store indexes on dimension and fact tables in the OLTP database, optimizing read operations.• Utilized SQL Server Integration Services (SSIS) for seamless integration and analysis of data from diverse information sources.• Developed reports and report models using SQL Server Reporting Services (SSRS) to facilitate user-friendly report building.• Implemented ETL processes using a robust combination of AWS services, including AWS Glue for data cataloging, transformation, and loading.• Leveraged AWS Lambda for serverless computing, automating tasks such as data loading/unloading and other data processing functions.• Utilized AWS Data Pipeline to orchestrate and automate the movement and transformation of data between different AWS services.• Designed and implemented a scalable Data Lake architecture using Amazon S3 for efficient storage of structured and unstructured data.• Utilized AWS Glue for data discovery and schema evolution, ensuring flexibility in handling diverse datasets.• Employed AWS Lambda for serverless computing, automating tasks such as Kubernetes deployment, Snow SQL script execution, and other continuous data flow processes.• Automated workflows and tasks using AWS Step Functions for orchestrating various operations, including Amazon SageMaker tasks• Integrated Amazon Redshift for high-performance querying, employing Spectrum for seamless querying of data stored in Amazon S3• Optimized and fine-tuned the Redshift environment, ensuring queries performed significantly faster for reporting tools like Tableau and SAS Visual Analytics Show less

    • Mitsubishi Motors Corporation

      Feb 2019 - Feb 2020
      Data Engineer

      • Analyzed business requirements and translated them into logical specifications.• Designed and customized data models for a Data Warehouse, supporting real-time data from multiple sources.• Developed ETL flows and Source-to-Target mapping for efficient data loading into Snowflake Cloud Data Warehouse.• Automated data loading/unloading between Snowflake Data Warehouse and AWS S3 using cloud-based task flows.• Configured a test environment on AWS and managed Hadoop log files.• Authored Snow SQL scripts in Snowflake Data Warehouse to support business reporting needs.• Created Power BI reports with various visualizations for comprehensive data representation.• Implemented Spark Kafka streaming for ingesting data from Kafka into the Spark pipeline.• Architected ETL transformation layers and crafted Spark jobs for efficient data processing.• Employed Microservices and Postman to interact with Kubernetes DEV and Hadoop clusters.• Deployed various Microservices (Spark, MongoDB, Cassandra) in Dockerized Kubernetes and Hadoop clusters.• Conducted comprehensive analysis of compiled data from various sources to derive actionable insights.• Utilized DAX functions for logical data visualization and Power Queries for data transformation.• Engineered efficient, scalable ETL processes for loading, cleansing, and validating data.• Engaged in the full software development lifecycle using Agile methodologies.• Integrated Kafka with Spark Streaming for real-time data processing.• Scripted and devised an indexing strategy for migrating to Confidential Redshift from SQL Server and MySQL databases.• Utilized AWS Data Pipeline, Stash Git-Bucket, Airflow, EMR, and Snowflake for efficient data processing and control.• Configured AWS EMR Clusters, employed Amazon IAM for access control, and used AWS Glue for ETL processes.• Optimized and fine-tuned the Redshift environment for faster query performance.• Designed and developed ETL processes in AWS Glue Show less

    • New York Life Insurance Company

      Mar 2020 - Dec 2021
      Data Engineer

      • Collaborated with partners to design scalable solutions on GCP using Dataflow, Databricks, and BigQuery.• Implemented data transformations, aggregations, and joins on large datasets using Google Cloud Storage, Dataproc, and Dataflow.• Integrated BigQuery with Data Studio for interactive visualizations and analytics dashboards.• Collaborated on solution design, outlining the architecture, and selecting appropriate GCP tools.• Implemented data transformations, aggregations, and joins using GCP tools for efficient processing.• Automated tasks, such as Kubernetes deployment and Snow SQL scripts, for continuous data flow.• Validated the developed pipelines, ETL processes, and orchestration mechanisms for accuracy and reliability.• Executed migration strategies, creating architecture and data flow diagrams, and estimating cluster sizes.• Configured and deployed orchestration pipelines for data loading and Cassandra management.• Contributed to Agile development by documenting Data Governance policies, Data Dictionary, and metadata.• Documented AWS Glue services, Redshift queries, and Spark jobs for reference and knowledge sharing.• Monitored and optimized data processes, addressing any issues in the implemented solutions.• Ensured continuous improvement by refining ETL processes and adapting to evolving requirements.• Automated Snow SQL scripts for continuous data load/unload to Snowflake and AWS S3.• Used AWS Glue services, S3, EMR, Lambda, RDS, and Athena for data processing and transformations.• Developed ETL processes in AWS Glue to migrate Campaign data into AWS Redshift.• Utilized AWS Redshift, S3, Spectrum, and Athena for querying large data on S3, creating a Virtual Data Lake.• Normalized data, performed cleansing, datatype modifications, and transformations using Spark, Scala, and AWS EMR.• Developed Sqoop jobs for importing data from Oracle to AWS S3. Show less

    • Alexion Pharmaceuticals, Inc.

      Aug 2022 - now
      Data Engineer

      • Successfully migrating from on-premises systems to Google Cloud Platform (GCP), employing Apache Camel and Stream Sets for seamless data transfers. • Transformed and loaded data from Oracle 19c to a Snowflake Data Warehouse.• Executed on-premises data migrations to GCP, utilizing Apache NiFi and Oracle 19c. • Architected multiple Data pipelines, end-to-end ETL and ELT processes for Data ingestion and transformation in GCP.• Ensured optimized Snowflake virtual warehouse sizing for various workloads.• Developed migration plans and selected GCP services for hosting Oracle databases, establishing a robust GCP Data Lake with Google Cloud Storage, BigQuery, and BigTable.• Automated workflows using Apache Airflow for Change Data Capture (CDC) services, providing efficient and timely data processing.• Built data integration solutions with Oracle Data Integrator (ODI) and customized Talend components, optimizing Apache NiFi pipelines. • Utilized Apache NiFi for diverse data ingestion workflows into GCP.• Leveraged Kafka and Spark Streaming for data ingestion workflows into GCP.• Implemented scalable data solutions with Hadoop, including MapReduce programs and ETL pipelines in Databricks using Python. • Created Databricks Spark Jobs with PySpark for various operations, extracted and analyzed data from data lakes and enterprise data warehouses. • Utilized SQL, Scala, Python, and Apache Spark for insightful reports.• Proficient in GCP, BigQuery, SQL, Python scripting, PySpark, Airflow, Kubernetes, Terraform, and data modeling. • Embraced infrastructure as code (IaC) and cloud shell usage throughout the development lifecycle.• Followed the complete SDLC process, including code reviews, source code management, and build processes. • Leveraged Git for version control, ensuring effective collaboration and tracking changes.• Architected multiple end-to-end ETL and ELT processes for data ingestion and transformation in GCP. Show less

  • Licenses & Certifications

    • AWS Certified Solutions Architect - Associate

      Amazon Web Services (AWS)
      Jul 2023
      View certificate certificate
  • Volunteer Experience

    • Troop Captain

      Issued by The Bharat Scouts and Guides
      The Bharat Scouts and GuidesAssociated with K Naveen