Krishna Mohan Reddy Paleti

Dallas-Fort Worth Metroplex

View Mobile No.

View Email

1000 followers

Timeline
About me
Data Engineer | Big Data Architecture Specialist | Cloud Solutions Expert | ETL & Data Warehousing Professional | Agile Collaborator Driving Data-Driven Insights
Education
- Sri venkateswara university
  2017 - 2021
  Bachelor's degree engineering
- Texas a&m university-commerce
  2023 -
  Master's degree business analytics
  Relevant Coursework: Data Warehousing, Database Management, Data Visualization, Big Data Technologies, Machine Learning, Business Analytics Programming, Project Management and Agile Methodologies.
Experience
- Medanta
  May 2020 - Apr 2021
  Data engineer
  •Built and architected multiple data pipelines, including end-to-end ETL/ELT processes for data ingestion and transformation in GCP.•Utilized AWS EC2, S3, and Lambda to manage cloud infrastructure and perform complex data operations.• Processed and loaded data from Google Pub/Sub to BigQuery using Cloud Dataflow with Python and R for real-time data handling.•Deployed and maintained containerized data services using Docker and Kubernetes for horizontal scalability.•Leveraged Talend for designing and automating ETL jobs to integrate data from various sources into AWS and GCP.•Applied machine learning techniques and statistical modeling using Python, Pandas, and NumPy to improve customer behavior analysis and subscription predictions.•Integrated Hive and Spark for processing large datasets on AWS EMR clusters, optimizing query performance for analytics.•Utilized Hadoop Distributed File System (HDFS) to efficiently manage storage and retrieval of large datasets.•Automated deployment and resource provisioning with Terraform, optimizing the use of cloud infrastructure for cost efficiency.•Developed dashboards using Power BI for real-time monitoring of patient data and resource allocation across hospital operations.•Managed data replication and synchronization across cloud services using Kafka for low-latency data streaming. Show less
- Hdfc bank
  May 2021 - Dec 2022
  Aws data engineer
  •Implemented Apache Airflow for authoring, scheduling, and monitoring ETL data pipelines with custom DAGs.•Architected and built infrastructure for Google Cloud Platform (GCP), including BigQuery and Cloud Dataflow, ensuring data flow automation.•Leveraged AWS Lambda and EMR for distributed data processing, using Spark, Hadoop, and Kafka to handle real-time data streams.•Developed Cassandra-based distributed databases to improve availability and scalability for key business services.•Integrated Sqoop for efficient data migration between Hadoop and RDBMS systems like MySQL and Oracle.•Applied Python and R for machine learning pipelines, handling data preprocessing and feature engineering.•Built data pipelines in AWS Glue and GCP Cloud Dataflow to manage structured and unstructured data with Scala and Python.•Implemented Snowflake for data warehousing solutions, optimizing query performance and data partitioning.•Developed machine learning models using Scikit-learn and TensorFlow, focusing on fraud detection and customer segmentation.•Automated infrastructure deployment using Ansible for configuration management across AWS and GCP environments.•Employed Docker and Kubernetes to containerize applications, ensuring smooth deployment across cloud environments. Show less
- Chubb
  Jul 2023 - now
  Data engineer
  •Built and managed AWS Data Pipelines using Lambda to facilitate data transfers between S3, DynamoDB, and Redshift for optimized ETL processes.•Developed complex T-SQL queries and stored procedures to fulfill business user requirements, integrating with Azure Synapse Analytics for real-time processing.•Integrated ERWIN and MB MDR for logical and physical data modeling, improving metadata management across projects.•Optimized data encryption processes in PySpark using hashing algorithms, ensuring client data security.•Designed and optimized Azure Data Factory pipelines for seamless data migration and ETL automation.•Implemented AWS EMR clusters for distributed processing of large datasets using Spark, Hive, and Hadoop.•Developed Python-based REST APIs for revenue tracking and analysis, automating report generation processes.•Used C++ for performance improvements in the data ingestion pipeline, optimizing low-level system operations.•Managed HBase clusters for high-performance storage and retrieval of sparse data.•Built comprehensive Tableau and Power BI dashboards for real-time KPI tracking and performance analysis.•Used Terraform to automate infrastructure provisioning for Kubernetes and Docker containers, enabling scalability across cloud platforms. Show less
Licenses & Certifications
- R for data science: analysis and visualization
  Linkedin
  Dec 2023
  View certificate
- Aws certified solutions architect – associate
  Amazon web services (aws)
  Oct 2024
  View certificate
- Tableau essential training
  Linkedin
  Dec 2023
  View certificate
- Snow pro core certification
  Snowflake
  Oct 2024
  View certificate
- Databricks certified professional
  Databricks
  Oct 2024
  View certificate
- Microsoft certified: azure data engineer associate
  Microsoft
  Aug 2024
  View certificate