Krishna Mohan Reddy Paleti

Krishna mohan reddy paleti

bookmark on deepenrich
location of Krishna Mohan Reddy PaletiDallas-Fort Worth Metroplex
Followers of Krishna Mohan Reddy Paleti1000 followers
  • Timeline

  • About me

    Data Engineer | Big Data Architecture Specialist | Cloud Solutions Expert | ETL & Data Warehousing Professional | Agile Collaborator Driving Data-Driven Insights

  • Education

    • Sri venkateswara university

      2017 - 2021
      Bachelor's degree engineering
    • Texas a&m university-commerce

      2023 -
      Master's degree business analytics

      Relevant Coursework: Data Warehousing, Database Management, Data Visualization, Big Data Technologies, Machine Learning, Business Analytics Programming, Project Management and Agile Methodologies.

  • Experience

    • Medanta

      May 2020 - Apr 2021
      Data engineer

      •Built and architected multiple data pipelines, including end-to-end ETL/ELT processes for data ingestion and transformation in GCP.•Utilized AWS EC2, S3, and Lambda to manage cloud infrastructure and perform complex data operations.• Processed and loaded data from Google Pub/Sub to BigQuery using Cloud Dataflow with Python and R for real-time data handling.•Deployed and maintained containerized data services using Docker and Kubernetes for horizontal scalability.•Leveraged Talend for designing and automating ETL jobs to integrate data from various sources into AWS and GCP.•Applied machine learning techniques and statistical modeling using Python, Pandas, and NumPy to improve customer behavior analysis and subscription predictions.•Integrated Hive and Spark for processing large datasets on AWS EMR clusters, optimizing query performance for analytics.•Utilized Hadoop Distributed File System (HDFS) to efficiently manage storage and retrieval of large datasets.•Automated deployment and resource provisioning with Terraform, optimizing the use of cloud infrastructure for cost efficiency.•Developed dashboards using Power BI for real-time monitoring of patient data and resource allocation across hospital operations.•Managed data replication and synchronization across cloud services using Kafka for low-latency data streaming. Show less

    • Hdfc bank

      May 2021 - Dec 2022
      Aws data engineer

      •Implemented Apache Airflow for authoring, scheduling, and monitoring ETL data pipelines with custom DAGs.•Architected and built infrastructure for Google Cloud Platform (GCP), including BigQuery and Cloud Dataflow, ensuring data flow automation.•Leveraged AWS Lambda and EMR for distributed data processing, using Spark, Hadoop, and Kafka to handle real-time data streams.•Developed Cassandra-based distributed databases to improve availability and scalability for key business services.•Integrated Sqoop for efficient data migration between Hadoop and RDBMS systems like MySQL and Oracle.•Applied Python and R for machine learning pipelines, handling data preprocessing and feature engineering.•Built data pipelines in AWS Glue and GCP Cloud Dataflow to manage structured and unstructured data with Scala and Python.•Implemented Snowflake for data warehousing solutions, optimizing query performance and data partitioning.•Developed machine learning models using Scikit-learn and TensorFlow, focusing on fraud detection and customer segmentation.•Automated infrastructure deployment using Ansible for configuration management across AWS and GCP environments.•Employed Docker and Kubernetes to containerize applications, ensuring smooth deployment across cloud environments. Show less

    • Chubb

      Jul 2023 - now
      Data engineer

      •Built and managed AWS Data Pipelines using Lambda to facilitate data transfers between S3, DynamoDB, and Redshift for optimized ETL processes.•Developed complex T-SQL queries and stored procedures to fulfill business user requirements, integrating with Azure Synapse Analytics for real-time processing.•Integrated ERWIN and MB MDR for logical and physical data modeling, improving metadata management across projects.•Optimized data encryption processes in PySpark using hashing algorithms, ensuring client data security.•Designed and optimized Azure Data Factory pipelines for seamless data migration and ETL automation.•Implemented AWS EMR clusters for distributed processing of large datasets using Spark, Hive, and Hadoop.•Developed Python-based REST APIs for revenue tracking and analysis, automating report generation processes.•Used C++ for performance improvements in the data ingestion pipeline, optimizing low-level system operations.•Managed HBase clusters for high-performance storage and retrieval of sparse data.•Built comprehensive Tableau and Power BI dashboards for real-time KPI tracking and performance analysis.•Used Terraform to automate infrastructure provisioning for Kubernetes and Docker containers, enabling scalability across cloud platforms. Show less

  • Licenses & Certifications