Prashanth Kumar

Data Warehouse Developer

1000 followers

Jersey City, New Jersey, United States

Connect with Prashanth Kumar to Send Message

Connect

Connect with Prashanth Kumar to Send Message

Connect

Timeline
About me
Azure Data Engineer | Data Modeling | ETL | Data Pipelines | Python | Databricks | Spark | Kafka | Hadoop | Data Analytics | Data Modeling | Snowflake | Terraform | PowerBI |Data Warehousing
Education
- University at Albany, SUNY
  2012 - 2013
  Master's degree
  I pursued a comprehensive curriculum in Computer Science, laying a solid foundation in algorithms, data structures, and software engineering principles. This program equipped me with the skills to analyze complex problems and implement effective solutions. My coursework included database management systems, distributed systems, and machine learning, providing me with a strong understanding of the technologies essential for a career in data engineering.Relevant Courses:Database… Show more I pursued a comprehensive curriculum in Computer Science, laying a solid foundation in algorithms, data structures, and software engineering principles. This program equipped me with the skills to analyze complex problems and implement effective solutions. My coursework included database management systems, distributed systems, and machine learning, providing me with a strong understanding of the technologies essential for a career in data engineering.Relevant Courses:Database Management SystemsDistributed SystemsAlgorithms and Data StructuresStored ProceduresMachine Learning Show less
- CVR College of Engineering, Hyderabad
  2007 - 2011
  Bachelor's degree
  As a dedicated and detail-oriented professional with a strong foundation in Electronics and Communication Engineering, I have acquired a robust skill set that extends beyond my academic curriculum. My coursework not only provided me with a solid understanding of core engineering principles but also equipped me with the analytical and problem-solving skills essential for the dynamic field of data engineering.I organized events in the Electronics Club and held a role in the Student Government… Show more As a dedicated and detail-oriented professional with a strong foundation in Electronics and Communication Engineering, I have acquired a robust skill set that extends beyond my academic curriculum. My coursework not only provided me with a solid understanding of core engineering principles but also equipped me with the analytical and problem-solving skills essential for the dynamic field of data engineering.I organized events in the Electronics Club and held a role in the Student Government Association, contributing to an enriched campus experience. As a sports enthusiast engaged in cricket and volleyball, and a participant in technical symposiums, I showcased a commitment to holistic development, embracing teamwork, innovation, and community building.Courses:Data BasesData StructuresDigital Signal ProcessingCommunication SystemsMicroprocessor and MicrocontrollerVLSI Design Show less
Experience
- American Express
  Jan 2014 - Nov 2017
  Data Warehouse Developer
  Orchestrated the development of a robust Data Mart, serving as a dependable data source for downstream reporting, alongside implementing a User Access Tool facilitating ad-hoc reports and query executions within the proposed Cube.Developed comprehensive models leveraging Erwin to establish efficient and structured data representations, enhancing Business Intelligence (BI) cubes and analytics capabilities.Conducted thorough Data Analysis for source and target systems, showcasing a strong grasp of Data Warehousing concepts including Staging Tables, Dimensions, Facts, and Star/Snowflake Schemas.Enhanced the deployment and performance of SSIS Packages through meticulous job configurations, resulting in optimized execution.Excelled in deploying SSIS Packages to production, leveraging various configuration options to export package properties and achieve environment independence.Demonstrated expertise in reverse engineering Oracle Warehouse Builder (OWB) ETL processes, developing new replacement Oracle Data Integrator (ODI) processes, and maintaining consistent functionality. Successfully deployed SSIS Packages to production environments, ensuring environment independence through strategic configuration options.Engineered robust stored procedures and triggers to enforce data consistency and integrity during data entry operations.Involved in the Data Analysis for source and target systems and good understanding of Data Warehousing concepts, Staging Tables, Dimensions, Facts and Star, Snowflake Schemas.Designed and executed ETL code proficiently to transform and load source data from diverse formats into a SQL database, employing various transformation techniques.Crafted compelling dashboards in Power BI and prepared user stories to convey actionable insights effectively.Actively participated in Agile Scrum Methodology, leading daily stand-up meetings, and managed project progress efficiently using Trello. Show less
- Mayo Clinic
  Dec 2017 - Jun 2019
  Big Data Hadoop and Data Modeler
  Spearheaded the development of dimensional models and schemas using Erwin, structuring, and organizing claims data for analytics in the Hadoop environment.Developed Spark applications using Scala and Python to handle data from various RDBMS and streaming sources.Spearheaded the transition from Oracle Warehouse Builder (OWB) ETL processes to Oracle Data Integrator (ODI) for enhanced efficiency.Implemented Spark streaming applications to process real-time data from Kafka and store it in HDFS, HBase, and Cassandra.Designed partitioning and bucketing strategies to optimize data processing and enhance performance in Hive.Developed PySpark Data Ingestion framework for data cleansing, aggregation, and de-duplication.Explored Spark for improving performance and optimization of existing algorithms in Hadoop.Utilized Zookeeper for managing configuration information and distributed synchronization.Leveraged Spark features like In-Memory processing and Map side Joins for minimal latency data preprocessing.Proficient in data profiling, mapping, cleaning, integration, metadata management, and Master Data Management (MDM).Managed data from various sources, maintained HDFS, and loaded structured and unstructured data.Developed comprehensive data pipelines using technologies such as Flume, Sqoop, Pig, Kafka, Oozie, and MapReduce.Utilized SSIS for constructing automated multi-dimensional cubes and Sqoop for data channeling.Automated data processing with Oozie for loading data into HDFS.Imported data from MongoDB using Sqoop, customized BI tools for query analytics, and estimated hardware requirements for Hadoop.Utilized advanced Machine Learning and PL/SQL techniques for data manipulation, including bulk select, bulk insert, arrays, and dynamic SQL.Employed advanced techniques like combiners, partitioning, and distributed cache to optimize MapReduce job performance.Maintained source code in Git and GitHub repositories for version control. Show less
- Nike
  Jul 2019 - Sept 2021
  Data Engineer
  Expertise in designing and developing ETL pipelines for seamless data movement across diverse data sources and warehouses.Designed normalized OLTP and dimensional data models using Erwin for Azure SQL Databases, optimizing for both transactional systems and analytics.Applied Kimball Dimensional Data Modeling methodologies using Erwin to design data warehouses tailored to business objectives, facilitating reporting and analytics.Proficient in managing databases across platforms such as MS SQL Server, MySQL, PostgreSQL, Oracle PL/SQL, and Teradata.Specialized in Azure Data Engineering roles, focusing on data standards, integrity, and master data management.Skilled in building Databricks notebooks for data extraction, cleansing, and loading into Azure SQL Database.Hands-on experience with Microsoft Azure services like HDInsight Clusters, Blob Storage, and Azure Data Factory.Executed ETL tasks using Azure Databricks and migrated on-premise Oracle ETL processes to Azure Synapse Analytics.Designed and implemented SSIS packages for data validation and ETL processes.Proficient in Python scripting within Azure Databricks for data validations and quality assurance.Implemented CI/CD practices using Azure DevOps, Jenkins, and GitHub Actions.Developed enterprise-level solutions using batch processing and streaming frameworks like Spark Streaming and Apache Kafka.Designed Snowflake stages and managed transient, temporary, and persistent Snowflake tables for efficient data processing.Extensive experience in creating pipelines in Azure Data Factory v2 and processing schema-oriented and non-schema-oriented data using Scala and Spark.Proficient in scheduling jobs using Airflow scripts and optimizing data processing workflows.Strong background in designing Power BI dashboards and supporting various reporting requirements.Collaborated with team members to resolve technical issues, manage resources, and mitigate project risks using Agile methodology. Show less
- Newyork State Government
  Oct 2021 - now
  Azure Data Engineer
  Led end-to-end operations of ETL data pipelines on Azure Databricks and Apache Spark for large-scale transformations and advanced analytics.Implemented Kimball Dimensional Data Modeling principles to optimize data structures, ensuring efficient storage and retrieval of information.Built Azure cloud pipelines using technologies like Delta Lake, Blob Storage, Data Factory, Cosmos DB, and Azure Key Vault.Spearheaded the development and implementation of MDM strategies, strategically aligning them with organizational goals to enhance overall efficiency and data management effectivenessDemonstrated advanced proficiency in data processing across diverse database platforms within the Azure ecosystem.Extensive expertise in crafting robust Data models for both OLTP and OLAP database systems. Demonstrates strong proficiency in Data Modeling, utilizing ER diagrams, Dimensional data modeling, and excelling in Star Schema and Snowflake modeling with tools such as Erwin and EMBARCADERO ERStudio.Developed Azure Infrastructure as Code templates with Terraform for streamlined deployment.Implemented data cataloguing and metadata management using Azure Purview.Integrated Azure Logic Apps, Kubernetes, and Azure Data Factory Analytics for workflow automation.Proficiently crafted SQL queries and optimized database performance.Architected real-time data pipelines using Kafka, Spark Streaming, and Hive.Developed efficient Spark scripts for accelerated data processing.Leveraged Azure Data Lake Storage Gen2 for scalable data movement and processing.Ensured robust data governance and compliance using Azure Blob Storage and Azure AD authentication.Orchestrated Dynamics 365 implementation projects and Azure cloud migrations.Designed and developed SSIS packages for ETL processes.Developed complex SQL views and procedures for query performance improvement.Utilized JIRA and Agile methodologies for project management and delivery. Show less
Licenses & Certifications
- Microsoft Certified: Azure Data Engineer Associate
  Microsoft Azure