Ashutosh Rastogi

Ashutosh rastogi

bookmark on deepenrich
location of Ashutosh RastogiNew York City Metropolitan Area
Followers of Ashutosh Rastogi4000 followers
  • Timeline

  • About me

    Actively seeking Contract opportunities as Data Engineer | Data Analyst

  • Education

    • Gurukula kangri vishwavidyalaya.

      -
      Bachelor of technology - btech computer science and engineering 9.17/10
    • University at buffalo

      -
      Master's degree computer science and engineering 3.6/4.0

      Pursuing Computer Science and Engineering degree at University at Buffalo with my specialization in Data Processing and Software Development for Data Scientist and SDE roles.

    • St. anthony's high school - india

      -
      10+2 (cbse) pcmc a
  • Experience

    • Integra micro software service- india

      Nov 2015 - Jul 2019
      Etl developer

      ●Imported the data from various formats like JSON, Sequential, Text, CSV, AVRO and Parquet to HDFS cluster with compressed for optimization. ●Worked on ingesting data from RDBMS sources like - Oracle, SQL Server and Teradata into HDFS using Sqoop.●Loaded all datasets into Hive from Source CSV files using Spark and Cassandra from Source CSV files using Spark.●Created environment to access Loaded Data via Spark SQL, through JDBC ODBC (via Spark Thrift Server).●Developed real time data ingestion/ analysis using Kafka / Spark-streaming.●Configured Hive and written Hive UDF's and UDAF's Also, created Static and Dynamic with bucketing as required. ●Worked on writing Scala programs using Spark on Yarn for analysing data.●Managing and scheduling Jobs on a Hadoop cluster using Oozie.Created Hive External tables and loaded the data into tables and query data using HQL. Show less

    • Lincoln financial group

      May 2020 - Aug 2021
      Data engineer

      ●Participate in requirement grooming meetings which involves understanding functional requirements from business perspective and providing estimates to convert those requirements into software solutions (Design and Develop & Deliver the Code to IT/UAT/PROD and validate and manage data Pipelines from multiple applications with fast-paced Agile Development methodology using Sprints with JIRA Management Tool).●Responsible to check data in DynamoDB tables and to check EC2 instances are upon running for ●(DEV, QA, CERT and PROD) in AWS.●Analysis on existing data flows and create high level/low level technical design documents for business stakeholders that confirm technical design aligns with business requirements.●Creation and deployment of Spark jobs in different environments and loading data to no sql database Cassandra/Hive/HDFS. Secure the data by implementing encryption based. Show less

    • Pg& e

      Sept 2021 - Aug 2022
      Data engineer

      ●Extensive experience in working with AWS cloud Platform (EC2, S3, EMR, Redshift, Lambda and Glue).●Working knowledge of Spark RDD, Dataframe API, Data set API, Data Source API, Spark SQL and Spark Streaming.●Developed Spark Applications by using Python and Implemented Apache Spark data processing Project to handle data from various RDBMS and Streaming sources. ●Worked with Spark for improving performance and optimization of the existing algorithms in Hadoop.●Using SparkContext, Spark-SQL, Spark MLlib, Data Frame, Pair RDD and Spark YARN. ●Used Spark Streaming APIs to perform transformations and actions on the fly for building common.●Learner data model which gets the data from Kafka in real time and persist it to Cassandra. ●Developed Kafka consumer API in python for consuming data from Kafka topics. ●Consumed Extensible Markup Language (XML) messages using Kafka and processed the XML file using Spark Streaming to capture User Interface (UI) updates. Show less

    • Nationwide insurance

      Oct 2022 - Dec 2023
      Senior data engineer

      ●As a Data Engineer I am responsible for building scalable distributed data solutions using Hadoop.●Involved in Agile Development process (Scrum and Sprint planning).●Handled Hadoop cluster installations in Windows environment.●Migrated on-premises environment in GCP (Google Cloud Platform)●Experience building and deploying cloud infrastructure using Terraform●Migrated data warehouses to Snowflake Data warehouse.●Defined virtual warehouse sizing for Snowflake for different type of workloads.●Demonstrated knowledge of AWS, Azure, Google Cloud Platform, and other cloud providers●Involved in porting the existing on-premises Hive code migration to GCP (Google Cloud Platform) BigQuery.●Ability to design, develop, and implement Terraform scripts for infrastructure automation.●Proven understanding of the principles of Infrastructure as Code (IaC). Show less

    • Thermo fisher scientific

      Jan 2023 - Jun 2024
      Senior data engineer

      ●Worked on building the data pipelines (ELT/ETL Scripts), extracting the data from different sources(DB2, AWS S3 files), transforming and loading the data to the Data Warehouse (AWS Redshift).●Worked on adding the Rest API layer to the ML models built using Python, Flask & deploying the models in AWS BeanStalk Environment using Docker containers.●Ability to debug and troubleshoot Terraform deployments.●Worked on developing & adding few Analytical dashboards using Looker product.●Worked on building the aggregate tables & de-normalized tables, populating the data using ETL to improve the looker analytical dashboard performance and to help data scientist and analysts to speed up the ML model training & analysis.●Created New Dashboards, reports, scheduled searches and alerts using spunk.●Integrate Pager duty with Splunk to generate the Incidents from Splunk. Show less

    • Charter communications

      Jul 2024 - now
      Senior data engineer

      • Spearheaded the migration of over 50 TB of big data from on-premises SQL Server databases to Amazon S3, ensuring 99.9% uptime during the migration process and reducing data retrieval time by 20% through optimized storage configurations.• Integrated the migration policies of Oracle databases to Hive, implementing ETL processes using XML and shell scripts that reduced data migration time by 40%, facilitating the processing of over 100 million records daily in a Hadoop-based environment.

  • Licenses & Certifications

  • Honors & Awards

    • Awarded to Ashutosh Rastogi
      Mathematics Topper Central Board of Secondary Education Jun 2011 I scored 100percent marks at my higher secondary school education, by which I got awarded with a certificate of merit from C.B.S.E New Delhi to come among 0.1percent of all India student to scored 100/100 marks in class 12th Mathematics.