Ashutosh Rastogi

New York City Metropolitan Area

View Mobile No.

View Email

4000 followers

Timeline
About me
Actively seeking Contract opportunities as Data Engineer | Data Analyst
Education
- Gurukula kangri vishwavidyalaya.
  -
  Bachelor of technology - btech computer science and engineering 9.17/10
- University at buffalo
  -
  Master's degree computer science and engineering 3.6/4.0
  Pursuing Computer Science and Engineering degree at University at Buffalo with my specialization in Data Processing and Software Development for Data Scientist and SDE roles.
- St. anthony's high school - india
  -
  10+2 (cbse) pcmc a
Experience
- Integra micro software service- india
  Nov 2015 - Jul 2019
  Etl developer
  ●Imported the data from various formats like JSON, Sequential, Text, CSV, AVRO and Parquet to HDFS cluster with compressed for optimization. ●Worked on ingesting data from RDBMS sources like - Oracle, SQL Server and Teradata into HDFS using Sqoop.●Loaded all datasets into Hive from Source CSV files using Spark and Cassandra from Source CSV files using Spark.●Created environment to access Loaded Data via Spark SQL, through JDBC ODBC (via Spark Thrift Server).●Developed real time data ingestion/ analysis using Kafka / Spark-streaming.●Configured Hive and written Hive UDF's and UDAF's Also, created Static and Dynamic with bucketing as required. ●Worked on writing Scala programs using Spark on Yarn for analysing data.●Managing and scheduling Jobs on a Hadoop cluster using Oozie.Created Hive External tables and loaded the data into tables and query data using HQL. Show less
- Lincoln financial group
  May 2020 - Aug 2021
  Data engineer
  ●Participate in requirement grooming meetings which involves understanding functional requirements from business perspective and providing estimates to convert those requirements into software solutions (Design and Develop & Deliver the Code to IT/UAT/PROD and validate and manage data Pipelines from multiple applications with fast-paced Agile Development methodology using Sprints with JIRA Management Tool).●Responsible to check data in DynamoDB tables and to check EC2 instances are upon running for ●(DEV, QA, CERT and PROD) in AWS.●Analysis on existing data flows and create high level/low level technical design documents for business stakeholders that confirm technical design aligns with business requirements.●Creation and deployment of Spark jobs in different environments and loading data to no sql database Cassandra/Hive/HDFS. Secure the data by implementing encryption based. Show less
- Pg& e
  Sept 2021 - Aug 2022
  Data engineer
  ●Extensive experience in working with AWS cloud Platform (EC2, S3, EMR, Redshift, Lambda and Glue).●Working knowledge of Spark RDD, Dataframe API, Data set API, Data Source API, Spark SQL and Spark Streaming.●Developed Spark Applications by using Python and Implemented Apache Spark data processing Project to handle data from various RDBMS and Streaming sources. ●Worked with Spark for improving performance and optimization of the existing algorithms in Hadoop.●Using SparkContext, Spark-SQL, Spark MLlib, Data Frame, Pair RDD and Spark YARN. ●Used Spark Streaming APIs to perform transformations and actions on the fly for building common.●Learner data model which gets the data from Kafka in real time and persist it to Cassandra. ●Developed Kafka consumer API in python for consuming data from Kafka topics. ●Consumed Extensible Markup Language (XML) messages using Kafka and processed the XML file using Spark Streaming to capture User Interface (UI) updates. Show less
- Nationwide insurance
  Oct 2022 - Dec 2023
  Senior data engineer
  ●As a Data Engineer I am responsible for building scalable distributed data solutions using Hadoop.●Involved in Agile Development process (Scrum and Sprint planning).●Handled Hadoop cluster installations in Windows environment.●Migrated on-premises environment in GCP (Google Cloud Platform)●Experience building and deploying cloud infrastructure using Terraform●Migrated data warehouses to Snowflake Data warehouse.●Defined virtual warehouse sizing for Snowflake for different type of workloads.●Demonstrated knowledge of AWS, Azure, Google Cloud Platform, and other cloud providers●Involved in porting the existing on-premises Hive code migration to GCP (Google Cloud Platform) BigQuery.●Ability to design, develop, and implement Terraform scripts for infrastructure automation.●Proven understanding of the principles of Infrastructure as Code (IaC). Show less
- Thermo fisher scientific
  Jan 2023 - Jun 2024
  Senior data engineer
  ●Worked on building the data pipelines (ELT/ETL Scripts), extracting the data from different sources(DB2, AWS S3 files), transforming and loading the data to the Data Warehouse (AWS Redshift).●Worked on adding the Rest API layer to the ML models built using Python, Flask & deploying the models in AWS BeanStalk Environment using Docker containers.●Ability to debug and troubleshoot Terraform deployments.●Worked on developing & adding few Analytical dashboards using Looker product.●Worked on building the aggregate tables & de-normalized tables, populating the data using ETL to improve the looker analytical dashboard performance and to help data scientist and analysts to speed up the ML model training & analysis.●Created New Dashboards, reports, scheduled searches and alerts using spunk.●Integrate Pager duty with Splunk to generate the Incidents from Splunk. Show less
- Charter communications
  Jul 2024 - now
  Senior data engineer
  • Spearheaded the migration of over 50 TB of big data from on-premises SQL Server databases to Amazon S3, ensuring 99.9% uptime during the migration process and reducing data retrieval time by 20% through optimized storage configurations.• Integrated the migration policies of Oracle databases to Hive, implementing ETL processes using XML and shell scripts that reduced data migration time by 40%, facilitating the processing of over 100 million records daily in a Hadoop-based environment.
Licenses & Certifications
- Algorithm toolbox
  Coursera
  Aug 2020
  View certificate
- Algorithms on graphs
  Coursera
  Aug 2020
  View certificate
- Core java programming
  Internshala
  Jul 2019
  View certificate
- Data structures
  Coursera
  Aug 2020
  View certificate
- Problem solving through programming in c
  Nptel
  Jan 2019
  View certificate
- Ethical hacking
  Internshala
  Mar 2019
  View certificate
- Programming with python
  Internshala
  Jan 2019
  View certificate
- Deep learning a-z: hands on artificial neural networks
  Superdatascience
  Apr 2020
  View certificate
- Machine learning a-z: hands-on python & r in data science
  Superdatascience
  Mar 2020
  View certificate
- Mba in a box: business lessons from a ceo
  365 careers
  Jul 2020
  View certificate
Honors & Awards
- Awarded to Ashutosh Rastogi
  Mathematics Topper Central Board of Secondary Education Jun 2011 I scored 100percent marks at my higher secondary school education, by which I got awarded with a certificate of merit from C.B.S.E New Delhi to come among 0.1percent of all India student to scored 100/100 marks in class 12th Mathematics.