Gaurav Kumar

Gaurav Kumar

Stochastic modelling of Corrosion induced Degradation in Concrete Structures

Followers of Gaurav Kumar796 followers
location of Gaurav KumarBengaluru, Karnataka, India

Connect with Gaurav Kumar to Send Message

Connect

Connect with Gaurav Kumar to Send Message

Connect
  • Timeline

  • About me

    Staff Engineer at Samsung Electronics || IIT Roorkee 2018

  • Education

    • Greenway Modern School

      2007 - 2014
      12 93%

      Activities and Societies: Maths group, Physics club

    • Indian Institute of Technology, Roorkee

      2014 - 2018
      Bachelor’s Degree Civil Engineering 8.2

      Activities and Societies: Debating Club, Bookshelf, Star-Gazing club, Models and Robotics Section.

  • Experience

    • IIT Roorkee

      May 2016 - Nov 2016
      Stochastic modelling of Corrosion induced Degradation in Concrete Structures
    • Wipro

      May 2017 - Jul 2017
      Intern

      Developed a product comparison web-app using Django framework. Implemented functionalities like user registration, authentication and recording user responses (0-5 score) for each category for different products, summarizing the strengths of each product in an interactive graphical interface. The app was hosted on an AWS server with a live database.

    • Wipro Limited

      Jun 2018 - Dec 2019
      Project Engineer

      Developed a web-app with the technologies .Net Core (C#), Angular & Gojs. Responsible for a component which used GoJs to make electrical circuits using various equipments and performed calculations to assess their reliability.

    • Samsung Semiconductor

      Jan 2020 - now

      2023 NPU Core Project & ADAS Projects:Worked on the NPU Simulator for ADAS SOCs & SDK Toolchain for Exynos NPU Compiler for 2023 NPU Core.NPU Simulator: Developed a tool to approximate the latencies of NNs when executed on NPU with different optimization levels of the compiler. Travelled to the Samsung Korea office for this work. - Created Models to predict HW execution latency of different operations.- Implemented compiler scheduling & SRAM allocation for various optimization levels present within the compiler. Also, Implemented other optimizations present within the compiler.- Designed the NPU Simulator with an exploration mode, that would try to find stratified chains of layers with increasing possibility of feature forwarding in SRAM. SDK Toolchain: - Converting NN Models from Tflite or Onnx formats to an internal format suitable for Exynos NPU Compiler. - Optimizing models to improve their latency. These include both HW-independent graph optimizations and HW-dependent optimizations such as layer conversions. - Quantization of models to Int8 or Float16 quantization scheme depending on latency & accuracy requirements.Owner for delivering 8 QAT Int8 NNs to the compiler for Geekbench benchmark for 2023 NPU Core.Worked on LLMs for Galaxy AI release for Samsung Flagship SOCs. Focused on delivering LLama-based models to the compiler ensuring high accuracy. Handled issues in SDK for ADAS Projects for clients such as Harman, BMW & Motional.Working on refactoring the SDK Toolchain for further SOCs including the 2024 NPU Core project. Show less 2022 NPU Core Project:Worked on the NN Compiler for Exynos NPU. Focused on bringing up NPU Core for this generation. Collaborated closely with the Firmware & Device Driver team for the NPU HW. Handled updated instruction type to RISC Style.Owned 1 KPI NN (MobileNetV3) for this generation. Used it to enable multiple features in the compiler.Played a major role in implementing Batch Mode Feature in Compiler. Handling SRAM Memory allocation and Instruction Scheduling for the POC Network to maximize FeatureMap forwarding within SRAM. Implemented an optimization for a single execution of batched convolutions by reinterpreting them as a single operation. (Since it was not natively supported by HW). Travelled to the Samsung Korea office for this work.Fixed the high power requirements for some NNs by reducing DRAM transfers and maximising FeatureMap forwarding within SRAM across layers.Worked on minimizing DRAM Footprint for multiple NNs for mid-tier or volume segment SOCs since they have lesser DRAM available compared to flagship SOCs. Implemented multiple allocation schemes to reduce the DRAM Allocation Size of the NNs. We were able to improve the size by up to 20% in some cases. These experiments culminated in the publication of the research paper: https://ieeexplore.ieee.org/document/10277304 2021 NPU Core Project:Worked on the NN Compiler for Exynos NPU. Owned 2 Key Performance Features for this project.Enabled Data Transfer Compression HW Feature. This was a zero-loss compression algorithm implemented in the Data Transfer HW from DRAM to SRAM and vice-versa. Enabling it improved the performance by ~12% for all KPI NNs.Worked on Maximizing the Cache utilization of Feature Maps (FM) for Neural Networks to improve reading and writing latencies. Implemented an FM Allocation method focusing on Multicore NN execution which improved the Cache IO by around 25%. Overall, with this feature, we were able to gain an average of ~10% in latency executing the KPI NNs. Show less

      • Staff Engineer

        Mar 2023 - now
      • Associate Staff Engineer

        Mar 2021 - Mar 2023
      • Senior Engineer

        Jan 2020 - Mar 2021
  • Licenses & Certifications

    • Machine Learning

      Coursera Course Certificates
      Jun 2016
      View certificate certificate
    • Design and Analysis of Algorithms

      NPTEL
      Mar 2016
  • Honors & Awards

    • Awarded to Gaurav Kumar
      JEE Advanced 2014 - May 2014 Achieved AIR 3060