Mukhamet Nurpeiissov

Mukhamet Nurpeiissov

Data Scientist

Followers of Mukhamet Nurpeiissov1000 followers
location of Mukhamet NurpeiissovAbu Dhabi, Abu Dhabi Emirate, United Arab Emirates

Connect with Mukhamet Nurpeiissov to Send Message

Connect

Connect with Mukhamet Nurpeiissov to Send Message

Connect
  • Timeline

  • About me

    Data Engineer @ M42 Health | Master's in Robotics

  • Education

    • Nazarbayev University

      2012 - 2017
      Bachelor's degree Mechatronics, Robotics, and Automation Engineering
    • Nazarbayev University

      2017 - 2019
      Master's degree Mechatronics, Robotics, and Automation Engineering
  • Experience

    • Institute of Smart Systems and Artificial Intelligence - Nazarbayev University

      Oct 2019 - May 2021
      Data Scientist

      ◦ WiFi based Indoor Localization: Created and published open-source finer level WiFi dataset and implemented regression for predicting user’s position in buildings with sub-meter accuracy. ◦ WiFi and Inertial Sensors based Indoor Localization: Collected dataset of IMU and WiFi readings and developed LSTM/Transformer models using Pytorch for sensor fusion and regression. Achieved and improved state-of-the-art accuracy for wifi localization in buildings. ◦ Kazakhs Speech Corpus: Developed web crawling scripts for collecting news and articles in Kazakh Language. Collected largest dataset in Kazakh Language at the time◦ Epidemic Simulator (COVID-19): Contributed to development simulation software for stochastic model with network transition which tried to predict covid-19 spread Show less

    • MixRank

      May 2021 - Jun 2022
      Data Engineer
    • CEX.IO

      Jun 2022 - Oct 2023
      Data Engineer

      • Developed and optimized ETL pipelines, increased the performance of pipelines by 90% • Integrated data from Google Analytics, Hubspot, and Intercom to DataWarehouse.• Deployed Apache Superset to Kubernetes for visualization/reporting layer.• Optimized deployment of Machine Learning models using FastAPI and CI/CD pipelines.

    • M42 Health

      Nov 2023 - now
      Data Engineer

      • Developed efficient and highly large scale genomic data processing pipelines using Spark and Nextflow.• Optimized deployment to Kubernetes, reducing size by 40% (20GB -> 12 GB) and build time by 90% (4hours to 30 min)• Optimized and developed CI/CD pipelines for Apache Superset• Developed a library for data analytics on large scale data using probabilistic data structure ( Data Sketches) for creating Decision Trees, Association rule mining and Clustering

  • Licenses & Certifications