Mikhail Pykhtin

Big Data Analyst-Engineer

79 followers

Serbia

Connect with Mikhail Pykhtin to Send Message

Connect

Connect with Mikhail Pykhtin to Send Message

Connect

Timeline
About me
Engineering Manager/ Team Lead – X5
Education
- ITMO university
  2016 - 2020
  Bachelor's degree Faculty of infocommunication technologies, Intelligent Systems in Humanities (Data Science) 4.4
  Graduation thesis: Collection and processing data for building a predictive model for the churn of subscribers of a mobile operator.
Experience
- VEON
  Aug 2019 - Oct 2020
  Big Data Analyst-Engineer
  Stack: Hadoop, HiveQL, SQL, Python, NiFi.Responsibilities:- organizing and facilitating meetings with business clients;- collecting and writing business requirements;- writing technical specifications for the development of storefronts and ETL flows;- analysis, research and creation of predictors (features) for machine learning models;- building data marts (HiveQL) and launching ETL flows;- testing ETL flows and validating data quality (data completeness, statistical analysis) on Hadoop;- execution of ad-hoc requests using Python;- geo-analytics of subscriber data, analysis of the locations of the greatest activity of cellular subscribers.Projects:1. B2B team - internal projects for the sales and marketing department, providing results to the customer (key and large business segments).a) Improving the B2B customer churn model.2. Center of Excellence team - external projects aimed at developing Big Data in the CIS countries:a) Creation and implementation of a predictive outflow model in Beeline Uzbekistan;b) Creation and implementation of a credit scoring model for clients of Uzbekistan;c) Geoanalytics - building a geolayer and calculating the coverage of base stations in order to analyze the best places to install cellular base stations, create metro stations, place advertising, etc. Show less
- Sberbank
  Oct 2020 - May 2024
  Responsibilities: - management of a team of 6 Data Engineers (5 projects with 5 different customers); - promotion and dismissal of employees, management of positions (staff, outsourcing) in the team; - annual, quarterly and sprint planning of team tasks and resources; - distribution, delegation and prioritization of tasks; - control of task deadlines; - project risk management; - tracking development metrics (Lead Time, Time to Market, Velocity, development time, time for distractions) and developing measures to improve metrics; - preparation of presentations on tasks, metrics, goals. - coordination of contracts with counterparties; - selection of employees and decision-making on candidates; - conducting quarterly performance reviews of team members; - conducting one-on-one sessions and feedback meetings with the team; - motivation of team members;Incl. as Team Lead Data Engineer:- development of datamarts;- implementation of integration interactions through file exchange;- optimization of queries and calculations;- code refactoring;- design of solution architecture and integration interactions;- team training;- conducting code reviews;- conducting technical interviews.Achievements:1. Created a team practically from scratch (increasing the team from 1 to 6 people).2. Brought the team from outsiders (10th place) to the top 1 in terms of development metrics, stability of deadlines and quality of improvements in the department.3. Reducing Lead Time by 2 times due to changes in accepted processes in the team.4. Increasing team stability (2 years without layoffs) by solving problems of low motivation and mis-hiring.5. Solving the problem of low quality releases by replacing those. stack, introducing new testing processes and developing a new training system within the team.6. Implementation of Code Review practice from scratch.7. Implementation of development standards (naming, code style) within the team. Show less Stack: Spark (PySpark), Python, Hive, SQL, Hadoop, Airflow, Jenkins, Git.Responsibilities:- organizing and facilitating meetings with business clients;- collection of business requirements;- validation, comparison and selection of correct data sources;- development of pipelines (ETL flows) on PySpark for building data marts;- implementation of integration interactions through file exchange;- optimization of queries and calculations;- code refactoring;- setting up ETL flow configurations;- testing ETL flows and validating data quality (data completeness, statistical analysis) on Hadoop;- coordination and validation of data with the business customer;- execution of ad-hoc requests;- writing documentation for bringing improvements into release;- building the distribution using Jenkins (CI/CD);- support for improvements in an industrial environment (3rd line of support).Achievements:1. Reducing Lead Time by 2.5 times due to the use of a new development framework.2. Reducing the time for calculating the SberRating data mart by 3 times using query optimization. Show less
  - Engineering Manager
    Jun 2021 - May 2024
  - Lead Big Data Engineer
    Oct 2020 - May 2021
- X5 Digital
  May 2024 - now
  Engineering Manager
  management of a team of 8 Data EngineersStack: Apache Airflow, Pyspark, S3, Postgres, Oracle, MQ, API, Kafka
Licenses & Certifications
- Professional Certificate IBM Data Analyst
  Coursera
  Nov 2020
  View certificate
- Effective Motivation Skills
  Samolov Group
  Apr 2022
  View certificate
- Databases and SQL for Data Science
  Coursera
  Nov 2020
  View certificate
- Apache spark for data engineering
  New Professions Lab
  Apr 2021
  View certificate
- Data Visualization & Dashboard Essentials
  Coursera
  Nov 2020
  View certificate
- Effective Delegation Skills
  Samolov Group
  Sept 2021
  View certificate