Sungmin Lee

Sungmin Lee

Followers of Sungmin Lee135 followers
location of Sungmin LeeSan Francisco Bay Area

Connect with Sungmin Lee to Send Message

Connect

Connect with Sungmin Lee to Send Message

Connect
  • Timeline

  • About me

    Bioinformatics Scientist/Data Engineer. 🌱 → 🦠 → 🧬

  • Education

    • Purdue University

      2011 - 2014
      Bachelor's Degree Biochemistry

      Activities and Societies: Phi Beta Kappa BS Biochem, 2014. Grad with highest distinction

    • Rice University

      2014 - 2020
      Doctor of Philosophy (Ph.D.) in Systems, Synthetic, and Physical Biology Molecular Phylogenetics/Virus Evolution
  • Experience

    • Purdue University

      May 2013 - May 2014

      Advisors: Drs. Uma Aryal, Mark Hall, and Daniel B. SzymanskiProject: Development of a method for proteome-wide analysis of plant protein complexes.- Assisted in the development of a novel protocol for plant proteomics using Arabidopsis thalianausing protein chromatography techniques, FPLC, and HPLC, and peptide mass analytics techniques, MS. Advisors: Drs. Uma Aryal, Mark Hall, and Daniel B. SzymanskiProject: Proteome-wide Analysis of Stable Plant Protein Complexes in the Cytosol by Correlation Profiling.- Optimized conditions for FPLC and HPLC in isolating stable protein complexes from Arabidopsis thaliana cytosol.

      • Undergraduate research assistant in Cytosol proteomics project

        Aug 2012 - May 2014
      • Summer stuent in Howard Huges Medical Institute Summer program

        May 2013 - Aug 2013
    • Systems, Synthetic, and Physical Biology program

      Aug 2014 - Aug 2020
      Graduate research assistant

      Advisors: Drs. Jianpeng Ma and Qinghua Wang• Conducted comparative evolutionary analyses on the human and avian influenza viruses with nucleotide sequences collected from NCBI and GISAID databases.• Examined patterns of viral circulation and evolutionary dynamics for emerging and re-emerging infectious diseases with phylogenetic analyses, population genetics theory, and statistical analyses along with python.

    • Thermo

      Dec 2020 - Jun 2021
      Scientist III - Bioinformatics

      • Designed targeted genotype by sequencing (tGBS) panel products for plant trait selection and animal breeding.• Delivered timely data analysis and product development reports for external customers and internal stakeholders.• Modified existing pipelines for new products and ensured smooth production pipeline operations for the design of highly multiplexed oligos in targeted sequencing panel design.

    • Seegene Inc.

      Jul 2021 - Jul 2024

      • Designed and executed a bioinformatics data pipeline using Unix/Linux, Python, and external software in a hybrid environment (HPC, Azure) for processing genomic data from multiple external sources (e.g., NCBI, SRA).• Collaborated closely with multidisciplinary teams (R&D, manufacturing) to develop and optimize assays based on experimental data using machine learning techniques, contributing to developing robust diagnostic assays.• Conducted exploratory data analysis on all genomics datasets (genomics, metagenomics) to uncover patterns and insights relevant to identifying diagnostic markers within diverse microbial environments.• Developed a relational database with comprehensive data modeling on the Azure platform to streamline data collection and statistical analysis, supporting multiple in-house projects. Show less • Led a multidisciplinary team to create a web-based platform for bioinformatics pipelines using Docker, AWS Lambda, and AWS Batch.• Developed AWS Lambda-based API solutions with optimized query performance for end-users, streamlining product development for software engineers and assay developers.• Led the development and lifecycle management of a nucleotide sequence database project using MySQL and MongoDB, supporting internal product development and quality assurance.• Executed the migration from the in-house database server to AWS RDS using AWS Database Migration Service, ensuring high database availability. Show less

      • Staff Bioinformatics Data Engineer

        Feb 2023 - Jul 2024
      • Staff Bioinformatics Engineer

        Jul 2021 - Feb 2023
    • Profluent

      Jul 2024 - Oct 2024
      Bioinformatics Consultant

      • Developed and optimized bioinformatics pipelines for processing and analyzing large-scale genomic datasets.

    • Cepheid

      Nov 2024 - now
      Senior Bioinformatics Scientist
  • Licenses & Certifications

    • Genomic Data Science Specialization

      Coursera
      Jan 2023
      View certificate certificate
    • 2019 Rice Data Science Boot Camp - Introduction to Modern Regression and Cross Validation, Unsupervised and Supervised Learning, Cloud, Python, AWs, Hadoop and Spark

      Ken Kennedy Institute at Rice University
      Aug 2019
      View certificate certificate
    • Bioinformatics

      University of Maryland Global Campus
      Apr 2020
    • Analyzing and Visualizing Data with Microsoft Power BI

      Microsoft
      Sept 2019
    • Querying Data with Transact-SQL

      Microsoft
      Jun 2019
    • Python for Everybody Specialization

      Coursera
      Oct 2020
      View certificate certificate