Yi Chen (Scott) Wu

Yi Chen (Scott) Wu

Data Scientist

Followers of Yi Chen (Scott) Wu1000 followers
location of Yi Chen (Scott) WuTaipei, Taipei City, Taiwan

Connect with Yi Chen (Scott) Wu to Send Message

Connect

Connect with Yi Chen (Scott) Wu to Send Message

Connect
  • Timeline

  • About me

    Data Engineer

  • Education

    • National Chengchi University

      2013 - 2017
      Bachelor of Arts - BA Public Finance
    • National Tsing Hua University

      2017 - 2019
      Master's degree MS, Institute of Statistic

      - Program in Data Science- Thesis : Hierarchical Decomposition of Functional Diversity ( Statistical Estimation and Software Development )- Thesis link : https://etd.lib.nctu.edu.tw/cgi-bin/gs32/hugsweb.cgi?o=dnthucdr&s=id=%22G021060245190%22.&searchmode=basic

  • Experience

    • AICS

      Feb 2020 - May 2020
      Data Scientist

      • Data cleansing and crawling for medical product development.• Defined and retrieved user feedback of medical product and built real-time dashboard for monitoring.• Normalized English name of drugs from NHI by IDF, organized 40,000 drugs into 2,000 categories.

    • Wisers Information Limited

      Sept 2020 - Mar 2022
      Data Scientist

      ◆◆◆ AI Projects & Research◆◆◆• Developed a multi-label model for product reviews that outperformed company’s labeling system by improving accuracy by 20% and reducing cost by 50%.• Developed a Simplified to Traditional Chinese conversion model that improved the sentence-level accuracy by 10% compared with company’s API and opencc.• Modified and re-trained the Chinese spellcheck language model proposed by ByteDance [arXiv:2005.07421] and outperformed the paper's result (Detection-level and Correction-level f1 scores reach 74.6 and 70.2 on SIGHAN 2015 test set). Which can be used to correct typos (e.g. OCR, ASR).• Designed an attention-based explainable model for classification/multi-label problems that can return both labels and corresponding keywords in real-time.◆◆◆ Business Support ◆◆◆• Deployed an information extraction API for policy documents and provides long-term data delivery for the insurance group company.• Developed a generalized tabular extraction API for pdf documents. User can access specific tables by input keywords.• Created interactive websites of AI projects for business introductions and demonstrations.• Support 3+ customer POC projects per month.◆◆◆ Invention Patent ◆◆◆*Domain words extraction algorithm*Domain words extraction algorithm is a semi-supervised method by giving several known domain words then return related words from documents. For example, give the words “fever” and “cough” then the algorithm will return “runny nose”, “sore throat” and other related words from given documents.• Redesigned and optimized the algorithm that improved NDCG score by 50% and the runtime was 3 times faster. • Extended the feature of algorithm that can apply to multilingual documents/corpus. Show less

    • Binance

      Mar 2022 - Nov 2023
      Data Scientist

      • On-Chain fund flow ETL and blockchain address uncover algorithm development.• Wallet exploit risk reduce algorithm development.• Real-time regulatory announcement system development.• Blockchain social media scraping.• KYC computer vision.• BI dashboards.

    • OpenNet Limited

      Mar 2024 - now
      Data Engineer
  • Licenses & Certifications

    • 2021招牌初階

      教育部人工智慧競賽與標註資料蒐集計畫 (AI CUP)
      Mar 2022
      View certificate certificate
    • 2021招牌進階

      教育部人工智慧競賽與標註資料蒐集計畫 (AI CUP)
      Mar 2022
      View certificate certificate
  • Honors & Awards

    • Awarded to Yi Chen (Scott) Wu
      T-Brain Deep Learning Competition ( Computer Vision ) AI CUP Sep 2021 - Private Leaderboard : 6th/183 ( Top 4% )- Description : Traditional Chinese Scene Text Recognition- Detection Model : Mask RCNN ( Backbone : Resnet152 )- Recognition Model : Ensemble ( Resnet50 + Densenet201 + Efficientnet-B1 )- Source Code : https://github.com/yichenwu05/Traditional-Chinese-Scene-Text-Recognition
    • Awarded to Yi Chen (Scott) Wu
      T-Brain Deep Learning Competition ( Computer Vision ) AI CUP May 2021 - Final Rank : 10th/341 ( Top 4% )- Description : Traditional Chinese Scene Text Detection- Model : Mask RCNN ( Resnet152 )- Source Code : https://github.com/yichenwu05/Traditional-Chinese-Scene-Text-Detection
    • Awarded to Yi Chen (Scott) Wu
      AI CUP Competition ( Natural Language Processing ) AI CUP Dec 2020 - Final Rank : 1st/148 ( Top 1% )- Description : Label the category ( Theoretical, Engineering, Empirical, Others) for abstracts of computer science related papers from arXiv. An abstract can have multiple categories.- Model : Fine-tuned the pre-trained BERT model ( scibert ).- Source Code : https://github.com/yichenwu05/AI-CUP-Competition
    • Awarded to Yi Chen (Scott) Wu
      T-Brain Machine Learning Competition Trend Micro Sep 2018 - Final Rank : 8th/495 ( Top 2% )- LightGBM model with parameter tuning and cross validation training to predict the renewal premium of policyholder for car insurance.- Using classification ( predict withdrawal probability ) and regression model to improve the prediction accuracy.- Link : https://github.com/yichenwu05/Tbrain-competition