
Gabriel Ducrocq
Data Scientist Research Intern

Connect with Gabriel Ducrocq to Send Message
Connect
Connect with Gabriel Ducrocq to Send Message
ConnectTimeline
About me
I am a researcher in AI/statistics/machine learning applied to cryoEM reconstruction.
Education

University Paris-Est Marne la Vallée
2014 - 2015Master's Degree Porbability and Statistics ( applied and theoritical) Master 2 ( fifth year of university studies in mathematics)Followed courses:-Big Data- Time Series- Simulations(Monte-Carlo Markov Chain methods, Optimization, Copulas)- Stochastic Processes- Non-parametric Statistics- Model Selection- Stochastic Calculus.The following courses involved an IT Project using R : Stochastic Processes, Simulations, Model selection, Non-parametric Statistics.

NA
2013 - 2014Hypnosis/HypnotherapyOne year break practicing and studying hypnotherapy.Co-creation of Association France Hypnose.Giving lectures about Impromptu Hypnosis in the association.Association's website:http://afh-hypnose.com/

University Paris-Est Marne la Vallée
2012 - 2013Master's Degree Pure and applied Mathematics Master 1 (fourth year of university studies in mathematics)Followed courses:- Probability theory- Parametric Statistics- Stochastic Processes- Numerical Analysis applied to Partial Differential Equations Functional Analysis- Algebra (Galois Theory)- Distributions and Partial Differential Equations- Final Paper.Cum laud distinction.

University of Lille1
2009 - 2012Pure and Applied Mathematics Licence (three years studying mathematics at university)Followed courses: -Parametric Statistics- Probability- Integration Theory- Numerical Analysis- Algebra- Topology- Differential Calculus- Graph Theory- Complex Analysis.Several courses involved in IT project using Ocaml, Mapple, Scilab, Mapple.

Ensae ParisTech
2015 - 2016Specialized master (Sixth year of study) Data Science/Big DataFollowed courses:- Machine Learning and Data-Mining-Data base and web-Computational Statistic (Monte Carlo Markov Chain)-Econometrics of marketing-Statistical analysis of network data-Tools for analysis of massive data base-Hadoop-Bootstrapping and resampling-Bayesian statistics
Experience

Laboratoire d'Analyse et de Mathématiques Appliquées
May 2015 - Oct 2015Data Scientist Research InternAs an intern, a developped a two step method of supervised classification of Stochastic Differential Equations (SDEs) using the Bayes' classifier:- First we use a maximum likelihood estimate in order to estimate the parameters of the SDEs-Second, using the estimates, we build an approximation of the bayes function and we decide based on this.The paper is available on my github:https://github.com/Gabriel-Ducrocq/Final_Paper/blob/master/Final_paper.pdf

Cheerz.com
Jun 2016 - Dec 2016Data Scientist InternDuring my 6 months internship at Cheerz, I did many things:I tried to identify opportunities to increase the conversion on the website and the app. In order to do this, I worked with various sources of data:- From the company's own databases- Using the Google Analytics API (tracking data)I also built an algorithm gathering the accounts of potentials "influencers" on Instagram - people potentially interested in the products of Cheerz, with enough followers - using keywords hashtags and the Instagram's API.I was in charge to run analysis on the customer's data and to make dashboards to support the marketing department.Finally, I implemented an A/B testing tool using a Bayesian framework instead of the usual frequentist approcach. It was designed to avoid the bad consequences of peeking/early-stopping . Show less

La Javaness
Mar 2017 - Sept 2017Research And Development Data ScientistAs a R&D Data Scientist, I was in charge of developping machine learning models responding to the business needs of the clients:- Natural Language Processing and development of an API enabling automatic email customer service.- Natural Language Processing with Deep Learning methods for postal adresses extraction from emails.- Maintenance of a Spark ML model designed to target the right customer and the right time for a phone call.- Development of a pricing model for discount offers during real-time negociation.Technologies:- Python (pandas, scikit-learn, nltk)- Tensorflow- Spark- Javascript Show less

Yubo
Jul 2018 - Sept 2018Data ScientistNatural language processing: topic emergence in live streams on the applicationDetection of spammers profiles on the application, using DataFlow, Google BigTable and MongoDB.

ENSAE Paris
Oct 2018 - May 2022PHD StudentPhD in Bayesian/computational statistics with an application to the study of the Cosmic Microwave background (CMB)Thanks to a cosmological model, we can establish a statistical model which, given the cosmological parameters (dark matter quantity, dark energy quantity, Hubble constant etc...) generates the CMB.Taking a Bayesian stance and setting a prior on these cosmological parameters, the aim of my thesis is to sample from the posterior distribution, given the observed CMB signal.This is a difficult problem, since the CMB signal is roughly 10^6 dimensional. Most of the algorithms require the inversion of a 10^6x10^6 dense dimensional matrix.I chose to improve upon the Gibbs sampler used in that field so far. I improved its performances by a factor 10 to 100 depending on the components, making this asymptotically exact method actually useful for the practitioner. I published a paper in Physical Review D:https://doi.org/10.1103/PhysRevD.105.103501I also developed the Cube method: a method to compress the output of MCMC algorithm using a geometrical sampling survey. I published a paper in Entropy:https://doi.org/10.3390/e23081017My PhD developed my ability to work at the intersection of cutting edge statistical concept and efficient code writing. In addition, I did my computations on a multi-CPUs/multi-GPUs environment.Implemented all my ideas in python and cython, using numba for efficiency.Since My project was multi-disciplinary, I am now comfortable in discussing research ideas and statistics with people having a very different scientific backgrounds. Show less

Linköping University
May 2022 - nowPostdoctoral ResearcherI am applying deep learning to biology. More precisely, we tackle the problem of conformational heterogeneity of proteins.We collect very noisy images of copies of the same protein in different shapes (conformations), and we want to recover the distribution of theses conformations. I have done two things:1/ I modified and used Alphafold to sample more conformations and take its own custom input. 2/ I used generative modelling (a variational auto-encoder structure) to learn the distribution of the deformations of an Alphafold output to fit it into the different images.I used python, PyTorch, and a multi-GPUs environment.Since I am working with biologists, I am comfortable discussing research ideas and communicating with people having a very different scientific background. See our project page:https://gabriel-ducrocq.github.io/cryosphere.github.io/ Show less
Licenses & Certifications

TOEFL
ETSJan 2018- View certificate

Genes and the Human Condition (From Behavior to Biotechnology)
Coursera Course CertificatesFeb 2016 - View certificate

Python for Genomic Data Science
Coursera Course CertificatesFeb 2016 - View certificate

Introduction to Genomic Technologies
Coursera Course CertificatesMar 2016
Languages
- enEnglish
- frFrench
Recommendations

Tamanna a.
RepresentativeNew Delhi, Delhi, India
Betzabé villarreal gómez
Ejecutivo de banca corporativa en Banco Ve por MásMexico City, Mexico
Krishna kant
Sr. QA Engineer at Benthon LabsBihar, India
Ramakoteswararao chintapalli, iosh certified
Manager Facilities Technical at Macro formerly Mace Operate, IOSHHyderabad, Telangana, India
Gabriel cunha
Analista de Infraestrutura | AZ 900 | OCI | CCNABelo Horizonte, Minas Gerais, Brazil
Arturo alcalá nápoles
Senior Service Delivery Manager en EPAM SystemsToluca, México, Mexico
Peter banaszek
Network Engineer at GPI / Greenman-Pedersen, Inc.New York City Metropolitan Area
Jatin dalal
Software Engineer at Cron AIBahadurgarh, Haryana, India.webp)
Olga harding (ex-dmitrijeva)
Project Manager and R&DManama, Capital Governorate, Bahrain
Edward kamau
Environment Health and Safety professionalKenya
Pratik rathi
Partner at M R K S AND COMPANYPune, Maharashtra, India
Anis panjwani
Certified Supply Chain Professional - CSCP (APICS) MBA - Supply Chain ManagementPakistan
Kavitha george
Senior Software Developer (Full stack) | Data Analyst | SQL DeveloperGreater Sydney Area
Nathan am ende
Political Science & International Relations Student at University of ConnecticutEast Lyme, Connecticut, United States
Mahmoud isreawe al-trify
Software Engineer | Laravel | NodeJs | AWS | at Numero eSimAmman, Jordan
Charles fatjo, ace
Terminal Manager @ Vantage Airport Group | Airport Operations ExpertQueens, New York, United States
Tony atkinson
International Education Consultant | Systems, Strategy, LeadershipBangkok, Bangkok City, Thailand
Daniel zim
Daniel Zim is an attorney specializing in travel law.Vienna, Virginia, United States
Megan mcgrath, cmca
Director of Administration and Executive Assistant at Cedar Management GroupHuntersville, North Carolina, United States
Marsha meehan, cf apmp
Proposal Manager with diverse writing, editing, and presentation experienceWashington DC-Baltimore Area
...