John H.

John H.

Geek Squad Double Agent

Followers of John H.580 followers
location of John H.United States

Connect with John H. to Send Message

Connect

Connect with John H. to Send Message

Connect
  • Timeline

  • About me

    Staff Data Engineer at HEB

  • Education

    • Purdue University

      2004 - 2009
      Bachelor's Degree Computer Science
  • Experience

    • Best Buy

      Jan 2004 - Jan 2010
      Geek Squad Double Agent

      Drove around the Geek Squad VW performing in home PC repairs as well as trainings.

    • Stryker

      Jan 2010 - Dec 2014
      Associate Technical Lead, Business Intelligence

      * Provided technical leadership, design and development for full stack business intelligence solutions which meet internal or external needs. Also work to mentor other team members on technical concepts and provide guidance.

    • Blue Cross of Idaho

      Jan 2015 - May 2018
      Senior Business Intelligence Developer

      * Lead the design, development, implementation, documentation, and support of full stack BusinessIntelligence solutions.* Provide technical leadership and consulting for business users and IS professionals on the design, development, and utilization of BI tools, technologies and processes. Including complete ETL pipelines, data models, and analysis artifacts such as SSRS Reports and Analysis Services cubes.

    • Indeed

      May 2018 - Feb 2020
      Business Intelligence Developer

      * Design, document, and develop data models using Kimball methodologies* Develop end to end ETL pipelines with Python, SQL, and Airflow from many varied sources (Amazon S3 storage, HDFS clusters, Raw web log files, RESTful web services, MySQL databases, PostgreSQL databases, Flat files)* Work with Big Data using Hive, Spark, Docker, and Airflow.* Wrote a custom connector in Python that allowed our internal ETL framework to connect directly to our log data source system, providing access to petabytes worth of detailed data that was typically only available in aggregated form. Show less

    • Workrise

      Feb 2020 - May 2021
      Staff Data Engineer

      * Built a custom data pipeline to consume non bulk export REST APIs using Python, Google Cloud Functions, and Airflow. The solution enables us to scale horizontally, and execute our extracts in parallel, decreasing runtime by 10 hours.* Built multiple data pipelines to consume many different SQL data sources using Fivetran and dbt.* Created a standard and relevant processes for securely hashing our data to protect our customer's PII.* Created a standard (contract) and relevant processes for our product engineers to use so they can push data to us in AVRO files via Google Cloud Storage. This enables them to create and push new objects as needed and allows us to consume them dynamically using Snowflake external stages.* Created and implemented a code review process for my team and trained them on it's use.* Built out a CI/CD pipeline using CircleCI for our GCP Functions and Airflow DAGs across multiple environments (dev/test/prod).* Led design and development of custom Airflow operators and relevant sensors for GCP Functions, dbt job execution via REST API, and Fivetran job execution via REST API.* Led a POC testing out Prefect as our ELT orchestration tool. Included converting multiple Airflow DAGs to Prefect as well as converting custom Airflow operators/sensors to Prefect equivalents. * Led the production rollout of Prefect and the total conversion of our Airflow processes to Prefect. This work reduced our monthly costs by 2/3 while drastically improving our developer experience. Show less

    • Amazon Web Services (AWS)

      May 2021 - Jun 2022
      Senior Business Intelligence Engineer, Foundational Security and Dedicated Clouds (InfoSec)

      * Led the design and development of the data pipeline, data model, and dashboard supporting an S-Team goal involving access reduction guidelines using a combination of S3/Athena/Step Functions/Lambda. Also built a Tableau dashboard from the ground up to support reporting on this goal in numerous different formats and metrics. Project MLP was delivered within a very short timeframe not just in terms of work but in terms of my tenure in the position (started May 2021, automated pipelines and dashboards initially deployed in July 2021)* Was brought in to help a related team with their data engineering/dashboarding for the POC for a new company wide program. Mentored junior engineers in general, tooling at Amazon, data modeling, row based security, AWS CDK apps, and more. Wrote the extract processes as well as building out the reporting data model used in the dashboard and taught it back to the engineers that would be working on the project long term. Designed and built out row level security implementation for use in Tableau / other analytics services* To improve the delivery of critical vulnerability reporting, wrote a script to automatically pull ticketing data from internal APIs that was being done manually. Process was written in Python as a reusable package, and shared with other teams. Included extracts from API, upload of data to S3 buckets, updating of Athena tables, etc.* Built a custom ETL framework using AWS Step Functions, Lambda, and S3 to automate some ad-hoc reporting processes and provide a quick and easy way to build basic ETL jobs with minimal code. Supports parameters/templating, custom date calculations, regex pattern matching and replacement, backfilling, and more. Show less

    • Amazon

      Jun 2022 - Jun 2023
      Senior Data Engineer, Amazon Advertising - Machine Learning Optimization

      What we do:MLO Analytics team’s charter is to support algorithm launch, drive product adoption, and develop and monitor system health metrics via centralizing analytics resources and infrastructure. We reduce duplicate data processes, facilitate cross-dataset analysis, and standardize performance/pacing/ranking evaluation measurement.What I do: * Build and maintain data pipelines and tools to enable the measurement of business and algorithm health, capable of handling hundreds of terabytes a day.* Work closely with Business Intelligence Engineers to dive deep into complex data issues in production and simulation.* Collaborate with Scientists to build robust monitoring and analysis for online experimentation.* Work with scientists, engineers, and product managers on high impact initiatives in Amazon’s Display Advertising.How I do it: * Python* EMR (Spark)* AWS CDK for CI/CD* Many AWS Services including:* AWS Glue / Amazon Athena / Amazon Redshift / Amazon S3 / CloudWatch / Step Functions / and more Show less

    • H-E-B

      Jun 2023 - now
      Staff Data Engineer
  • Licenses & Certifications

  • Honors & Awards

    • Awarded to John H.
      GQO Spotlight on Sucess Award Stryker Corporation 2014 Rewarded for outstanding contribution to the GQO portion of our business for my efforts on the global supplier data warehouse.
    • Awarded to John H.
      Global IT Game Changer Award Stryker Corporation Dec 2011
    • Awarded to John H.
      Associate of the Month Stryker Corporation Apr 2010