
Pooja B.
Java Developer

Connect with Pooja B. to Send Message
Connect
Connect with Pooja B. to Send Message
ConnectTimeline
About me
AWS Certified Solutions Architect-Associate | Google Cloud Certified-Associate Cloud Engineer | Senior Data Engineer @ Capital One
Education

Alamuri Ratnamala Institute of Engineering and Technology
-Bachelor's degree Computer Science
Monroe College
-Master's degree Computer Science
Experience

EClinicalWorks
Oct 2013 - Dec 2014Java Developer• Developed the use cases and class diagrams using Rational Rose/UML.• Used ORM in the persistence layer and implemented DAO’s to access data from with Oracle and MYSQL databases.• Storing the SOAP messages received in the JMS Queue of WebSphere MQ (MQ Series). • Developed Data access bean and developed EJBs that are used to access data from the database.• Used EJB to inject the services and their dependencies.• Wrote PL/SQL and SQL blocks for the application.• Used Core java Multi-Threading concepts for avoiding concurrent processes.• Used Log4j package for logging, ANT for automated deployment and Junit for Testing.• Providing daily development status, weekly status reports, and weekly development summary and defects report.• Implemented the project according to the Software Development Life Cycle (SDLC).• Implemented JDBC for mapping an object-oriented domain model to a traditional relational database.• Created Stored Procedures to manipulate the database and to apply the business logic according to the user’s specifications.• Developed the Generic Classes, which includes the frequently used functionality, so that it can be reusable.• Exception Management mechanism using Exception Handling Application Blocks to handle the exceptions.• Designed and developed user interfaces using JSP, Java script and HTML.• Involved in Database design and developing SQL Queries, stored procedures on MySQL.• Used CVS for maintaining the Source Code. Show less

Rolta India Limited
Jan 2015 - Jul 2016Hadoop and Spark Developer• Involved in Requirement Gathering to connect with BA.• Working Closely with BA & Client for creating technical Documents like High-Level Design and low-Level Design specifications.• Experienced on loading and transforming of large sets of structured data, semis structured data and unstructured data.• Imported data using Sqoop to load data from MySQL to HDFS on regular basis.• Developing RDDS to schedule various Hadoop Program.• Written SPARK SQL Queries for data analysis to meet the business requirements.• Experienced in defining job flows.• Cluster coordination services through Kafka and Zookeeper.• Serializing JSON data and storing the data into tables using Spark SQL.• Writing Shell scripts to automate the process flow.• Storing the extracted data into HDFS using Flume• Experienced in multiple file formats including XML, JSON, CSV and other compressed file formats• Experienced writing queries in Spark SQL using Scala• Communicated all issues and participated in weekly strategy meetings.• Collaborated with the infrastructure, network, database, application, to ensure data quality and availability.• Experience in Daily production support to monitor and trouble shoots Hadoop/Hive jobs• Support/Troubleshoot hive programs running on the cluster and Involved in fixing issues arising out of duration testing.• Prepare daily and weekly project status report and share it with the client. Show less

Capital One
Apr 2018 - Dec 2018Hadoop Developer• Developed Spark scripts by using Scala as per the requirement.• Analyzed large amounts of data sets to determine optimal way to aggregate and report on it. • Designed and implemented Incremental Imports into Hive tables.• Developed and written Apache PIG scripts and HIVE scripts to process the HDFS data. • Involved in defining job flows, managing and reviewing log files.• Involved in Unit testing and delivered Unit test plans and results documents using Junit and MR unit.• Supported MapReduce Programs those are running on the cluster.• Implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies such as Hadoop, MapReduce Frameworks, HBase, Hive, Oozie, Flume, Sqoop etc.• Configured deployed and maintained multi-node Dev and Test Kafka Clusters and implemented data ingestion and handling clusters in real time processing using Kafka.• Imported data from AWS S3 into Spark RDD, Performed transformations and actions on RDD'• Developed Spark scripts by using Scala shell commands as per the requirement.• Imported Bulk Data into HBase Using MapReduce programs.• Perform analytics on Time Series Data exists in HBase using HBase API.• Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.• Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data. • Extracted the data from Teradata into HDFS/Databases/Dashboards using Spark Streaming.• Responsible for continuous monitoring and managing Elastic MapReduce cluster through AWS console. • Wrote multiple java programs to pull data from HBase.• Involved with File Processing using Pig Latin.• Involved in creating Hive tables, loading with data and writing Hive queries that will run internally in MapReduce way. Show less

Walmart Global Tech
Jan 2019 - May 2020Sr. Big Data Engineer/Data Engineer• Implemented Apache Airflow for authoring, scheduling and monitoring Data Pipelines• Designed several DAGs (Directed Acyclic Graph) for automating ETL pipelines• Performed Data Migration to GCP• Responsible for data services and data movement infrastructures• Experienced in ETL concepts, building ETL solutions and Data modeling.• Aggregated daily sales team updates to send report to executives and to organize jobs running on Spark clusters• Designed & build infrastructure for the Google Cloud environment from scratch• Experienced in fact dimensional modeling (Star schema, Snowflake schema), transactional modeling and SCD (Slowly changing dimension)• Worked on confluence and Jira• Designed and implemented configurable data delivery pipeline for scheduled updates to customer facing data stores built with Python• Implementing the big data pipeline with real-time processing using Python, PySpark and Hadoop ecosystem (HDFS, Map Reduce, Hive, Pig, Scala, Sqoop).• Have predominantly worked on Google Cloud Platform GCP Services: Compute Engine for hosting Net App on IIS (app server), Cloud SQL PostgreSQL (SSS DB and Lightbox DB) - Database, Internal Load Balancer - Load balance application server endpoints, Http Load Balancer, Stack driver - Logging and Monitoring, VPC, Other shared services VPC, IAM, DNS, KMS.• Compiled data from various sources to perform complex analysis for actionable results• Measured Efficiency of Hadoop/Hive environment ensuring SLA is met• Developed a PySpark code for saving data into AVRO and Parquet format and building Hive tables on top of them.• Experience in creating and executing Data pipelines in GCP and AWS platforms.• Hands on experience in GCP, Big Query, GCS, cloud functions, Cloud dataflow, Pub/Sub, cloud shell, GSUTIL, ba command-line utilities, Data Proc. Implemented a Continuous Delivery pipeline with Docker, and Git Hub and AWS• Built performant, scalable ETL processes to load, cleanse and validate data Show less

Bank of America
Jun 2020 - Jul 2022Senior Big Data Engineer/Hadoop Developer• Involved in complete Implementation lifecycle, specialized in writing custom MapReduce, Pig and Hive. • Have been using NiFi for transferring data from source to destination and Responsible for handling batch as well as Real-time Spark jobs through NiFi.• Developed micro-services using Python scripts in Spark Data Frame API’s for the semantic layer.• Developed Spark scripts by using Scala as per the requirement.• Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data. • Implemented Big Data Analytics and Advanced Data Science techniques to identify trends, patterns, and discrepancies on petabytes of data by using Azure Databricks.• Trained in QlikView and Splunk Reporting and Dashboard.• Developed a data pipeline using Kafka, Spark and Hive to ingest, transform and analyzing data.• Used Scala to convert Hive/SQL queries into RDD transformations in Apache Spark. • Have been involved in data extraction, data migration, data validation, data encryption, data decryption, and data replication from on-prem to GCP and Bi-directional replication.• Worked on creating data ingestion processes to maintain Global Data Lake on the GCP cloud and Big Query.• Built the complete data ingestion pipeline using NiFi which POST’s flow file through invoke HTTP processor to our Micro services hosted inside the Docker containers.• Used CloudFormation and Cloud Development Kit (CDK) to define infrastructure resources and provision AWS resources in a repeatable and automated manner, ensuring consistency and reliability across environments.• Built Streaming services for real time processing of 100,000 users using Java and Scala.• Lead migration of a legacy Data Warehouse from On-premises to AWS and Java/Spark. Show less

Freddie Mac
Aug 2022 - Nov 2023Senior Big Data Engineer• Implemented end-to-end complete ETL for Rep and Warrant project with agile methodology and responsible for risk failure.• The current project involves cloud migration from oracle(on-prem) to GCP (cloud). Developed and automated data migration• Developed Python scripts to load raw JSON files to derive the attribute values for corresponding tables. • Used Snowpark for extracting the data from source and loading it to Enterprise snowflake.• Developed Python scripts to parse embedded JSON files to derive attribute values and load in snowflake table.• Implemented PySpark logic to transform and process various formats of data like XLS, XLS, JSON, and TXT. • Built scripts to load PySpark processed files into Redshift DB and used diverse PySpark logics. • Created Hive Generic UDF's to process business logic that varies based on policy. • Moved Relational Data base data using Sqoop into Hive Dynamic partition tables using staging tables.• Designed and implemented complex workflows and state machines using AWS Step Functions, orchestrating distributed systems and coordinating tasks across AWS services.• Used ansible for application Deployment, Continuous Deployment, Automation.• Implemented event-driven workflows using AWS EventBridge, enabling seamless integration and communication between various services and systems within the AWS ecosystem.• Created Hive Generic UDF's to process business logic that varies based on policy. • Moved Relational Data base data using Sqoop into Hive Dynamic partition tables using staging tables.• Develop predictive analytic using Apache Spark APIs. • Analyzed and mined business data to identify patterns and correlations among the various data points in Splunk. Show less

Capital One
Nov 2023 - nowSenior Data Engineer• Worked on enhancing the features for the circuit breaker functionality on the Overdraft (OD) UI using Python and Scala. Ensured seamless data integration and feature updates to improve user experience and system reliability.• Loaded and managed data in DynamoDB, ensuring high availability and performance for real-time data access and updates.• Utilized AWS Step Functions to orchestrate various processes, including inclusion, exclusion, and segmentation steps. This ensured efficient workflow management and automation of complex business logic.• Developed Scala-based projects to perform aggregation of various rules and check if user data falls under specific metrics. • This included Implementing aggregation logic to evaluate complex rule sets.• Ensuring scalability and performance of rule evaluations, Integrating with AWS services for data processing and storage.• Leveraged AWS Glue jobs for ETL processes and AWS Lambda for serverless compute to run real-time data processing tasks. This included:• Creating Glue jobs to transform and load data efficiently.• Using Lambda functions to trigger specific steps within workflows.• Ensuring seamless integration between Glue and other AWS services.• Enhanced the circuit breaker feature to dynamically handle system loads and prevent failures, ensuring robustness and reliability of the application.• Employed a comprehensive technology stack including Python, Scala, AWS Step Functions, DynamoDB, AWS Glue, and AWS Lambda to deliver high-quality, scalable solutions. Show less
Licenses & Certifications
- View certificate

Associate Cloud Engineer
Google CloudFeb 2024 - View certificate
.webp)
AWS Certified Solutions Architect – Associate
Amazon Web Services (AWS)Feb 2024
Recommendations

Tejas saykar
Software Developer at Spitertech Solutions LLP | React Js | Node Js | MongoDB | MERN Stack Develope...Nashik, Maharashtra, India
Francesca concio
Senior Civil Structural Engineer presso newcleoGenoa, Liguria, Italy
Chidimma sylvia anyanwu, mcib,mcilrm
BRANCH MANAGER at ACCION MICROFINANCE BANK LTDNigeria
Dheeru sharma
Security At Sterling Auxiliaries Pvt Ltd.Bharuch, Gujarat, India
Liberty smith
Program Manager at ACC PremiereMontoursville, Pennsylvania, United States
Ahmad ashrif a bakar
Professor at Universiti Kebangsaan MalaysiaBandar Baru Bangi, Selangor, Malaysia
Pratama siregar
Performance Marketing Specialist | Lead-Gen Focused-Campaign | Paid Media | Media Planning & Budgeti...Jakarta, Jakarta, Indonesia
Tyson seburn
Author, How to Write Inclusive Materials / AD International Programs, UofTGreater Toronto Area, Canada
Keith koh
Insurance Loss Adjusting/ Civil / Geotechnical EngineeringSingapore
Muhammad naufan azhari
People Development | Training & Learning DevelopmentJakarta, Indonesia
Alexander marquez, bsn, rn
Emergency Room Nurse for The Mayo Clinic Phoenix Campus.Miami, Florida, United States
Yolanda silveira
Content WriterGoa, India
Muhammad shahroz
Engineer | Data Analyst | WordPress Dev | Google-Certified | UET Alumni | PepsiCo X Amal Talent '23 ...Lahore, Punjab, Pakistan
Joshua bennett
Information Technology Business System Analyst at Tarrytown Expocare PharmacyAustin, Texas, United States
Blanca r. jimenez
📈Fort Lauderdale, Florida, United States
Maryam ghane
Iran
Jérôme cayolle
Data investigator | Product and process sentinel | Root Cause Analysis Specialist | SQL | Big Data |...Ireland
Tina liu
sales mananger at Suzhou Dahua Ship Co.,LtdChina
Suvendu bikash deb.
Head of Finance & Supply ChainBangladesh
Iuliia zherdieva
Senior Software EngineerCanada
...