
Satya Sai Teja Jasthi
ETL Developer

Connect with Satya Sai Teja Jasthi to Send Message
Connect
Connect with Satya Sai Teja Jasthi to Send Message
ConnectTimeline
About me
Sr Data Engineer
Education

Bradley University
-Master's degree
Lovely Professional University
-Bachelor's degree
Experience

Hexaware Technologies
Nov 2014 - Apr 2016ETL DeveloperCreated ETL pipelines to move data from legacy systems to a Hadoop cluster while working on a reporting project. Additionally involved in data pretreatment, data cleansing, business requirement validation, functional specification design for schema and table construction, and Hive DWH query performance optimization.Responsibilities:• Assisted in the distribution of Hortonworks. installed, set up, and kept up a Hadoop cluster in accordance with the needs of the organization.• Using a variety of Informatica Designer tools, including Source Analyzer, Warehouse Designer, Mapplet Designer, and Mapping Designer, I created new mapping designs.• Created the mappings in accordance with technical specifications by utilizing the necessary Transformations in the Informatica tool.• Developed intricate mappings that required Business Logic to be implemented to feed data into the staging area.• Developed mappings and sessions using Informatica Power Center for data loading and utilized Informatica reusability at different stages of development.• Using IICS Data Integration, an ETL process was designed, created, and put into use.• Made extensive use of performance tweaking strategies while utilizing IICS to import data into Azure Synapse.• Worked with different Informatica Transformations, such as Filter, Expression, Aggregate, Update Strategy, Normalizer, Joiner, Router, Sorter, and Union, to perform data manipulations.• Created scripts in Bash to retrieve log files from FTP servers and run Hive jobs to process and analyze them.• Developed the processes for transferring data to the data warehousing system from all systems.• Constructed the staging area's surroundings and filled it with information gathered from various sources.• Contributed to the development of ETL processes for transferring data from source to target systems by analyzing business process activities. Show less

Sonata Software
May 2016 - Feb 2017Jr Data EngineerI was a key player in the setup and management of the Hadoop Ecosystem on GCP as a Data Engineer at Target, overseeing the transfer of apps using Google Dataflow. I created store-level metrics and data pipelines in partnership with product teams, using tools like SQOOP, PySpark, and Airflow for automation and data processing.Responsibilities:• Used PySpark programming to implement and manage data transformations in the Azure environment.• Created T-SQL scripts to synchronize and migrate data between several database systems.• Enhanced Tableau data models to meet intricate business needs while maintaining data integrity and accuracy.• Developed unique Talend routines and components to address difficult integration problems and complicated data transformations.• Developed and refined intricate stored procedures, functions, and queries in PL/SQL.• Created and built Unix Shell scripts to automate Azure-based data integration workflows.• Used Spark Streaming for dividing streaming data into batches for input to the Spark engine.• Wrote Spark applications for data validation, cleansing, transformation, and custom aggregation.• Developed REST APIs using Python with Flask and Django for integration with various data sources.• Utilized Apache Spark with Python for developing Big Data Analytics and Machine learning applications. Show less

PLZ Corp
Apr 2017 - Feb 2019Big Data EngineerPlz corp specialized in development, manufacturing, packaging and distribution of a comprehensive private labeled products. I closely handled the data of the company, making sure that it is accurately gathered, processed, and stored. Creating and refining data pipelines to assist business administration and payment integrity is one of my duties. After that, the data is put into centralized data warehouses, which allow for thorough reporting and analytics for risk evaluation, premium computation, and client-specific insights.Responsibilities:• Designed and executed end-to-end data solutions on Cloudera, Hortonworks, MapR, Snowflake, and Apache Airflow, leveraging Hadoop, Hive, and PIG.• Skilled in developing and overseeing distributed data solutions and ETL pipelines using Big Data technologies including AWS, GCP, Azure Cloud services, Databricks platform, and Hadoop ecosystem components.• Using Terraform, developed and managed cloud infrastructure as code (IaC) to automate the provisioning of AWS services such as EC2, S3, and VPCs.• Proficiency with several ETL tools, such as Talend Open Studio for Big Data, in the areas of data migration, profiling, ingestion, cleansing, transformation, and export.• Proficient in the development and optimization of data solutions using SQL Server, MSBI, and Azure Cloud.• Expertise in using Azure Cosmos, Azure Synapse Analytics, Azure Data Factory, Azure Data Lake Storage, and Azure Analytical services.• Skilled in Star Join Schema/Snowflake modeling and possessing dimensional data modeling experience with tools such as ER/Studio, Erwin, and Sybase Power Designer.• Thorough understanding of the AWS platform and all its functionalities, such as Cloud Formation, Cloud Watch, Cloud Trail, EBS, VPC, RDS, and IAM.• Able to deal with CloudFront, CloudFormation, S3, Athena, SNS, SQS, Glue, RDS, DynamoDB, EC2 instances, ECS, EBeanstalk, Lambda, and Elastic load balancing. Show less

Intetics
Mar 2019 - Sept 2021GCP Data EngineerIn Intetics our team project was to develop new application for MedForward a business partner of Intetics. MedForward allows pharmacies to drive more traffic to their stores mostly through the pharmacy finder a gateway to MTM. These pharmacies make money by performing MTM cases. Eventually MedForward will start offering products to pharmacies directly through the application training. Our team developed a new site for a partner company to use internally to manage their clients as well as developed external site for partner companie's clients.Responsibilities:• Using GCP technologies, end-to-end data pipelines for processing large amounts of bank transaction data have been successfully created and implemented.• Exhibited proficiency in SQL and Big Query, creating and refining intricate queries to retrieve valuable insights from terabytes of transactional data.• Comprehensive understanding of clinical workflows, patient data management, and healthcare processes.• Capable of maximizing the use of SSIS, SSAS, and SSRS to improve the accuracy and efficiency of data processing inside the project.• Developed intricate ETL procedures utilizing Python, Hadoop, and PySpark to convert unstructured transaction data into an organized format for subsequent analysis.• Practical knowledge in utilizing Spark to read data from various sources, such as files and RDBMS, and handle it through actions and transformations.• Diagnose issues, debug, and fine-tune SQL and PL/SQL code to maximize application performance.• Constructed Python DAGs within Apache Airflow to oversee complete data pipelines for diverse uses. Utilized Apache Spark-based analytics with Azure Databricks, enabling cooperative data science and engineering.• Experience in creating enterprise-level solutions with streaming frameworks (Apache Kafka, Spark Streaming, and Flink) and batch processing (using Apache Pig). Show less

American Express
Oct 2021 - Nov 2022Cloud EngineerMy position as a cloud engineer at American Express is to work on cloud services and to use ETL tools like Informatica to ensure the smooth flow and transformation of financial data. I work in the field of financial planning and advisory services. I create and carry out data integration procedures that gather pertinent financial data from several sources, modify it in accordance with predetermined business standards and guidelines, and then load it into analytical or data warehouse systems.Responsibilities:• Designed and implemented end-to-end data solutions utilizing Hadoop, Hive, and PIG on various Big Data platforms including Cloudera, Hortonworks, MapR, Snowflake, and Apache Airflow.• Proficient in building and managing distributed data solutions and ETL pipelines leveraging Big Data technologies such as Hadoop ecosystem components, Databricks platform, AWS, GCP, and Azure Cloud services.• Developed and managed cloud infrastructure as Code (IaC) using Terraform for automating the provisioning of AWS resources like EC2, S3, and VPCs.• Expertise in Data Migration, Data Profiling, Data Ingestion, Data Cleansing, Transformation, and Data Export using multiple ETL tools like Talend Open Studio for Big Data.• Skilled in SQL Server, MSBI, and Azure Cloud for developing and optimizing data solutions.• Proficient in working with Azure Data Factory, Azure Data Lake Storage, Azure Synapse Analytics, Azure Analytical services, and Azure Cosmos.• Experienced in Dimensional Data Modeling using tools like ER/Studio, Erwin, and Sybase Power Designer, with expertise in Star Join Schema/Snowflake modeling.• In-depth knowledge of the AWS platform and its features, including IAM, EC2, EBS, VPC, RDS, Cloud Watch, Cloud Trail, Cloud Formation, and more.• Skilled in working with EC2 instances, ECS, EBeanstalk, Lambda, Glue, RDS, DynamoDB, CloudFront, CloudFormation, S3, Athena, SNS, SQS, and Elastic load balancing (ELB). Show less

OSF HealthCare
Dec 2022 - nowMy primary responsibility as a Data Engineer in OSF Healthcare Solutions division is to develop and deploy data solutions that improve the effectiveness and efficiency of healthcare procedures. To ensure accurate data collection, processing, and storage, I work closely with healthcare data. Developing and refining data pipelines to facilitate patient-centered treatment, payment integrity, and healthcare administration are among my duties. By utilizing cutting-edge data modeling and integration strategies, I help ensure that information flows across healthcare systems seamlessly, which promotes better decision-making and operational efficacy.Responsibilities:• Designed and implemented scalable data ingestion pipelines using Azure Data Factory, ingesting data from various sources such as SQL databases, CSV files, and REST APIs.• Developed data processing workflows using Azure Databricks, leveraging Spark for distributed data processing and transformation tasks.• Ensured data quality and integrity by performing data validation, cleansing, and transformation operations using Azure Data Factory and Databricks.• Designed and implemented a cloud-based data warehouse solution using Snowflake on Azure, leveraging its scalability and performance capabilities.• Created and optimized Snowflake schemas, tables, and views to support efficient data storage and retrieval for analytics and reporting purposes.• Collaborated with data analysts and business stakeholders to understand their requirements and implemented appropriate data models and structures in Snowflake.• Developed and optimized Spark jobs to perform data transformations, aggregations, and machine learning tasks on big data sets.• Leveraged Azure Synapse Analytics to integrate big data processing and analytics capabilities, enabling seamless data exploration and insights generation.• Configured event-based triggers and scheduling mechanisms to automate data pipelines and workflows. Show less
Azure Data Engineer
Dec 2022 - nowAzure Snowflake Data Engineer
Dec 2022 - now
Licenses & Certifications
- View certificate

Academy Accreditation - Generative AI Fundamentals
DatabricksAug 2024 - View certificate

Microsoft Certified: Azure Data Engineer Associate
MicrosoftJun 2024 - View certificate

Professional Data Engineer Certification
GoogleMay 2024
Recommendations

Alizah shafiq
BSc Accounting & Finance | UoL’25 | xEY | xPIAIslāmābād, Pakistan
Hadi sumarno.,s.t crmh®.,cdmh®.,choa®.,chpo®.,cfsh®
General ManagerWest Java, Indonesia
Angelina pereira
Master’s Student in Service Engineering and ManagementPorto Metropolitan Area
Nguyen trung hieu
Senior Software Test Engineer at BoschHo Chi Minh City, Vietnam
Valeriy suftin
Events Manager at Luštica Bay, a subsidiary of Swiss-based Orascom Development Holding (ODH)Tivat, Montenegro
Sandy alfonsi
Helping clients meet their insurance and financial needs throughout each stage of lifeHamilton, Ontario, Canada
Allanah poirier
Manager, Operations | Administration & Operations LeadershipBeiseker, Alberta, Canada
Um ul wara
Certified PMDC GraduateKarāchi, Sindh, Pakistan
Hu dan semba
Associate Professor at Nagoya UniversityJapan
Marilyn o'hara
Owner/Manager/ Personal Protection Officer (Body Guard)Greater Houston
George williams
Programme Manager - Information, Education and Communication at National Commission for Social Acti...Sierra Leone
Nicole zalina faizal
Content ModeratorMiri, Sarawak, Malaysia
Sivaprakasam sundaram
Technical Consultant at Optum | Full Stack Development | Cloud | Github Copilot | Gen-AIBengaluru, Karnataka, India
Cristina camuñas sevilla
Unidad de Calidad. Complejo Hospitalario Universitario de Toledo.Toledo, Castile-La Mancha, Spain
Angel arora
CA Finalist | Associate at Deloitte | RAPCCE'25Mumbai, Maharashtra, India
Emily falcao
Operations and Information Management Major at the Isenberg School of ManagementHolliston, Massachusetts, United States
Syed yaruq ali
IT Support Engineer | NSS Team | Technician | Network GuardianAntalya, Türkiye
Bruno belieni
Gestor de Manutenção e Operação | Engenharia de Produção | Siderurgia | MineraçãoSerra, Espírito Santo, Brazil
Lei lu
Shared Services Center Manager, Bollore LogisticsSongjiang District, Shanghai, China
Yiğit karabacak
Senior Process Manager at Mercedes-Benz AutomotiveIstanbul, Türkiye
...