ANUSHA RAMA

Software Engineer Intern

316 followers

Overland Park, Kansas, United States

Connect with ANUSHA RAMA to Send Message

Connect

Connect with ANUSHA RAMA to Send Message

Connect

Timeline
About me
Data Engineer enthusiast actively looking for Big data, Data engineer, Hadoop Engineer and Spark Engineer full time positions
Education
- University of Missouri-Kansas City
  2023 - 2024
  Master's degree Computer Science 3.5
- CMR Group Of Institutions
  2014 - 2018
  Bachelor's degree Electrical, Electronics and Communications Engineering 7.2/10
Experience
- Krifal IT Ventures Pvt. Ltd.
  Aug 2017 - Jun 2018
  Software Engineer Intern
  • Assist in data quality assurance tasks, including data cleansing, validation, and anomaly detection, to ensure data accuracy and reliability.• Apply programming languages such as Python, SQL, and scripting languages for data processing, analysis, and automation tasks.• Assist in developing and maintaining data pipelines and ETL processes to extract, transform, and load data from various sources into databases or data lakes.• Participate in data-related projects and initiatives, such as data migrations, data warehousing, or data visualization projects, under mentorship and guidance.• Learn and apply best practices in data engineering, data governance, and data security to ensure compliance and data privacy. Show less
- Mindtree
  Aug 2018 - Sept 2021
  Software Engineer
  Responsibilities: • Developed end-to-end ETL pipelines using Spark and Hive for performing various business specific transformations. • Building Applications and automating the pipelines in Spark for Bulk loads as well as Incremental Loads of various Datasets. • Developed various spark applications using PySpark to perform various enrichments of user behavioral data (click stream data) merged with user profile data. • Developed python scripts to fetch data from various data sources with the help of various APIs. Worked closely with our team’s data scientists and consumers to shape the datasets as per the requirements. • Scheduled the jobs, workflows using Apache Airflow and Oozie. Worked extensively with Sqoop for importing data from Oracle. • Involved in creating Hive tables, loading, and analyzing data and implemented Partitioning, Dynamic Partitioning, Bucketing. • Good experience with continuous Integration and deployment of application using Gitlab and Jenkins.Environment: Spark, Pyspark, Spark SQL, Hive, HDFS, Hadoop, SQL, Linux, Sqoop, Gitlab, Jenkins. Show less
- Wipro
  Sept 2021 - Dec 2022
  Software Engineer
  Responsibilities:· Design, build and maintain data pipelines to ingest, process, and transform sales data from various sources like hive tables, SQL servers into centralized target databases using Databricks and PySpark. · Implement data transformation and cleansing operations using Spark to convert raw data into a clean, structured format suitable for data analysis and modeling.· Utilized the distributed computing capabilities of Spark to process and analyze large volumes of sales and customer data efficiently, taking advantage of parallel processing.· Designed, built, scheduled and monitored complex data pipelines using Apache airflow ensuring reliable execution and successful data processing.· Implement data partitioning techniques in Spark to optimize data processing performance and reduce data shuffling during transformations.· Collaborate with data scientists to prepare and transform data for ML models, ensuring that the data is in the appropriate format for training and scoring.· Used Athena to perform ad-hoc querying on the data stored in S3 for data analytics and exploration. · Optimize Spark jobs, configurations, and resource allocation to achieve the best performance and resource utilization.· Implement security measures to safeguard sensitive data and ensure compliance with data privacy regulations to protect customer data.· Strong experience using Python and PySpark in building data pipelines and writing python scripts to automate pipelines.· Experienced in handling large datasets using Spark in Memory capabilities, using broadcasts variables in Spark, effective & efficient Joins, transformations, and other capabilities.· Worked extensively with Sqoop for importing data from Oracle. · Experience working with EMR cluster in AWS cloud and working with S3.· Good experience with continuous Integration of application using GitHub and Jenkins.Environment: Databricks, Pyspark, Python, Spark SQL, Hive, Athena, Sqoop, Linux, AWS, GitHub, Jenkins. Show less
Licenses & Certifications
- Academy Accreditation - Generative AI Fundamentals
  Databricks
  May 2024
  View certificate
- Academy Accreditation - Databricks Lakehouse Fundamentals
  Databricks
  Mar 2024
  View certificate