Nagireddy Ravi Kumar Reddy

Bengaluru, Karnataka, India

View Mobile No.

View Email

156 followers

Timeline
About me
Data Engineer | Hadoop, HDFS, Sqoop, Hive, Spark, SQL, Scala, AWS | Apache Spark, Spark-SQL | Expertise in Optimizing Spark Jobs & Cost Reduction | Actively Seeking New Opportunities
Education
- Kalasalingam university
  2017 - 2021
  Bachelor of technology computer science 8.6
Experience
- Tata consultancy services
  Jun 2021 - Feb 2023
  Assistant system engineer
  • Involved in loading data into HDFS from different Data sources like SQL Server, AWS S3 using Sqoop and load into Hive tables.• Involved in creating Hive tables, loading data from different data sources, HDFS locations and other hive tables.• Created SQOOP jobs and scheduled them to handle incremental loads from RDBMS into HDFS and applied Spark transformations.• Created Hive external tables to perform ETL on data that is generated on daily basis.• Developed Spark code in Scala and Python (Pyspark) and deployed it in AWS EMR.• Was responsible for Optimizing Spark SQL and HIVE queries that helped in saving Cost to project.• Worked in monitoring, managing, and troubleshooting Hadoop and Spark Log files.• Worked on Hadoop within Cloudera Data Platform and running services through Cloudera manager.• Involved in Agile methodologies, daily Scrum meetings, Sprint planning. Show less
- Ipsos
  Mar 2023 - Sept 2023
  Software engineer
  • Collaborated with data modeling teams, stakeholders, and data analysts to comprehend data requirements and translate them into technical specifications and structured data representations.• Developed Spark applications in Scala for performing data cleansing, event enrichment, data aggregation, and data preparation to meet business requirements.• Implemented data quality checks and validation processes to ensure accuracy, consistency, and completeness of data.• Worked on various data formats like AVRO, Sequence File, JSON, Parquet, and XML.• Worked on fine-tuning spark applications to improve overall processing time for pipelines.• Created Hive tables, loaded with data, and wrote Hive queries to process data. Created Partitions and used Bucketing on Hive tables and used required parameters to improve performance.• Debugged common issues with Spark RDDs and Data Frames, resolved production issues, and ensured seamless data processing in production environments.• As per business requirement stored spark processed data in HDFS/S3 with appropriate file formats.• Performed Import and Export of data into HDFS and Hive using Sqoop and managed data within environment.• Created EC2 instances and EMR clusters for Spark Code development and testing.• Performed step execution in EMR clusters for spark job deployment as per requirements.• Used Agile Scrum methodology/ Scrum Alliance for development. Show less
- Societe generale
  Apr 2024 - now
  Big data engineer
Licenses & Certifications
- Sap certified development associate - abap with sap netweaver 7.50
  Sap
  Mar 2021
  View certificate
- Python functions, files, and dictionaries
  Coursera
  Dec 2020
  View certificate
- Python basics
  Coursera
  Nov 2020
  View certificate