Data Engineer

Petrofac

Petrofac is a leading international service provider to the energy industry, with a diverse client portfolio including many of the world’s leading energy companies.

We design, build, manage, and maintain infrastructure for our clients. We recruit, reward, and develop our people based on merit, regardless of race, nationality, religion, gender, age, sexual orientation, marital status, or disability. We value our people and treat everyone who works for or with Petrofac fairly and without discrimination.

The world is re-thinking its energy supply and energy security needs and planning for a phased transition to alternative energy sources. We are here to help our clients meet these evolving energy needs.

This is an exciting time to join us on this journey.

Are you ready to bring the right energy to Petrofac and help us deliver a better future for everyone? 

JOB TITLE: Data Engineer

 

KEY RESPONSIBILITIES:

 

        Architecting and defining data flows for big data/data lake use cases.

·       Excellent knowledge on implementing full life cycle of data management principles such as Data Governance, Architecture, Modelling, Storage, Security, Master data, and Quality.

·       Act as a coach and provide consultancy services and advice to data engineers by offering technical guidance, and ensuring architecture principles, design standards and operational requirements are met.

·       Participate in the Technical Design Authority forums.

·       Collaborates with analytics and business stakeholders to improve data models that feed BI tools, increasing data accessibility, and fostering data-driven decision making across the organization.

·       Work with team of data engineers to deliver the tasks and achieving weekly and monthly goals, also to guide the team to follow the best practices and improve the deliverables.

·       Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability.

·       Responsible for estimating the cluster size, core size, monitoring, and troubleshooting of the data bricks cluster and analysis server to produce optimal capacity for computing data ingestion.

·       Deliver master data cleansing and improvement efforts; including automated and cost-effective solutions for processing, cleansing, and verifying the integrity of data used for analysis.

·       Expertise in securing the big data environment including encryption, tunnelling, access control, secure isolation.

·       To guide and build highly efficient OLAP cubes using data modelling techniques to cater all the required business cases and mitigate the limitation of Power BI in analysis service.

·       Deploy and maintain highly efficient CI/CD devops pipelines across multiple environments such as dev, stg and production.

·       Strictly follow scrum based agile approach of development to work based on allocated stories.

·       Comprehensive knowledge on data extraction, Transformation and loading data from various sources like Oracle, Hadoop HDFS, Flat files, JSON, Avro, Parquet and ORC.

·       Experience defining, implementing, and maintaining a global data platform

·       Experience building robust and impactful data visualisation solutions and gaining adoption

·       Extensive work experience onboarding various data sources using real-time, batch load or scheduled loads. The sources can be in cloud, on premise, SQL DB, NO SQL DB or API-based.

·       Expertise in extracting the data through JSON, ODATA, REST API, WEBSERVICES, XML.

·       Expertise in data ingestion platforms such as Apache Sqoop, Apache Flume, Amazon kinesis, Fluent, Logstash etc.

·       Hands on experience in using Databricks, Pig, SCALA, HIVE, Azure Data Factory, Python, R

·       Operational experience with Big Data Technologies and Engines including Presto, Spark, Hive and Hadoop Environments

·       Experience in various databases including Azure SQL DB, Oracle, MySQL, Cosmos DB, MongoDB

·       Experience supporting and working with cross-functional teams in a dynamic environment.

 

ESSENTIAL QUALIFICATION & SKILLS:

  • Bachelor’s degree (masters’ preferred) in Computer Science, Engineering, or any other technology related field  

  • 10+ years of experience in data analytics platform and hands-on experience on ETL and ELT transformations with strong SQL programming knowledge.

  • 5+ years of hands-on experience on big data engineering, distributed storage and processing massive data into data lake using Scala or Python.

  • Proficient knowledge on Hadoop and Spark eco systems like HDFS, Hive, Sqoop, Oozie, Spark core, streaming.

  • Experience with programming languages such as Scala, Java, Python and Shell scripting

  • Proven Experience in pulling data through REST API, ODATA, XML,Web services.

  • Experience with Azure product offerings and data platform.

  • Experience in data modelling (data marts, snowflake/Star, Normalization, SCD2).

  •  Architect and defining the data flows and building highly efficient, scalable data pipelines.

  • To work in tandem with the Enterprise and Domain Architects to understand the business goals and vision, and to contribute to the Enterprise Roadmaps.

  • Strong troubleshooting skills, problem solving skills of any issues stopping business progress.

  • Coordinate with multiple business stake holders to understand the requirement and deliver.

  • Conducting a continuous audit of data management system performance, refine whenever required, and report immediately any breach or loopholes to the stakeholders.

  • Allocate task to various team members, track the status and provide the report on activities to management.

  • Understand the physical and logic plan of execution and optimize the performance of data pipelines.

  • Extensive background in data mining and statistical analysis.

  • Able to understand various data structures and common methods in data transformation.

  • Ability to work with ETL tools with strong knowledge on ETL concepts.

  • Strong focus on delivering outcomes.

  • Data management:  modelling, normalization, cleaning, and maintenance

  • Understand Data architectures, Data warehousing principles and be able to participate in the design and development of conventional data warehouse solutions.

Additional Information

    To apply for this job please visit petrofac.referrals.selectminds.com.