Data Engineer
Technology Sack
- Apache Kafka
- Hadoop v2, HDFS, and MapReduce.
- Storm, Spark Streaming
- Pig, Hive, Spark
- Protobuf, Thrift, Avro
Job Description
- Design, develop and maintain an infrastructure for streaming, processing, and storage of data. Build tools for effective maintenance and monitoring of the data infrastructure
- Contribute to key data pipeline architecture decisions and lead the implementation of major initiatives
- Work closely with stakeholders to develop scalable and performant solutions for their data requirements, including extraction, transformation, and loading of data from a range of data sources
- Develop the team’s data capabilities – share knowledge, enforce best practices and encourage data-driven decisions
- Develop data retention policies, and backup strategies and ensure that the firm’s data is stored redundantly and securely
Requirements
- Solid Computer Science fundamentals, excellent problem-solving skills, and a strong understanding of distributed computing principles
- At least 3 years of experience in a similar role, with a proven track record of building scalable and performant data infrastructure
- Expert SQL knowledge and deep experience working with relational and NoSQL databases
- Advanced knowledge of Apache Kafka and demonstrated proficiency in Hadoop v2, HDFS, and MapReduce
- Experience with stream-processing systems (e.g. Storm, Spark Streaming), big data querying tools (e.g. Pig, Hive, Spark), and data serialization frameworks (e.g. Protobuf, Thrift, Avro)
- Bachelor’s or Master’s degree in Computer Science or related field from a top university
- Able to work within the GMT+8 time zone
About our Client
Our client is a leading Singaporean-based organization who engages in B2B services
Company: George Bernard (Pvt) Ltd
Company email: [email protected]
Job Location: Colombo
Job Category: Software Development / Web / QA / Data / GIS
Job Type: Full Time
