Job description

Big Data Engineer

Use this job description template on Flatwork ATS to hire faster than ever before.
Need a fast hiring & recruiting tool?
Get Flatwork.
Flatwork is built for speed with advanced features like nurture, candidate email sourcing for HR managers and recruiters.
Create a free Flatwork account

We are seeking an outstanding person to play a pivotal role in helping the analysts & business users make decisions using data and visualizations. You will partner with key players across the engineering, analytics & business teams as you design and build query friendly data structures. The ideal candidate is a self-motivated teammate, skilled in a broad set of data processing techniques with the ability to adapt and learn quickly, provide results with limited direction, and choose the best possible data processing solution is a must.

The Big Data Engineer will be responsible for designing, building, and maintaining our large-scale data processing systems. The engineer will work with our team of data scientists and software engineers to design efficient and reliable data pipelines that ingest, process, and store data from a variety of sources. The engineer will also be responsible for optimizing our data processing systems for performance and scalability.

Key Qualifications

  • 5+ years of professional experience with Big Data systems, pipelines and data processing
  • Practical hands-on experience with technologies like Apache Hadoop, Apache Pig, Apache Hive, Apache Sqoop & Apache Spark
  • Ability to understand API Specs, identify relevant API calls , extract data and implement data pipelines & SQL-friendly data structures
  • Identify Data Validation rules and alerts based on data publishing specifications for data integrity and anomaly detection
  • Understanding on various distributed file formats such as Apache AVRO, Apache Parquet and common methods in data transformation
  • Expertise in Python, Unix Shell scripting and Dependency driven job schedulers
  • Expertise in Core JAVA, Oracle, Teradata and ANSI SQL
  • Familiarity with rule based tools and APIs for multi stage data correlation on large data sets is a plus