Data Engineer

Data Engineer

About the Role:

We are seeking a passionate and skilled Data Engineer to join our growing team. In this role, you will be the cornerstone of our data infrastructure, responsible for building, optimizing, and maintaining the robust and scalable data platforms that drive our business decisions and power our products. You will have a direct impact on the reliability, efficiency, and intelligence of our data ecosystem.

Abstract interior
Abstract interior

Key Responsibilities

Data Architecture & Modeling:

Participate in the design and implementation of our data warehouse models, ensuring architectural stability, scalability, and adherence to best practices.

Data Pipeline Development:

Build, manage, and optimize core data pipelines, including event tracking specification, data ingestion, and ETL/ELT processes, ensuring timely and accurate data flow.

Data Governance & Quality:

Establish and enforce mechanisms for data quality monitoring, task alerting, and operational maintenance to guarantee data accuracy, consistency, and completeness.

Business Collaboration & Insight:

Partner with cross-functional teams to translate business logic into scalable data models and a unified metrics system. Utilize visualization tools to effectively present business insights and support ad-hoc analytical requests.

Performance & Innovation:

Proactively identify performance bottlenecks and optimization opportunities within the data platform. Lead and implement technical improvements in storage architecture, computing engines, and real-time data processing capabilities.

Qualifications & Experience

  • Bachelor's degree in Computer Science, Engineering, or a related field.


  • 3+ years of hands-on experience in database or large-scale data platform development.


  • Preferred: Experience with advertising platforms and B2B business data processing.

Technical Stack Requirements:

Cloud & Big Data Platforms:

Hands-on experience with AWS big data services (e.g., EMR, Redshift, S3, Glue) or equivalents on other cloud platforms is a strong plus.

Batch Processing & SQL:

Proficiency in Hadoop, HDFS, and Hive architecture. Expert knowledge of SQL tuning and data modeling methodologies.

Distributed Computing:

Deep understanding of Spark core architecture (RDD/DataFrame/Dataset) and proven experience in performance optimization.

OLAP Databases:

Expertise in using, optimizing, and managing columnar storage databases like ClickHouse.

Relational Databases:

Solid understanding of the design and optimization of mainstream RDBMS such as MySQL/PostgreSQL.

Real-time Processing:

Familiarity with real-time streaming technologies like Flink, Kafka, or Pulsar; practical implementation experience is highly desirable.

Programming:

Proficiency in at least one of the following: Python, Java, or Scala, with strong foundational programming skills.

Structure your enterprise decision system.

Structure your enterprise decision system.

Structure your enterprise decision system.