Job Description
- Design and manage robust data pipelines to support AI/ML systems.
- Ingest, clean, transform, and store large-scale data from various sources (APIs, files, streams).
- Build and optimize ETL/ELT workflows using tools like Airflow, dbt, or Spark.
- Work with structured, semi-structured, and unstructured data (JSON, audio, video, logs).
- Maintain data lakes, warehouses, and real-time streaming architecture.
- Ensure data quality, lineage, versioning, and schema evolution across the lifecycle.
- Integrate AI/ML-ready datasets into model training and inference pipelines.
- Secure sensitive data using access control, encryption, and compliance best practices.
- Collaborate with AI, product, and analytics teams to align data models with business goals.
- Monitor data systems for bottlenecks, latency, and reliability.
Job Information
Country
Worldwide
Industry
Technology
Job Type
Full time / part time