Senior Big Data Engineer – AI-Focused Data

4497 views | Apply Before: 2025-07-31

Job Summary

No. of Vacancy

N/A

Job Type

Full Time

Offered Salary

Negotiable

Gender

Any

Career Level

Top Level

Job Description

We are seeking a Senior Big Data Engineer with a strong background in managing structured and unstructured data pipelines, who thrives in a fast-paced AI-focused environment. You will be instrumental in building and scaling our data lake architecture, supporting a system designed to fuel intelligent AI agents for data collection, labeling, and analytical reasoning. This includes integrating vector databases and optimizing for retrieval-augmented generation (RAG) workflows deployed on AWS Bedrock and other AI stacks.

Design and implement scalable ingestion pipelines for structured/unstructured data using AWS and Databricks Unity Catalog.
Build and maintain high-throughput ETL/ELT pipelines with Apache Airflow and Databricks.
Architect and manage data modeling, storage, and indexing strategies in PostgreSQL and RDS, ensuring compatibility with AI retrieval systems.
Integrate and manage vector databases to support fast semantic and embedding-based search in RAG pipelines.
Collaborate with AI engineers to ensure seamless compatibility with LangGraph and LangSmith agent systems.
Implement robust data validation, lineage, and governance systems using Unity Catalog.
Optimize performance across distributed compute environments (Databricks, EC2).
Deploy and maintain Lambda-based microservices for scalable, real-time data ingestion and enrichment.

Job Specification

Required Skills

5+ years working with big data systems in production environments.
Proven expertise with Databricks, Unity Catalog, and Apache Spark.
Proficiency in Airflow, AWS stack (Lambda, EC2, RDS), and cloud-based data lake architectures.
Strong SQL and database design skills (PostgreSQL preferred).
Working knowledge of vector databases (Chroma, Pinecone, FAISS).
Solid understanding of data lifecycle management in ML/AI contexts.
Bonus: Familiarity with LangGraph, LangSmith, LangChain, or similar agent orchestration tools.

Preferred Qualifications

Experience with AI agent pipelines or large-scale ML model support.
Emphasis on data observability, security, and lineage tracking.
Hands-on with RAG architecture, including vector storage and semantic retrieval.
Exposure to AWS Bedrock and model deployment orchestration.

IT Outsourcing Company

IT project outsourcing company

FOR EMPLOYER
Post a jobEmployer Login
COMPANY
About usBlogsTerms and conditionsPrivacy Policy
CONTACT US