LLM Data Engineer United States Fully Remote
We are seeking an experienced AI/LLM Data Engineer to build and maintain the data pipeline for our Generative AI platform. The ideal candidate will be well-versed in the latest Large Language Model (LLM) technologies and have a strong background in data engineering, with a focus on Retrieval-Augmented Generation (RAG) and knowledge-base techniques. This role sits in the AI COE within DX Tech & Digital. As a AI/LLM Data Engineer (you will report into the Director, AI Solutions & Development who oversees the AI COE.
You will work on highly visible strategic projects, collaborating with cross-functional teams
to define requirements and deliver high-quality AI solutions.
The ideal candidate will have a passion for Generative AI and LLMs, with a proven track record of delivering innovative AI applications.
Responsibilities
? Design, implement, and maintain an end-to-end multi-stage data pipeline for LLMs, including Supervised Fine Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) data processes
? Identify, evaluate, and integrate diverse data sources and domains to support the Generative AI platform
? Develop and optimize data processing workflows for chunking, indexing, ingestion, and vectorization for both text and non-text data
? Benchmark and implement various vector stores, embedding techniques, and retrieval methods
? Create a flexible pipeline supporting multiple embedding algorithms, vector stores, and search types (e.g., vector search, hybrid search)
? Implement and maintain auto-tagging systems and data preparation processes for LLMs
? Develop tools for text and image data crawling, cleaning, and refinement
? Collaborate with cross-functional teams to ensure data quality and relevance for AI/ML models
? Work with data lake house architectures to optimize data storage and processing
? Integrate and optimize workflows using Snowflake and various vector store technologies
Requirements
? Master's degree in Computer Science, Data Science, or a related field
? 3-5 years of work experience in data engineering, preferably in AI/ML contexts
? Proficiency in Python, JSON, HTTP, and related tools
? Strong understanding of LLM architectures, training processes, and data requirements
? Experience with RAG systems, knowledge base construction, and vector databases
? Familiarity with embedding techniques, similarity search algorithms, and information retrieval concepts
? Hands-on experience with data cleaning, tagging, and annotation processes (both manual and automated)
? Knowledge of data crawling techniques and associated ethical considerations
? Strong problem-solving skills and ability to work in a fast-paced, innovative environment
? Familiarity with Snowflake and its integration in AI/ML pipelines
? Experience with various vector store technologies and their applications in AI
? Understanding of data lakehouse concepts and architectures
? Excellent communication, collaboration, and problem-solving skills.
? Ability to translate business needs into technical solutions.
? Passion for innovation and a commitment to ethical AI development.
? Experience building LLMs pipeline using framework like LangChain, LlamaIndex, Semantic Kernel, OpenAI functions.
? Familiar with different LLM parameters like temperate, top-k, and repeat penalty, and different LLM outcome evaluation data science metrics and methodologies.
Preferred Skills
? Experience with popular LLM/ RAG frameworks
? Familiarity with distributed computing platforms (e.g., Apache Spark, Dask)
? Knowledge of data versioning and experiment tracking tools
? Experience with cloud platforms (AWS, GCP, or Azure) for large-scale data processing
? Understanding of data privacy and security best practices
? Practical experience implementing data lakehouse solutions
? Proficiency in optimizing queries and data processes in Snowflake or Databricks
? Hands-on experience with different vector store technologies
Benefits
? US employees benefit package.
Similar Remote Jobs
LLM Data Engineer United States Fully Remote
Posted on: 20-11-2024 13:46
Remote Customer Service Representative $45 per hour
Posted on: 20-11-2024 13:46
Marketing Intern - Immediate Hire (Fully Remote)
Posted on: 20-11-2024 13:46
Remote Customer Service-Payment Collection Representative FL/Full Time
Posted on: 20-11-2024 13:46
Remote Customer Service-Payment Collection Representative TX/Full Time
Posted on: 20-11-2024 13:46
Customer Service Representative (100% Remote in Texas)
Posted on: 20-11-2024 13:46
Customer Service & Sales Support - REMOTE
Posted on: 20-11-2024 13:46
Cargo Agent (Customer Service Agent) - ORD
Posted on: 20-11-2024 13:46
Digital Editor, National Geographic Washington, DC, USA
Posted on: 20-11-2024 13:46
Senior SEO & Digital Strategist (National Geographic)
Posted on: 20-11-2024 13:46
Work From Home Jobs No Experience Needed
Posted on: 16-07-2024 18:39
American Express Work From Home (Entry Level Job, College Level) - Online Remote Jobs
Posted on: 10-10-2024 00:00
Patient Accounts Resolution Representative - Remote MN or WI
Posted on: 29-08-2024 00:00
Sales Representative - Remote or in Person (No Experience Needed)
Posted on: 22-10-2024 05:26
Inventory Control Associate
Posted on: 31-01-2025 09:56
Call Center Agent-REMOTE
Posted on: 27-12-2024 04:28
Online Typing Jobs At Home (Work From Home Remote) In Nigeria
Posted on: 03-01-2025 07:36
Senior Manager, Storage Systems Engineering [Remote]
Posted on: 04-02-2025 19:19
Coordinator, Mission Education (Remote)
Posted on: 24-01-2025 04:53
Delta Remote Customer Service Jobs For Teens Entry Level Full Time - WFH
Posted on: 24-10-2024 05:38