π Job Title: LLM / Prompt Engineer β AI Stewardship Squad
Location: Remote from LATAM | Full-time contractor
Company: Inallmedia.com
π About the Role
As part of Inallmedia's AI Stewardship Squad, you'll be responsible for turning state-of-the-art LLMsβlike GPTβ4βturbo, Claude 3, or Llama 3βinto production-ready, measurable features. Your mission will include selecting or fine-tuning foundation models, implementing scalable Retrieval-Augmented Generation (RAG) pipelines, versioning prompts, and evangelizing best practices in prompt engineering across product teams.
Youβll also play a key role in ensuring model reliability, safety, and auditability through prompt monitoring, bias mitigation, and secure deployment workflows.
π― Key Responsibilities
- Select, fine-tune, or deploy LLMs for product-specific needs
- Design and optimize RAG architectures with observability and vector search integration
- Implement and monitor prompt versioning using version control and CI/CD pipelines
- Track and analyze key metrics: hallucination rate, grounding score, latency, token cost
- Detect and mitigate bias in prompts (cultural, gender, linguistic)
- Support red teaming activities to assess model robustness and safety
- Collaborate with MLOps and platform teams to deploy models in production environments
- Promote internal standards for LLMOps, prompt governance, and model observability
π§ Ideal Candidate
- 2β4 years of hands-on experience with LLMs and Retrieval-Augmented Generation (RAG)
- 5β8 years in machine learning or software engineering roles
- Proven experience building RAG pipelines with LangChain or equivalent frameworks
- Familiarity with vector databases such as Pinecone, Weaviate, or Milvus
- Strong understanding of prompt lifecycle management: versioning, evaluation, rollback
- Experience implementing metrics and monitoring pipelines: hallucination rate, grounding, token usage
- Demonstrated ability to detect and reduce prompt-related bias
- Comfortable working with LLMs in production environments using CI/CD and containerization
- Exposure to red teaming techniques for foundational model testing
π οΈ Recommended Stack
- LLMs: GPTβ4/5, Claude, Gemini, Perplexity
- Frameworks: LangChain, LlamaIndex, Hugging Face Transformers
- Vector DBs: Pinecone, Weaviate, Milvus
- Languages: Python
- DevOps: GitHub Actions, GitLab CI, Docker, Kubernetes (basic for deployment)
- Monitoring & Evaluation: OpenAI Evals, TruLens, Prometheus, OpenTelemetry
- Infra: Azure, AWS, GCP
βοΈ Infrastructure & Environment
- 100% remote across LATAM
- GitHub/GitLab for code and prompt versioning
- CI/CD pipelines and Terraform-based IaC
- Secure VPN/VPC access with MFA
- Focus on AI safety: input validation, RBAC, encryption-at-rest