In all Media logo

Senior LLM Prompt Engineer

In all Media
Full-time
Remote
South America
Technology

πŸ“Œ Job Title: LLM / Prompt Engineer – AI Stewardship Squad

Location: Remote from LATAM | Full-time contractor
Company: Inallmedia.com

πŸš€ About the Role

As part of Inallmedia's AI Stewardship Squad, you'll be responsible for turning state-of-the-art LLMsβ€”like GPT‑4‑turbo, Claude 3, or Llama 3β€”into production-ready, measurable features. Your mission will include selecting or fine-tuning foundation models, implementing scalable Retrieval-Augmented Generation (RAG) pipelines, versioning prompts, and evangelizing best practices in prompt engineering across product teams.

You’ll also play a key role in ensuring model reliability, safety, and auditability through prompt monitoring, bias mitigation, and secure deployment workflows.

🎯 Key Responsibilities

  • Select, fine-tune, or deploy LLMs for product-specific needs
  • Design and optimize RAG architectures with observability and vector search integration
  • Implement and monitor prompt versioning using version control and CI/CD pipelines
  • Track and analyze key metrics: hallucination rate, grounding score, latency, token cost
  • Detect and mitigate bias in prompts (cultural, gender, linguistic)
  • Support red teaming activities to assess model robustness and safety
  • Collaborate with MLOps and platform teams to deploy models in production environments
  • Promote internal standards for LLMOps, prompt governance, and model observability

🧠 Ideal Candidate

  • 2–4 years of hands-on experience with LLMs and Retrieval-Augmented Generation (RAG)
  • 5–8 years in machine learning or software engineering roles
  • Proven experience building RAG pipelines with LangChain or equivalent frameworks
  • Familiarity with vector databases such as Pinecone, Weaviate, or Milvus
  • Strong understanding of prompt lifecycle management: versioning, evaluation, rollback
  • Experience implementing metrics and monitoring pipelines: hallucination rate, grounding, token usage
  • Demonstrated ability to detect and reduce prompt-related bias
  • Comfortable working with LLMs in production environments using CI/CD and containerization
  • Exposure to red teaming techniques for foundational model testing

πŸ› οΈ Recommended Stack

  • LLMs: GPT‑4/5, Claude, Gemini, Perplexity
  • Frameworks: LangChain, LlamaIndex, Hugging Face Transformers
  • Vector DBs: Pinecone, Weaviate, Milvus
  • Languages: Python
  • DevOps: GitHub Actions, GitLab CI, Docker, Kubernetes (basic for deployment)
  • Monitoring & Evaluation: OpenAI Evals, TruLens, Prometheus, OpenTelemetry
  • Infra: Azure, AWS, GCP

☁️ Infrastructure & Environment

  • 100% remote across LATAM
  • GitHub/GitLab for code and prompt versioning
  • CI/CD pipelines and Terraform-based IaC
  • Secure VPN/VPC access with MFA
  • Focus on AI safety: input validation, RBAC, encryption-at-rest