Swift Engineer (5+ YOE) – AI / LLM Code Evaluation (Remote, Contract)

Posted 2026-05-06
Remote, USA Full-time Immediate Start

Company: Mercor.
Type: Contract (Full-time or Part-time).
Location: Remote (Worldwide).
Language: Professional English required.

    Compensation:
  • USD $30 – $90/hour (depending on experience & evaluation performance).
  • Weekly payments via Stripe or Wise.
  • Flexible workload (project-based, scalable hours).
    Mission:
  • Work directly with leading AI teams to improve how large language models reason about code, systems design, and technical problem-solving.
  • You will evaluate and refine AI-generated responses, making them more accurate, reliable, and aligned with real-world engineering standards.
    Responsibilities:
  • Evaluate AI-generated answers to coding and system design problems.
  • Execute and validate code outputs.
  • Identify bugs, inefficiencies, and incorrect reasoning.
  • Assess code quality & readability.
  • Assess algorithmic correctness.
  • Assess system design logic.
  • Annotate responses with structured, actionable feedback.
  • Follow defined evaluation frameworks and quality benchmarks.
    Required Skills:Core:
  • Swift (expert level).
  • Software Engineering (5+ years).
  • Data Structures & Algorithms.
  • Systems Design.
  • Debugging & Code Review.
  • Problem Solving (Medium–Hard level).
  • Technical:
  • Code Execution & Testing.
  • API Design & Backend Logic.
  • Performance Optimization.
  • Version Control (Git).
  • AI / Evaluation Context:
  • Experience using LLMs in development workflows.
  • Ability to evaluate reasoning, not just outputs.
    Nice-to-Have Skills:
  • RLHF / AI Model Evaluation.
  • Competitive Programming.
  • Open-source contributions (merged PRs).
  • Multi-language experience (Python, JS, etc.).
  • Technical writing / explaining complex concepts.
    Ideal Candidate:
  • Degree in Computer Science or related field (BS/MS/PhD).
  • Strong real-world engineering background.
  • Detail-oriented and highly analytical.
  • Comfortable identifying subtle logic flaws and edge cases.
  • Able to work independently in async environments.
    What You Will Achieve:
  • Improve the quality and reasoning of AI-generated code.
  • Influence how AI systems assist developers globally.
  • Deliver high-quality evaluation outputs that directly impact model performance.

Location: Remote - Anywhere

    Skills required for this job:
  • AI model evaluation
  • API design
  • Algorithm development
  • Code review
  • Data science
  • Data structures
  • Debugging
  • Git
  • JavaScript
  • Large language model (LLM)
  • Performance optimization
  • Python
  • Reinforcement learning from human feedback (RLHF)
  • Software engineering
  • Swift
  • Systems design
  • Technical writing
  • Testing

Similar Jobs

Back to Job Board