Swift Engineer (5+ YOE) – AI / LLM Code Evaluation (Remote, Contract)
Posted 2026-05-06
Remote, USA
Full-time
Immediate Start
Company: Mercor.
Type: Contract (Full-time or Part-time).
Location: Remote (Worldwide).
Language: Professional English required.
- Compensation:
- USD $30 – $90/hour (depending on experience & evaluation performance).
- Weekly payments via Stripe or Wise.
- Flexible workload (project-based, scalable hours).
- Mission:
- Work directly with leading AI teams to improve how large language models reason about code, systems design, and technical problem-solving.
- You will evaluate and refine AI-generated responses, making them more accurate, reliable, and aligned with real-world engineering standards.
- Responsibilities:
- Evaluate AI-generated answers to coding and system design problems.
- Execute and validate code outputs.
- Identify bugs, inefficiencies, and incorrect reasoning.
- Assess code quality & readability.
- Assess algorithmic correctness.
- Assess system design logic.
- Annotate responses with structured, actionable feedback.
- Follow defined evaluation frameworks and quality benchmarks.
- Required Skills:Core:
- Swift (expert level).
- Software Engineering (5+ years).
- Data Structures & Algorithms.
- Systems Design.
- Debugging & Code Review.
- Problem Solving (Medium–Hard level). Technical:
- Code Execution & Testing.
- API Design & Backend Logic.
- Performance Optimization.
- Version Control (Git). AI / Evaluation Context:
- Experience using LLMs in development workflows.
- Ability to evaluate reasoning, not just outputs.
- Nice-to-Have Skills:
- RLHF / AI Model Evaluation.
- Competitive Programming.
- Open-source contributions (merged PRs).
- Multi-language experience (Python, JS, etc.).
- Technical writing / explaining complex concepts.
- Ideal Candidate:
- Degree in Computer Science or related field (BS/MS/PhD).
- Strong real-world engineering background.
- Detail-oriented and highly analytical.
- Comfortable identifying subtle logic flaws and edge cases.
- Able to work independently in async environments.
- What You Will Achieve:
- Improve the quality and reasoning of AI-generated code.
- Influence how AI systems assist developers globally.
- Deliver high-quality evaluation outputs that directly impact model performance.
Location: Remote - Anywhere
- Skills required for this job:
- AI model evaluation
- API design
- Algorithm development
- Code review
- Data science
- Data structures
- Debugging
- Git
- JavaScript
- Large language model (LLM)
- Performance optimization
- Python
- Reinforcement learning from human feedback (RLHF)
- Software engineering
- Swift
- Systems design
- Technical writing
- Testing