Job Description
This is a remote position.
Job Summary
Customer is one of the world’s fastest-growing AI companies accelerating the advancement and deployment of powerful AI systems.
They help customers in two ways : Working with the world’s leading AI labs to advance frontier model capabilities in thinking, reasoning, coding, agentic behavior, multimodality, multilinguality, STEM and frontier knowledge; and leveraging that work to build real-world AI systems that solve mission-critical priorities for companies.
As a Software Engineering evaluator, you will create cutting-edge datasets for training, benchmarking, and advancing large language models, collaborating closely with researchers. This includes curating code examples, providing precise solutions, and making corrections in Python, JavaScript (including ReactJS), C / C++, Java, Rust, and Go; evaluating and refining AI-generated code for efficiency, scalability, and reliability; and working with cross-functional teams to enhance enterprise-level AI-driven coding solutions.
Job Responsibilities
- Working on AI model training initiatives by curating code examples, building solutions, and correcting code in Python, JavaScript (including ReactJS), C / C++, Java, Rust, and Go.
- Evaluate and refine AI-generated code to ensure that it is efficient, scalable, and reliable.
- Collaborate with cross-functional teams to enhance AI-driven coding solutions against industry performance benchmarks.
- Build agents that can verify the quality of the code and identify error patterns.
- Hypothesize on steps in the software engineering cycle (prototyping, architecture design, API design, production implementation, launch, experiments, monitoring, operational maintenance) and evaluate model capabilities on them
- Design verification mechanisms that can automatically verify a solution to a software engineering task.
Essential Skills
Required Skills
Several years of software engineering experience (+5 years), including2+years of continuous full-time experience at a top-tier product company (e.g., Google, Stripe, Amazon, Apple, Meta, Netflix, Microsoft, Datadog, Dropbox, Shopify, PayPal, IBM Research).Strong expertise in building full-stack applications and deploying scalable, production-grade software using modern languages and tools.Deep understanding of software architecture, design, development, debugging, and code quality / review assessment.Excellent oral and written communication skills for clear, structured evaluation rationales.Other Details
Work fully remotely, from the following locations : US, Canada, Australia or Western Europe (UK, France, Germany, Switzerland, Singapore, Denmark, Finland, Netherlands, Sweden, Iceland, Italy, Austria, Ireland, Norway)Commitment : flexible engagement, minimum 10 hrs / week, up to 40 hrs / week (partial PST overlap required)Type : Contractor (no medical / paid leave)Duration : 1 month (starting next week; potential extensions based on performance and fit)Background Check required
Hiring Process
Selection criteria involves :
1. ICF (Candidate Interest Form) :
2. Vetsmith Automated Coding Challenge (30–45 mins) :
3. AI Interview on Qode (20 mins) after passing the coding challenge.
LinkedIn Requisite- Profile authenticity includes Date of LinkedIn profile creation, No of connections, Recent activities will be checked. Ensure LI profile is updated & LI link is mentioned on the resume
IMP : - The submited resume has to be in PDF format only.
Requirements
Fullstack Python Java JavaScript TypeScript Go Rust C++ ReactJS