Talent.com
AI Engineer, Evaluation and Reliability
AI Engineer, Evaluation and ReliabilityMice Groups • Redwood City, CA, US
AI Engineer, Evaluation and Reliability

AI Engineer, Evaluation and Reliability

Mice Groups • Redwood City, CA, US
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.permanent
job_description.job_card.job_description

Senior Engineer, AI Evaluation and Reliability / Contract-to-Hire or Direct Hire / Redwood City / Hybrid, onsite 3 days per week / This position pays $70-80 / hr. W2 for Contract, $140-190K annually upon conversion / US Citizens and Green Card holders only Summary : Our client is looking for a Senior Engineer, AI Evaluation & Reliability to lead the design and execution of evaluation, quality assurance, and release gating for our agentic AI features. You'll develop the pipelines, datasets, and dashboards that measure and improve agent performance across real-world SOC workflows ensuring every release is safe, reliable, efficient, and production-ready. You will guarantee that our agentic AI features operate at full production scale, ingesting and active on millions of SOC alerts per day, with measurable impact on analyst productivity and risk mitigation. This role partners closely with the Product team to deliver operational excellence and trust in every AI-drive capability. Responsibilities : Define quality metrics : Translate SOC use cases into measurable KPI's (e.g., precision / recall, MTTR, false-positive rate, step success, latency / cost budgets). Build continuous evaluations : Develop offline / online evaluation pipelines, regression suites, and A / B or canary test; integrate them into CI / CD for release gating. Curate and manage datasets : Maintain gold-standard datasets and red-team scenarios; establish data governance and drift monitoring practices. Ensure safety, reliability, and explainability : Partner with Platform and Security Research to encode guardrails, policy enforcement, and runtime safety checks. Expand adversarial test coverage (prompt injection, data exfiltration, abuse scenarios). Ensure explainability and auditability of agent decisions, maintaining traceability and compliance of AI-driven workflows. Production reliability & observability : Monitor and maintain reliability of agentic AI features post-release define and uphold SLIs / SLOs, establish alerting and rollback strategies, and conduct incident post-mortems. Design and implement infrastructure to scale evaluation and production pipelines for real-time SOC workflows across cloud environments. Drive agentic system engineering : Experiment with multi-agent systems, tool-using language models, retrieval-augmented workflows, and prompt orchestration. Manage model and prompt lifecycle track version, rollout strategies, and fallbacks; measure impact through statistically sound experiments. Collaborate cross-functionally : Work with Product, UX and Engineering to prioritize high-leverage improvements, resolve regressions quickly, and advance overall system reliability. Required Skills : 6+ years building evaluation or testing infrastructure for ML / LLM systems or large-scale distributes system Proven ability to translate product requirements into measurable metrics and test plans. Strong Python skills Strong Experience with modern data tooling Hands-on experience running A / B tests, canaries, or experiment frameworks. Experience defining and maintaining operational reliability metrics (SLIs / SLOs) for AI-driven systems. Familiarity with large-scale distributed or streaming systems serving AI / agent workflows (millions of events or alerts / day). Excellent communication skills able to clearly convey technical results and trade-offs to engineer, PMs, and analysts. Pay for this position is based on market location and may vary depending on job-related knowledge, skills, and experience. As a contractor you may also be eligible for health benefits such as health, dental, and vision as well as access to a 401K plan. A sign-on payment and restricted stock units may be provided as part of the compensation package, in addition to a full range of medical, financial, and / or other benefits, dependent on the position offered by our client. Applicants should apply via The Mice Groups Inc. website (www.micegroups.com) or through this careers site posting. We are an equal opportunity employer and value diversity at The Mice Groups Inc. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records. Pursuant to the Los Angeles Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records. The Mice Groups Inc. values your privacy. Please consult our Candidate Privacy Notice, for information about how we collect, use, and disclose personal information of our candidates. Privacy Policy One of the basic principles The Mice Groups follows in designing and operating this website is that we ask for only the information we need to provide the service you’ve requested. The Mice Groups does not currently collect personal identifying information via its website except (i) to the extent that you provide this information in an online job application and (ii) to the extent that your web browser provides personal identifying information. The Mice Groups will use your personally identifying information solely for the purpose for which you submitted the information. The Mice Groups may, however, aggregate certain elements of your personal identifying information with the information of other users of our website to analyze the usefulness and popularity of various web pages on its website. The Mice Groups reserves the right to change this policy at any time by posting a new privacy policy at this location. Questions regarding this statement should be directed to info@micegroups.com

serp_jobs.job_alerts.create_a_job

Reliability Engineer • Redwood City, CA, US

Job_description.internal_linking.related_jobs
AI Research Engineer, Enterprise Evaluations

AI Research Engineer, Enterprise Evaluations

Scale AI • San Francisco, CA, United States
serp_jobs.job_card.full_time
AI Research Engineer, Enterprise Evaluations.Scale AI is seeking a technically rigorous and driven.This high‑impact role is critical to our mission of delivering the industry's leading.You will be ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Lead Generative AI Engineer

Lead Generative AI Engineer

Madison-Davis, LLC • San Francisco, CA, United States
serp_jobs.job_card.full_time
We’re supporting a major global financial technology organization that’s making significant investments in AI innovation. They’re scaling their engineering teams across North America to drive develo...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Applied AI Engineer – Generative AI

Applied AI Engineer – Generative AI

Kodiak • San Francisco, CA, United States
serp_jobs.job_card.full_time
The company has developed an artificial intelligence (AI) powered technology stack purpose-built for commercial trucking and the public sector. The company delivers freight daily for its customers a...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
AI Engineer

AI Engineer

Factory • San Francisco, CA, United States
serp_jobs.job_card.full_time
Factory is looking for innovative AI Engineers to build and evolve cutting-edge AI systems that transform how software organizations accelerate their productivity and innovation.Innovate at the for...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Applied AI Engineer : Lead LLM Agent Evaluations (Remote)

Applied AI Engineer : Lead LLM Agent Evaluations (Remote)

Canvas Medical • San Francisco, CA, United States
serp_jobs.filters.remote
serp_jobs.job_card.full_time
A healthcare technology firm is looking for an Applied AI Software Engineer to lead evaluations of AI agents in clinical operations. The ideal candidate has over 5 years of experience in machine lea...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
AI Research Engineer, Enterprise Evaluations

AI Research Engineer, Enterprise Evaluations

Scale AI, Inc. • San Francisco, CA, United States
serp_jobs.job_card.full_time
Scale AI is seeking a technically rigorous and driven.This high-impact role is critical to our mission of delivering the industry's leading. You will be a hands-on contributor to the core systems th...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Lead AI Engineer

Lead AI Engineer

Woebot • San Francisco, CA, United States
serp_jobs.job_card.full_time
We’re a mission-driven startup reinventing the way people find peace and inspiration through dazzling digital experiences. Our team blends engineering, design, and product minds with a shared passio...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
AI / Generative AI Engineer

AI / Generative AI Engineer

Andiamo • San Francisco, CA, United States
serp_jobs.job_card.permanent
Get AI-powered advice on this job and more exclusive features.Generative AI Engineer - Pre-IPO Tech Company.AI-powered applications at a high-growth, pre-IPO technology company.You will work on app...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Founding AI Engineer

Founding AI Engineer

HartleyCo • San Francisco, CA, United States
serp_jobs.job_card.full_time
This range is provided by HartleyCo.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Direct message the job poster from HartleyCo.Headhunter & Di...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted
Applied AI Engineer at Innovative AI simulation startup

Applied AI Engineer at Innovative AI simulation startup

Jack & Jill / External ATS • San Francisco, California, USA
serp_jobs.job_card.full_time
This is a job that we are recruiting for on behalf of one of our customers.Hes an AI agent that sends you unmissable jobs and then helps you ace the interview. Hell make sure you are considered for ...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Founding AI Engineer

Founding AI Engineer

Second Nature Computing • San Francisco, CA, United States
serp_jobs.job_card.full_time
San Francisco, CA (HQ) — in-person most days with flexibility when you need it.Must be authorized to work in the U.We believe everyday life shouldn’t have us juggling apps.Second Nature Computing i...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Lead AI Engineer

Lead AI Engineer

1Five • San Francisco, CA, United States
serp_jobs.job_card.full_time
This is a leadership role at the intersection of.AI, technical architecture, and company vision.ML engineering and model development. Backflip’s core model, including architecture, data, training, a...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new
AI Engineer

AI Engineer

Anything • San Francisco, California, USA
serp_jobs.job_card.full_time
Anything is the AI product engineer for the next wave of entrepreneurs.Its an AI agent that turns English into apps.Everything you need make money on the internet built in - mobile web design AI b...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Lead AI Engineer

Lead AI Engineer

Backflip • San Francisco, CA, United States
serp_jobs.job_card.full_time
Mechanical design, the work done in CAD, is the rate-limiter for progress in the physical world.However, there are only 2-4 million people on Earth who know how to CAD. But what if hundreds of milli...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Research Engineer : AI for Production & Reliability

Research Engineer : AI for Production & Reliability

Resolve.Ai • San Francisco, CA, United States
serp_jobs.job_card.full_time
Join a forward-thinking company at the forefront of AI innovation! As part of a dynamic team, you'll develop cutting-edge AI workflows that transform software engineering and production systems.You...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Principal Engineer, AI Behaviors - Remote Lead & Innovate

Principal Engineer, AI Behaviors - Remote Lead & Innovate

Protingent • Hillsborough, CA, United States
serp_jobs.filters.remote
serp_jobs.job_card.full_time
A leading staffing firm is seeking a Principal Engineer specializing in behaviors to drive team and technology direction. This remote position requires a strong background in Machine Learning and so...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Senior AI Engineer — Agentic LLMs Lead & Mentor (Remote)

Senior AI Engineer — Agentic LLMs Lead & Mentor (Remote)

LiveRamp • San Francisco, CA, United States
serp_jobs.filters.remote
serp_jobs.job_card.full_time
A leading data collaboration platform in San Francisco is seeking a Senior Staff AI Engineer to lead the development of advanced AI agents. This role entails mentoring a talented team and collaborat...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted
Applied Evals Engineer — Build Reliable AI Pipelines

Applied Evals Engineer — Build Reliable AI Pipelines

OpenAI • San Francisco, CA, United States
serp_jobs.job_card.full_time
A leading AI research company in San Francisco is seeking a Software Engineer to design and build evaluation systems for advanced AI. The role involves collaboration across research and product team...serp_jobs.internal_linking.show_more
serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted