AI Model Evaluation SpecialistInizio Partners • New York, NY, United States

serp_jobs.error_messages.no_longer_accepting

AI Model Evaluation Specialist

Inizio Partners • New York, NY, United States

job_description.job_card.30_days_ago

serp_jobs.job_preview.job_type

serp_jobs.job_card.full_time

job_description.job_card.job_description

About the job AI Model Evaluation Specialist

Key Responsibilities :

Perform scoring and qualitative evaluations ofLLM-generated responses across multiple use cases.
Develop and maintain scoring guidelines and rubrics toensure consistency and objectivity.
Collaborate with data scientists, product managers, andengineering teams to align scoring with project goals.
Assist in the creation and labeling of high-qualityevaluation datasets for prompt tuning or model fine-tuning.
Utilize NLP-based metrics and tools (e.g., ROUGE, BLEU,cosine similarity) for automated scoring support.
Document scoring patterns, common model errors, andimprovement opportunities.
Contribute to prompt experimentation and help compareeffectiveness of different prompt strategies.

Qualifications :

Prior experience with LLMs (e.g., GPT, Claude, LLaMA,etc.) or AI / NLP projects is highly preferred.

Strong analytical skills and attention to detail,especially in assessing language quality.

Familiarity with prompt engineering, generative AI, orconversational AI tools is a plus.

Hands-on experience with Python, Jupyter, or evaluationlibraries (optional but desirable).

Experience working with evaluation frameworks orannotation tools (Label Studio, Prodigy, etc.) is a bonus.

Excellent written and verbal communication skills

serp_jobs.job_alerts.create_a_job

Model • New York, NY, United States

Job_description.internal_linking.related_jobs

Applied Researcher II (AI Foundations, LLM Core and Agentic AI)

Capital One • New York, NY, United States

serp_jobs.job_card.full_time +1

Applied Researcher II (AI Foundations, LLM Core and Agentic AI).At Capital One, we are creating trustworthy and reliable AI systems, changing banking for good. For years, Capital One has been leadin...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_hours • serp_jobs.job_card.promoted • serp_jobs.job_card.new

Director Data Modeling, Measurement Innovation

People Inc • New York City, New York, USA

serp_jobs.job_card.full_time

Director Data Modeling Measurement Innovation.Dotdash Meredith is seeking a technically strong Director of Data Modeling to lead advanced analytical initiatives within our Measurement team.You will...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

AI Project Lead : Creative Benchmark & Evaluation

Contra • New York, NY, United States

serp_jobs.job_card.full_time

A leading software development company in New York is seeking an AI Project Lead to design and manage the Human Creativity Benchmark. This mid-senior level role focuses on AI evaluation and requires...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

AI Innovation Content Lead

Latham & Watkins LLP • New York City, New York, USA

serp_jobs.job_card.full_time

Latham & Watkins is a global law firm consistently ranked among the top firms in the world.The success of our firm is largely determined by our commitment to hire and develop the very best and ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Machine Learning Engineer - Model Evaluations, Public Sector

Scale AI, Inc. • New York, NY, United States

serp_jobs.job_card.full_time

Machine Learning Engineer - Model Evaluations, Public Sector.The Public Sector ML team at Scale deploys advanced AI systems-including LLMs, agentic models, and multimodal pipelines-into mission-cri...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

AI Model Risk Analyst

MetLife • New York, NY, United States

serp_jobs.job_card.full_time

Global Risk Management (GRM) oversees MetLife's financial and non-financial risks to support responsible growth and ensure we deliver on our promises to customers and stakeholders.GRM accomplishes ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Remote Cinematic Video Evaluator - AI Trainer ($45-$45 per hour)

Mercor • Clifton, New Jersey, US

serp_jobs.filters.remote

serp_jobs.job_card.full_time

Overview : • • Mercor is seeking highly discerning • •video evaluators • •.Specifically : artistic professionals such as • •video editors, motion graphics designers, producers, animators, cinematographer a...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Artificial Intelligence (AI) Engagement Lead

JPMorganChase • Jersey, New Jersey, USA

serp_jobs.job_card.full_time

If youre passionate about translating complex ideas into engaging content and driving the future of Artificial Intelligence (AI) this is your opportunity to shine. As the Artificial Intelligence (AI...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Senior Specialist, Technical Evaluations & Proposals

Resilience • New York, NY, United States

serp_jobs.job_card.full_time

A career at Resilience is more than just a job - it's an opportunity to change the future.Resilience is a technology-focused biomanufacturing company that's. We're building a sustainable network of ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Applied AI Research Engineer

Norm Ai • New York, New York, United States

serp_jobs.job_card.full_time

Norm Ai is the Compliance AI Platform for legal standards-based reasoning & workflow automation.We developed the first Domain Specific Language (DSL) for fully representing regulatory requirements ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Data Modeling Specialist

Morgan Stanley • New York City, New York, USA

serp_jobs.job_card.full_time

Were seeking someone to join our team as a Data Modeling Specialist in NFR Data & Analytics to help execute on our data centric strategy. In the Legal & Compliance division we assist the Fir...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

VP, Model Validation

NYC Staffing • New York, NY, US

serp_jobs.job_card.full_time

Role Summary / Purpose : The VP, Model Validation is within Synchrony Model Risk Management function and responsible for leading a model validation team of quantitative analysis, focusing on the valid...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

AI Model Risk Analyst

MetLife Services and Solutions, LLC • New York, NY, United States

serp_jobs.job_card.full_time

Global Risk Management (GRM) oversees MetLife’s financial and non-financial risks to support responsible growth and ensure we deliver on our promises to customers and stakeholders.GRM accomplishes ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Machine Learning Systems Engineer - Data & Evaluation, Horizons

Anthropic • New York, New York, United States

serp_jobs.job_card.full_time

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted

Machine Learning Engineer - Model Evaluations, Public Sector

Scale AI • New York, NY, United States

serp_jobs.job_card.full_time

Machine Learning Engineer - Model Evaluations, Public Sector.Louis, MO; New York, NY; Washington, DC.The Public Sector ML team at Scale deploys advanced AI systems—including LLMs, agentic models, a...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Community ML Research Engineer, non-AI scientific fields - US Remote

Hugging Face • New York, New York, United States

serp_jobs.filters.remote

serp_jobs.job_card.full_time

At Hugging Face, we’re on a journey to democratize good AI.We are building the fastest growing platform for AI builders with over 5 million users & 100k organizations who collectively shared over 1...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_variable_days • serp_jobs.job_card.promoted

Artificial Intelligence Specialist

EMW Staffing Solutions LLC • New York, NY, United States

serp_jobs.job_card.full_time

Thank you for your consideration.Please apply for further information.Role : Forward Deployment Engineer / Full Stack Developer. Key Responsibilities : Embed with law enforcement teams, understand their...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_less • serp_jobs.job_card.promoted • serp_jobs.job_card.new

AI researcher

Graphite • New York, New York, United States

serp_jobs.job_card.full_time

Graphite builds consumer-quality tools for modern software engineering teams, so they can ship faster and create amazing products. Anyone can start using Graphite individually without needing their ...serp_jobs.internal_linking.show_more

serp_jobs.last_updated.last_updated_30 • serp_jobs.job_card.promoted