Full-time
Description
About the Role -
We’re seeking a Data Scientist for Agentic AI to help build AI agents that drive yield improvement, root-cause discovery, and process optimization for leading semiconductor companies.
You’ll combine modern data science and machine learning with LLM- and agent-based techniques to power the next generation of agentic data systems in semiconductor manufacturing and product engineering. Your work will turn messy, high-dimensional data into agents that can continuously monitor, diagnose, and recommend actions across the semiconductor lifecycle.
You will work closely with process, product, and test engineers, as well as data engineers and platform developers, to ingest complex data across the semiconductor lifecycle and build analytical pipelines, models, and agents that surface insights and take action automatically.
Direct semiconductor yield and process experience is not required , but familiarity with yield analysis and semiconductor manufacturing concepts is a strong plus .
Requirements
Key Responsibilities -
Data Integration & Management
- Ingest and unify data from diverse sources across the semiconductor lifecycle (process, test, assembly, and field / system-level data).
- Handle structured data (SQL, CSV, metrology logs, time-series) and unstructured data (reports, images, logs).
- Develop pipelines to clean, align, and normalize large datasets across manufacturing stages and product lines.
Analytics, Modeling & Agent Intelligence
Design and implement algorithms for excursion and anomaly detection, trend monitoring, and yield / quality loss pattern recognition.Perform advanced correlation analysis across process parameters, test metrics, and design variables.Support root-cause analysis by developing interpretable statistical and machine learning models to identify likely drivers of excursions and quality drift.Develop feature extraction and dimensionality-reduction methods suitable for high-dimensional industrial and manufacturing data.Collaborate with ML and agent engineers to design agents that can call tools, query data, run analyses, and iteratively refine their own hypotheses.Visualization & Insight Delivery
Create interactive dashboards and visualizations (e.g., wafer maps, trend charts, pareto analyses) to communicate findings to process and product engineers.Work with UX and platform teams to define how analytics outputs are surfaced to human users and to AI agents.Help design human-in-the-loop review and feedback flows so engineers can guide and improve agent behavior over time.Research, Innovation & Agentic AI
Evaluate and implement modern AI / ML approaches for yield optimization, root-cause analysis, and reliability monitoring (e.g., graph-based methods, representation learning, LLM-assisted analysis).Prototype and evaluate agents that combine LLM reasoning with structured analytics, simulation tools, and domain-specific algorithms.Contribute ideas and feedback into our agent evaluation, safety, and observability frameworks.End-to-End Productionization of AI Agents
Translate offline analyses and models into production-grade code that powers our agentic data systems.Design, implement, and maintain data and inference pipelines that connect enterprise data sources to AI agents.Work with platform and infrastructure teams to deploy, monitor, and continuously improve AI agents in real-world semiconductor operations.Education -
M.S. or Ph.D. in Computer Science, Electrical Engineering, Applied Physics, Statistics, Applied Mathematics, or a related quantitative field (or equivalent practical experience).Experience & Skills
4+ years of experience in applied data science, machine learning, or advanced analytics.Experience working with large, complex, multi-source datasets (e.g., time-series, sensor, manufacturing, or hardware data).Proficiency in Python (e.g., pandas, scikit-learn, NumPy) and common ML / data science workflows.Strong experience with databases and querying (SQL and NoSQL).Experience with data visualization tools and frameworks (e.g., Plotly or similar).Strong grasp of statistical methods such as outlier detection, correlation analysis, regression, clustering, causal inference, and time-series analysis.Soft Skills
Comfortable working with both engineering and data teams, as well as domain experts.Strong analytical thinking, curiosity, and ability to explain complex findings clearly to non-specialists.Practical mindset — focused on building usable analytics and agents that deliver real operational impact.Preferred Qualifications -
Familiarity with semiconductor yield analysis and manufacturing processes (e.g., yield metrics, bin analysis, wafer maps, process flows).Experience with wafer-level and bin-level data analysis, wafer map pattern recognition.Experience using or developing yield management systems (YMS) or manufacturing data platforms.Experience with multi-modal or foundation models (LLMs, VLMs) applied to industrial or manufacturing data.Exposure to LLM-assisted data analysis, AI-based anomaly detection, or agent-based systems in industrial contexts.Experience with cloud data environments (e.g., GCP, AWS, Databricks) and modern data pipelines / orchestration (e.g., Airflow, Argo, dbt, Beam, Dagster).About Emergence AI
Emergence AI builds an agentic data platform that lets enterprises create, orchestrate, and operate networks of AI agents across their data and systems. Our platform focuses on automating complex, mission-critical data workflows — from ingestion and transformation through analysis, decisioning, and action.
We work with customers in semiconductors, life sciences, and other data-intensive industries to turn fragmented, high-stakes data into trustworthy, continuously-operating AI agents.
Why Join Us
Build real AI agents, not just dashboards. You’ll work on agents that plan, reason, and act across real production environments — not toy demos — with a direct line to customer impact.Work at the intersection of agentic AI and semiconductors. Help some of the world’s most advanced semiconductor companies flag issues earlier, accelerate root-cause analysis, and optimize yield and quality by giving their engineers powerful autonomous assistants.Own problems end-to-end. You’ll have the autonomy to take problems from messy data and ambiguous requirements all the way through modeling, deployment, and iteration in production.Join a small, senior team. Collaborate with experienced engineers, researchers, and product leaders who are deeply invested in pushing the state of the art in agentic AI and shipping it responsibly.Shape the platform. Your work will directly influence how our agentic data platform evolves — the types of agents we build, the data abstractions we expose, and the capabilities we deliver to customers.If you’re excited about building AI agents that operate on real, high-stakes data — and want to help redefine how semiconductor companies understand and act on their data — we’d love to talk.