Talent.com

Infrastructure engineer serp_jobs.h1.location_city

serp_jobs.job_alerts.create_a_job

Infrastructure engineer • richmond ca

serp_jobs.last_updated.last_updated_variable_days
Infrastructure Engineer

Infrastructure Engineer

FAR.AIBerkeley, California, United States
serp_jobs.job_card.full_time
AI is a non-profit AI research institute dedicated to ensuring advanced AI is safe and beneficial for everyone.Our mission is to facilitate breakthrough AI safety research, advance global understan...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
Senior Software Engineer, Compute Infrastructure

Senior Software Engineer, Compute Infrastructure

QxBranchBerkeley, CA
serp_jobs.job_card.full_time
Rigetti Computing is building the world’s most powerful computers to solve humanity’s most pressing problems.We believe this technology will fundamentally change the world for the better and will a...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Lead Infrastructure DevOps Engineers - 24 months contract

Lead Infrastructure DevOps Engineers - 24 months contract

Resource Informatics Group IncOakland, CA, US
serp_jobs.job_card.temporary
This position with a healthcare client, our partner had a lot of success with in the past.MUST be local to the Bay Area (commuting distance to Oakland) – they are working remote right now but...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Assistant or Associate Professor - Integrated, Autonomous Construction and Infrastructure

Assistant or Associate Professor - Integrated, Autonomous Construction and Infrastructure

InsideHigherEdBerkeley, California, United States
serp_jobs.job_card.full_time
Assistant or Associate Professor - Integrated, Autonomous Construction and Infrastructure - Civil and Environmental Engineering. Assistant Professor in Civil and Environmental Engineering / .Associate...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Machine Learning Infrastructure Engineer

Machine Learning Infrastructure Engineer

Astera InstituteEmeryville, CA, US
serp_jobs.job_card.temporary
The Diffuse Project is dedicated to advancing our understanding of protein motion through the use of diffuse scattering – a signal in X-ray crystallography that is currently under-utilized or...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Infrastructure & Systems Engineer

Infrastructure & Systems Engineer

VIGILENT CORPORATIONOakland, CA, US
serp_jobs.job_card.full_time +1
Vigilent is looking for world-class talent to help us achieve our mission of improving facility operations while creating a more sustainable planet. Vigilent applies machine learning, Al and expert ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
Staff AI Infrastructure Engineer (Software Engineering Focus)

Staff AI Infrastructure Engineer (Software Engineering Focus)

WEXBay Area, CA
serp_jobs.job_card.full_time
This is a remote position; however, the candidate must reside within 30 miles of one of the following locations : Portland, ME. Boston, MA; Chicago, IL; San Francisco Bay Area, CA; Dallas, TX; and S...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Senior Project Manager - Infrastructure

Senior Project Manager - Infrastructure

SamprasoftOakland, CA, US
serp_jobs.job_card.full_time
The selected consultant must be able to attend an onsite orientation and work onsite in Oakland, California when regular operations resume.serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Project Manager - Electric Vehicle Infrastructure

Project Manager - Electric Vehicle Infrastructure

Jacobs SolutionsOakland, CA, US
serp_jobs.job_card.full_time
Project Manager Electric Vehicle Infrastructure.At Jacobs, we're challenging today to reinvent tomorrow by solving the world's most critical problems for thriving cities, resilient environments, mi...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
Integrations and Infrastructure Product Team Senior Manager

Integrations and Infrastructure Product Team Senior Manager

BoeingBerkeley, California, USA
serp_jobs.job_card.full_time +2
Integrations and Infrastructure Product Team Senior Manager.Boeing Defense Space and Security (BDS) Sapphire is looking for a dynamic. Integrations and Infrastructure Product Team Senior Manager.Ber...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
Senior Infrastructure Engineer

Senior Infrastructure Engineer

CrunchbaseCalifornia, United States
serp_jobs.job_card.full_time
Crunchbase helps over 75 million people around the world connect with the companies and people that matter.Powered by best-in-class proprietary data, Crunchbase is democratizing access to opportuni...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
DevOps Infrastructure Engineer IV

DevOps Infrastructure Engineer IV

Great Places to WorkOakland, CA, US
serp_jobs.job_card.full_time
Great Place To Work® is the global authority on workplace culture.Our mission is to help every place become a Great Place To Work® for all. We give leaders and organizations the recognition and tool...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
Business Development Manager - Power & Infrastructure

Business Development Manager - Power & Infrastructure

ENERCONEmeryville, CA, US
serp_jobs.job_card.full_time
Our Corporate Business Development Group is seeking a Business Development Manager for our Power & Infrastructure team.As a trusted partner to key clients, you’ll lead relationship-building efforts...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Infrastructure Engineer

Infrastructure Engineer

VirtualVocationsOakland, California, United States
serp_jobs.job_card.full_time
A company is looking for an Infrastructure Engineer - Remote.Key Responsibilities Provide primary support and engineering for Azure customer experiences, resolving complex issues Act as the voic...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
IT Infrastructure Engineer

IT Infrastructure Engineer

Robert HalfOakland, CA, US
serp_jobs.job_card.full_time
Are you a tech-savvy problem solver with a passion for cloud infrastructure and identity access management? Do you thrive in a collaborative, mentorship-driven environment where cultural fit and te...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
IT Security & Infrastructure Engineer

IT Security & Infrastructure Engineer

Atomic MachinesEmeryville, California, United States
serp_jobs.job_card.full_time
Atomic Machines is ushering in a new era in micromanufacturing with its Matter Compiler (MC) technology.The MC enables new classes of micromachines to be designed and built by offering manufacturin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
Supervisory Civil Engineer - Transportation Infrastructure

Supervisory Civil Engineer - Transportation Infrastructure

Parsons CorporationUSA CA Oakland
serp_jobs.job_card.full_time
Supervising Civil Engineer - Transportation Infrastructure – San Francisco / Oakland.Parsons has a challenging and rewarding opportunity for a motivated Supervising Civil Engineer to join our team ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
  • serp_jobs.job_card.promoted
Structural Engineer | Design Engineer

Structural Engineer | Design Engineer

Degenkolb EngineersOakland, CA, US
serp_jobs.job_card.full_time
Founded in 1940 and headquartered in San Francisco, Degenkolb Engineers has more than eight decades of commitment to innovation, client service, and life-long learning. We deliver customized structu...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
  • serp_jobs.job_card.promoted
Azure Cloud Infrastructure Engineer Consultant

Azure Cloud Infrastructure Engineer Consultant

Olivine, Inc.Berkeley, CA, US
serp_jobs.job_card.full_time
The Cloud Infrastructure Engineer (Azure) is responsible for designing, implementing, managing, and optimizing cloud infrastructure and services within Microsoft Azure. This role is responsible for ...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
Senior HPC Engineer, Infrastructure Specialist Team

Senior HPC Engineer, Infrastructure Specialist Team

NVIDIARemote, CA, US
serp_jobs.filters.remote
serp_jobs.job_card.full_time
NVIDIA is looking for a Senior HPC Engineer to join its Professional Services team.NVIDIA products to revolutionize deep learning and data analytics, and to power data centers.Join the team buildin...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_30
Infrastructure Engineer

Infrastructure Engineer

FAR.AIBerkeley, California, United States
job_description.job_card.variable_days_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

About FAR.AI

FAR.AI is a non-profit AI research institute dedicated to ensuring advanced AI is safe and beneficial for everyone. Our mission is to facilitate breakthrough AI safety research, advance global understanding of AI risks and solutions, and foster a coordinated global response.

Since our founding in July 2022, we've grown quickly to 30+ staff, producing over 40 influential academic papers, and established the leading AI Safety events for research, and international cooperation . Our work is recognized globally, with publications at premier venues such as NeurIPS, ICML, and ICLR, and features in the Financial Times , Nature News, and MIT Technology Review .

We drive practical change through red-teaming with frontier model developers and government institutes. Additionally, we help steer and grow the AI safety field through developing research roadmaps with renowned researchers such as Yoshua Bengio, running FAR.Labs, an AI safety-focused co-working space in Berkeley housing 40 members, and supporting the community through targeted grants to technical researchers.

About FAR.Research

Our research team likes to move fast. We explore promising research directions in AI safety and scale up only those showing a high potential for impact. Unlike other AI safety labs that take a bet on a single research direction, FAR.AI aims to pursue a diverse portfolio of projects.

Our current focus areas include :

Investigating deception in AI (e.g. lie detectors can either induce honesty or evasion )

Building a science of robustness (e.g. finding vulnerabilities in superhuman Go AIs )

Advancing model evaluation techniques (e.g. inverse scaling and codebook features , and learned planning ).

We also put our research into practice through red-teaming engagements with frontier AI developers, and collaborations with government institutes.

About the Role

We’re seeking an Infrastructure Engineer to develop and manage scalable infrastructure to support our research workloads. You will own our existing Kubernetes cluster, deployed on top of bare-metal H100 cloud instances. You will oversee and enhance the cluster to 1) support new workloads, such as multi-node LoRA training; 2) new users, as we double the size of our research team in the next twelve to eighteen months; and 3) new features, such as fine-grained experiment compute usage tracking.

You will be the point-person for cluster-related work. You will work on the Foundations team alongside experienced engineers, including those who built and designed the cluster, who can provide guidance and backup. However, as our first dedicated infrastructure hire, you will need to work autonomously, design solutions to varied and complex problems, and communicate with researchers who are technically skilled but less knowledgeable about our cluster and infrastructure.

This is an opportunity to build the technical foundations of the largest independent AI safety research institute, with one of the most varied research agendas. You will be working directly with both the Foundations team and researchers across the organization to enable bleeding-edge research workloads across our research portfolio.

Responsibilities

Build and Maintain

You will deliver a scalable and easy to use compute cluster to support impactful research by :

Empowering the research team to solve their own day-to-day compute problems, such as debugging simple issues and streamlining recurring tasks (e.g. running batch experiments, launching an interactive devbox, etc.).

Maintaining and developing in-cluster services, such as backups, experiment tracking, and our in-house LLM-based cluster support bot.

Maintaining adequate cluster stability to avoid interfering with research workloads (currently >

95% uptime outside of planned maintenance windows).

Maintaining situational awareness of the cloud GPU market and assisting leadership with vendor comparisons to ensure we are using the most effective compute platforms.

Support Security

We often collaborate with partners with stringent security requirements (e.g. governments, frontier developers) and handle sensitive information (e.g. non-public exploits, CBRN datasets). You will implement security measures towards :

Securing the cluster against insider threats (architecting it to have adequate isolation to provide data confidentiality and integrity for sensitive workloads) and external threats (through minimizing the attack surface, and ensuring security updates are promptly installed).

Making secure workflows the default, e.g. streamlining the deployment of internal web dashboards behind an OAuth reverse proxy.

Championing security across the FAR.AI team, including maintaining and extending our mobile device management (MDM) system.

Bleeding-edge Workloads

You will work with the Foundations team and specific research teams to support novel ML workloads (e.g. fine-tuning a new open-weight model release) by :

Architecting our Kubernetes cluster to flexibly support novel workloads.

Assisting projects with bespoke requirements, designing and implementing effective infrastructure solutions, and sharing your infrastructure wisdom with ML researchers.

Improving observability over cluster resources and GPU utilization to allow us to rapidly diagnose and work around hardware issues or software bugs that may only arise on novel workloads.

About You

It is essential that you

Have Kubernetes or other system administration experience.

Have a curiosity and willingness to rapidly learn the needs of a new space.

Are self-directed and comfortable with ambiguous or rapidly evolving requirements.

Are willing to be on-call during waking hours for cluster issues ahead of major deadlines (for a few weeks a quarter).

Are interested in improving our security posture through identifying, implementing and administering security policies.

It is preferable that you

Have experience supporting ML / AI workloads.

Have previously worked in research environments or startups.

Are experienced in administering compute or GPU clusters.

Are able to adopt a security mindset.

Are willing to be part of an eventual on-call rotation, if required.

Example Projects

Configure the cluster and user-space development environments to support InfiniBand nodes for high-performance multi-node training.

Improve our default devbox K8s pod template to incorporate best-practice workflows for our researchers.

Roll out a new mobile device management system to ensure corporate devices meet our security requirements.

Streamline onboarding to the cluster for new starters (possibly in different timezones), and candidates on time-limited work trials.

Be “holder of the keys”, managing permissions and access control for FAR.AI’s team members to technical systems, including streamlining / automating (e.g. via SAML, SCIM) where appropriate.

Analyze storage patterns and propose infrastructure improvements for backups, disaster recovery, and usability.

Logistics

You will be a full-time employee of FAR AI, a 501(c)(3) research non-profit.

Location : Both remote and in-person (Berkeley, CA) are possible, though 2 hours of overlap with Berkeley timezones are required. We sponsor visas for CA in-person employees, and can also hire remotely in most countries.

Hours : Full-time (40 hours / week).

Compensation : $100,000-$175,000 / year depending on experience and location. We will also pay for work-related travel and equipment expenses. We offer catered lunch and dinner at our offices in Berkeley.

Application process : A programming assessment, a short screening call, two 1-hour interviews, and a 1 week paid work trial.

If you have any questions about the role, please reach out at talent@far.ai. If you don't have questions, the best way to ensure a proper review of your skills and qualifications is by applying directly via the application form. Please don't email us to share your resume (it won't have any impact on our decision). Thank you!