Job Description
Job Description
Responsibilities :
We are seeking an AWS ML Cloud Engineer to design, deploy, and optimize cloud-native machine-learning systems that power our next-generation predictive-automation platform. You will blend deep ML expertise with hands-on AWS engineering, turningdata into low-latency, high-impact insights. The ideal candidate commands statistics, coding, and DevOps—and thrives on shipping secure, cost-efficient solutions at scale.
Objectives of this role :
- Design and productionize cloud ML pipelines (SageMaker, Step Functions, EKS) that advance predictive-automation roadmap
- Integrate foundation models via Bedrock and Anthropic LLM APIs to unlock generative-AI capabilities
- Optimize and extend existing ML libraries / frameworks for multi-region, multi-tenant workloads
- Partner cross-functionally with data scientists, data engineers, architects, and security teams to deliver end-to-end value
- Detect and mitigate data-distribution drift to preserve model accuracy in real-world traffic
- Stay current on AWS, MLOps, and generative-AI innovations; drive continuous improvement
Responsibilities :
Transform data-science prototypes into secure, highly available AWS services; choose and tune the appropriate algorithms, container images, and instance typesRun automated ML tests / experiments; document metrics, cost, and latency outcomesTrain, retrain, and monitor models with SageMaker Pipelines, Model Registry, and CloudWatch alarmsBuild and maintain optimized data pipelines (Glue, Kinesis, Athena, Iceberg) feeding online / offline inferenceCollaborate with product managers to refine ML objectives and success criteria; present results to executive stakeholdersExtend or contribute to internal ML libraries, SDKs, and infrastructure-as-code modules (CDK / Terraform)Skills and qualifications :
Primary technical skills :
AWS SDK, SageMaker, Lambda, Step FunctionsMachine-learning theory and practice (supervised / deep learning)DevOps & CI / CD (Docker, GitHub Actions, Terraform / CDK)Cloud security (IAM, KMS, VPC, GuardDuty)Networking fundamentalsJava, Springboot, JavaScript / TypeScript & API design (REST, GraphQL)Linux administration and scriptingBedrock & Anthropic LLM integrationSecondary / tool skills :
Advanced debugging and profilingHybrid-cloud management strategiesLarge-scale data migrationImpeccable analytical and problem-solving ability; strong grasp of probability, statistics, and algorithmsFamiliarity with modern ML frameworks (PyTorch, TensorFlow, Keras)Solid understanding of data structures, modeling, and software architectureExcellent time-management, organizational, and documentation skillsGrowth mindset and passion for continuous learningPreferred qualifications :
10+ years of Software Experience3+ years in an ML-engineering or cloud-ML role (AWS focus)Proficient in Python (core), with working knowledge of Java or ROutstanding communication and collaboration skills; able to explain complex topics to non-technical peersProven record of shipping production ML systems or contributing to OSS ML projectsBachelor’s (or higher) in Computer Science, Data Engineering, Mathematics, or a related fieldAWS Certified Machine Learning – Specialty and / or AWS Solutions Architect – Associate a strong plus