Job Description
We are seeking a skilled Senior Data Platform Engineer to join our Data Platform team in Kansas City, MO. In this role, you will be instrumental in designing, implementing, and optimizing a cloud-native data stack that leverages best-in-class open-source tools. The ideal candidate will design, build, and maintain an opinionated, resilient, and scalable data platform in a private cloud environment—enabling data-driven decision-making, analytics, and machine learning, while providing deep insights out of the box. This role blends data engineering, software development, and infrastructure management, leveraging tools such as Apache Iceberg, Airflow, Spark, Kafka, and Superset.
Core Responsibilities
Data Platform Architecture
- Design scalable and secure architecture for data storage, processing, and access within a private cloud.
- Select appropriate technologies (e.g., Apache Iceberg, Spark, Superset) to meet business and technical requirements.
- Define flexible and scalable data schemas using Apache Iceberg.
Metadata Management & Data Governance
Evaluate and implement metadata management platforms such as DataHub, Apache Atlas, or OpenMetadata to support data cataloging, lineage, and governance use cases.Collaborate with data stakeholders to align metadata solutions with organizational needs.Define and enforce governance policies related to data quality, privacy, and compliance (e.g., GDPR, CCPA).Implement fine-grained access controls, encryption, and auditing with a focus on regulatory compliance and data traceability.Infrastructure Management
Manage private cloud infrastructure to provision and maintain data lakes / data mesh solution, processing engines, and databasesEnsure high availability, scalability, disaster recovery, and infrastructure cost-efficiency.Enforce strong encryption standards and access controls in line with industry best practices.Design for growth in data volume, velocity, and variety.Automation & CI / CD
Automate data workflows, infrastructure provisioning, and deployments using tools like Airflow, Ansible, Salt and Kubernetes.Implement CI / CD pipelines for data platform updates and enhancements.Performance Optimization
Optimize data storage and queries using Apache Iceberg and Spark to ensure high performance and low-latency access.Identify and address performance bottlenecks; implement partitioning, caching, and indexing strategies.Monitoring and Alerting
Monitor data platform health using tools such as Prometheus and Grafana dashboards.Configure real-time alerts to proactively detect and resolve pipeline failures or data issues.Troubleshoot and resolve platform outages and data incidents promptly.Collaboration
Work closely with data scientists, analysts, and engineers to understand data needs and deliver performant, scalable solutions.Collaborate with cross-functional teams (Cloud Engineering, Network, and DevOps / Solutions Engineering) to troubleshoot and resolve infrastructure issues.Qualifications
Education
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.Experience
5–8+ years of experience in data engineering, with a strong focus on cloud-based data platforms.Technical Skills
Strong programming skills in Python or Java.Deep knowledge of Apache Iceberg, Spark, Airflow, Superset, and Kafka.Familiarity with metadata management platforms like DataHub, Apache Atlas, or OpenMetadata and experience with their evaluation or implementation.Experience with cloud-native infrastructure tools such as Kubernetes, Ansible, Salt, etc.Soft Skills
Strong analytical and problem-solving skills.Effective communication and collaboration with cross-functional teams.