Job Title : Data Platform Engineer
Location : Remote
Work Authorization : USC / GC Holder / H1B
Job Type : 4-5 month contract to hire
Position Overview :
We are seeking a skilled Data Platform Engineer with a strong focus on pipeline development and data integration. The ideal candidate will have experience designing, building, and maintaining data pipelines, managing data warehouses, and implementing APIs for real-time data access. This role requires a deep understanding of data infrastructure and a commitment to delivering high-quality solutions.
Responsibilities :
- Pipeline Development : Design, develop, and maintain data pipelines to efficiently move data in and out of the data warehouse.
- Data Warehouse Management : Oversee the management of the data warehouse, ensuring optimal performance and reliability.
- Real-Time Data Access : Implement and manage APIs to provide real-time data access for internal use cases, handling private and sensitive information.
- Automation & Scripting : Use Python to automate and script data pipeline tasks, perform complex data transformations and cleansing, orchestrate workflows, and troubleshoot pipeline issues.
- Data Transformation & Querying : Utilize SQL for data cleansing, transformation, and querying within large datasets.
- Database Management : Manage relational databases for storing, managing, and querying large datasets, design schemas, and enable efficient data warehousing operations to support data movement in and out of pipelines.
- Maintenance & Support : Provide support for existing data pipelines and systems, addressing issues and performing necessary maintenance.
Qualifications :
Experience : 3-7 years of experience in data engineering, with a focus on pipeline development, data warehousing, and API integration.Technical Skills : Python for automating and scripting data pipeline tasks, performing complex data transformations and cleansing, orchestrating workflows with tools, and troubleshooting pipeline issues.SQL for data cleansing, transformation, and querying within large datasets.Relational Databases for storing, managing, and querying large datasets, designing schemas, and enabling efficient data warehousing operations to support the movement of data in and out of pipelines.Proficiency in AWS services (., Redshift, DynamoDB)Experience with data pipeline tools (., Prefect, Airbyte, DBT)