Cloud Data Engineer (AWS)
At Collective[i], we value diversity of experience, knowledge, backgrounds and people who share a commitment to building a company and community on a mission to help people be more prosperous. We recruit extraordinary individuals and provide them the platform to contribute their exceptional talents and the freedom to work from wherever they choose. Our company is a wonderful place to learn and grow alongside an incredible and tenacious team.
Collective[i] was founded by three entrepreneurs with over $1B of prior exits. Their belief in the power of Artificial Intelligence to transform life as we know it and improve economic outcomes at massive scale drove the decision to invest over $100m in the company which has created a state-of-the-art platform for prosperity that helps companies generate sales and people expand their professional connections. In the last decade, Collective[i] has grown into a powerful community of scientists, engineers, creative talent and more, working together to help people succeed in business.
We are looking for a Senior Data Engineer to join our team. We are seeking an experienced Senior Data Engineer with a strong background in AWS DevOps and data engineering to join our team. In this role, you will manage and optimize our data infrastructure, focusing on both data engineering and DevOps responsibilities. A key aspect of this role involves deploying machine learning models to AWS using SageMaker, so expertise with AWS and SageMaker is essential. Experience with Snowflake is highly desirable, as our data environment is built around Snowflake for analytics and data warehousing.
Responsibilities:
- Design, develop, and maintain ETL pipelines to ensure reliable data flow and high-quality data for analytics and reporting.
- Build and optimize data models, implementing best practices to handle large volumes of data efficiently in Snowflake.
- Create and maintain complex SQL queries and transformations for data processing and analytics.
- Conduct orchestration and scheduling through Apache Airflow.
- Document data pipelines, architecture, and processes, maintaining clear and updated technical documentation.
- Design, develop, and maintain ETL pipelines to ensure reliable data flow and high-quality data for analytics and reporting.
- Build and optimize data models, implementing best practices to handle large volumes of data efficiently in Snowflake.
- Create and maintain complex SQL queries and transformations for data processing and analytics.
- Conduct orchestration and scheduling through Apache Airflow.
- Document data pipelines, architecture, and processes, maintaining clear and updated technical documentation.
- Architect, build, and maintain data science data and models infrastructure on AWS, focusing on scalability, performance, and cost-efficiency.
- Collaborate with Data Scientists to deploy machine learning models on AWS SageMaker, optimizing model performance and ensuring secure deployments.
- Automate deployment and monitoring of ML models using CI/CD pipelines and infrastructure-as-code (IaC) tools such as Terraform or AWS CloudFormation.
- AWS specific tasks (EC2, S3, RDS, VPC, CloudFormation, AutoScaling, CodePipeline, CodeBuild, CodeDeploy, ECS/EKS, cost management, etc.)
- Set up and manage monitoring solutions (e.g., CloudWatch) to ensure data pipelines and deployed models are operating effectively.
Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
- 5+ years of experience in Data Engineering with at least 3+ years working in AWS environments.
- Strong knowledge of AWS services, specifically SageMaker, Lambda, Glue, and Redshift.
- Hands-on experience deploying machine learning models in AWS SageMaker.
- Proficiency in DevOps practices, including CI/CD pipelines, containerization (Docker, ECS, EKS), and infrastructure-as-code (IaC) tools like Terraform or CloudFormation.
- Advanced SQL skills and experience in building and maintaining complex ETL workflows.
- Proficiency in Python, with additional skills in Java or Scala
- Practical experience with Airflow for DAG management and data orchestration.
- Proficient in version control (GIT) and containerized deployment with Docker and managed services such as AWS Fargate, ECS, or EKS.
- Effective communication, Result oriented approach.
$100,000 - $170,000 a year
Salary ranges can vary significantly based on a multitude of factors, reflecting the diverse and complex nature of today's job market. These factors encompass a wide range of elements, including industry, experience, education, and geographic location.
Who you are working for - About Collective[i]:
Collective[i] is on a mission to help people and companies prosper. Backed over 20 patents and developed by a team of world renowned entrepreneurs, engineers, scientists, and business leaders, Collective[i] is an Economic Foundation Model (“EFM”) that studies how the world does business. Collective[i]’s advisors include a world renowned economist, the former Vice Chair of the Federal Reserve, founders of Comcast, Instagram, MySQL, and former executives from Tesla, NewsCorp, USANetworks, and others.
Harnessing insights from more than a decade of data collection, our EFM has been trained on trillions of dollars of data to unearth successful buying and selling patterns. With Collective[i], any person or company can plug in their own data and receive customized insights that help them maximize economic opportunity and adapt to changing market conditions.
Founded and managed by the early teams behind LinkShare (purchased for $425m) and Overstock (NASDAQ:OSTK), Collective[i] is a private 100% remote company.
Our core values help shape our culture: We are curious. We are direct. We deliver. We succeed together. We strive for the extraordinary. If you enjoy a challenge, thrive in an innovative environment and welcome the opportunity to work with amazing humans operating on the bleeding edge of technology, Collective[i] is the place for you.
Recent press:
Forbes: Stephen Messer: Amazon Missed The AI Boom
CNBC: Harvard professor on A.I. job risks: We need to upskill ad update business models
ZDNet: Why open source is essential to allaying AI fears
Information about the founders:
Tad Martin
Stephen Messer
Heidi Messer