JobHire
face icon
Register to automatically apply for this and similar jobs
Registration
star

Lead Infrastructure Engineer

KoboToolbox

Cambridge, massachusetts


Job Details

Contract


Full Job Description

KoboToolbox has an immediate opening for a Lead Infrastructure Engineer to fill a full-time position of approximately 35-40 hours per week, for a commitment of at least 1 year. As a member of our team, you will share in the challenge and excitement of maintaining infrastructure used by over 15,000 nonprofit organizations around the world. These humanitarian, development, and environmental organizations create data-driven change through the collection and analysis of more than 20 million surveys per month.

Only senior candidates who already have experience working on large web applications will be considered. You must have a professional history of working on systems that support substantial volumes of traffic and data.

Beyond technical acumen, we are seeking a team member who demonstrates curiosity, initiative, and a cooperative approach to problem solving and decision-making.

If you're passionate about leveraging technology to make a positive impact, we want to hear from you!

Responsibilities

  • Lead a small but growing team of infrastructure engineers (currently 1.5 FTEs besides yourself), providing direction and mentorship, with opportunities to expand the team as the organization grows.
  • Manage team schedules to maximize round-the-clock coverage, and implement an on-call rotation.
  • Develop metrics and thresholds for acceptable, marginal, and unacceptable infrastructure performance. Document both anticipatory and reactive plans to keep all services running at the acceptable level.
  • Proactively oversee large, public instances of KoboToolbox to maximize availability and performance, reduce manual labor, and control costs.
  • Supervise hosting operations for numerous smaller instances of KoboToolbox, maintaining and streamlining existing infrastructure-as-code processes.
  • Consult with self-hosting clients, guiding them through best practices and assisting them with maintenance of their own infrastructure.
  • Serve as a skilled practitioner of Linux systems administration, Docker containerization, and Kubernetes orchestration.
  • Document configuration and processes, including recurring tasks and how to assign them.
  • Work closely with the software development team to ensure smooth deployments and good performance in production, e.g. schema migrations for multi-terabyte PostgreSQL databases.
  • Assist with maintenance and improvement of development infrastructure, such as CI pipelines and automated staging deployments.
  • Act as the focal point for external auditors to ensure compliance with applicable standards such as ISO 27001.
  • Attend regular videoconference check-ins with other members of the technical team.
  • Communicate with the public in conjunction with our support staff or directly through forums, issue trackers, etc.

Requirements

Required Qualifications

  • At least 3 years of experience in infrastructure engineering, systems administration, or a related field.
  • Successful history of leading a technical team while continuing to engage with hands-on technical work.
  • Thorough grasp of the technologies underlying web applications, including networking, hardware, virtualization, operating systems, and databases.
  • Excellent verbal and written English communication skills. Friendly and effective written communication is crucial given that we are a fully remote organization.
  • Mastery of Debian-based Linux as a server operating system.
  • Experience with Terraform (or OpenTofu), Helm, Kubernetes, and Docker in a production setting.
  • Ability to write reusable scripts in Python and Bash.
  • Extensive knowledge of basic AWS services like EC2 and S3; working familiarity with others such as EKS, RDS (PostgreSQL), DocumentDB, ElasticCache, Route 53, and SES.
  • Working familiarity with Azure and Google Cloud.
  • Interest in data collection (surveying), particularly in humanitarian emergencies and other challenging contexts, and a desire to improve our platform for our users.
  • Working hours in the Eastern time zone.
  • Average availability of at least 30 hours per week, preferably 35 hours or more.

Preferred Skills

  • Strong grasp of cloud infrastructure security best practices.
  • Experience administering large PostgreSQL, MongoDB, and Redis databases.
  • Good understanding of GitHub Actions and GitLab CI/CD.
  • Familiarity with XLSForm, ODK XForm, and OpenRosa.
  • Past work in a non-profit or mission-driven organization.
  • Experience with web application development in Python/Django, React, and (separately) Node.js.

Benefits

General Benefits:

  • Genuine Impact: Contribute directly to projects that affect millions globally, working with international humanitarian organizations and community-based partners in 200 countries.
  • Meaningful Work Environment: Join a team that tackles global challenges through innovative data collection tools that create lasting change.
  • Diverse Team: Be part of a globally diverse, inclusive team that values equity and inclusion across all spectrums.
  • Flexible Work Culture: Enjoy mutual flexibility, with a culture prioritizing work-life balance.
  • Professional Development: Access generous professional development opportunities.

Employee Benefits (U.S. candidates only):

  • Health & Wellness: 5 medical insurance options, dental, and vision (up to 80% premium covered), plus life insurance.
  • Financial Security: 401(k) retirement plan with 100% match up to 2%.
  • Work-Life Balance: 20 days paid time off, 10 floating holidays, unlimited sick days, and paid parental leave.

Get 10x more interviews and get hired faster.

JobHire.AI is the first-ever AI-powered job search automation platformthat finds and applies to relevant job openings until you're hired.

Registration