Software Engineer, Infrastructure (Experienced)
JupiterOne
N/A
Job Details
Full-time
Full Job Description
JupiterOne is a cyber asset attack surface management (CAASM) platform company providing visibility and security into your entire cyber asset universe. Using graphs and relationships, JupiterOne provides a contextual knowledge base for an organization's cyber asset operations. With JupiterOne, teams can discover, monitor, understand, and act on changes in their digital environments. Cloud resources, ephemeral devices, identities, access rights, code, pull requests, and much more are collected, graphed, and monitored automatically.
JupiterOne’s Platform team is dedicated to developing resilient, efficient, and reliable data ingestion processes that empower our customers to understand their cyber assets and potential risks. With a product centered around graph-based security, the role of a Software Engineer, Infrastructure involves leveraging software and systems engineering expertise to build a robust foundation for our data platform. Your responsibilities will include enhancing system reliability and fault tolerance, automating processes for system continuity and recovery, implementing event-based automated workflows, and contributing to the development of next generation platform infrastructure features and capabilities. Additionally, we value engineers who proactively identify opportunities for improvement, voice their insights and are committed to challenging the status quo through relentless system improvement and innovation.
Tech Stack:
- Node.js (Typescript) & Go
- AWS (EC2, ECS, EKS, Lambda, S3, Kinesis)
- Kubernetes (EKS)
- ArgoCD, Argo Workflows, Argo Events
- Terraform
- Github Actions
- Helm
- Neo4j (Cypher, Java)
- New Relic & Open Telemetry (OTEL)
Requirements
What you will do:
- Collaborate with and report to the lead of the Platform Engineering Infrastructure team.
- Serve as a subject matter expert in backend cloud-native architecture patterns, AWS services and the infrastructure required to support and deploy our graph-based data platform.
- Develop and enhance event-driven platform automation capabilities at JupiterOne.
- Contribute to building and implementing next-generation infrastructure features and promoting their adoption across the platform.
- Instrument Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to enhance core platform reliability.
- Work on improving the ability to perform unattended infrastructure related updates through infrastructure test automation.
- Create proactive monitoring systems that identify potential issues based on early symptoms rather than waiting for outages.
- Participate in an on-call rotation and the incident response process, continuously improving procedures and tools.
Who You Are:
- You are collaborative, easy to work with and open to feedback and direction.
- You stay current with industry trends and best practices, maintaining a strong foundation in Software Engineering and approaching challenges with a software-first mindset.
- With over 5 years of experience in Software Engineering roles, you specialize in cloud infrastructure to address Site Reliability and/or Platform Engineering challenges
- You advocate for cloud-native architecture, promoting its benefits over proprietary cloud solutions.
- You have hands-on experience making Kubernetes a seamless PaaS for software engineers.
- You can talk at depth about your homelab
- You are equally adept at writing code and managing infrastructure, possessing an in-depth understanding of Linux fundamentals, networking, and system architecture from both a theoretical and practical perspective.
- Your expertise includes troubleshooting and resolving Kubernetes deployment issues efficiently and running
high-performance applications that are scalable, highly available, and resilient.
- You are proficient in coding with languages such as TypeScript, Go, Rust, etc. and related infrastructure test automation.
- You excel in whiteboarding complex ideas and coming up with appropriate architectures given the problem domain.
- You have experience managing cloud infrastructure in AWS, Azure, or Google Cloud and are familiar with observability tools such as New Relic, Prometheus, OpenTelemetry, Grafana, Datadog, CloudWatch, or equivalent for diagnosing production issues.
- You have experience using security tools to keep infrastructure and services secure.
Bonus:
- You are proficient in managing infrastructure through GitOps with ArgoCD
- You are proficient in writing Argo Workflows
- You are proficient in building helm charts + proficient in the test automation of helm charts.
- You have experience with writing Kubernetes Operators, running Kubernetes across multiple platforms, and running Data Pipelines in Kubernetes using Argo Workflows and Argo Events
- You have experience with NATS and/or Kafka
- You have experience with AI and fine tuning large language models (LLMs)
- You have experience with running and hosting LLMs within Kubernetes
- Experience with graph databases (e.g. Neo4j).
- You have an active Certified Kubernetes Administrator (CKA) Certification and/or equivalent experience that makes you an expert in this area.
- You have an active Certified Kubernetes Application Developer (CKAD) and/or equivalent experience that makes you an expert at deploying applications to Kubernetes.
- You have contributed to one or more CNCF related projects on Github.
Benefits
- Medical, Dental, Vision Insurance etc.
- Flexible PTO
- Maternity & Paternity Paid Leave
- Reimbursement for Gym Memberships and/or Fitness Equipment
- Wellness Program Offerings
- 401(k), Life Insurance, Short and Long Term Disability
- Paid Holidays, including JupiterOne Day on July 21st.
- Generous Employee Referral Program
- & SO much more!