JobHire
face icon
Register to automatically apply for this and similar jobs
Register
star

Composable Data Stack Python Engineer

dltHub

N/A


Job Details

Full-time


Full Job Description

Who We Are

We are looking for a Software or Data Engineer that is experienced in high performance Python data processing libraries (often referred to as the Composable Data Stack). You will collaborate directly with our CTO and be part of the core product team.

dlt is an open-source library that automatically creates datasets from messy, unstructured data sources. You can use the library to move data from about anywhere into the most well-known SQL and vector stores, data lakes, storage buckets, or local engines such as DuckDB, Arrow or delta-rs. The library automates many cumbersome data engineering tasks and can be handled by anyone who knows Python. You can see more details in this Hacker News article.

dltHub is based in Berlin and New York City. It was founded by data and machine learning veterans. We are backed by Foundation Capital, Dig Ventures, and many technical founders from companies such as Datadog, Instana, Hugging Face, MotherDuck, Mesosphere, Matillion, Miro, and Rasa.

Your Tasks and Responsibilities:

dlt is a missing part between traditional Modern Data Stack and the emerging Pythonic Composable Data Stack: a gateway that creates datasets which the other components can then process. Our mission is to integrate dlt fully with this new, emerging ecosystem in a way that our users love. This means we respect their time, effort and previous investments in modern data stack when designing features. To support this mission your tasks and responsibilities include:

  • You design and implement OSS features that make dlt a gateway to composable data stack: integrating query engines, transformation frameworks, table formats with our library
  • You listen to our users, always paying attention to what they need to go to production with dlt.
  • You work with our customers in commercial projects where dlt is combined with existing “modern data stack” infrastructure
  • You maintain the open source project with the team (e.g., review PRs, resolve issues, talk with community contributors, etc.)

Requirements

Who You Are

If you are fascinated by the emerging ecosystem of data libraries in Python, which allows you to do on a single machine what until recently was possibly only in the cloud - then you’ll enjoy working with us.

  • You know what duckdb, arrow, datafusion, lancedb, delta-rs, ibis, pyiceberg, sqlglot, kedro, hamilton and similar Python libraries / pip installable components do and know when to apply them.
  • You have experience in building data apps or product based on composable data stack
  • … or you were contributing code to any (or similar) of projects above
  • You know what so called “Modern Data Stack” is and appreciate certain aspects of it (ie. maturity, fitting into enterprise workflows etc.)
  • … and in fact you are interested in combining both worlds.
  • You really like Python and are fluent in writing Python code (e.g., Python typing, unit testing, writing docstrings, etc.)
  • You have a degree in computer science, data science, or other equivalent experience
  • You are familiar with GitHub workflows (e.g., pull requests, code reviews, CI/CD services, etc.)

Nice to Have:

  • You are based in Berlin and willing to work in our office regularly
  • You have a hacker nature and you love to make things optimized
  • Experience with DevOps (e.g., CI systems like GitHub Actions, Docker, Kubernetes, AWS/GCP/Digital Ocean, etc.)
  • Experience with machine learning (e.g., the toolset, the workflows, practical applications, etc.)

Benefits

What do we offer

In our work culture, we value each other’s autonomy and efficiency. We have set hours for communication and deep work. We like automation, so we automate our work before we automate the work of others.

  • We are an office-first company but give you plenty of opportunities for deep work and work from home. Dedicated "no meeting days" to help the team focus on their most impactful work
  • As we work often from the Berlin office, we cover your public transportation ticket
  • We are deeply committed to your personal and professional growth, so we have an annual budget for learning and development.
  • We offer regular subsidized team lunches and Urban Sports Club membership.
  • We also have an ESOP plan for employees, depending on their role and dedication. We provide an option to increase your ESOP if you grow with us.  

Get 10x more interviews and get hired faster.

JobHire.AI is the first-ever AI-powered job search automation platformthat finds and applies to relevant job openings until you're hired.

Registration