Sanctuary Computer: LLM Data Engineer

Headquarters: New York City

URL: https://sanctuary.computer

Sanctuary Computer is hiring a Remote LLM Data Engineer

Original job post link:

LLM Data Engineer Original Job Post Link

About garden3d

We are a worker-owned creative collective, innovating on everything from brands and IRL communities to IoT devices and cross-platform apps. We share profit, open source everything, spin out new businesses, and invest in exciting ideas through financial and/or in-kind contributions.

Our client roster includes Google, Stripe, Figma, Hinge, Black Socialists in America, ACLU, Pratt, Parsons, Mozilla, The Nobel Prize, MIT, Gnosis, Etsy & Gagosian. We’re the software team behind innovative products like The Light Phone & Mill, and we a global, decentralized community space collective called Index Space.

We think of our garden3d as collective for creative people, prioritizing a happy, talented, and diverse studio culture. We work on projects that bring value to our world, and we balance deep care for the work we do with a genuine curiosity about life outside of our jobs.

 Who we’re looking for:

Right now, we’re looking for a Data Engineer with expertise in data pipelines, workflow orchestration, and data integration. Our current approach to data ingestion relies on an in-house application that defines lightweight workflows in code and executes them with queues. We’re now expanding this process by adopting Prefect as an orchestration tool, enabling us to manage pipelines for structured and unstructured data from diverse sources, including web crawls and scrapers.

In this role, you’ll work on a variety of client projects to find cost-effective, high-quality, pragmatic solutions to complex problems. Responsibilities will include:

  • Monitoring and maintaining data pipelines, troubleshooting new errors, and addressing format drift
  • Extracting and enriching additional data elements from diverse sources
  • Reprocessing and validating large datasets in batch workflows
  • Designing and integrating new data sources into existing pipelines
  • Aligning and integrating extracted data with the core application data model to ensure consistency and usability
  • Participating in code reviews, providing constructive feedback to teammates and ensuring adherence to best practices
  • Contributing to project success by keeping a close eye on team velocity, project scope, budget, and timeline
  • Negotiating with clients to align project scope with budget and timeline, if needed

The person we’re looking for is happy, relaxed and easy to get along with. They’re flexible on anything except conceits that will lower their usually outstanding work quality. They work “smart”, by carefully managing their workflow and staggering features that have dependencies intelligently, they prefer deep work but are OK coming up to the surface now and then for top-level/strategic conversations.

We believe people with backgrounds or interests in design, art, music, food or fashion tend to have a well-rounded sense of design & quality — so a variety of hobbies or side projects is a big nice to have!

Compensation

Our pay scale ranges from $85 p/hr to $130 p/hr pending seniority (& team leadership experience), and our projects are rarely less than 8 full-time weeks at 40 hours per week. Additionally, we pay discretionary bonuses for going over and above — like training & coaching others, winning new business, speaking at conferences, etc.

We prefer long-standing relationships with highly accountable and communicative team members, so we encourage candidates to expect longer-term engagements. A Data Engineer working 40 — 45 full-time weeks may take home $150k — $200k USD.

 Must Have Competencies:

  • Strong proficiency in Python
  • Experience with data/workflow orchestration tools (e.g., Prefect, Dagster, Airflow)
  • A thorough understanding ETL & data transformation for the ingestion of industry standard LLMs (OpenAI, Claude, etc)
  • Familiarity with Large Language Models (LLMs)
  • Skilled in interfacing with APIs (OpenAI, Google Gemini/Vertex, etc.) using wrapper libraries such as Instructor, LiteLLM, etc.
  • Practical experience in prompt engineering
  • Ability to work with structured outputs and potentially tool calling
  • 5+ years of general experience in backend (Ruby on Rails, Elixir Phoenix, Python Django, or Node Express) and/or native app development (React Native, Flutter, Android, AOSP, Kotlin/Java).

 Nice to Have Competencies:

We’re always pitching for new and exciting technology niches. Some of the areas below are relevant to us!

  • Experience with Google Cloud Platform (GCP), particularly Cloud Run and Cloud Tasks
  • Knowledge of search technologies, including embeddings and vector databases for semantic search, as well as keyword-based search (BM25)
  • Familiarity with PySpark for batch data processing
  • Experience working with LLMs, Vector Databases, and other generalist AI-enabled application patterns
  • Client-facing experience: working directly with customers to gather requirements and provide technical solutions
  • Product management experience: defining product roadmaps and collaborating closely with stakeholders
  • Engineering management experience: leading teams, setting technical direction, and mentoring developers

 How we interview:

Our interview process starts with a call where you get to meet a few members of our team. From there we’ll ask appropriate candidates to take part in a technical exercise which helps illustrate skill level and comfort.

→ It’s also a great way to see what it’s like to work with us and help support folks who may not have the ‘right title’ but have the experience and technical know-how for the role.

‍ How we work:

We believe that there’s a better balance between the poles of freelancing & full time, and for that reason Sanctuary works differently to most shops:

  • Transparency & Ownership: We release out Profit & Loss statements to the community each year, open source our best ideas, and talk business & money with everyone in the company. We’re proud to run our business with integrity, and for that reason we share everything with our team & community.
  • 150% Carbon Negative: Our studio offsets 150% of the carbon we use to do business each year, dated back to our founding in 2015. We turn down work that is not in line with our morals, and we encourage our peers to do the same. We were certified climate neutral earlier this year.
  • Strong Morals: Since our founding, we’ve turned down somewhere between $1mm – $2mm of work that didn’t meet our moral standards. (Most of that was DTC brands that can’t show a valid sustainability initiative).
  • Async & Decentralized: We use tools optimized for calm, thoughtful communication, and opt for async whenever possible. We fight hard to maintain our focus time.
  • Remote Friendly: Our company is fluent in remote work, making our workplace more decentralized, and democratized in the process.
  • Ideas & Products: In our spare studio time, we work to build our own open source or internal products to diversify & bolster our income. We create amazing technology products for our clients, so why not for the studio?

→ Read more on our Substack, over here, or our Medium, over there.

Quick tip! Adding a Loom recording to your profile in our form to showcase your skillset can really make your application stand out!

Please click this Important link below or copy the URL (Same as the apply button) to proceed with your application by telling us a bit more about your interest in the role:

garden3d Creative Form

To apply: https://weworkremotely.com/remote-jobs/sanctuary-computer-llm-data-engineer

Share:

Leave a Reply

Your email address will not be published. Required fields are marked *