Careers
Build the trust standard for AI agents.
Pipkin is assembling a team of evaluators, researchers, engineers, and communicators who believe that AI agents making real decisions deserve real oversight. We are not building a product. We are building an institution \u2014 one that will define how trust is measured for autonomous AI systems.
The people who join Pipkin early will shape not just the organization, but the standard itself. If you believe that independence matters, that rigor is non-negotiable, and that the absence of a rating is not the absence of risk, we want to hear from you.
Our Values
Independence Above All
We exist because independence matters. Every person at Pipkin understands that the moment we compromise our independence is the moment we cease to have value. Independence is not a feature. It is the product. Every decision, every hire, every partnership is evaluated against one question: does this protect or erode our ability to publish honest ratings without influence.
Rigor Over Speed
We would rather publish one thoroughly evaluated rating than ten superficial ones. The methodology is not optional. The process is not a guideline. Rigor is the product. Every evaluation follows the same documented procedure. Every score is reproducible. If we cannot defend a rating with evidence, we do not publish it.
Transparency as Default
Our framework is published. Our methodology is documented. Our conflicts are disclosed. If we cannot explain a decision publicly, we should not make it privately. Opacity is the enemy of trust, and trust is what we evaluate. We hold ourselves to the same standard we apply to the agents we rate.
Intellectual Honesty
If the data contradicts our hypothesis, the data wins. If an agent we expected to score well scores poorly, we publish the score. Conclusions follow evidence. We do not cherry-pick results, adjust rubrics after the fact, or rationalize predetermined outcomes. The methodology produces the rating. Not the other way around.
Long-Term Thinking
We are building an institution, not a startup. Every decision is evaluated against a 10-year horizon. The standard we are establishing will outlast any individual product cycle. Short-term revenue is never worth long-term credibility damage. The trust standard for AI agents must be built to endure.
Future Roles
Pipkin is not currently hiring for these positions. They represent the team we intend to build as the organization scales. Express interest early to be considered when roles open.
AI Agent Evaluator
Design and administer trust evaluations against the Pipkin Framework. Conduct adversarial testing, score agent performance across all five pillars, and produce detailed findings for published ratings.
Requirements
- Deep understanding of large language model capabilities and failure modes
- Experience with prompt engineering, red-teaming, or AI safety research
- Ability to design and execute systematic evaluation protocols
- Strong written communication for documenting findings
- Rigorous, methodical approach to testing with attention to edge cases
What You Would Do
- Administer the Standard Core Battery to AI agents under evaluation
- Design new adversarial test vectors as agent capabilities evolve
- Score agent performance using calibrated rubrics across all five pillars
- Document findings with evidence and reproducible methodology
- Contribute to framework evolution based on evaluation experience
Research Analyst
Transform evaluation data into sector-level insights, regulatory analysis, and published research. Monitor the global regulatory landscape and identify alignment opportunities between the Pipkin Framework and emerging AI governance requirements.
Requirements
- Strong quantitative reasoning and analytical writing skills
- Understanding of AI regulation (EU AI Act, NIST AI RMF, ISO 42001)
- Experience producing published research or policy analysis
- Ability to translate technical evaluation data into strategic insights
- Attention to detail and comfort with ambiguity
What You Would Do
- Write research publications for the Pipkin Insights library
- Map the Pipkin Framework to evolving regulatory requirements globally
- Analyze cross-agent trends from evaluation data
- Produce the Pipkin Brief and contribute to public commentary
- Support enterprise clients with compliance crosswalk documentation
Platform Engineer
Build the infrastructure behind trust evaluations: test administration platforms, scoring systems, data pipelines, the rating API, and PipkinRated.com. You will be building systems that must be as reliable and rigorous as the ratings they produce.
Requirements
- Proficiency in TypeScript, Next.js, and modern web development
- Experience with database design (PostgreSQL, Supabase, or Prisma)
- Understanding of API design and data pipeline architecture
- Security-conscious development practices
- Comfort working on a small, high-impact team
What You Would Do
- Build and maintain the evaluation administration platform
- Develop the Pipkin API for programmatic access to ratings data
- Architect scoring systems that enforce methodology consistency
- Maintain and improve PipkinRated.com
- Build internal tools for evaluators and analysts
Content and Communications
Write research publications, manage the Pipkin Brief, produce press materials, and maintain the institutional voice across all Pipkin communications. You will be responsible for ensuring that every word published under the Pipkin name meets the standard of precision and authority the organization demands.
Requirements
- Exceptional writing with an institutional, authoritative tone
- Understanding of AI technology at a level sufficient to write about it accurately
- Experience with editorial processes and publication standards
- Ability to translate technical findings into clear, accessible language
- Comfort with a brand voice that is deliberately not startup-y
What You Would Do
- Write and edit research publications, commentary, and analysis
- Produce and distribute the Pipkin Brief
- Draft press releases and manage media materials
- Maintain consistency of voice and quality across all published content
- Support the founder with thought leadership and public communications
How to Express Interest
Send your resume and a note about why independent AI oversight matters to you. We read every submission. There is no application form and no automated screening. Tell us what you believe, what you have built, and why this work matters.