When I started Aspen Biosciences in 2009, I had one thing in mind — to make the process of drug discovery easier and more manageable for the people doing it. Having spent time at both mid-size biotech companies like Lexicon Pharmaceuticals and at large pharmaceutical companies like Pfizer and Takeda I had seen first-hand a lot of the challenges that researchers were faced with, and the shortcomings that many of the available technologies had in trying to address those challenges. 
The origin of Pipeline, our Program Management Platform, can be traced back to a simple request from the Site Head of Pfizer La Jolla. She wanted a tool that would enable the entire site to visualise the drug discovery pipeline and recognize the individuals responsible for advancing their programs. The initial solution was a simple program that generated a static website, which addressed the immediate need and the technical constraints of the time.
|
How Pipeline Got Its Name
The name “Pipeline” has a bit of a double meaning: both a drug pipeline, and a term for a particular type of wave known to most surfers. In 2006, when I worked at Pfizer I was surprised to discover that “board meeting” similarly had a dual meaning. I was waiting for a meeting with the site head, and someone mentioned that there was a board meeting in progress. One of the people in the waiting room asked, “suit or wetsuit?”. The reply was “wetsuit”. Seeing the puzzled look on my face, someone said “some of the CEOs and site heads surf one of the local point breaks together. It gives them an opportunity to talk informally. It beats golf.” |
The idea of managing drug discovery programs using a Pipeline diagram stuck with me – it was a natural fit for the way in which people thought of drug discovery: a stage-gated, recursive discovery process. And after starting Aspen, I decided to create a web application to take that idea to the next level. This caught the attention of the CIO of Fibrogen Carl Drinkwater, who asked if we could build some basic project management capabilities into the system.

We realised however, that simply creating a planning tool that was biology- and chemistry-aware wasn’t enough. A plan without a means of actually executing the plan was a bit of an empty promise. And if you couldn’t track what was happening at each step in the process, you might end up adding to the workload rather than alleviating it. Scientists & project managers needed to be able to plan drug discovery programs, execute those plans, and manage/track each step of the process.
In practical terms this meant that scientists would need to be able to:
-
- identify drug targets from literature, view their structure, understand the known ligands for the target, and understand the role of the target in the context of an indication
-
- create projects from those targets. Whenever someone new joined the project, they needed to have one place where they could go to understand what was going on, and who to contact if you had questions.
-
- design custom protein target forms for screening campaigns, order them from CROs and track and manage that process.
-
- design libraries of compounds, and manage the process of synthesising those compounds either internally or through CROs.
-
- register proteins, compounds and cell lines and place them in inventory.
-
- plan assay cascades and screen those compounds to identify and characterise the hits.
And this was only scratching the surface…
Design Principles
Throughout this time, we had continued to do a lot of bespoke software development for customers. Everything from custom tools for fragment-based drug discovery, to platforms designed to support DEL-based drug discovery. And from those experiences 3 design principles emerged that would guide the development of Pipeline:
-
- Everything is Configurable
-
- Everything is a Plugin
-
- Everything is Computable
Everything is Configurable
Everybody likes to say that they do drug discovery “the canonical way”, and perhaps at the 50,000 ft view that’s correct, but when it comes down to street level, everyone does it differently. There are different instruments, different drug discovery platforms and workflows, different software, and modalities all at work in a biotech company.
This meant that we needed to provide customers with an unprecedented level of configurability, not only to help them meet the challenges of today, but to make it possible to meet challenges that aren’t even on their radar yet.
This in turn meant providing people with the ability to:
-
- configure assays in the Assay Request Management module,
-
- configure workflows and steps in the Protein Production and Chemistry modules
-
- configure the registries and inventories.
Across all of these modules, we needed to make sure that users could configure security rules to control access to parts of the system, and configure notifications so that everyone on the team would get the notifications that were appropriate to their role on the team.
They needed to be able to use spreadsheets, and SD files to load data into the system, and they needed a way to map the fields in those files to the fields in the database. They needed to save that mapping, so that each time a data file was uploaded, it didn’t have to be remapped.
To make this level of configurability possible, we made metadata the cornerstone of the application. This made it possible for the people to use industry standard constrained vocabularies from ontologies like the Allotrope Foundation, EDAM ontology, and NCI Metathesaurus and others. When defining an assay, the user can upload a sample spreadsheet, and the system will automatically map the columns in the spreadsheet to the terms in the ontology. Thus removing ambiguities that are the source of confusion when sharing data internally, with CROs, or with collaborators.
But we don’t just use metadata in the Assay Request Module. We use it everywhere. In Protein Production and Chemistry modules, it’s used to define the data that are collected at each step in their workflows. In the Registration and Inventory modules, it’s used to define the data that’s collected whenever substances are registered and stored. It’s used to map data from spreadsheets into each of those modules, to assign parameters to calculations and to export data out in different file formats.
In order to accomplish this we created the Field Library, a metadata repository that’s part of the Foundation module of Pipeline.

Everything is a Plugin
Because so many of our customers already had software that addressed parts of their workflows, we needed to figure out how to make the process of incorporating those tools into Pipeline easier.
To begin with, it meant treating our own solutions as “default plugins” – plugins that could be replaced by other software as needed. We studied the APIs of some of the most popular tools in the research informatics space to make sure that the APIs we were developing aligned with the models that software vendors were using, thus making it easier to create plugins for them.
We created Test Compatibility Kits (TCKs), so that we could easily retest plugin implementations whenever a vendor creates a new release.
Everything is Computable
One of the key challenges that we faced with a number of projects, is integrating applications that were never designed to be integrable. In many cases these were applications where the APIs were an afterthought, and were added later in order to make KNIME workflows possible. In some cases, the APIs would have undocumented, breaking changes, from one release to the next, and no API versioning.
Unlike those applications, Pipeline was designed from the ground up to be API-first.
To make it easy for partners to integrate with us, we provided API documentation that clearly spelled out how to do it. You can use our web APIs with KNIME workflows or scripts to get the information that you need to drive your business.
But more than that, we wanted to make it possible for customers to extend the capabilities of Pipeline in ways that we had never envisioned. We live in an age where AI/ML and other technologies are rapidly evolving. We wanted to make it possible to easily experiment with and integrate those technologies into their workflows.
-
- To perform property calculations, or toxicity predictions for new compounds as you design them.
-
- To perform customised tractability scoring functions on new targets that you’re contemplating adding to your pipeline.
-
- To extract target, pathway, and disease associations using custom text mining algorithms on papers in the Literature Mining module.
And a million other things we haven’t thought of yet. So we built the Calculation Engine to make it possible for you to create your own scripts that are triggered by events within the application. This Event Driven Architecture means that you can hook your code into Pipeline to support novel functionality.

—-
If you’d like to find out more about Pipeline, contact us at info@aspen.bio for a demonstration.
