How Australian AI-based femtech systems are improving IVF outcomes for women around the world

Adel Foda, Head of Science at Silverpond, talks to Dr Jonathan Hall, co-founder and Director of Australian startup Presagen, about the transformative role that their scalable AI-based systems are playing in IVF clinics around the world.

Adel: Can you tell us a bit about Presagen do and what is your role is there?

Jonathan: Presagen is an AI company working in the women’s health space, with an initial focus on fertility. Life Whisperer was Presagen’s first product. It provides clinical decision support for people going through IVF by using AI to select the healthiest embryo. We want to help others by partnering with clinics and hospitals and turning their data into products that can be used so people get real-life benefits from them. We do this through Presagen’s AI Open Projects platform, which enables any clinic in any country to co-create medical AI products at scale and at low cost.

I am one of the founders and a Director of Presagen and its Chief AI Science Officer. I work with AI techniques and technologies, but my role also is about solving business challenges such as understanding how AI can solve a problem and how to bring that product solution to market.

Adel: Essentially, you look at the data and what you can do with it then develop products from that. You already have the Life Whisperer decision support system. Are you also looking at other applications?

Jonathan: We have a pipeline of aligned products that encompass other areas in women’s health beyond IVF, including obstetrics and gynecology, and are not solely about embryo viability (which is whether the embryo will lead to a pregnancy). Our second product, which supports embryo genetic analysis, is almost complete and is about to go for regulatory approval.

However, I would argue that Life Whisperer is not our main product. Our main product is the AI engine that produces it – Presagen AI Open Projects global cloud platform. We have a kick-ass workflow where we can get data in a privacy-compliant way. We understand all the regulatory compliance. We all understand AI and how to productionise it. We have the internal workflow and the whiz-bang products we make. Then we have the other part of the equation, which is delivering everything through an easy-to-use, collaborative global portal. Users log in and the portal delivers what they need to the web browser of the clinic that wants to use it. All this is our AI engine.

Adel: We have a similar model to yours at Silverpond. We combine the product with consulting, which is where we work out what the problem is and see where the value lies. Then we use the product to help deliver the solution to that problem. Does consulting play a similar role in your organisation, where you have to try and find out what the right problem formulation is and establish the value proposition?

Jonathan: We are a purely a product company, so we don’t do consulting. Our AI products are really exciting for clinics and hospitals that meet the specifications because they solve the problems they want to solve. We work very closely with our clients’ domain experts, such as a lab scientist or a Senior Scientific Director, to understand their problems. But I wouldn’t say we position ourselves as consultants and we do not do project-based work as consultants – our projects are about building globally scalable products. We find partners, like a large clinical chain, a large hospital, or a large group that has data. These people are busy doctors and scientists. They do not have the time to productionise any of this stuff, but they understand their organisation’s problems very well.

We get the knowledge we need from them to form a partnership, then use their data to make products that help not just them, but the rest of the industry. We work out the actual AI that solves their particular problem across multiple clinics or hospitals, so it is scalable. That means that when we do a sales cycle, we have a scalable product. So it is a scalable business model rather than doing one-off consulting engagements.

“We want to help others by partnering with clinics and hospitals and turning their data into products that can be used so people get real-life benefits from them.” 

Adel: That makes sense. When you are developing the product, are you working with a single partner? Or do you have multiple partners, where you try to get a sense of the variance between different clinics, for instance?

Jonathan: We like to work with multiple partners. We start with one partner, ideally a very big or reputable one with opinion leadership in the field. We might even start with one to get a proof of concept and work with them alone, but typically there are half a dozen or so that we might work with as we build out the first product. A big partner could have multiple sites, then we might add a few others that represent different countries or demographics.

Once a product is complete, we lock it down as work version 1.0. Then we can go out to a broader base of clinics that are not partners but might be customers.

For our Life Whisperer application, we worked with about 10 partners. They eventually became customers but started out as partners. They helped us build the data, collect the data, understand the problem, understand how it could generalise from one clinic another, one camera type to another and so on. It meant we could get a sense of what was really going on.

Adel: What is the commercial model there? Are your initial partners compensated for helping you with that early development?

Jonathan: The model we use is one we worked out with our first main clinical partner. Under that model, we adopt an open projects approach. The clinics come on board and get a royalty share proportional to how they have helped us develop that product.

We say to the clinics that if they use this portal and prepare everything in the right way, they can track how well the product is doing, how well it is contributing to the overall data set. Then they can get a potential revenue share based on royalties of the product sales and their relative contribution. So that is how we position it to partners. But we also have a business model where product sales flow back into the company.

Adel: How about the consumer after the product has been developed — is that a subscription model?

Jonathan: We had to think about how clinics are going to use it. For example, in the IVF space, we have a cloud-based interface for our Life Whisperer product. A clinician logs in and we will drag and drop embryo images so they can decide which is the healthiest using a matrix. Then they can select that embryo and get on with the rest of the IVF process.

Some patients have one embryo, some have 6, some have 10. You never really know how many it will be, but we charge on the basis of a per-cycle charge. Some clinics pass the cost on to the patient. Some choose to absorb it. It is up to them and we do not to control what they choose to do. It is not an ongoing recurring subscription, but it a pay-per-cycle model. There are no additional up-front costs or ongoing maintenance fees for the clinic.

“The clinics come on board and get a royalty share proportional to how they have helped us develop that product.”

Adel: Part of what Silverpond does with clients is to analyse the value the AI system adds to the process. That would be hard to quantify for a product like Life Whisperer. How do you make that decision about what you believe the value of the system is to the clinic?

Jonathan: It is a hard question. I think it depends a lot on the country and how its value is perceived there. Initially, we talk to our partners, who have experienced both sides of the commercial equation, about what they think the value is. In their country, it might be more, it might be less. It is not a consumer application, it is a regulated medical product, so it must cost more than a consumer application.

So you have these two things. It is a regulated medical product, and it is an enterprise sale for a clinic that will use it and possibly mark up the cost. First of all, we thought about how accurate Life Whisperer is, how efficacious it is, how much does it help people? We saw we were improving clinical pregnancy rates by about 20%. So we thought that around 2-4% of the IVF cycle fee was a reasonable markup for usages per cycle, depending on the country.

Adel: The outcome the IVF patient is looking for is a successful pregnancy, so I guess that is something you can measure and compare. It’s about how many cycles you can save a person, and the cost savings of that.

Jonathan: Yes. How can we give really great value that is fairly costed and even a bit generous? I am not saying everything is perfect. We are just trying to find the best ballpark figure in each market.

Adel: You mentioned that Life Whisperer is a decision support system. Can you expand on that? What workflow does it support?

Jonathan: In the IVF process, you typically have a doctor who does the consultation and coordination but there is also a scientist in the lab called an embryologist. An embryologist is a scientist who is often a key decision-maker in the company. They are very expensive to train and have many years of expertise. Basically, they turn the eggs into embryos by fertilising them. Say there are 5 to 10 embryos and the embryologist looks at the health of all of them. Some might not have fertilised properly. Some might have fertilised a lot better. The embryologist has to make a decision, using a microscope and their eyes, and maybe a scoresheet of agreed criteria. They select what they think are the healthiest embryos at that stage. Then they choose a maximum of two embryos to transfer back to the patient, depending on the country and its laws. It is often one.

Because they are usually only transferring one embryo, “Which one is the healthiest one?” is one of the hardest questions in the field. It is also one of the biggest weaknesses in that workflow. The embryologist has done all this work. They must decide which is the healthiest embryo to implant. If they get it wrong, the other embryos might be frozen and the patient might come back weeks or months later for another go, and another go after that. The embryologist wants to get the best outcome for the patient, to shorten that time to pregnancy. Because there is a huge exponential emotional cost for the patient. How many times do they have to be heartbroken until there is finally an embryo transfer that works? Maybe it is the first time, or maybe it is never because there is an underlying clinical issue.

This is the workflow where we saw potential. You have these embryo images. You are already taking photos of them to show the patient and now you must decide which one gives them the best chance. No one quite knows the best way of doing this.

What we say to the embryologist is: “Okay. Since you are already taking photos and uploading them to the computer, what if you open a browser window and drag and drop those images onto our product interface? Then it will sort them for you and tell you which ones are the healthiest, and you can click for more information if you want it.” It gives the embryologist an easy interface based on the most sophisticated machine learning we can provide.

Life Whisperer fits into their workflow. We do not try and disrupt it or demand retraining. Embryologists can drag and drop, get an answer then move on to the next one. Then they can get on with their other tasks. It also helps them to sort patient records, so they are not doing it manually. We add a lot of value to their clinical workflow.

“”Which one is the healthiest one?” is one of the hardest questions in the field.”

Adel: Is there any kind of feedback mechanism in Life Whisperer where a clinician can report the result of implanting an embryo so it feeds into the machine learning system?

Jonathan: There is the ability for clinics to provide feedback and they often use it. We encourage it, so we know whether the Life Whisperer process resulted in a pregnancy and any other information they choose to provide. We use that knowledge to help improve the product. We track the version of it very carefully and validate each study. If it has been agreed that the data can be used and everything has been signed off.

We have a very robust medical ISO procedure for looking at the data, making sure we select it in the right way, back-validating every single study we have done in the past. Then when we are ready, we do a software release as an AI update. We are very clear with our customers about what has changed and how the reports and metrics have changed, if at all.

But Life Whisperer does not learn passively. That would be quite dangerous in a medical setting, because there would not be many controls around whether this would change its score.

Adel: One thing I have found is that I can train an AI system and it performs at a 95% level of accuracy when I sample uniformly from some input distribution. But it may be possible there are pockets of the input space where the system still performs extremely poorly and makes bad decisions. Is that something you have run into?

Jonathan: This is our bread and butter, and I must say you are so right. It is something that people do not say out loud enough. They do not consider it when they talk about accuracy alone. It could be a very misleading number. You can have a very high accuracy and a very low accuracy arbitrarily. Also, the number can change because you have basically calculated it on one distribution then validated it on another distribution. There is not a lot of explanation out there as to why.

When we look at how AI is performing while it is training, we do not look at just accuracy. We look at a whole bunch of different metric quantities. We develop some in-house for generalisability. With some, we also look at how the data is distributed and not just how it is housed. If it is supervised learning, how is the label distributed, how is the image distributed, how many images are not helpful, and what does that embryo look like?

When we say the number is low or high, we say: “It is low or high in this region for this type of image, this subset of this camera.” And when we know why as well, we can say: “It is low here, but we know it is low because the patient had an in-utero disease and there is no way that image could have known that.” This is a good example. Someone might have a medical condition and will never get a positive IVF outcome without surgery. The image cannot be expected to know that. Mapping these things together means we can understand what the actual AI distribution output is like, what the distribution of scores is, how the scores map to the subsets of the data. These are things that a lot of medical regulated products need to do to be able to pass the necessary bars.

“When we look at how AI is performing while it is training, we do not look at just accuracy. We look at a whole bunch of different metric quantities.”

Adel: Would it be fair to say that understanding all the dimensions of the inputs base may affect the predictive power the model has? That this is an ongoing process? So every year, you learn about new pockets and new things in what is a very complicated biological phenomenon?

Jonathan: We definitely find we need to do post-market clinical follow-up. So we will contact clinics that have been using our product for a while and we get trend reports and understand what is going on. If it is not an in-market clinic, we will analyse how the clinic works before they use our product and understand what kind of camera they are using based on what we have seen before. Mostly, our technology scales correctly because we are using the most robust techniques to account for different camera designs. If the product will be based on a new camera, we do a check to make sure it is okay for before committing.

We do a lot of pre-market analysis and post-market follow-up so we can understand the data and see if anything different has been done that might throw it off. If it is something we have not looked at before, we will create documentation for different studies we will complete pre-market or in clinical follow-ups. So if something works better in one space and worse in another space, those numbers and percentages are there to guide users into getting the best out of our product. I think that is a challenge as well, especially when you have a medical endpoint which is not directly related to the image. For example, you have to assume they are going to transfer the embryo to the patient right away. If they leave it sitting around, it will no longer be healthy and that may impact the outcome.

Adel: How much R&D time has been put into a product like Life Whisperer?

Jonathan: The R&D and regulatory side of things take up a significant amount of our time. There is a lot of IP generation, so we have a rich store of IP and continually do new R&D projects. We are really an R&D house putting together excellent IP. We currently have 5 patents at different stages that cover both specific health areas and general data analysis.

We brought Life Whisperer to market quite quickly — about 2 to 3 years. The timeframe depends on what point you say the product is actually in the market, because you are continually upgrading it and trying to increase sales. With Life Whisperer, we built a lot of our AI factory technology as we went along. We could bring it to market so much more quickly now, maybe a matter of months, because all the infrastructure is there. In fact, we have done this with new products because of our behind-the-scenes AI factory capability. We have those networks in place and a lot of the internal tools we need. I think we work fairly fast and it would be a real challenge, if not impossible, for anyone else to bring a regulated medical product to market in a shorter timeframe.

“I think we work fairly fast and it would be a real challenge, if not impossible, for anyone else to bring a regulated medical product to market in a shorter timeframe.”

Adel: I’d like to get your opinion on model errors, because there is no such thing as a perfect AI system. In the case of Life Whisperer, it sounds like the cost of an error is not that significant because at worst, you have lost a cycle. And if things are not working with the system, the embryologist can always go back to the human process. Is that how you see it?

Jonathan: I would say we take any issues of misclassification very seriously. It is a regulated medical product, so a lot of our risk analysis centres on misclassification and other potential errors. We produce extensive study documentation and publish in the peer-reviewed scientific literature, and make sure we monitor the total percentages and the positive predictive power and negative predictive power of the model, along with many other commonly quoted and understood metrics, as well as in-house metrics we develop.

Misclassification could result in a healthy embryo being frozen when it did not need to be, or even potentially discarded. That creates a lot of trauma for a prospective mother going through multiple cycles. This is probably the largest risk we face — that Life Whisperer says no to an embryo that is actually healthy. We go through a lot of effort to make sure that the model is likely to have more false positives and treat false negatives as high risk.

Adel: At Silverpond, we have seen situations when bringing the product out in the way it was envisioned became basically impossible because the relative cost of a failure was so high, even with a really low false negative rate of 0.00001. The economics of it would not work out. The way we eventually got around that was by designing different workflows and making the workforce less automated and more human-driven.

Jonathan: It is such a tricky thing because lots of human-related activities are dangerous. Even with driving to work, the chance of death is not zero. It is actually quite high compared to almost any other activity. I understand that. But if a robot is driving, is that equally risky or less risky than a human driver?

Adel: Exactly Where does liability sit and how is it is perceived? I think the industry is struggling with this question.

Jonathan: Definitely. I do not think there is an easy, one-size-fits-all answer.

Adel: What guarantees do you provide to customers? Are they statical guarantees based on your clinical trials?

Jonathan: Yes, we have statistically significant study results, although we do not claim a guarantee. Other types of benchmarks are incorporated and called the indications for use. The medical industry is very specific around the terminology we can use for all these benchmarks. You have indications for use that have certain performance metrics and you are free to work out what is reasonable for these things based on what you would like. But you must be very clear about what you intend it to do. If it is going to be a diagnostic, that sometimes attracts more stringent regulatory burdens, because you are diagnosing something. Whereas for us, our products are decision support, not diagnosing, so they are not supposed to be used in isolation of the doctor’s knowledge of the patient’s medical history. Instead, they provide another lens, another data point the doctor can take into account on top of their knowledge. So when we position the benchmarks, we specify what we think is reasonable, what provides enough value to people, the class being regulated and what the risks are. We work out those numbers then specify these things in our indications for use. They state how the product should be used and what you can and cannot do.

“We go through a lot of effort to make sure that the model is likely to have more false positives and treat false negatives as high risk.”

Adel: Maybe there is an analogy with pharmacology products. Every drug lists good indications and contraindications and potential side-effects.

Jonathan: You are right. We use the same language because it is a regulated medical product, and the same framework is used by the pharmaceutical industry. It depends on the classification — if it is high regulation, or up regulate, or down regulate. So yes, we use the same kind of language. We outline the kind of studies we use, like whether they are retrospective studies or clinical trials or other kinds of study designs.

Adel: Life Whisperer was your first product. What others are you are developing?

Jonathan: Our next product is for embryo genetic integrity analysis. We are really excited about it. It is approaching market readiness and about to go for regulatory approval. We already have Life Whisperer in the market and a foothold with lots of different industry partners and connections. It makes sense for our next product to be aligned so we can package them together. It would be quite similar to Life Whisperer in terms of the decision support aspect of it, but with its own separate metrics.

Adel: Are you always exploring what kinds of problems your industry partners and connections have so you can continue building additional solutions like this using the AI factory?

Jonathan:  Yes. We have tabled a good number of product ideas internally. There are many opportunities in image analysis for the obstetrics and gynecology pipeline that will help with different diseases and issues. There is a huge amount of value that we could bring by creating or co-creating products, especially with image analysis and decision support.

Adel: With the Life Whisperer Genetics product, I am thinking again about model error. What does that mean in this context and what are the misclassification risks?

Jonathan: I use genetic integrity as a colloquial term. The correct term is aneuploidy, which means the embryo has genetic errors. Euploidy means the embryo is chromosomally normal. One of the risks is classifying an embryo as healthy when it is unhealthy or unhealthy when it is healthy. Whether it is a false negative or a false positive, there are risks either way.

When a patient goes through IVF, an additional in-field process is pre-implantation genetic testing for aneuploidy (PGT-A). The embryologist takes the embryo to a laboratory and does a biopsy so they can do a genetic analysis to work out if it is healthy. This is a risky and often expensive process that might damage the embryo. What we are hoping to do is provide a pre-screening tool, something that may not tell you everything about the embryo yet in the same way as full genetic sequencing but can pre-screen the embryos. The embryologist can then choose to remove the unhealthy ones and may do full genetic tests and select the healthiest ones for implantation.

If an embryo is misclassified, you can imagine the risks. If it is unhealthy but classified as healthy and frozen then implanted, the patient might not get a baby from it and there could also be a higher disease risk. If an embryo is misclassified as unhealthy, a healthy embryo may be frozen and the patient will end up spending more time and money on IVF. So we are trying to find a solution that provides value by making the embryologist confident about their decisions.

We would not necessarily encourage people to ignore other kinds of genetic testing. We are just saying that we have a system that can help people who might want to avoid some of the risks of the current approach and increase their chances of a healthy baby.

“There is a huge amount of value that we could bring by creating or co-creating products, especially with image analysis and decision support.”

Adel: Do you have a project management methodology for developing AI systems for different applications?

Jonathan: We do not adopt a particular textbook methodology for project management at this stage. We are still a small company. We do have a process for scouting out new opportunities and forming partnerships. But when we are having initial conversations with partners, it is not really from a project management perspective but more around what the problem is and what the product solution might look like for them.

Once we get into the process of productionising, that is much more robust in terms of process, because we have done it before, and it has to be very regulated. The process burden increases as we go along the productionisation pipeline. It becomes automated and we are very careful with the paperwork. Once we get into the prototyping phase and what the product development lifecycle will look like, there is even more process.

Adel: It sounds like a lot of that is around meeting regulatory obligations and testing.

Jonathan: It is, but there is also compliance. I see regulatory as being about respecting the country’s laws. But compliance is also traceability in case we are asked questions. If a customer asked for something and it was important, did we do a risk analysis? Did their request get ticketed? Did it go to a person who could put it into the customer’s app? Where are the designs for it? What version of the app did it end up being rolled into? So the traceability is something we take very seriously. We have ISO 13485, which is the medical ISO, for Life Whisperer.

Adel: Are your workflow support tools something you are selling, or intend to sell, to a customer who wants to manage their own AI applications, or do you just use them internally?

Jonathan: Just for internal use for now. It is always possible to change that if something we are doing provides a lot of value and other people are interested. For now, we do it for ourselves. AI engineers need to be thinking about the science and what is really going on with all this data. So having in-house tools we built ourselves to automate machines and AI training is really useful. It means they can check results much more quickly than they ever could wading through terabytes of data with just a laptop. How do you query, what do you query, how well is it going? All the important information that they want to be able to query easily is there. By making it easier do this, our AI engineers can productionise ideas faster.

Adel: Do you see anything on the horizon in terms of promising new AI technology?

Jonathan: We always try to keep our eye on any new algorithms, but the real stuff we do here is not so much using a special custom algorithm. It is more about the way an algorithm is combined with the workflow to create a robust sequence of different models that do the right thing. New techniques for looking at data analysis are of more interest to us. We like conferences and new articles that shed light on how data can be analysed. We contribute to that space ourselves with articles and some of our patents on data analysis and data cleaning.

What we have found really powerful is understanding what clean data is. It has been a wildly successful technique that has really helped us out and is a big area that should be talked about more. So that is where our focus has been. Not necessarily the algorithm itself with the data, but how does the algorithm query the data and even criticise itself? Because the data needs to be treated carefully. Don’t just throw data at an algorithm and hope it will work. That’s a recipe for failure because there is no understanding of which data is really important and why.

Adel: The data ingestion and cleaning process is something you are trying to develop standards for so people can feed it into the AI factory. Why is this important?

Jonathan: We have developed a very crisp methodology for this. It has been a key focus for us. I would even argue that it was wildly successful in the way that we created data. We tried many different approaches when looking at data and they did not necessarily work well. But then we put our finger on what was really going on, and it was that we needed to understand the composition of the data itself. That really made a huge difference to our AI training.

At the moment, we have a patent that is about to go to PCT stage, so the methodology should be publicly available. We are planning to publish those results and already have the manuscript in preprint. We are working hard to get it published because we are really excited about it and want to get the word out there. We have also done a few blog posts about it and will probably put out more.

“Understanding what clean data is…has been a wildly successful technique that has really helped us out and is a big area that should be talked about more.”

As co-founder of Presagen, Dr Jonathan Hall is helping to transform global medicine through AI-enabled products, with an emphasis on women’s health.
Presagen’s AI Open Projects is an online platform that safely and privately connects data from clinics around the world to co-create medical AI products that are scalable, unbiased, accessible and affordable for all patients globally. Presagen’s first product is called Life Whisperer, which uses AI to better identify viable embryos for IVF and ultimately improve pregnancy rates for couples struggling with fertility.
Presagen has received multiple industry accolades, including the AIIA’s Startup of the Year and Machine Learning Innovation of the Year awards. Life Whisperer was named Global Winner — One to Watch for the Asia Pacific region at Talent Unleashed (where judges included Richard Branson and Steve Wozniak) and reached the top 5 finalists in TechCrunch’s Startup Battlefield Australia.
Jonathan has PhDs in theoretical particle physics and nanotechnology (specialising in biosensing in embryos). His personal awards include being one of MIT Technology Review’s Innovators Under 35 (Asia Pacific) and an InDaily 40 Under 40 Business Leader (South Australia). He also won InDaily’s Entrepreneurial Award.