cover of episode Graphs for Causal AI

Graphs for Causal AI

2025/5/24
logo of podcast Data Skeptic

Data Skeptic

AI Deep Dive AI Chapters Transcript
Topics
Utkarshni Jaimini: 我致力于研究因果神经符号AI,旨在提升人工智能系统的可解释性,使其能够像人类一样理解因果关系,而不仅仅是依赖统计相关性。统计学在处理安全关键型应用时存在局限性,例如医疗保健,仅仅依赖相关性可能会导致错误的结论,危及人类生命。因此,我们需要在AI系统中引入因果推理,以确保其决策的可靠性和安全性。我通过知识图谱来表示因果关系,并使用神经符号方法将知识图谱转换为向量空间,从而实现链接预测等任务。同时,我也关注知识图谱中可能存在的虚假相关关系,并提出了因果LP back等方法来减轻这些关系的影响。我的最终目标是创建一个整体框架,将贝叶斯网络和领域知识整合到一个空间中,从而实现更可靠、更可解释的AI系统。 Utkarshni Jaimini: 为了更深入地理解因果关系,我研究了后门路径问题。后门路径是指连接因果变量和效应变量的非因果路径,它会导致虚假的相关性。例如,吸烟和肺癌之间存在后门路径,即吸烟基因。吸烟基因会导致人们吸烟,也会导致人们患癌症,因此不能简单地说吸烟会导致癌症。在研究吸烟导致癌症的关系时,必须考虑吸烟基因这个后门路径。为了解决后门路径问题,我开发了因果LP back方法,该方法通过考虑后门路径来进行链接预测,并在评估空间中删除后门路径,从而提高链接预测的准确性。此外,我还研究了中介者在因果关系中的作用,并提出了使用超关系图来表示中介者的方法。通过这些研究,我希望能够更全面、更准确地表示因果关系,并将其应用于实际问题中。

Deep Dive

Chapters
The episode begins with a discussion on how network science predicted the election of the new Pope, using data from who ordained whom, official co-membership, and informal relationships among cardinals.
  • Network science was used to predict the election of the new Pope.
  • A research team from Italy used three main sources: who ordained whom, official co-membership, and informal relationships among cardinals.
  • Robert Brevost (Leo XIV) had the highest eigenvector centrality in the network.

Shownotes Transcript

Translations:
中文

You're listening to Data Skeptic, Graphs and Network, the podcast exploring how the graph data structure has an impact in science, industry, and elsewhere. I'm sorry, you know, to redirect, but I think we have something important to talk about that we forgot to mention. Yeah. Is that I think you should say Habemus Papam. We have a new pope. Ah.

The reason I mention it is because there was kind of a prediction here based on network science on who would be the next pope. Fascinating. Tell me more. A day before the pope was chosen, a research team from Italy, University of Baconi, they published a cardinal network or a Vatican network, you can say. They used three main sources. They used the who ordained whom.

Okay, because there's a kind of patronage system there and official co-membership in the church institutions and informal relationships like both were mentioned on an article or something like that. So...

They defined three criteria of prominence for the cardinals, like the use status that they measured by using eigenvector centrality. And I think we mentioned it, but eigenvector centrality is centrality that you get a high score if you are related or you are in the neighborhood of a node with a high degree.

They use information control and for that they use the between centrality to measure how much the node bridges between sides of the network. And they use the cardinal's ability, they try to measure the cardinal's ability to build coalitions by using the local clustering coefficient. So how many triads in the neighborhood of the node are closed, right?

So they used these three measures, and the article was the day before, and they found out that Robert Brevost, or should I say Leo XIV, had the highest eigenvector centrality in the network.

I wish I would have known this before. I would have put some money on him. Yeah. But fascinating. Did they explore if that technique could have predicted previous popes? Great question. But the reason I know they didn't or the reason I guess it got traction is because it succeeded, right? If it

Sure. If it didn't succeed, we wouldn't – There were 10 other papers predicting it with other methods where it didn't work and we don't know. I'm sure there was. I'm sure there were. Well, it's great to know that they reverse engineered the great papal algorithm, right? Don't the cardinals go into the room and run some software and that's how they decide? Yeah.

The machine overheats and the black smoke comes out. The white smoke. Well, that's once it succeeds, yeah. Yeah, exactly. Well, I'm glad that was resolved then. They have a new leader. And network science predicted it. So it will be interesting to follow the next round of this and see if the same techniques can yield a fruitful prediction. I wish Leo the 14th a long life.

And healthy and long life. People are pointing out that he's a Chicagoan like myself, so that's interesting. Anyway, today we're getting into a topic we've touched on before, which is knowledge graphs, but I think in a new context. And tie into issues of causality and Bayesian reasoning and these sorts of things. Asaf, what did you think of the paper we're going to discuss today? Actually, it was very interesting.

But, you know, you don't have to convince me that networks help to, you know, explain stuff, right? So the whole idea behind modeling our data as a network is to simplify, understand, and explain the data, right? Sure. But when I think of a knowledge graph, well, actually, first aside from my notes, one of the interesting insights she shares is that knowledge graphs are inherently incomplete. There's always a link missing, right? We don't know the full...

Indeed. Indeed.

My name is Utkarshni Jemini, and I'm currently a grad student at University of South Carolina at Artificial Intelligence Institute. This is my last semester. I'm graduating in summer, and I'm hoping to be an assistant professor in a university pretty soon in fall. Can you share a few details on what you're studying? What's the research you're doing in grad school?

I work on causal neuro-symbolic AI, which lies at the intersection of both causality and neuro-symbolic AI. And my goal in research and other areas is how do I make artificial intelligence more explainable to people? Can we do causality so we as humans understand cause and effect?

But these AI systems do not. They're mostly learning based on correlations or statistical relation. So I want the AI systems to learn human-like causal understanding or human-like causal reasoning. So that's my goal for the AGI. Why does statistics fall short? We know software is really good at doing statistics and statistics can find correlation. What else do we need? It doesn't require causation.

I'm sure people have heard about spurious correlations, where we see relations from ice cream sales go up and there are more likely people are dying. So statistics at that stage leads to these type of analysis or relations that higher ice cream sale leads to higher people, more people drowning in pool or let's say the ocean. But we need to go into what is the cause behind it.

There are safety-critical applications of AI, such as healthcare. We don't want to just rely on correlation. Let's say there's a disease which has higher correlation with the symptom. You don't want to say that this is the cause.

But it's not just, it's not a cause. It's just this one statistical relation that you have learned. It is just correlation. And if we think about using AI in these safety critical applications, such as healthcare or industrial manufacturing or autonomous driving, we want to move a step further into causation. Because if we make a wrong claim, there's a human life at stake.

In the statistics world, we've got standard methods. Everybody knows the t-test and the chi-square test and these sorts of things. Are there established methodologies for examining causality? So the t-test, the independence test, they are a good predictor for how likely something could be a possible cause. And we do have similar criterias in causality. So I follow Udapil's

philosophy of do calculus. So in those domains we have something called as intervention and to measure the effect of interventions there are three parameters such as total causal effect which figures out or which estimates the intervention to the treatment variable and see how it is how does it affects the outcome variable.

Similarly, if there's a mediator involved, so think about it as a serial chain. So A causes B causes C, B is a mediator. So now you want to estimate how does the effect of this mediator influence the causal relation from A to C. So for that, we have two different types of effect estimation, causal direct effect and causal indirect effects. So those would be three, I would say, variables or estimates.

Well, I'm pretty familiar with certain symbolic techniques like the historical ones, maybe first order logic and this kind of stuff. But you said specifically neuro symbolic. How does it differ? So basically, neuro symbolic deals with, yes, we do have logics in that. And when I talk about neuro symbolic, I'm talking about knowledge graphs in specific. So everything is in the form of a triple.

The link from A to B is a triple in a knowledge graph represented using RDF or description logic if you want to use that. There are different types of knowledge graphs. So people have used labeled property graphs. My research deals with the triple-based knowledge graph for RDF and RDF star knowledge graphs. So everything in that sense would be, so it would be in a form of a triple, A causes B. That's the symbolic part of neuro-symbolic, yeah, that's the knowledge graph.

and we transform this knowledge graph into a vector space or an embedding space. So every node would be represented or every relation would be represented as a vector.

and that becomes the neuro part of it and combining those two becomes the neuro symbolic. So now once you have converted this knowledge graph into the vector space, now you can do certain AI applications or downstream tasks such as link prediction. So given this link or partial link of A causes, I want to predict what does A causes, does it A causes B, does A causes some other node C or D for example.

then some other tasks that could be done as clustering. So cluster similar nodes in the knowledge graph together or classify nodes in the knowledge graph together. Well, I hope most listeners will be familiar with text embeddings, like what goes on behind the scenes in large language models. You give it some text, you get a similar vector out, and that vector represents, you know, has these spooky properties, you know, king minus man plus woman equals queen kind of stuff. There's context to it.

Are there equivalents that you find in the embeddings you do, or are there nuances to that process to get a proper knowledge graph embedding? ♪

In today's data-driven world, the ability to extract value from data isn't just an advantage, it's essential. Mastering analytics can transform both your career and the organization you work for. It's your turn to transform your career and drive organizational success through analytics. Let me tell you about the Scheller College of Business' Business Analytics Graduate Certificate at Georgia Tech.

It's 100% online. Scheller College ranks in the top 10 U.S. business schools for busy business analytics professionals. They have a world-class faculty that can help you graduate in as little as a year, but

But maybe you're busy like me and you want to take it a little slower. You can combine flexibility with rigorous education. Scheller's Graduate Certificate Program adapts to your life, not the other way around. Their program is designed for professionals like us who want to leverage data and solve real-world business challenges, but need flexibility with their time and schedule.

That's why you can schedule your classes in a way that makes sense to you. On top of that, you're not just earning a certificate. You're potentially opening doors to Georgia Tech's prestigious MBA programs. Now is the time to become a data-savvy leader with Georgia Tech's Business Analytics Graduate Certificate. Applications are open for spring 2026.

Visit techgradcertificates.com to learn more and apply before the August 1st deadline at techgradcertificates.com.

Delete.me makes it easy, quick, and safe to remove your personal data online at a time when surveillance and data breaches are common enough to make everyone vulnerable. Your data is a commodity. Anyone on the web can buy your private details. This can lead to identity theft, phishing attempts, or harassment. But now you can protect your privacy. That's why I've been using Delete.me. One of the best things about the service is when you first sign up, they give you the flexibility to start with just basic information. You choose what details you want them to protect.

I started conservatively, but after seeing their detailed remover reports and experiencing their service firsthand, I felt confident enough to expand my protection. The peace of mind that comes with Delete.me's service is invaluable. Knowing that a dedicated team of human privacy experts is actively working to protect your personal information lets you focus on what matters most in your life.

Some removals happen within 24 hours, while others might take a few weeks. But Delete.me manages it all. They keep you informed throughout the process and their quarterly reports show you exactly what they're doing to protect your privacy. Take control of your data and keep your private life private by signing up for Delete.me now at a special discount for our listeners. Today get 20% off your Delete.me plan by texting DATA to 64000. The only way to get 20% off is to text DATA

to 64,000, that's dated as 64,000, message and data rates may apply.

Well, I hope most listeners will be familiar with text embeddings. Are there equivalents that you find in the embeddings you do, or are there nuances to that process to get a proper knowledge graph embedding? There is. So there are different knowledge graph embedding techniques that you can use. So as you clearly mentioned, the king and queen. So we have similar distance-based embedding space.

there are like matrix-based loss functions. So the embedding space could be very similar to the text embeddings. And is there a lot of work to be done there, deciding how big the embedding is or customizing and tweaking that, or is it kind of off the shelf?

Right now it is off the shelf methods, but I'm definitely looking into how can I improve this embedding space in terms of so another parameter that we are looking into is called causal effects. As I said that we have these three different types of causal effects and

and how to incorporate these effects into the embedding space directly. So at present, I'm using one of the work, seminal work, which has been done in this area called, they've called this paper as Focus E, where they have incorporated the weights in the loss function.

But currently what I'm looking into is can I incorporate weights in the embedding space directly and use it for the downstream tasks further? So that's an open research area that I'm interested in personally. What are some of the common applications of knowledge graphs?

So one common application which I see is, as I said, like link prediction. Then knowledge graphs are also used for explanations because they are considered to be the domain knowledge. So everything that has to, for example, there are biomedical knowledge graphs, there are gene knowledge graphs out there, which basically means everything that has to be known about the gene or any particular disease or symptom is in that knowledge graph.

So they can be used for explanation further. So let's say a neural network gives you an outcome that you have. I'm taking a healthcare example that you have this particular disease. You can map that disease to a node in the knowledge graph and trace the path and see how is the disease linked to a symptom or linked to a certain gene that you might have observed in your data set.

The problem of link prediction, why we do link prediction, is based on an assumption that knowledge graph is inherently incomplete. We do say that it has domain knowledge, but we also make this assumption that there are still information or knowledge which we have not incorporated or relations which do not still exist in the knowledge graph. And that's why we have this one application of link prediction.

So it makes sense that the knowledge graph would be incomplete. Is there a worry that it has some, I guess, false positives as well, edges that are sort of spurious? That's an interesting question. So that basically boils down to what I work on. That's the backdoor path, which talks about the spurious relations that might exist. So inherently, it should not. It depends if you're creating a knowledge graph. Let's say if you use an LLM to create a knowledge graph.

and there are high chances that these spurious relations might exist in it. But on the other hand, the knowledge graphs which are created by the domain expert, the assumption is since they are the domain expert and those knowledge graphs are based on the ontologies, they are less likely to have those spurious relations. But in the case where you have these spurious relations, that's where my work kind of fit in, that how can we use the causal LP back

or the backdoor paths in the knowledge graph itself and mitigate those spurious relations, the effect of those spurious relations in the downstream tasks. What is a backdoor path? So, backdoor path is basically a non-causal path which connects the if-causal variable to its effect variable.

in a way that it influences how much, they're both confounders, as well as influencing how does the cause variable affects the effect variable in a way. So if I take an example of A causes B, there might be some common variables. Let's say there's a variable D, which is both affecting A and B at the same time.

So, to just separate out the relation A to B, we have to make sure that we are taking the variable D into account. And do you know that from just context or the existing network, or is it something that can be learned? That can be learned. Also, we do get to know this in terms of context. So, I'm again going back to the healthcare example because it is easy for everybody to understand. So, there might be a disease which is causing a symptom.

There might be a context in which that disease would cause a symptom. So medical experts or clinicians do take that into account when they are treating a certain patient. They might not give the same decision or the same diagnosis to every other patient that they see. So just taking an example of, let's say, age affecting, you know, the blood pressure, normal blood pressure level.

So what is normal for me might be different than what is normal for somebody who is 60 years or 80 years old. So there's a context that you have to take in account. It can be learned. So in a way, we have causal network learning algorithms, which are known as causal structure learning algorithms.

So once you learn the structure of a causal Bayesian network, there are existing methods which will tell us that these are possible considering these two nodes, these are possible backdoor paths that exist between these two nodes.

Is it possible to do a real-world example or maybe a hand-wavy one of a backdoor path, like in the medical case or perhaps something else? What would one look like? So backdoor could be thought about as a common cause or a confounder. We have this common causal Bayesian network where we have the relation between smoking and lung cancer.

There's a common gene which causes, so there's a smoking. Smoking does not always cause lung cancer. There is a relation from a gene, so there's a smoking gene that causes people to smoke, as well as there's a relation from a smoking gene which causes people to have cancer.

So that's the backdoor path. So once you're taking the effect of smoking and causing cancer, you are missing out this context of smoking gene. So you cannot say that smoking causes cancer. It has to be in the context of the smoking gene. So take that, we have to take this backdoor path.

into account when actually looking at this relation smoking to causing cancer. Why did the backdoor pass become a problem? It leads to spurious correlations in a way. So as I said, if somebody is smoking, are we going to say, oh, this person is going to have cancer?

That's not entirely true. They might or might not. Smoking is not the cause of somebody having cancer. Not necessarily, right? Not necessarily, yes. There might be other parameters which we have to look into account before we make this claim. With that concept in place, we can probably start talking about causal LP back. What is the purpose of this architecture? What are the goals?

So the goal of this is people have started looking into, I'm going to talk a little bit, little bit background before I actually dig into causal LP back is why did I even started looking into causal neurosymbolic AI? And I hope that would make sense when I talk about causal LP back. I started my PhD working in a healthcare project. So I was coordinating my advisors in IHR01 where we were dealing with pediatric asthma patients.

And we wanted to develop a personalized Bayesian inference model that one would be due. So asthmatic patients are highly affected with the outdoor parameters. So right now, Poland is way high in most of the cities right now. And that tends to flare up people's asthma. You could see there's a correlation. So there's a predictor there. So you can have a high conditional probability. But can I say that Poland is the only cause?

There might be other outdoor environmental parameters going on. So we have ozone, we have AQI. They might be going up all together at the same time. So what I was interested in was to pinpoint what is the probable cause.

To practically do so, you would have to take that patient, put them into just the polon environment, just the AQI environment, or just the oval ozone environment. And that is practically, it's not feasible. That's why we do like randomized control trials. I cannot run that randomized control trials for these pediatric patients. So that inspired me to look into causality. So how do I do that?

And I felt Knowledge Graph, on the other hand, gave me a lot of domain knowledge, which it's practically not possible if you go to a doctor and tell them, hey, I have all this data, analyze it for me, give me all your domain expertise. They don't have the time to do that. But if you have a Knowledge Graph of a disease or any domain that you're looking into, you

you can in a way supplement or complement the actual domain expertise using the knowledge graph there. So I thought, why don't I combine these two domains together? So causal, that's how causal neuro-symbolic came into account. Then I started looking to how people have represented causality. Causality is represented in the causal domain, is represented using structural causal models or causal Bayesian networks.

and I wanted to see how people are representing causality in knowledge graph, which is a simple triple based. So A causes B, and then B causes C if it's a serial relation. And I found a gap in there, so I thought, okay, people are representing causality, that is great, that's a good start. But what happens if this is a serial relation, which is A causes B, B causes C? They would separate them into two different triples, which would be A causes B, then B causes C.

and then maybe a third triple which says A causes C. But when you represent this triple A causes C, you miss out the context that is B, which is very important.

My goal was first to enrich the neuro-symbolic AI space with the representation of causality which exists in the causal Bayesian network. So I would want to take whatever is there in the causal Bayesian network and put it into the neuro-symbolic AI space. So that's where we created a causal ontology which is one-on-one mapping from the Bayesian network to the neuro-symbolic AI space or the knowledge graph space.

So it's not just about mapping the structure, but it is also about mapping the concepts from the causal Bayesian network space into the neuro-symbolic AI space. So I've mapped the weights. I can do that. I can incorporate them into the embedding space. I can use them in the loss function to do my downstream tasks. The second concept that exists in a causal Bayesian network is Markov condition.

or causal Markov assumption, which basically means that given a relation or given any particular node, it is only affected by its direct cause and it's independent of its parents. So I also wanted to make sure that we have all of that in one space, which would be the neuro symbolic AI space.

I've used this methodology called Markov Bay Split when we are evaluating that scenario.

Then comes the backdoor path. So there are some confounders. So we have links in the knowledge graph. And if you do normal link prediction, because of those confounders, we saw that our results tend to show inflated performance, which is good for us. But practically, if you think from a causal standpoint, those are not the correct results. My basic goal was have a holistic framework where

I don't need just the Bayesian network to do all the certain tasks that I do. So I have this one framework where along with the Bayesian network, I also have the domain knowledge, everything incorporated into one space.

So that's what motivated me to look into causal LP back, that we call it. So now doing link prediction, but taking the backdoor paths into account. So remove the backdoor paths in the evaluation space when we are doing link prediction tasks.

Can you remind listeners what the weights represent? The weights are basically telling you how much. So given this relation, A causes B. We have something known as intervention. Let's say how much I have this relation that taking aspirin

causes my headache to go away. Now I want to intervene and say, no, I did not take an aspirin. Now tell me, how does my headache go away or not? And how does that affect my headache being cured? This number is what we call the causal weight of this relation. So how much taking Advil or taking aspirin is actually curing my headache?

So I think a traditional approach would be to use those weights directly in machine learning in the loss function. You want to make something as a small weight, not so bad of a mistake to make, but if it has a big weight, you want to get that one correct. Exactly. Why is that not sufficient? It depends on what tasks you are doing. In terms of the link predictions, this is not a well-known strategy that people have used.

It's typically just take the causal relation into account. But there might be different relations. So there might be a relation from A to B, A to C. So if you're doing link prediction, all you're doing is taking the relation, which is A causes and A causes B, A causes C into account. However, the weightage of, let's say, A causes B is maybe, let's say,

5% versus the weightage from A to C is, let's say, 10%. So you would want your algorithm to give you the result for A causes what to be more likely C rather than B. So that's where the weights comes into account in the link prediction tasks.

Can you talk more about how you get that included then in the embedding itself? How does your technique differ? We are using some of the existing work that has been done on using weights into the embedding space, which is done by a paper called Incorporating Weights into Link Prediction Tasks, where they have called this embedding technique as FOCUS-E.

And they have included into the loss function. You could use it with any embedding algorithms that you're using. Currently, they have implemented with four. So there's a transy, distmult, conv, I guess, holy. So these are four different commonly used embedding methodologies or embedding spaces that people have used. And the way they're doing is, let's say something has a

loss function of, let's take into account the distance-based loss function. So you will want to increase or take this number into account when you are calculating the distance between the node A to the node C or to node B. Could we revisit the Markov-based split? What mistakes would you make if you didn't do that?

So some of the mistakes that you would end up doing if you don't include the Markov Bay Split, you've already trained some of the links and you are testing on those links. So you're biasing your test set.

Is that like leakage? That is actually data leakage. So when you split the causal network, you would take a node and if you do a random splitting, there's most likely that some of the nodes that you have included in the testing set might have links which are already included in the training set or already would have a link to their parent in the training set. So when you're doing link predictions,

in a way that includes bias, because you've already learned those links from your training set. So now you're doing link prediction, obviously, you'll have a better performance. And we saw that

that if we don't include the Markov-based split, we saw that the mean reciprocal rank, that's a common metric in the knowledge graph embedding space of the link prediction does, increased by 42.3%. Good evidence that it needs to be there. Especially if you're doing causality, because we want to incorporate causality in neuro-symbolic AI space. So we have to take into consideration these phenomenas which exist in the knowledge graph, in the causal Bayesian network space.

So we've gone through some of the steps, starting from the causal relationships into the network, removing those backdoor paths. I guess it all amounts then to predicting new causal links in that knowledge graph.

Is that with traditional methods or have you done something innovative there as well? So that's one of the traditional methods that people tend to do. One innovative way I have incorporated is using weights in doing causal link prediction. The other work that we have looked into is incorporating mediators.

So this is another property of a Bayesian network that you have mediators. So the serial relation of A causes B, B causes C. B is a mediator in this case. And in that scenario or situation, we have used hyperrelational graphs.

So those are not simple triple-based. We have this triple of A causes B would have another hyperrelation to this relation causes to, let's say, has mediator another node B.

So this is another innovative work that we have looked into in the case of causal neurosymbolic AI or incorporating causality into neurosymbolic AI. And we saw good results, significant improvement when you take mediator into account while doing a link prediction task. And what sort of metrics or evaluations do you use to look at the quality of the output?

There are two metrics that we commonly use. One is the mean reciprocal rank, that is the MRR score. Another one is the hits at K. So we typically use hits at 1, 3, and 5, sometimes at 10, which basically tells you the first time you do the link prediction task, tell me how likely are you going to get the correct result at K.

So how likely are you getting the first result, which is the correct result? How likely are you get your top three results, which are k equals to three, are the correct results? And are there any standard data sets you can use as benchmarks? Absolutely. So we are using one of the causal reasoning benchmark data set. That data set is not so common in the knowledge graph space.

But we have transformed that data set, which is a visual causal reasoning benchmark data set called Clever Humans. And we've converted that into a knowledge graph. So the data set itself consists of videos. So there are 700 plus videos of objects of different shapes, size and material coming into the frame and colliding with each other.

and they have defined at least 27 different types of collision events. So every collision event is different. So hit has a different semantics than push, although they all are collision. And we have converted this data set into a causal clevered humans knowledge graph.

which we are using as a benchmark to test on. But if you think about are there possible causal benchmark or knowledge graph based benchmarks available out there, there is still work to be done in that domain. And that's something that we are moving forward to.

One other common knowledge graph that people have used, causal knowledge graph, is a Wikipedia-based causal knowledge graph which calls a wiki-causal. But one area that we are actively looking into creating benchmarks to test our data sets or test our methodologies. With the one you have, can you compare your method to others? Is it apples to apples in some way that that's useful? We start with looking from a causal network standpoint.

The other knowledge graphs that are causal knowledge graph, as in the wiki causal knowledge graph, there's one more called causnet. They're looking at causality from a text point of view. So they learn causal relations from textual data. So not exactly apple to apple comparisons there, but we can definitely work out something where we may be able to compare how wiki causal or causnet and other causal or cause KG work.

kind of graphs perform on the methodologies that we have implemented. But we come from the baseline. So as you said, our goal was to learn new causal relation. So another problem that exists in the causal space is causal discovery, where given the observation data, you want to learn a causal network.

And our assumption is, and some of the studies that we have seen, the observation data tend to be incomplete. It is just practically impossible to observe everything around in the world. If you use the traditional structure learning algorithms and the observation data, because of the incompleteness of the observation data or the collected data, the network that you're going to learn is going to be incomplete. So we come from that perspective.

Our goal is, or my goal in a way, is to now once I have this incomplete data, yes, I have this incomplete knowledge or causal network that I have already learned, either learn or I have a domain expert who's telling me, okay, this is from this observation data, this could be the possible network that you could have learned, or this is a snapshot or a small network that, yes, we can give you.

Another problem that happens with structural algorithms is, at times, especially this is true when you talk about real world examples, they would learn relations which are not true, because they are just looking into conditional probability or some statistical relation between the variables.

So you definitely need a domain expert in the loop to tell you does these relations actually make sense or not? And that's where Knowledge Graph helps in. So in a way we have converted this problem of causal discovery into, transformed it into a causal link prediction technique or link prediction problem into the neuro symbolic AI space.

That's a good description. Yeah, that makes it possible then really. Exactly. So now you can start with your simple or first with your causal network, which you have either learned using the existing algorithm or a domain expert has provided to you. Now we map this into the knowledge graph space and use the knowledge graph techniques to learn it. So we come from bottom up.

So starting with the network and then enriching it. However, the other knowledge graphs, which are causal knowledge graph, which are out there, they are more at a step two level, I would say. So it's not about the causal network. It's about if there's a text given to you, they've learned the causal triples from them from there, and then they are doing the causal link predictions.

Are there any particular problems or maybe industries or things like that where you think causal LP-back architecture can make a big impact?

Oh, absolutely. So since I keep on coming back to the healthcare example, so one industry, and I think safety-critical industries or application is where causality should be implemented head-on at the moment. So healthcare is one area where we can think about, so what is the cause of a certain disease or a certain condition? The other area that we have looked at briefly is

and we are collaborating with is smart manufacturing. So at the University of South Carolina, we have a Mechner Aerospace Center and we have a NEXT team, which works on creating a smart assembly pipeline where we assemble toy rockets. And there are anomalies happening in the assembly pipeline. So anomaly, which would mean that the assembled toy rocket would not have a nose or the top of the rocket missing, the base missing, or one of the body part of the rocket missing.

So doing causality there, so looking into root cause analysis in the manufacturing setting. So that's definitely one application area that I see. The other area could be that we have also briefly looked into is autonomous driving. Can we actually reason about the action that the car has taken, the autonomous vehicle has taken? So these are some of the top areas where I would like to see the application of causality.

And how do the techniques benefit explainable AI? So since we are using knowledge graphs in this case, explainability is one of the benefits of knowledge graph. So knowledge graphs are inherently explainable for humans. So we can definitely explain connecting, tracing back information.

our results to the other nodes and the parts in the knowledge graph to explain why did the link prediction algorithm or why did the neural network give us a certain result. What's next for you?

I have a couple of offers lined up for a position as assistant professor. So that's my next step and I'm very excited for it. Some of the future work that we have talked about would be part of my proposals and maybe PhD students who I guide in my journey would be working on those exciting topics.

I like interdisciplinary work. So my future work would mostly deal about extension or what I'm currently doing, but in different and exciting domains and with some exciting domain experts. And is there anywhere listeners can follow you online? So they can always follow me up on my LinkedIn page.

If anybody's interested in collaborating or working on the same project or a similar area, they can always send me a message on LinkedIn. They can look up my website. So I have my website mentioned on my LinkedIn page. So there's another way that they can always look me up or send me a message or send me an email. Sounds good. We'll have links in the show notes for listeners to follow up.

Thank you so much for taking the time to come on and share your work. Oh, thank you so much for having me. It was lovely talking to you.