This is something you won't like.
But here everyone is a liar.
Don't take it too personally.
What I mean is that lying is very common
and it is now well-established
that we lie on a daily basis.
Indeed, scientists have estimated
that we tell around two lies per day,
although, of course, it's not that easy
to establish those numbers with certainty.
And, well, I introduce myself.
I'm Riccardo, I'm a psychologist
and a PhD candidate,
and for my research project I study
how good are people at detecting lies.
Seems cool, right? But I'm not joking.
And you might wonder
why a psychologist was then invited
to give a TED Talk about AI.
And well, I'm here today
because I'm about to tell you
how AI could be used to detect lies.
And you will be very surprised
by the answer.
But first of all,
when is it relevant to detect lies?
A first clear example
that comes to my mind
is in the criminal investigation field.
Imagine you are a police officer
and you want to interview a suspect.
And the suspect is providing
some information to you.
And this information is actually leading
to the next steps of the investigation.
We certainly want to understand
if the suspect is reliable
or if they are trying to deceive us.
Then another example comes to my mind,
and I think this really affects all of us.
So please raise your hands
if you would like to know
if your partner cheated on you.
(Laughter)
And don't be shy because I know.
(Laughter)
Yeah. You see?
It's very relevant.
However, I have to have to say
that we as humans
are very bad at detecting lies.
In fact, many studies
have already confirmed
that when people are asked to judge
if someone is lying or not
without knowing much
about that person or the context,
people's accuracy is no better
than the chance level,
about the same as flipping a coin.
You might also wonder
if experts, such as police officers,
prosecutors, experts
and even psychologists
are better at detecting lies.
And the answer is complex,
because experience alone
doesn't seem to be enough
to help detect lies accurately.
It might help, but it's not enough.
To give you some numbers.
In a well-known meta-analysis
that previous scholars did in 2006,
they found that naive judges' accuracy
was on average around 54 percent.
Experts perform only slightly better,
with an accuracy rate around 55 percent.
(Laughter)
Not that impressive, right?
And ...
Those numbers actually come
from the analysis
of the results of 108 studies,
meaning that these findings
are quite robust.
And of course, the debate is also
much more complicated than this
and also more nuanced.
But here the main take-home message
is that humans are not good
at detecting lies.
What if we are creating an AI tool
where everyone can detect
if someone else is lying?
This is not possible yet,
so please don't panic.
(Laughter)
But this is what we tried to do
in a recent study
that I did together
with my brilliant colleagues
whom I need to thank.
And actually, to let you understand
what we did in our study,
I need to first introduce you
to some technical concepts
and to the main characters of this story:
Large language models.
Large language models are AI systems
designed to generate outputs
in natural language
in a way that almost mimics
human communication.
If you are wondering how we teach
these AI systems to detect lies,
here is where something called
fine-tuning comes in.
But let's use a metaphor.
Imagine large language models
being as students
who have gone through years of school,
learning a little bit about everything,
such as language, concepts, facts.
But when it's time for them to specialize,
like in law school or in medical school,
they need more focused training.
Fine-tuning is that extra education.
And of course, large language models
don't learn as humans do.
But this is just to give you
the main idea.
Then, as for training students,
you need books, lectures, examples,
for training large language models
you need datasets.
And for our study
we considered three datasets,
one about personal opinions,
one about past autobiographical memories
and one about future intentions.
These datasets were already available
from previous studies
and contained both truthful
and deceptive statements.
Typically, you collect
these types of statements
by asking participants to tell the truth
or to lie about something.
For example, if I was a participant
in the truthful condition,
and the task was
"tell me about your past holidays,"
then I will tell the researcher
about my previous holidays in Vietnam,
and here we have a slide to prove it.
For the deceptive condition
they will randomly pick some of you
who have never been to Vietnam,
and they will ask you to make up a story
and convince someone else
that you've really been to Vietnam.
And this is how it typically works.
And as in all university courses,
you might know this,
after lectures you have exams.
And likewise after training our AI models,
we would like to test them.
And the procedure that we followed,
that is actually the typical one,
is the following.
So we picked some statements
randomly from each dataset
and we took them apart.
So the model never saw these statements
during the training phase.
And only after the training was completed,
we used them as a test, as the final exam.
But who was our student then?
In this case, it was
a large language model
developed by Google
and called FLAN-T5.
Flanny, for friends.
And now that we have all the pieces
of the process together,
we can actually dig deep into our study.
Our study was composed
by three main experiments.
For the first experiment,
we fine-tuned our model, our FLAN-T5,
on each single dataset separately.
For the second experiment,
we fine-tuned our model
on two pairs of datasets together,
and we tested it
on the third remaining one,
and we used all three
possible combinations.
For the last final experiment,
we fine-tuned the model
on a new, larger training test set
that we obtained by combining
all the three datasets together.
The results were quite interesting
because what we found
was that in the first experiment,
FLAN-T5 achieved an accuracy range
between 70 percent and 80 percent.
However, in the second experiment,
FLAN-T5 dropped its accuracy
to almost 50 percent.
And then, surprisingly,
in the third experiment,
FLAN-T5 rose back to almost 80 percent.
But what does this mean?
What can we learn from these results?
From experiment one and three
we learn that language models
can effectively classify
statements as deceptive,
outperforming human benchmarks
and aligning with previous
machine learning
and deep learning models
that previous studies trained
on the same datasets.
However, from the second experiment,
we see that language models struggle
in generalizing this knowledge,
this learning across different contexts.
And this is apparently because
there is no one single
universal rule of deception
that we can easily apply in every context,
but linguistic cues of deception
are context-dependent.
And from the third experiment,
we learned that actually language models
can generalize well
across different contexts,
if only they have been
previously exposed to examples
during the training phase.
And I think this sounds as good news.
But while this means that language models
can be effectively applied
for real-life applications
in lie detection,
more replication is needed
because a single study is never enough
so that from tomorrow we can all have
these AI systems on our smartphones,
and start detecting other people's lies.
But as a scientist,
I have a vivid imagination
and I would like to dream big.
And also I would like to bring you with me
in this futuristic journey for a while.
So please imagine with me
living in a world
where this lie detection technology
is well-integrated in our life,
making everything from national security
to social media a little bit safer.
And imagine having this AI system
that could actually spot fake opinions.
From tomorrow, we could say
when a politician
is actually saying one thing
and truly believes something else.
(Laughter)
And what about the security board context
where people are asked
about their intentions and reasons
for why they are crossing borders
or boarding planes.
Well, with these systems,
we could actually spot
malicious intentions
before they even happen.
And what about the recruiting process?
(Laughter)
We heard about this already.
But actually, companies
could employ this AI
to distinguish those
who are really passionate about the role
from those who are just trying
to say the right things to get the job.
And finally, we have social media.
Scammers trying to deceive you
or to steal your identity.
All gone.
And someone else may claim
something about fake news,
and well, perfectly, language model
could automatically read the news,
flag them as deceptive or fake,
and we could even provide users
with a credibility score
for the information they read.
It sounds like a brilliant future, right?
(Laughter)
Yes, but ...
all great progress comes with risks.
As much as I'm excited about this future,
I think we need to be careful.
If we are not cautious, in my view,
we could end up in a world
where people might just
blindly believe AI outputs.
And I'm afraid this means
that people will just be more likely
to accuse others of lying
just because an AI says so.
And I'm not the only one with this view
because another study already proved it.
In addition, if we totally rely
on this lie detection technology
to say someone else is lying or not,
we risk losing another
important key value in society.
We lose trust.
We won't need to trust people anymore,
because what we will do
is just ask an AI to double check for us.
But are we really willing
to blindly believe AI
and give up our critical thinking?
I think that's the future
we need to avoid.
With hope for the future
is more interpretability.
And I'm about to tell you what I mean.
Similar to when we look at reviews online,
and we can both look
at the total number of stars at places,
but also we can look more in detail
at the positive and negative reviews,
and try to understand
what are the positive sides,
but also what might have gone wrong,
to eventually create
our own and personal idea
if that is the place where we want to go,
where we want to be.
Likewise, imagine a world
where AI doesn't just offer conclusions,
but also provides clear
and understandable explanations
behind its decisions.
And I envision a future
where this lie detection technology
wouldn't just provide us
with a simple judgment,
but also with clear explanations
for why it thinks someone else is lying.
And I would like a future where, yes,
this lie detection technology
is integrated in our life,
or also AI technology in general,
but still, at the same time,
we are able to think critically
and decide when we want
to trust in AI judgment
or when we want to question it.
To conclude,
I think the future of using AI
for lie detection
is not just about
technological advancement,
but about enhancing our understanding
and fostering trust.
It's about developing tools
that don't replace human judgment
but empower it,
ensuring that we remain at the helm.
Don't step into a future
with blind reliance on technology.
Let's commit to deep understanding
and ethical use,
and we'll pursue the truth.
(Applause)
Thank you.