172
Chapter 11: Artificial Intelligence
an isolated decision made by the organization.
For example, a hypothetical example of an AI
algorithm used during the COVID-19 pan-
demic determines what patients should receive
treatment. Certain countries or regions would
prefer to allocate their scarce resources to
patients that are predicted to have the highest
chance of survival (accuracy prevails), whereas
others may prefer to apply a fair allocation of
resources rather than considering a patient’s age
and gender (fairness prevails). Such national
and regional differences require a handshake
between the company’s ethics framework and
that at the policy level. To accommodate for
national and regional differences, algorithms
can be designed so that they can be adjusted
by the ethics committees at hospitals or at the
regional level to meet the level of accuracy ver-
sus fairness desired.
Ethics committees emerged in health-
care in the 1970s at the very heart of hospi-
tals. Initially, they focused on research ethics
for clinical investigations on human subjects.
Such committees exist throughout Europe
and are regulated by law. Today, many of
these committees have organized themselves
at the regional or national level and also
focus on clinical ethics. They have evolved
into democratic platforms of public debate
on medicine and human values. There is no
single model. Every country has created its
own organizational landscape, according
to its moral and ideological preferences,
adapted to the political structure of its
health system, and its method of financing
healthcare.54 This complex reality and the
lack of a unified approach make it challeng-
ing for companies to engage with ethics
committees and find a single working point
on the tradeoff curve between accuracy and
fairness that is acceptable worldwide.
Bias
From a scientific point of view, bias is the ten-
dency of a statistic to overestimate or underes-
timate a parameter.55 From a legal perspective,
bias is any prejudiced or partial personal or
social perception of a person or group.56 It is
beyond this chapter’s scope to discuss the differ-
ences between the scientific and legal definition
suffice it to say that we attribute a broader
meaning to the legal definition, mainly because
the scientific definition generally is understood
to refer to systematic estimation errors.57 In
contrast, the legal definition also can apply to
one-off errors in perception.
Aiming for software to be unbiased is desir-
able. Zero bias is, however, impossible to achieve.
Bias may enter the AI development chain at
different stages (see Figure 11-9). We humans
all have our blind spots. Therefore, any data set
that relies on humans making decisions will have
some form of bias. Every hospital is different,
every country is different, and every patient is
different. You can never achieve zero bias when
extrapolating. In the medical world, clinical
investigations of devices for adults have histori-
cally underrepresented women, minority racial or
ethnic groups, and to some extent, patients over
age 65.58 AI can maintain or even amplify such
bias through its decisions. In trying to optimize
its function, AI might ignore a minority if the
minority looks different from the general popula-
tion to optimize for the general population.
For manufacturers to establish that bias is
minimized, they need to assess the AI for bias
and ensure bias has been minimized if consid-
ered harmful. For example, they can compare the
software output against an independent reference
standard (e.g., a ground truth, biopsy, or expert
consensus). They can perform specific sanity tests
in terms of accuracy on every group to efficiently
identify from the data.59 The challenge is for
computer scientists to get an accurate picture of
which group(s) could be potentially biased. The
difference might not show up in aggregate, but
only when focusing on a sub-population within
that group, where no test can exhaustively cover
the space of all permutations. A big challenge to
address bias is that sufficient and complete data
must be available, which is rarely possible under
Chapter 11: Artificial Intelligence
an isolated decision made by the organization.
For example, a hypothetical example of an AI
algorithm used during the COVID-19 pan-
demic determines what patients should receive
treatment. Certain countries or regions would
prefer to allocate their scarce resources to
patients that are predicted to have the highest
chance of survival (accuracy prevails), whereas
others may prefer to apply a fair allocation of
resources rather than considering a patient’s age
and gender (fairness prevails). Such national
and regional differences require a handshake
between the company’s ethics framework and
that at the policy level. To accommodate for
national and regional differences, algorithms
can be designed so that they can be adjusted
by the ethics committees at hospitals or at the
regional level to meet the level of accuracy ver-
sus fairness desired.
Ethics committees emerged in health-
care in the 1970s at the very heart of hospi-
tals. Initially, they focused on research ethics
for clinical investigations on human subjects.
Such committees exist throughout Europe
and are regulated by law. Today, many of
these committees have organized themselves
at the regional or national level and also
focus on clinical ethics. They have evolved
into democratic platforms of public debate
on medicine and human values. There is no
single model. Every country has created its
own organizational landscape, according
to its moral and ideological preferences,
adapted to the political structure of its
health system, and its method of financing
healthcare.54 This complex reality and the
lack of a unified approach make it challeng-
ing for companies to engage with ethics
committees and find a single working point
on the tradeoff curve between accuracy and
fairness that is acceptable worldwide.
Bias
From a scientific point of view, bias is the ten-
dency of a statistic to overestimate or underes-
timate a parameter.55 From a legal perspective,
bias is any prejudiced or partial personal or
social perception of a person or group.56 It is
beyond this chapter’s scope to discuss the differ-
ences between the scientific and legal definition
suffice it to say that we attribute a broader
meaning to the legal definition, mainly because
the scientific definition generally is understood
to refer to systematic estimation errors.57 In
contrast, the legal definition also can apply to
one-off errors in perception.
Aiming for software to be unbiased is desir-
able. Zero bias is, however, impossible to achieve.
Bias may enter the AI development chain at
different stages (see Figure 11-9). We humans
all have our blind spots. Therefore, any data set
that relies on humans making decisions will have
some form of bias. Every hospital is different,
every country is different, and every patient is
different. You can never achieve zero bias when
extrapolating. In the medical world, clinical
investigations of devices for adults have histori-
cally underrepresented women, minority racial or
ethnic groups, and to some extent, patients over
age 65.58 AI can maintain or even amplify such
bias through its decisions. In trying to optimize
its function, AI might ignore a minority if the
minority looks different from the general popula-
tion to optimize for the general population.
For manufacturers to establish that bias is
minimized, they need to assess the AI for bias
and ensure bias has been minimized if consid-
ered harmful. For example, they can compare the
software output against an independent reference
standard (e.g., a ground truth, biopsy, or expert
consensus). They can perform specific sanity tests
in terms of accuracy on every group to efficiently
identify from the data.59 The challenge is for
computer scientists to get an accurate picture of
which group(s) could be potentially biased. The
difference might not show up in aggregate, but
only when focusing on a sub-population within
that group, where no test can exhaustively cover
the space of all permutations. A big challenge to
address bias is that sufficient and complete data
must be available, which is rarely possible under