# Common Knowledge

*First published Tue Aug 28, 2001; substantive revision Tue Dec 6, 2005*

A proposition *A* is *mutual knowledge* among a set of
agents if each agent knows that *A*. Mutual knowledge by itself
implies nothing about what, if any, knowledge anyone attributes to
anyone else. Suppose each student arrives for a class meeting knowing
that the instructor will be late. That the instructor will be late is
mutual knowledge, but each student might think only she knows the
instructor will be late. However, if one of the students says openly
"Peter told me he will be late again," then each student knows that
each student knows that the instructor will be late, each student knows
that each student knows that each student knows that the instructor
will be late, and so on, *ad infinitum.* The announcement made
the mutually known fact *common knowledge* among the students.

Common knowledge is a phenomenon which underwrites much of social life. In order to communicate or otherwise coordinate their behavior successfully, individuals typically require mutual or common understandings or background knowledge. Indeed, if a particular interaction results in "failure", the usual explanation for this is that the agents involved did not have the common knowledge that would have resulted in success. If a married couple are separated in a department store, they stand a good chance of finding one another because their common knowledge of each others' tastes and experiences leads them each to look for the other in a part of the store both know that both would tend to frequent. Since the spouses both love cappuccino, each expects the other to go to the coffee bar, and they find one another. But in a less happy case, if a pedestrian causes a minor traffic jam by crossing against a red light, she explains her mistake as the result of her not noticing, and therefore not knowing, the status of the traffic signal that all the motorists knew. The spouses coordinate successfully given their common knowledge, while the pedestrian and the motorists miscoordinate as the result of a breakdown in common knowledge.

Given the importance of common knowledge in social interactions, it
is remarkable that only quite recently have philosophers and social
scientists attempted to analyze the concept. David Hume (1740) was
perhaps the first to make explicit reference to the role of mutual
knowledge in coordination. In his account of convention in *A
Treatise of Human Nature*, Hume argued that a necessary condition
for coordinated activity was that agents all know what behavior to
expect from one another. Without the requisite mutual knowledge, Hume
maintained, mutually beneficial social conventions would disappear.
Much later, J. E. Littlewood (1953) presented some examples of
common-knowledge-type reasoning, and Thomas Schelling (1960) and John
Harsanyi (1967-1968) argued that something like common knowledge is
needed to explain certain inferences people make about each other. Yet
it was David Lewis (1969) who first gave an explicit analysis of common
knowledge in the monograph *Convention.* Stephen Schiffer
(1972), Robert Aumann (1976), and Gilbert Harman (1977) independently
gave alternate definitions of common knowledge. Jon Barwise (1988,
1989) gave a precise formulation of Harman's intuitive account.
Schiffer's analysis of common knowledge as a *hierarchy* of
epistemic claims has become standard in the philosophical and social
science literature. Lewis', Aumann's, and Barwise's accounts all imply
the hierarchical account. In some contexts, Schiffer's, Aumann's, and
Barwise's definitions of common knowledge are more convenient to use
than Lewis' original definition. More recently, Margaret Gilbert (1989)
proposed a somewhat different account of common knowledge which she
argues is preferable to the standard account. Others have developed
accounts of mutual knowledge, *approximate common knowledge*,
and *common belief* which require less stringent assumptions
than the standard account, and which serve as more plausible models of
what agents know in cases where strict common knowledge seems
impossible (Brandenburger and Dekel 1987, Stinchcombe 1988, Monderer
and Samet 1989, Rubinstein 1992). The analysis and applications of
common knowledge and related multi-agent knowledge concepts has become
a lively field of research.

The purpose of this essay is to overview of some of the most
important results stemming from this contemporary research. The topics
reviewed in each section of this essay are as follows: Section 1 gives
motivating examples which illustrate a variety of ways in which the
actions of agents depend crucially upon their having, or lacking,
certain common knowledge. Section 2 discusses alternative analyses of
common knowledge. Section 3 reviews applications of multi-agent
knowledge concepts, particularly to *game theory* (von Neumann
and Morgenstern 1944), in which common knowledge assumptions have been
found to have great importance in justifying *solution concepts*
for mathematical games. Section 4 discusses skeptical doubts about the
attainability of common knowledge. Finally, Section 5 discusses the
*common belief* concept which result from weakening the
assumptions of Lewis' account of common knowledge.

- 1. Motivating Examples
- 2. Alternative Accounts of Common Knowledge
- 3. Applications of Mutual and Common Knowledge
- 4. Is Common Knowledge Attainable?
- 5. Coordination and Common
*p*-Belief - Bibliography
- Other Internet Resources
- Related Entries

## 1. Motivating Examples

Most of the examples in this section are familiar in the common knowledge literature, although some of the details and interpretations presented here are new. Readers may want to ask themselves what, if any, distinctive aspects of mutual and common knowledge reasoning each example illustrates.### 1.1. The Clumsy Waiter^{[1]}

A waiter serving dinner slips, and spills gravy on a guest's white silk
evening gown. The guest glares at the waiter, and the waiter declares
"I'm sorry. It was my fault." Why did the waiter say that he was at
fault? He knew that he was at fault, and he knew from the guest's angry
expression that she knew he was at fault. However, the sorry waiter
wanted assurance that the guest *knew that he knew*he was at fault. By saying openly that he was at fault, the waiter knew that the guest knew what he wanted her to know, namely, that he knew he was at fault. Note that the waiter's declaration established at least three levels of nested knowledge.

Certain assumptions are implicit in the preceding story. In
particular, the waiter must know that the guest knows he has spoken the
truth, and that she can draw the desired conclusion from what he says
in this context. More fundamentally, the waiter must know that if he
announces "It was my fault" to the guest, she will interpret his
intended meaning correctly and will infer what his making this
announcement ordinarily implies in this context. This in turn implies
that the guest must know that if the waiter announces "It was my fault"
in this context, then the waiter indeed knows he is at fault. Then on
account of his announcement, the waiter knows that the guest knows that
he knows he was at fault. The waiter's announcement was meant to
generate *higher-order* levels of knowledge of a fact each
already knew.

Just a slight strengthening of the stated assumptions results in
even higher levels of nested knowledge. Suppose the waiter and the
guest each know that the other can infer what he infers from the
waiter's announcement. Can the guest now believe that the waiter does
not know that she knows that he knows he is at fault? If the guest
considers this question, she reasons that if the waiter falsely
believes it is possible that she does not know that he knows he is at
fault, then the waiter must believe it to be possible that she cannot
infer that he knows he is at fault from his own declaration. Since she
knows she *can* infer that the waiter knows he is at fault from
his declaration, she knows that the waiter knows she can infer this, as
well. Hence the waiter's announcement establishes the fourth-order
knowledge claim: The guest knows that the waiter knows that she knows
that he knows he is at fault. By similar, albeit lengthier, arguments,
the agents can verify that corresponding knowledge claims of even
higher-order must also obtain under these assumptions.

### 1.2 The Barbecue Problem

This is a variation of an example first published by Littlewood
(1953), although he notes that his version of the example was already
well-known at the
time.^{[2]}
*N* individuals enjoy a picnic supper
together which includes barbecued spareribs. At the end of the meal,
*k* ≥ 1 of these diners have barbecue sauce on their faces.
No one wants to continue the evening with a messy face. No one wants to
wipe her face if it's not messy, for this would make her appear
neurotic. And no one wants to take the risk of being thought rude by
telling anyone else that he has barbecue sauce on his face. Since no
one can see her own face, none of the messy diners makes a move to
clean her face. Then the cook who served the spareribs returns with a
carton of ice cream. Amused by what he sees, the cook rings the dinner
bell and makes the following announcement: "At least one of you has
barbecue sauce on her face. I will ring the dinner bell over and over,
until anyone who is messy has wiped her face. Then I will serve
dessert." For the first *k* − 1 rings, no one does
anything. Then, at the *k*^{th} ring, each of the messy
individuals suddenly reaches for a napkin, and soon afterwards, the
diners are all enjoying their ice cream.

How did the messy diners finally realize that their faces needed
cleaning? The *k* = 1 case is easy, since in this case, the lone
messy individual will realize he is messy immediately, since he sees
that everyone else is clean. Consider the *k* = 2 case next. At
the first ring, messy individual *i*_{1} knows that one
other person, *i*_{2}, is messy, but does not yet know
about himself. At the second ring, *i*_{1} realizes that
he must be messy, since had *i*_{2} been the only messy
one, *i*_{2} would have known this after the first ring
when the cook made his announcement, and would have cleaned her face
then. By a symmetric argument, messy diner *i*_{2} also
concludes that she is messy at the second ring, and both pick up a
napkin at that time.

Let's next consider *k* = 3. Again at the first ring, each of
the messy diners *i*_{1}, *i*_{2}, and
*i*_{3} knows the status of the other diners, but not
her own. The situation is apparently unchanged after the second ring.
But on the third ring, *i*_{1} realizes that she is
messy. For if *i*_{2} and *i*_{3} were
the only messy ones, then they would have discovered this after the
second ring by the argument of the previous paragraph. Since
*i*_{1} can see that all of the diners other than
*i*_{2} and *i*_{3} are clean, she
concludes that she must be messy. *i*_{2} and
*i*_{3} draw similar conclusions at the third ring, and
all clean their faces at that time.

The general case follows by induction. Suppose that if *k* =
*j*, then each of the *j* messy diners can determine that
he is messy after *j* rings. Then if *k* = *j* +
1, then at the *j* + 1^{st} ring, each of the *j*
+ 1 individuals will realize that he is messy. For if he were not
messy, then the other *j* messy ones would have all realized
their messiness at the *j*^{th} ring and cleaned
themselves then. Since no one cleaned herself after the
*j*^{th} ring, at the *j* + 1^{st} ring
each messy person will conclude that someone besides the other
*j* messy people must also be messy, namely, himself.

The "paradox" of this argument is that for *k* > 1, like
the case of the clumsy waiter of Example 1.1, the cook's announcement
told the diners something that each already knew. Yet apparently the
cook's announcement also gave the diners useful information. How could
this be? By announcing a fact already known to every diner, the cook
made this fact *common knowledge* among them, enabling each of
them to eventually deduce the condition of his own face after
sufficiently many rings of the bell. Note that the inductive argument
the agents run through depends upon the conclusions they each draw from
several *counterfactual conditionals.* In general, the
consequences of agents' common knowledge are intimately related to how
they evaluate subjunctive and counterfactual
conditionals.^{[3]}

### 1.3 The Farmer's Dilemma

Does meeting one's obligations to others serve one's self-interest? Plato and his successors recognized that in certain cases, the answer seems to be "No." Hobbes (1651, pp. 101-102) considers the challenge of a "Foole", who claims that it is irrational to honor an agreement made with another who has already fulfilled his part of the agreement. Noting that in this situation one has gained all the benefit of the other's compliance, the Foole contends that it would now be best for him to break the agreement, thereby saving himself the costs of compliance. Of course, if the Foole's analysis of the situation is correct, then would the other party to the agreement not anticipate the Foole's response to agreements honored, and act accordingly?

Hume (1740, pp. 520-521) takes up this question, using an example: Two neighboring farmers each expect a bumper crop of corn. Each will require his neighbor's help in harvesting his corn when it ripens, or else a substantial portion will rot in the field. Since their corn will ripen at different times, the two farmers can ensure full harvests for themselves by helping each other when their crops ripen, and both know this. Yet the farmers do not help each other. For the farmer whose corn ripens later reasons that if she were to help the other farmer, then when her corn ripens he would be in the position of Hobbes' Foole, having already benefited from her help. He would no longer have anything to gain from her, so he would not help her, sparing himself the hard labor of a second harvest. Since she cannot expect the other farmer to return her aid when the time comes, she will not help when his corn ripens first, and of course the other farmer does not help her when her corn ripens later.

The structure of Hume's *Farmers' Dilemma* problem can be
summarized using the following tree diagram:

This tree is an example of a

Figure 1.1a

*game in extensive form.*At each stage

*i*, the agent who moves can either choose

*C*

^{i}, which corresponds to helping or

*cooperating*, or

*D*

^{i}, which corresponds to not helping or

*defecting*. The relative preferences of the two agents over the various outcomes are reflected by the ordered pairs of

*payoffs*each receives at any particular outcome. If, for instance, Fiona chooses

*C*

^{i}and Alan chooses

*D*

^{i}, then Fiona's payoff is 0, her worst payoff, and Alan's is 4, his best payoff. In a game such as the Figure 1.1.a game, agents are

*(Bayesian) rational*if each chooses an act that maximizes her expected payoff, given what she knows.

In the Farmers' Dilemma game, following the
*C*^{1},*C*^{2}-path is strictly better
for both farmers than following the
*D*^{1},*D*^{2}-path. However, Fiona
chooses *D*^{1}, as the result of the following simple
argument: "If I were to choose *C*^{1}, then Alan, who
is rational and who knows the payoff structure of the game, would
choose *D*^{2}. I am also rational and know the payoff
structure of the game. So I should choose *D*^{1}."
Since Fiona knows that Alan is rational and knows the game's payoffs,
she concludes that she need only analyze the *reduced* game in
the following figure:

Figure 1.1b

In this reduced game, Fiona is certain to gain a strictly higher
payoff by choosing *D*^{1} than if she chooses
*C*^{1}, so *D*^{1} is her unique best
choice. Of course, when Fiona chooses *D*^{1}, Alan,
being rational, responds by choosing *D*^{2}. If Fiona
and Alan know: (i) that they are both rational, (ii) that they both
know the payoff structure of the game, and (iii) that they both know
(i) and (ii), then they both can predict what the other will do at
every node of the Figure 1.1.a game, and conclude that they can rule
out the *D*^{1},*C*^{2}-branch of the
Figure 1.1.b game and analyze just the reduced game of the following
figure:

Figure 1.1c

On account of this *mutual knowledge*, both know that Fiona
will choose *D*^{1}, and that Alan will respond with
*D*^{2}. Hence, the
*D*^{1},*D*^{2}-outcome results if the
Farmers' Dilemma game is played by agents having this mutual knowledge,
though it is suboptimal since both agents would fare better at the
*C*^{1},*C*^{2}-branch.^{[4]}
This
argument, which in its essentials is Hume's argument, is an example of
a standard technique for solving sequential games known as
*backwards
induction.*^{[5]}
The basic idea behind backwards induction is
that the agents engaged in a sequential game deduce how each will act
throughout the entire game by ruling out the acts that are not
payoff-maximizing for the agents who would move last, then ruling out
the acts that are not payoff-maximizing for the agents who would move
next-to-last, and so on. Clearly, backwards induction arguments rely
crucially upon what, if any, mutual knowledge the agents have regarding
their situation, and they typically require the agents to evaluate the
truth values of certain subjunctive conditionals, such as "If I (Fiona)
were to choose *C*^{1}, then Alan would choose
*D*^{2}".

### 1.4 The Centipede

The mutual knowledge assumptions required to construct a backwards
induction solution to a game become more complex as the number of
stages in the game increases. To see this, consider the sequential
*Centipede* game depicted in the following figure:

At each stage

Figure 1.2

*i*, the agent who moves can either choose

*R*

^{i}, which in the first three stages gives the other agent an opportunity to move, or

*L*

^{i}, which ends the game.

Like the Farmers' Dilemma, this game is a commitment problem for the
agents. If each agent could trust the other to choose
*R*^{i} at each stage, then they would each
expect to receive a payoff of 3. However, Alan chooses
*L*^{1}, leaving each with a payoff of only 1, as the
result of the following backwards induction argument: "If node
*n*_{4} were to be reached, then Fiona, (being rational)
would choose *L*^{4}. I, knowing this, would (being
rational) choose *L*^{3} if node *n*_{3}
were to be reached. Fiona, knowing *this*, would (being
rational) choose *L*^{2} if node *n*_{2}
were to be reached. Hence, I (being rational) should choose
*L*^{1}." To carry out this backwards induction
argument, Alan implicitly assumes that: (i) he knows that Fiona knows
he is rational, and (ii) he knows that Fiona knows that he knows she is
rational. Put another way, for Alan to carry out the backwards
induction argument, at node *n*_{1} he must know what
Fiona must know at node *n*_{2} to make
*L*^{2} her best response should *n*_{2}
be reached. While in the Farmer's Dilemma Fiona needed only
*first-order* knowledge of Alan's rationality and
*second-order* knowledge of Alan's knowledge of the game to
derive the backwards induction solution, in the Figure 1.2 game, for
Alan to be able to derive the backwards induction solution, the agents
must have *third-order mutual knowledge* of the game and
*second-order mutual knowledge* of rationality, and Alan must
have *fourth-order* knowledge of this mutual knowledge of the
game and *third-order* knowledge of their mutual knowledge of
rationality. This argument also involves several counterfactuals, since
to construct it the agents must be able to evaluate conditionals of the
form, "If node *n*_{i} were to be reached, Alan
(Fiona) would choose *L*^{i}
(*R*^{i})", which for *i* > 1 are
counterfactual, since third-order mutual knowledge of rationality
implies that nodes *n*_{2}, *n*_{3}, and
*n*_{4} are never reached.

The method of backwards induction can be applied to any sequential
game of *perfect information*, in which the agents can observe
each others' moves in turn and can recall the entire history of play.
However, as the number of potential stages of play increases, the
backwards induction argument evidently becomes harder to construct.
This raises certain questions: (1) What precisely are the mutual or
common knowledge assumptions that are required to justify the backwards
induction argument for a particular sequential game? (2) As a
sequential game increases in complexity, would we expect the mutual
knowledge that is required for backwards induction to start to
fail?

### 1.5 The Department Store

When a man loses his wife in a department store without any prior understanding on where to meet if they get separated, the chances are good that they will find each other. It is likely that each will think of some obvious place to meet, so obvious that each will be sure that it is "obvious" to both of them. One does not simply predict where the other will go, which is wherever the first predicts the second to predict the first to go, and soad infinitum.Not "What would I do if I were she?" but "What would I do if I were she wondering what she would do if she were wondering what I would do if I were she … ?"Thomas Schelling,The Strategy of Conflict

Schelling's department store problem is an example of a *pure
coordination problem*, that is, an interaction problem in which the
interests of the agents coincide perfectly. Schelling (1960) and Lewis
(1969), who were the first to make explicit the role common knowledge
plays in social coordination, were also among the first to argue that
coordination problems can be modeled using the analytic vocabulary of
game theory. A very simple example of such a coordination problem is
given in the next figure:

Figure 1.3

The matrix of Figure 1.3 is an example of a *game in strategic
form.* At each outcome of the game, which corresponds to a cell in
the matrix, the row (column) agent receives as payoff the first
(second) element of the ordered pair in the corresponding cell.
However, in strategic form games, each agent chooses without first
being able to observe the choices of any other agent, so that all must
choose as if they were choosing simultaneously. The Figure 1.3 game is
a game of *pure coordination* (Lewis 1969), that is, a game in
which at each outcome, each agent receives exactly the same payoff. One
interpretation of this game is that Schelling's spouses, Liz and
Robert, are searching for each other in the department store with four
floors, and they find each other if they go to the same floor. Four
outcomes at which the spouses coordinate correspond to the strategy
profiles
(*s*_{j}, *s*_{j}),
1 ≤ *j* ≤ 4, of the Figure 1.3 game. These four profiles
are strict *Nash equilibria* (Nash 1950, 1951) of the game, that
is, each agent has a decisive reason to follow her end of one of these
strategy profiles provided that the other also follows this
profile.^{[6]}

The difficulty the agents face is trying to select an equilibrium to
follow. For suppose that Robert hopes to coordinate with Liz on a
particular equilibrium of the game, say
(*s*_{2}, *s*_{2}). Robert reasons
as follows: "Since there are several strict equilibria we might follow,
I should follow my end of
(*s*_{2}, *s*_{2}) if, and only if,
I have sufficiently high expectations that Liz will follow her end of
(*s*_{2}, *s*_{2}). But I can only
have sufficiently high expectations that Liz will follow
(*s*_{2}, *s*_{2} ) if she has
sufficiently high expectations that I will follow
(*s*_{2}, *s*_{2}). For her to have
such expectations, Liz must have sufficiently high (second-order)
expectations that I have sufficiently high expectations that she will
follow (*s*_{2}, *s*_{2}), for if
Liz doesn't have these (second-order) expectations, then she will
believe I don't have sufficient reason to follow
(*s*_{2}, *s*_{2}) and may therefore
deviate from (*s*_{2}, *s*_{2}) herself.
So I need to have sufficiently high (third-order) expectations that Liz
has sufficiently high (second-order) expectations that I have
sufficiently high expectations that she will follow
(*s*_{2}, *s*_{2} ). But this
implies that Liz must have sufficiently high (fourth-order)
expectations that I have sufficiently high (third-order) expectations
that Liz has sufficiently high (second-order) expectations that I have
sufficiently high expectations that she will follow
(*s*_{2}, *s*_{2}), for if she
doesn't, then she will believe I don't have sufficient reason to follow
(*s*_{2}, *s*_{2}), and then she
won't, either. Which involves me in fifth-order expectations regarding
Liz, which involves her in sixth-order expectations regarding me, and
so on." What would suffice for Robert, and Liz, to have decisive reason
to follow (*s*_{2}, *s*_{2}) is that
they each *know* that the other *knows* that …
that the other will follow (*s*_{2},
*s*_{2}) for any number of levels of knowledge, which is
to say that between Liz and Robert it is common knowledge that they
will follow (*s*_{2}, *s*_{2}). If agents
follow a strict equilibrium in a pure coordination game as a
consequence of their having common knowledge of the game, their
rationality and their intentions to follow this equilibrium, and no
other, then the agents are said to be following a
*Lewis-convention* (Lewis 1969).

Lewis' theory of convention applies to a more general class of games than pure coordination games, but pure coordination games already model a variety of important social interactions. In particular, Lewis models conventions of language as equilibrium points of a pure coordination game. The role common knowledge plays in games of pure coordination sketched above of course raises further questions: (1) Can people ever attain the common knowledge which characterizes a Lewis-convention? (2) Would less stringent epistemic assumptions suffice to justify Nash equilibrium behavior in a coordination problem?

## 2. Alternative Accounts of Common Knowledge

- 2.1 The Hierarchical Account
- 2.2 Lewis' Account
- 2.3 Aumann's Account
- 2.4 Barwise's Account
- 2.5 Gilbert's Account

*A*is

*mutually known*among a set of agents if each agent knows that

*A*. Mutual knowledge by itself implies nothing about what, if any, knowledge anyone attributes to anyone else. Suppose each student arrives for a class meeting knowing that the instructor will be late. That the instructor will be late is mutual knowledge, but each student might think only she knows the instructor will be late. However, if one of the students says openly "Peter told me he will be late again," then the mutally known fact is now

*commonly known.*Each student now knows that the instructor will be late, and so on,

*ad infinitum.*The agents have common knowledge in the sense articulated informally by Schelling (1960), and more precisely by Lewis (1969) and Schiffer (1972). Schiffer uses the formal vocabulary of

*epistemic logic*(Hintikka 1962) to state his definition of common knowledge. Schiffer's general approach was to augment a system of sentential logic with a set of knowledge operators corresponding to a set of agents, and then to define common knowledge as a hierarchy of propositions in the augmented system. Bacharach (1992) and Bicchieri (1993) adopt this approach, and develop logical theories of common knowledge which include soundness and completeness theorems. One can also develop alternate formal accounts of common knowledge in set-theoretic terms, which is the approach taken in this article.

^{[7]}

### 2.1 The Hierarchical Account

Monderer and Samet (1988) and Binmore and Brandenburger (1989) give
a particularly elegant set-theoretic definition of common knowledge. I
will review this definition here, and then show that it is logically
equivalent to the ‘*i* knows that *j* knows that
… *k* knows that A’ hierarchy that Lewis (1969) and
Schiffer (1972) argue characterizes common
knowledge.^{[8]}

Some preliminary notions must be stated first. Following C. I. Lewis
(1943-1944) and Carnap (1947), propositions are formally subsets of a
set Ω of *state descriptions* or *possible worlds.*
One can think of the elements of Ω as representing Leibniz's
possible worlds or Wittgenstein's possible states of affairs. Some
results in the common knowledge literature presuppose that Ω is
of finite cardinality. If this admittedly unrealistic assumption is
needed in any context, this will be explicitly stated in this essay,
and otherwise one may assume that Ω may be either a finite or an
infinite set. A distinguished actual world ω_{α} is
an element of Ω. A proposition *A* ⊆ Ω obtains
(or is true) if the actual world ω_{α} ∈
*A*. In general, we say that *A* *obtains at* a
world ω ∈ Ω if ω ∈ *A*. What an
agent *i* knows about the possible worlds is stated formally in
terms of a *knowledge operator*
**K**_{i}. Given a proposition *A*
⊆ Ω, **K**_{i}(*A*)
denotes a new proposition, corresponding to the set of possible worlds
at which agent *i* knows that A obtains.
**K**_{i}(*A*) is read as
‘*i* knows (that) *A* (is the case)’. The
knowledge operator **K**_{i} satisfies
certain axioms, including:

K1:K_{i}(A) ⊆AK2: Ω ⊆

K_{i}(Ω)K3:

K_{i}(∩_{k}A_{k}) = ∩_{k}K_{i}(A_{k})K4:

K_{i}(A) ⊆K_{i}K_{i}(A)^{[9]}

In words, K1 says that if *i* knows *A*, then
*A* must be the case. K2 says that *i* knows that some
possible world in Ω occurs no matter which possible world ω
occurs. K3 says that *i* knows a conjunction if, and only if,
*i* knows each conjunct. K4 is a *reflection axiom*,
which says that if *i* knows *A*, then *i* knows
that she knows *A*. Note that by K3, if *A* ⊆
*B* then **K**_{i}(*A*)
⊆ **K**_{i}(*B*), by K1 and
K2, **K**_{i}(Ω) = Ω, and by
K1 and K4, **K**_{i}(*A*) =
**K**_{i}**K**_{i}(*
A*). Any system of knowledge satisfying K1 - K4 corresponds to the
modal system S4 (Kripke 1963). If one drops the K1 axiom and retains
the others, the resulting system would give a formal account of what an
agent *believes*, but does not necessarily *know.*

A useful notion in the formal analysis of knowledge is that of a
*possibility set.* An agent i's possibility set at a state of
the world Ω is the smallest set of possible worlds that
*i* thinks could be the case if ω is the actual world.
More precisely,

Definition 2.1

Agenti'spossibility set_{i}(ω) at ω ∈ Ω is defined asThe collection of sets_{i}(ω) ≡ ∩{E| ω ∈K_{i}(E) }is_{i}= ∪_{ω∈Ω}_{i}(ω)i'sprivate information system.

Since in words,
_{i}(ω) is the
intersection of all propositions which *i* knows at
ω,
_{i}(ω) is the smallest
proposition in Ω that *i* knows at ω. Put another
way,
_{i}(ω) is the most
specific information that *i* has about the possible world
ω. The intuition behind assigning agents private information
systems is that while an agent *i* may not be able to perceive
or comprehend every last detail of the world in which *i* lives,
*i* does know certain facts about that world. The elements of
*i*'s information system represent what *i* knows
immediately at a possible world. We also have the following:

Proposition 2.2

K_{i}(A) = { ω |_{i}(ω) ⊆A}

In many formal analyses of knowledge in the literature, possibility
sets are taken as primitive and Proposition 2.2 is given as the
definition of knowledge. If one adopts this viewpoint, then the axioms
K1 - K4 follow as consequences of the definition of knowledge. In many
applications, the agents' possibility sets are assumed to
*partition*^{[10]}
the set, in which case
_{i} is called i's *private
information partition.*

To illustrate the idea of possibility sets, let us return to the Barbecue Problem described in Example 1.2. Suppose there are three diners: Cathy, Jennifer and Mark. Then there are 8 relevant states of the world, summarized by Table 2.1:

Table 2.1

Each diner knows the condition of the other diners' faces, but not
her own. Suppose the cook makes no announcement, after all. Then none
of the diners knows the true state of the world whatever ω ∈
Ω the actual world turns out to be, but they do know *a
priori* that certain propositions are true at various states of the
world. For instance, Cathy's information system before any announcement
is made is depicted in Figure 2.1a:

Figure 2.1a

In this case, Cathy's information system is a partition
_{1} of Ω defined
by

_{1}= {H_{CC},H_{CM},H_{MC},H_{MM}}

where

H_{CC}= {ω_{1}, ω_{2}} (i.e., Jennifer and Mark are both clean)

H_{CM}= {ω_{4}, ω_{6}} (i.e., Jennifer is clean and Mark is messy)

H_{MC}= {ω_{3}, ω_{5}} (i.e., Jennifer is messy and Mark is clean)

H_{MM}= {ω_{7}, ω_{8}} (i.e., Jennifer and Mark are both messy)

Cathy knows immediately which cell
_{1}(ω) in her partition is the
case at any state of the world, but does not know which is the true
state at any ω ∈ Ω.

If we add in the assumption stated in Example 1.2 that if there is at least one messy diner, then the cook announces the fact, then Cathy's information partition is depicted by Figure 2.1b:

Figure 2.1b

In this case, Cathy's information system is a partition
_{1} of Ω defined
by

_{1}= {H_{CCC},H_{MCC},H_{CM},H_{MC},H_{MM}}

where

H_{CCC}= {ω _{1}}(i.e., Jennifer, Mark, and I are all clean) H_{MCC}= {ω _{2}}(i.e., Jennifer and Mark are clean and I am messy) H_{CM}= {ω _{4},ω_{6}}(i.e., Jennifer is clean and Mark is messy) H_{MC}= {ω _{3},ω_{5}}(i.e., Jennifer is messy and Mark is clean) H_{MM}= {ω _{7},ω_{8}}(i.e., Jennifer and Mark are both messy)

In this case, Cathy's information partition is a *refinement*
of the partition she has when there is no announcement, for in this
case, then Cathy knows *a priori* that if ω_{1} is
the case there will be no announcement and will know immediately that
she is clean, and Cathy knows *a priori* that if
ω_{2} is the case, then she will know immediately from
the cook's announcement that she is messy.

A slightly more complex case occurs if we alter the Barbecue problem so that the cook makes an announcement only if he sees at least two messy diners. Cathy's possibility set is now depicted by the diagram in Figure 2.1c:

Figure 2.1c

This time, Cathy's information system does not partition Ω.
For Cathy knows *a priori* that at ω_{5}, the cook
will make his announcement, and since at ω_{5} Jennifer
is messy and Mark is clean, Cathy will realize immediately that she is
messy. However, Cathy also knows *a priori* that at
ω_{3}, either ω_{3} or ω_{5}
could be the case, since at ω_{3} she does not know in
advance whether or not the cook will make an announcement. Hence
_{1}(ω_{5})
= {ω_{5}}, but
_{1}(ω_{3}) =
{ω_{3}, ω_{5}}. Similarly,
_{1}(ω_{6})
= {ω_{6}}, but
_{1}(ω_{4}) =
{ω_{4}, ω_{6}}. Jennifer's and Mark's
information systems given any of the above three scenarios are derived
similarly to Cathy's information system, and the details of this are
left as an exercise for the reader.

We can now define mutual and common knowledge as follows:

Definition 2.3

Let a set Ω of possible worlds together with a set of agentsNbe given.1. The proposition that

Ais(first levelorfirst order)mutual knowledge for the agents ofN,K^{1}_{N}(A), is the set defined byK^{1}_{N}(A) ≡ ∩_{i∈N}K_{i}(A).2. The proposition that

Aism^{th}level(orm^{th}order)mutual knowledge among the agents of N,K^{m}_{N}(A), is defined recursively as the setK^{m}_{N}(A) ≡ ∩_{i∈N}K_{i}(K^{m−1}_{ N}(A)).3. The proposition that

Aiscommon knowledgeamong the agents ofN,K^{*}_{N}(A), is defined as the set

K^{*}_{N}(A) ≡∞

∩

^{m=1}K^{m}_{N}(A).

As a consequence of Proposition 2.2, the agents' private information
systems determine an *a priori* structure of propositions over
the space of possible worlds regarding what they can know, including
what mutual and common knowledge they potentially have. The world
ω ∈ Ω which obtains determines *a posteriori*
what individual, mutual and common knowledge agents in fact have.
Hence, one can read ω ∈
**K**_{i}(*A*) as
‘*i* knows *A* at (possible world) ω’,
ω ∈
**K**^{m}_{N}(*A*)
as ‘*A* is *m*^{th} level mutual knowledge
for the agents of *N* at ω’, and so on. If ω
obtains, then one can conclude that *i* does know *A*,
that *A* is *m*^{th} level mutual knowledge, and
so on. Common knowledge of a proposition *E* implies common
knowledge of all that *E* implies, as is shown in the
following:

Proposition 2.4

If ω ∈K^{*}_{N}(E) andE⊆F, then ω ∈K^{*}_{N}(F).

Note that
(**K**^{m}_{N}(*E*))_{
m≥1} is a decreasing sequence of events, in the sense
that
**K**^{m+1}_{N}(*E*)
⊆
**K**^{m}_{N}(*E*),
for all *m* ≥ 1. It is also easy to check that if everyone
knows *E*, then *E* must be true, that is,
**K**^{1}_{N}(*E*) ⊆
*E*. If Ω is assumed to be finite, then if *E* is
common knowledge at ω, this implies that there must be a finite
*m* such that

K^{m}_{N}(E) =∞

∩

^{n=1}K^{n}_{N}(E).

The following result relates the set-theoretic definition of common
knowledge to the hierarchy of ‘*i* knows that *j*
knows that … knows *A*’ statements.

Proposition 2.5

ω ∈K^{m}_{N}(A) iff(1) For all agentsHence, ω ∈i_{1},i_{2}, … ,i_{m}∈N, ω ∈K_{i1}K_{i2}…K_{im}(A)K^{*}_{N}(A) iff (1) is the case for eachm≥ 1.

The condition that ω ∈
**K**_{i1}**K**_{i2}
…
**K**_{im}(*A*)
for all *m* ≥ 1 and all *i*_{1},
*i*_{2}, … , *i*_{m}
∈ *N* is Schiffer's definition of common knowledge, and is
often used as the definition of common knowledge in the literature.

### 2.2 Lewis' Account

Lewis is credited with the idea of characterizing common knowledge
as a hierarchy of ‘*i* knows that *j* knows that
… knows that *A*’ propositions. However, Lewis is
aware of the difficulties that such an infinitary definition raises. On
the one hand, there is the question of whether it is possible to reduce
the infinity inherent in the hierarchical account into a workable
finite definition. On the other hand, there is the issue that finite
agents cannot entertain the infinite amount of epistemic states which
is necessary for common knowledge to obtain. Lewis tackles both
problems, yet his presentation is informal and occasionally lacking in
detail. It is probably for this reason that Aumann is often credited
with presenting the first finitary method of generating the common
knowledge hierarchy (Aumann 1976). Recently, Cubitt and Sugden (2003)
have argued that Aumann's and Lewis' accounts of common knowledge are
radically different and irreconcilable.

First, it is important to see that, although Lewis introduced the
technical term ‘common knowledge,’ his analysis is about
belief, rather than knowledge. Indeed, Lewis offers his solution to the
second problem mentioned above by introducing a distinction between
*actual belief* and *reason to believe*. Reasons to
believe are interpreted as potential beliefs of agents, so that the
infinite hierarchy of epistemic states becomes harmless, consisting in
an infinity of potential belief states. The solution to the second
problem is given by providing a finite set of conditions that, if met,
generate the infinite series of reasons to believe. Such conditions
taken together represent Lewis' official definition of common
knowledge. Notice that it would be more appropriate to speak of
‘common reason to believe,’ or, at least, of ‘common
belief.’ Lewis himself later acknowledges that "[t]hat term
[common knowledge] was unfortunate, since there is no assurance that it
will be knowledge, or even that it will be true." Cf. (Lewis 1978, p.
44, n.13) The reader should be warned that, with Lewis and many others,
we will at times abuse terminology and speak of ‘common
knowledge’ *tout court*. Although conceptually persuasive,
Lewis distinction between reasons to believe and actual beliefs remains
informal, and it is debatable whether a mathematically precise account
of Lewis' characterization of common knowledge can be given in the
framework introduced by (Aumann 1976). Following (Vanderschraaf 1998)
we give the details of such an account here, and show that Lewis'
analysis does result in the common knowledge hierarchy following from a
finite set of axioms. It is however debated whether a possible worlds
approach can properly render the subtelties of Lewis' characterization.
Cubitt and Sugden (2003), for example, abandon the possible worlds
framework altogether and propose a different formal interpretation of
Lewis in which, among other elements, the distinction between reasons
to believe and actual belief is taken in account. An attempt to
reconcile the two positions can be found in (Sillari 2005), where
Lewis' characterization is formalized in a richer possible worlds
semantic framework where the distinction between reasons to believe and
actual believe is represented.

Lewis presents his account of common knowledge on pp. 52-57 of
*Convention*. Lewis does not specify what account of knowledge
is needed for common knowledge. As it turns out, Lewis' account is
satisfactory for any formal account of knowledge in which the knowledge
operators **K**_{i}, *i* ∈
*N*, satisfy K1, K2, and K3. A crucial assumption in Lewis'
analysis of common knowledge is that agents know they share the same
"rationality, inductive standards and background information" (Lewis
1969, p. 53) with respect to a state of affairs *A*′, that
is, if an agent can draw any conclusion from *A*′, she
knows that all can do likewise. This idea is made precise in the
following:

Definition 2.6

Given a set of agentsNand a propositionA′ ⊆ Ω, the agents ofNaresymmetric reasoners with respect toA′ (orA′-symmetric reasoners) iff, for eachi,j∈Nand for any propositionE⊆ Ω, ifK_{i}(A′) ⊆K_{i}(E) andK_{i}(A′) ⊆K_{i}K_{j}(A′), thenK_{i}(A′) ⊆K_{i}K_{j}(E).^{[11]}

The definiens says that for each agent *i*, if *i* can
infer from *A*′ that *E* is the case and that
everyone knows that *A*′ is the case, then *i* can
also infer that everyone knows that *E* is the case.

Definition 2.7

A propositionEisLewis-common knowledge atω ∈ Ω among the agents of a setN= {1, … ,n} iff there is a propositionA* such that ω ∈A*, the agents ofNareA*-symmetric reasoners, and for everyi∈N,L1: ω ∈K_{i}(A*)L2:

K_{i}(A*) ⊆K_{i}(∩_{j∈N}K_{j}(A*))L3:

K_{i}(A*) ⊆K_{i}(E)

A* is abasisfor the agents' common knowledge.L*_{N}(E) denotes the proposition defined by L1 - L3 for a setNofA*-symmetric reasoners, so we can say thatEis Lewis-common knowledge for the agents ofNiff ω ∈L*_{N}(E).

In words, L1 says that *i* knows *A** at ω. L2
says that if *i* knows that *A** obtains, then *i*
knows that everyone knows that *A** obtains. This axiom is meant
to capture the idea that common knowledge is based upon a proposition
*A** that is *publicly known*, as is the case when agents
hear a public announcement. If the agents' knowledge is represented by
partitions, then a typical basis for the agents' common knowledge would
be an element
(ω)
in
the
meet^{[12]}
of their partitions. L3 says that
*i* can infer from *A** that *E*.

A human agent obviously cannot work her way mentally through an infinite mutual knowledge hierarchy. Lewis argues that this is not a problem for his analysis of common knowledge, since the mutual knowledge claims of a common knowledge hierarchy for a chain of logical consequences, not a series of steps in anyone's actual reasoning. Lewis uses an example to show how his definition of common knowledge generates the first few levels of mutual knowledge. In fact, Lewis' definition implies the entire common knowledge hierarchy, as is shown in the following result.

Proposition 2.8

L*_{N}(E) ⊆K*_{N}(E), that is, Lewis-common knowledge ofEimplies common knowledge ofE.

As mentioned above, it has recently come into question whether a
formal rendition of Lewis' definition as the one given above adeguately
represents all facets of Lewis' approach. Cubitt and Sugden (2003)
argue that it does not, their critique hinging on two specific features
of Lewis' analysis that are lost in a possible world framework. First,
as we mentioned at the onset of this subsection, in a possible worlds
account, the distinction between reasons to believe and actual belief
is lost. To avoid that, Cubitt and Sugden concern themselves with
operators representing reasons to believe rather than knowledge. The
other feature of Lewis' analysis missing in the possible worlds
rendition of it is the 3-place relation of *indication* used by
Lewis. The definition of indication can be found at pp. 52-53 of
*Convention*:

Definition 2.9

A state of affairsAindicatesEto agenti(A ind) if and only if, if_{i}Eihad reason to believe thatAheld,iwould thereby have reason to believe thatE

The wording of Lewis' definition and the use he makes of the
indication relation in the definitory clauses for common knowledge,
suggest that Lewis is careful to distinguish indication and material
implication. Cubitt and Sugden (2003) incorporate such distinction in
their formal reconstruction. Paired with their interpretation of
"*i* has reason to believe *x*" as "*x* is yielded
by some logic of reasoning that i endorses," we have that, if *A
ind _{i} x*, then

*i*'s reason to believe

*A*provides

*i*with reason to believe

*x*as well. Given that Lewis does want to endow agents with deductive reasoning, (Cubitt and Sugden 2003) list the following axioms, claiming that they capture the desired properties of indication. (To simplify the exposition, we overlook the distinction between state of affairs and propositions, which is not essential here.) For all agents

*i, j*,

CS1: (R_{i}A∧Aind_{i}x) →R_{i}x.CS2: (

AentailsB) →Aind_{i}BCS3: (

Aind_{i}x∧Aind_{i}y) →Aind_{i}(x∧y)CS4: (

A ind∧_{i}BB ind) →_{i}xA ind_{i}xCS5: ((

A ind_{i}R) ∧_{j}BR(_{i}B ind)) →_{j}xA ind_{i}R_{j}x

The first axioms captures the intuition behind indication. It says
that if an agent has reason to believe that *A* holds, then, if
*A* indicates *x* to her, she has reason to believe
*x* as well. CS2 says that indication incorporates material
implication. CS3 says that if two propositions *x* and
*y* are indicated to an agent by a proposition *A*, then
*A* indicates to her also the conjunction of *x* and
*y*. The next axiom states that indication is transitive. CS5
says that if a proposition *A* indicates to *i* that
agent *j* has reason to believe *B*, and *i* has
reason to believe that *B* indicates *x* to *j*,
then *A* indicates to *i* also that *j* has reason
to believe *x*.

Armed with these axioms, it is possible to give the following definition.

Definition 2.10

In any given populationPa propositionAis areflexive common indicator that xif and only if, for alli, j∈Pand all propositionsx, y, the following four conditions hold:RCI1:

A→R_{i}ARCI2:

A ind_{i}R_{j}ARCI3:

A ind_{i}xRCI4:

A ind→_{j}(R_{i}A ind)_{j}y

Clauses RCI1-RCI3 above render L1-L3 of definition 2.7 above in the formal language that underlies axioms CS1-CS5; while RCI4 affirms (cf. definition 2.6 above) that agents are symmetric reasoners, i.e. that if a proposition indicates another proposition to a certain agent, then it does so to all agents in the population.

The following proposition shows that RCI1-RCI4 are sufficient conditions for ‘common reason to believe’ to arise:

Proposition 2.11 shows that a finite definition of ‘common reason to believe’ can be given. A straighforward extension to ‘common belief’ is easily obtained once we say that an agentProposition 2.11

IfAholds, and ifAis a common reflexive indicator in the populationPthatx, then there is common reason to believe inPthatx.

*i reasons faultlessy*if she believes everything she has reason to believe. The following proposition characterizes then common belief in the approach of Cubitt and Sugden:

Proposition 2.12

Consider a populationPand a propositionAsuch that (i)Ais a reflexive common indicator inPthatxand (ii)Ais a reflexive common indicator inPthat each member ofPreasons faultlessy. SupposeAholds and each agent inPreasons faultlessly. Then there is common (actual) belief inPthatx.

Is it possible to take formally in account the insights of Lewis'
definition of common knowledge without abandoning the possible world
framework? (Sillari 2005) puts forth an attempt to give a postive
answer to that question by articulating in a possible world semantics
the distinction between actual belief and reason to believe. As in
(Cubitt and Sugden 2003), the basic epistemic operator represents
reasons to believe. The idea is then to impose an *awareness
structure* over possible worlds. Simply put, an awareness structure
associates to each agent, for every possible world, a set of events of
which the agent is said to be aware. An agent entertains an actual
belief that a certain event occurs if and only if she has reason to
believe that the event occurs *and* such event is in her
awareness set at the world under consideration.

### 2.3 Aumann's Account

Aumann (1976) gives a different characterization of common knowledge
which gives another simple algorithm for determining what information
is commonly known. Aumann's original account assumes that the each
agent's possibility set forms a private information partition of the
space Ω of possible worlds. Aumann shows that a proposition C is
common knowledge if, and only if, C contains a cell of the meet of the
agents' partitions. One way to compute the meet
of the partitions
_{i}, *i* ∈
*N* is to use the idea of "reachability".

Definition 2.13

A state ω′ ∈ Ω isreachablefrom ω ∈ Ω iff there exists a sequence ω=ω_{0}, ω_{1}, ω_{2}, … , ω_{m}=ω′ such that for eachk∈ {0,1, … ,m−1}, there exists an agenti_{k}∈Nsuch that_{ik}(ω_{k}) =_{ik}(ω_{k+1}).

In words, ω′ is reachable from ω if there exists a sequence or "chain" of states from ω to ω′ such that two consecutive states are in the same cell of some agent's information partition. To illustrate the idea of reachability, let us return to the modified Barbecue Problem in which Cathy, Jennifer and Mark receive no announcement. Their information partitions are all depicted in Figure 2.1d:

Figure 2.1d

One can understand the importance of the notion of reachability in
the following way: If ω′ is reachable from ω, then if
ω obtains then some agent can reason that some other agent thinks
that ω′ is possible. Looking at Figure 2.1d, if ω =
ω_{1} occurs, then Cathy (who knows only that
{ω_{1}, ω_{2}} has occurred) knows that
Jennifer thinks that ω_{5} might have occurred (even
though Cathy knows that ω_{5} did not occur). So Cathy
cannot rule out the possibility that Jennifer thinks that Mark thinks
that that ω_{8} might have occurred. And Cathy cannot
rule out the possibility that Jennifer thinks that Mark thinks that
Cathy believes that ω_{7} is possible. In this sense,
ω_{7} is reachable from ω_{1}. The chain of
states which establishes this is ω_{1}, ω
_{2}, ω_{5}, ω_{8},
ω_{7}, since
_{1}(ω_{1}) =
_{1}(ω_{2}),
_{2}(ω_{2})
=
_{2}(ω_{5}),
_{3}(ω_{5})
=
_{3}(ω_{8}), and
_{1}(ω_{8})
=
_{1}(ω_{7}). Note that one
can show similarly that in this example any state is reachable from any
other state. This example also illustrates the following immediate
result:

Proposition 2.14

ω′ is reachable from ω iff there is a sequencei_{1},i_{2}, … ,i_{m}∈Nsuch that(1) ω′ ∈_{im}( … (_{i2}(_{i1}(ω))))

One can read (1) as: ‘At ω, *i*_{1}
thinks that *i*_{2} thinks that … ,
*i*_{m} thinks that ω′ is
possible.’

We now have:

andLemma 2.15

ω′ ∈ (ω) iff ω′ is reachable from ω.

andLemma 2.16

(ω) is common knowledge for the agents ofNat ω.

Proposition 2.17(Aumann 1976)

Let be the meet of the agents' partitions_{i}for eachi∈N. A propositionE⊆ Ω is common knowledge for the agents ofNat ω iff (ω) ⊆E. (In Aumann (1976),Eisdefinedto be common knowledge at ω iff (ω) ⊆E.)

If *E* =
**K**^{1}_{N}(*E*), then
*E* is a *public event* (Milgrom 1981) or a *common
truism* (Binmore and Brandenburger 1989). Clearly, a common truism
is common knowledge whenever it occurs, since in this case *E* =
**K**^{1}_{N}(*E*) =
**K**^{2}_{N}(*E*) =
… , so *E* =
**K**^{*}_{N}(*E*). The proof of
Proposition 2.17 shows that the common truisms are precisely the
elements of
and unions of
elements of
,
so any
commonly known event is the consequence of a common truism.

### 2.4 Barwise's Account

Barwise (1988) proposes another definition of common knowledge that
avoids explicit reference to the hierarchy of ‘*i* knows
that *j* knows that … knows that *A*’
propositions. Barwise's analysis builds upon an informal proposal by
Harman (1977). Consider the situation of the guest and clumsy waiter in
Example 1 when he announces that he was at fault. They are now in a
setting where they have heard the waiter's announcement and know that
they are in the setting. Harman adopts the circularity in this
characterization of the setting as fundamental, and propses a
definition of common knowledge in terms of this circularity. Barwise's
formal analysis gives a precise formulation of Harman's intuitive
analysis of common knowledge as a *fixed point.* Given a
function *f, A* is a fixed point of *f* if
*f(A)=A*. Now note that

So we have established that

K^{1}_{N}(E∩∞

∩

^{m=1}K^{m}_{N}(E) )=

K^{1}_{N}(E) ∩K^{1}_{N}(∞

∩

^{m=1}K^{m}_{N}(E) )=

K^{1}_{N}(E) ∩ (∞

∩

^{m=1}K^{1}_{N}(K^{m}_{N}(E) ) )=

K^{1}_{N}(E) ∩ (∞

∩

^{m=2}K^{m}_{N}(E) )=

∞

∩

^{m=1}K^{m}_{N}(E)

**K**

^{*}

_{N}(

*E*) is a fixed point of the function

*f*

_{E}defined by

*f*

_{E}(

*X*) =

**K**

^{1}

_{N}(

*E*∩

*X*).

*f*

_{E}has other fixed points. For instance, any contradiction

*B*∩

*B*= ø is a fixed point of

^{c}*f*

_{E}.

^{[13]}Note also that if

*A*⊆

*B*, then

*E*∩

*A*⊆

*E*∩

*B*and so

that is,f_{E}(A) =K^{1}_{N}(E∩A) ⊆K^{1}_{N}(E∩B) =f_{E}(B)

*f*

_{E}is

*monotone.*(We saw that

**K**

^{1}

_{N}is also monotone in the proof of Proposition 2.4.) Barwise's analysis of common knowledge can be developed using the following result from set theory:

This proposition establishes thatProposition

A monotone functionfhas a unique fixed pointCsuch that ifBis a fixed point off, thenB⊆C.Cis thegreatest fixed point of f.

*f*has a greatest fixed point, which characterizes common knowledge in Barwise's account. As Barwise himself observes, the fixed point analysis of common knowledge is closely related to Aumann's partition account. This is easy to see when one compares the fixed point analysis to the notion of common truisms that Aumann's account generates. Some authors regard the fixed point analysis as an alternate formulation of Aumann's analysis. Barwise's fixed point analysis of common knowledge is favored by those who are especially interested in the applications of common knowledge to problems in logic, while the hierarchical and the partition accounts are favored by those who wish to apply common knowledge in social philosophy and social science. When knowledge operators satisfy the axioms (K1)-(K4), the Barwise account of common knowledge is equivalent to the hierarchical account.

_{E}Proposition 2.18

LetC^{*}_{N}be the greatest fixed point off_{E}. ThenC^{*}_{N}(E) =K^{*}_{N}(E). ( In Barwise (1988, 1989),Eisdefinedto be common knowledge at ω iff ω ∈C^{*}_{N}(E).)

Barwise argues that in fact the fixed point analysis is more
flexible and consequently more general than the hierachical account.
This may surprise readers in light of Proposition 2.18, which shows
that Barwise's fixed point definition is *equivalent* to the
hierarchical account. Indeed, while Barwise (1988, 1989) proves a
result showing that the fixed point account implies the hierarchical
account and gives examples that satisfy the common knowledge hierarchy
but fail to be fixed points, a number of authors who have written after
Barwise have given various proofs of the equivalence of the two
definitions, as was shown in Proposition 2.18. In fact, there is not a
true controversy, at least with respect to the analytical results.
Barwise's fixed point account is indeed equivalent to the hierarchical
and the partition accounts given the account of knowledge characterized
by (K1)-(K4) that most practitioners accept. Barwise does not make
explicit which axioms of (K1)-(K4) he accepts, but he wishes to analyze
a weaker notion of knowledge that is not closed under logical
implication, and so he is committed to rejecting (K3). By doing so,
Barwise is able to prove the nonequivalence between the fixed point and
the hierarchical account he claims. But Barwise's result comes at a
price most analysts are not willing to pay. To formulate his results
given his very weak conception of knowledge, Barwise must use
*non-well-founded set theory* (Aczel 1988) in order to allow him
to make the necessary circular definitions. As we have seen in this
section, when one adopts the conventional analysis of knowledge that
satisfies (K1)-(K4), the equivalence of the hierarchical and the fixed
point accounts follows without the need to introduce non-well-founded
set-theoretic concepts.

### 2.5 Gilbert's Account

Gilbert (1989, Chapter 3) presents an alternative account of common knowledge, which is meant to be more intuitively plausible than Lewis' and Aumann's accounts. Gilbert gives a highly detailed description of the circumstances under which agents have common knowledge.

Definition 2.19

A set of agentsNare in acommon knowledge situation(A) with respect to a propositionAif, and only if, ω ∈Aand for eachi∈N,

G _{1}:iisepistemically normal, in the sense thatihas normal perceptual organs which are functioning normally and has normal reasoning capacity.^{[14]}G _{2}:ihas the concepts needed to fulfill the other conditions.G _{3}:iperceives the other agents ofN.G _{4}:iperceives that G_{1}and G_{2}are the case.G _{5}:iperceives that the state of affairs described byAis the case.G _{6}:iperceives that all the agents ofNperceive thatAis the case.

Gilbert's definition appears to contain some redundancy, since
presumably an agent would not perceive A unless A is the case. Gilbert
is evidently trying to give a more explicit account of single agent
knowledge than Lewis and Aumann give. For Gilbert, agent *i*
knows that a proposition *E* is the case if, and only if,
ω ∈ *E*, that is, *E* is true, and either
*i* perceives that the state of affairs *E* describes
obtains or *i* can infer *E* as a consequence of other
propositions *i* knows, given sufficient inferential
capacity.

Like Lewis, Gilbert recognizes that human agents do not in fact have
unlimited inferential capacity. To generate the infinite hierarchy of
mutual knowledge, Gilbert introduces the device of an agent's
*smooth-reasoner counterpart.* The smooth-reasoner counterpart
*i*′ of an agent *i* is an agent that draws every
logical conclusion from every fact that *i* knows. Gilbert
stipulates that *i*′ does not have any of the constrains
on time, memory, or reasoning ability that *i* might have, so
*i*′ can literally think through the infinitely many
levels of a common knowledge hierarchy.

Definition 2.20

If a set of agentsNare in a common knowledge situation_{N}(A) with respect toA, then the corresponding setN′ of their smooth-reasoner counterparts is in aparallel situation′_{N′}(A) if, and only if, for eachi′ ∈N,

G _{1}′:i′ can perceive anything that the counterpartican perceive.G _{2}′:G _{2}- G_{6}obtain fori′ with respect toAandN′, same as for the counterpartiwith respect toAandN.G _{3}′:i′ perceives that all the agents ofN′ are smooth-reasoners.

From this definition we get the following immediate consequence:

Proposition 2.21

If a set of smooth-reasoner counterparts to a setNof agents are in a situation ′_{N′}(A) parallel to a common knowledge situation_{N}(A) ofN, thenfor allConsequently,m∈ and for anyi_{1}′, … ,i_{m}′,K_{i1′}K_{i2′}…K_{im′}(A).K^{m}_{N′}(A) for anym∈ .

Gilbert argues that, given
′_{N′}(*A*),
the smooth-reasoner counterparts of the agents of *N* actually
satisfy a much stronger condition, namely mutual knowledge
**K**^{α}_{N′}(*A*)
to the level of any ordinal number α, finite or infinite. When
this stronger condition is satisfied, the proposition *A* is
said to be *open* to the agents of* *N*. With the concept
of open*-ness, Gilbert gives her definition of common knowledge.

Definition 2.22

A propositionE⊆ Ω isGilbert-common knowledgeamong the agents of a setN= {1, … ,n}, if and only if,

G _{1}*:Eis open* to the agents ofN.G _{2}*:For every i∈N,K_{i}(G_{1}*).G_{N}*(E) denotes the proposition defined by G_{1}* and G_{2}* for a setNofA*-symmetric reasoners, so we can say thatEis Lewis-common knowledge for the agents ofNiff ω ∈G_{N}*(E).

One might think that an immediate corollary to Gilbert's definition
is that Gilbert-common knowledge implies the hierarchical common
knowledge of Proposition 2.5. However, this claim follows only on the
assumption that an agent knows all of the propositions that her
smooth-reasoner counterpart reasons through. Gilbert does not
explicitly endorse this position, although she correctly observes that
Lewis and Aumann are committed to something like
it.^{[15]}
Gilbert
maintains that her account of common knowledge expresses our intuitions
with respect to common knowledge better than Lewis' and Aumann's
accounts, since the notion of open*-ness presumably makes explicit that
when a proposition is common knowledge, it is "out in the open", so to
speak.

## 3. Applications of Mutual and Common Knowledge

Readers primarily interested in philosophical applications of common knowledge may want to focus on the No Disagreement Theorem and Convention subsections. Readers interested in applications of common knowledge in game theory may continue with the Strategic Form Games, and Games of Perfect Information subsections.- 3.1 The "No Disagreement" Theorem
- 3.2 Convention
- 3.3 Strategic Form Games
- 3.4 Games of Perfect Information
- 3.5 Communication Networks

### 3.1 The "No Disagreement" Theorem

Aumann (1976) originally used his definition of common knowledge to prove a celebrated result that says that in a certain sense, agents cannot "agree to disagree" about their beliefs, formalized as probability distributions, if they start with common prior beliefs. Since agents in a community often hold different opinions and know they do so, one might attribute such differences to the agents' having different private information. Aumann's surprising result is that even if agents condition their beliefs on private information, mere common knowledge of their conditioned beliefs and a common prior probability distribution implies that their beliefs cannot be different, after all!

Proposition 3.1

Let Ω be a finite set of states of the world. Suppose thatThen

- Agents
iandjhave a common prior probability distribution μ(·) over the events of Ω such that μ(ω) > 0, for each ω ∈ Ω, and- It is common knowledge at ω that
i's posterior probability of eventEisq_{i}(E) and thatj's posterior probability ofEisq_{j}(E).q_{i}(E) =q_{j}(E).Proof.

[Note that in the proof of this proposition, and in the sequel, μ(·|B) denotes conditional probability; that is, given μ(B)>0, μ(A|B) = μ(A∩B)/μ(B).]

In a later article, Aumann (1987) argues that the assumptions that
Ω is finite and that μ(ω) > 0 for each
ω ∈ Ω reflect the idea that agents only regard as
"really" possible a finite collection of salient worlds to which they
assign positive probability, so that one can drop the states with
probability 0 from the description of the state space. Aumann also
notes that this result implicitly assumes that the agents have common
knowledge of their partitions, since a description of each possible
world includes a description of the agents' possibility sets. And of
course, this result depends crucially upon (i), which is known as the
*common prior assumption* (CPA).

Aumann's "no disagreement" theorem has been generalized in a number
of ways in the literature (McKelvey and Page 1986, Monderer and Samet
1989, Geanakoplos 1994). However, all of these "no disagreement"
results raise the same philosophical puzzle raised by Aumann's original
result: How are we to explain differences in belief? Aumann's result
leaves us with two options: (1) admit that at some level, common
knowledge of the agents' beliefs or how they form their beliefs fails,
or (2) deny the CPA. For instance, agents in the real world often do
not express their opinions probabilistically. If one agent
announces‘I believe that *E* is the case’ while
another announces‘I doubt that *E* is the case’,
then they might attribute their divergent opinions to a lack of common
knowledge of each other's true posteriors for *E*. Even if
agents do assign precise posterior probabilities to an event, Aumann
shows that if they have merely first-order mutual knowledge of the
posteriors, they can "agree to disagree". Suppose the following all
hold:

Ω = {ωThen if_{1}, ω_{2}, ω_{3}, ω_{4}},

_{1}= {{ω_{1},ω_{2}}, {ω_{3},ω_{4}}}

_{2}= {{ω_{1},ω_{2},ω_{3}}, {ω_{4}}}

μ(ω_{i}) = 1/4

*E*= {ω

_{1},ω

_{4}}, then at ω

_{1}, we have:

q_{1}(E) = μ(E| {ω_{1},ω_{2}}) = 1/2, and

q_{2}(E) = μ(E| {ω_{1},ω_{2},ω_{3}}) = 1/3

Moreover, at ω = ω_{1}, Agent 1 knows
that
_{2}(ω) =
{ω_{1},ω_{2},ω_{3}}, so she
knows that *q*_{2}(*E*) = 1/3. Agent 2 knows
at ω_{1} that either
_{1}(ω) =
{ω_{1},ω_{2}} or
_{1}(ω) =
{ω_{3},ω_{4}}, so either way he knows that
*q*_{1}(*E*) = 1/2. Hence the agents' posteriors
are mutually known, and yet they are unequal. The reason for this is
that the posteriors are not common knowledge. For Agent 2 does not
know what Agent 1 thinks *q*_{2}(*E*) is,
since if ω = ω_{3}, which is consistent with what
Agent 2 knows, then Agent 1 will believe that
*q*_{2}(*E*) = 1/3 with probability 1/2 (if
ω = ω_{3}) and *q*_{2}(*E*) =
1 with probability 1/2 (if ω = ω_{4}).

Aumann's result could fail if the agents' partitions are not common
knowledge. For suppose in the example just given, the agents do not
know each other's partitions. Then at ω = ω_{1}, if
their posteriors are common knowledge, then Agent 1, who knows
that ω ∈ {ω_{1},ω_{2}}, can
explain Agent 2's posterior as the result of Agent 2 having
observed either
{ω_{1},ω_{2},ω_{3}},
{ω_{1},ω_{2},ω_{4}},
{ω_{1},ω_{3},ω_{4}} or
{ω_{2},ω_{3},ω_{4}}. Still
another way Aumann's result might fail is if agents do not have common
knowledge that they update their beliefs by Bayesian
conditionalization. Then clearly, agents can explain divergent opinions
as the result of others having modified their beliefs in the "wrong"
way. However, there are cases in which none of these explanations will
seem convincing. For instance, odds makers sometimes publicly announce
different probabilities for an event, such as a particular winner of a
prize at a forthcoming Academy Awards presentation, and they will know
that none of them have *any* private information regarding the
event. In cases such as this, the agents have common knowledge that
they all have the same information structure and common knowledge of
their posteriors. And knowing that they are all competent odds makers,
they have common knowledge that they update by Bayesian
conditionalization. Still, the odds makers' beliefs violate the
conclusion of Aumann's result. More generally, denying the requisite
common knowledge seems a rather *ad hoc* move. For instance, to
deny that agents have common knowledge of information structures is
simply to deny that agents can all infer the same conclusions regarding
possible worlds as Aumann defines them. To deny that agents have common
knowledge that they update their beliefs by Bayesian conditionalization
is to assert that some believe that some might be updating their
beliefs *incoherently*, in the sense that their belief updating
leaves them open to a *Dutch book* (Skyrms 1984). As just noted,
these failures of agents' beliefs in each others' competence do not
fail in all cases. Why should one think that such failures of common
knowledge provide a general explanation for divergent beliefs?

What of the second option, that is, denying the
CPA?^{[16]}
The main
argument put forward in favor of the CPA is that any differences in
agents' probabilities should be the result of their having different
information only, that is, there is no reason to think that the
different beliefs that agents have regarding the same event are the
result of anything other than their having different information.
However, one can reply that this argument amounts simply to a
restatement of the Harsanyi
Doctrine.^{[17]}
And while defenders
of the Harsanyi Doctrine may be right in thinking that there is
apparently no compelling reason to think that agents' priors can be
different, neither is there compelling reason to think they must be the
same! In any event, while the controversy over the Harsanyi Doctrine
remains unresolved, we can conclude that the "no disagreement" results
have interesting implications for the viability of common knowledge and
the very nature of probability. Defenders of the CPA take an
*objectivist* view of probability, and by virtue of the "no
disagreement" results are evidently committed to the view that common
knowledge of agents beliefs and how they are formed is a rare
phenomenon in the world. Those who are prepared to deny the CPA allow
for a genuinely *subjectivist* conception of probability. They
take the view that common knowledge of agents' beliefs and how they
come by them can be a commonplace phenomenon, and that differences in
opinion can stem from differences in (subjective) prior
probabilities.

### 3.2 Convention

Schelling's Department Store problem of Example 1.5 is a very simple
example in which the agents "solve" their coordination problem
appropriately by establishing a *convention.* Using the
vocabulary of game theory, Lewis (1969) defines a convention as a
*strict coordination equilibrium* of a game which agents follow
on account of their common knowledge that they all prefer to follow
this coordination equilibrium. A coordination equilibrium of a game is
a strategy combination such that no agent is better off if any agent
unilaterally deviates from this combination. As with equilibria in
general, a coordination equilibrium is *strict* if any agent who
deviates unilaterally from the equilibrium is strictly worse off. The
strategic form game of Figure 1.3 summarizes Liz's and Robert's
situation. The Department Store game has four Nash equilibrium outcomes
in pure strategies: (*s*_{1}, *s*_{1}),
(*s*_{2}, *s*_{2}),
(*s*_{3}, *s*_{3}), and
(*s*_{4},
*s*_{4}).^{[18]}
These four equilibria
are all strict coordination equilibria. If the agents follow either of
these equilibria, then they coordinate successfully. For agents to be
following a Lewis-convention in this situation, they must follow one of
the game's coordination equilibria. However, for Lewis to follow a
coordination equilibrium is not a sufficient condition for agents to be
following a convention. For suppose that Liz and Robert fail to analyze
their predicament properly at all, but Liz chooses
*s*_{2} and Robert chooses *s*_{2}, so
that they coordinate at (*s*_{2},
*s*_{2}) by sheer luck. Lewis does not count accidental
coordination of this sort as a convention.

Suppose next that both agents are Bayesian rational, and that part
of what each agent knows is the payoff structure of the Intersection
game. If the agents expect each other to follow
(*s*_{2}, *s*_{2}) and they consequently
coordinate successfully, are they then following a convention? Not
necessarily, contends Lewis, in a subtle argument on p. 59 of
*Convention.* For while each knows the game and that she is
rational, she might not attribute like knowledge to the other agent. If
each agent believes that the other agent will follow her end of the
(*s*_{2}, *s*_{2}) equilibrium
mindlessly, then her best response is to follow her end of
(*s*_{2}, *s*_{2}). But in this case the
agents coordinated as the result of their each falsely believing that
the other acts like an automaton, and Lewis thinks that any proper
account of convention must require that agents have *correct*
beliefs about one another. In particular, Lewis requires that each
agent involved in a convention must have mutual expectations that each
is acting with the aim of coordinating with the other, that is, that
each knows that:

A_{1}: Both are rational,

A_{2}: Both know the payoff structure of the game, and

A_{3}: Both intend to follow (s_{2},s_{2}), and not some other strategy combination.

Suppose that the agents' beliefs are appropriately augmented so that
each agent knows that *A*_{1}, *A*_{2},
and *A*_{3} are the case. Again they coordinate on
(*s*_{2}, *s*_{2}). Are they following a
convention this time? Still not necessarily, says Lewis. For what if it
turns out that Liz thinks that Robert does not know that they are both
rational? Then Liz has a false belief about Robert. Beyond this, there
are two other points which Lewis does not himself raise in this
argument, but which clearly support his view. First, it would be
counterintuitive, at the very least, to suppose that any agent
following a convention believes that he has reasoning abilities that
the other agents lack. If Liz has determined that
*A*_{1}, *A*_{2}, and *A*
_{3} are the case, then if they are following a convention she
should expect that Robert has arrived at the same conclusion. Second,
what could explain Liz's knowledge of *A*_{3}? The most
natural explanation for Liz's expectation that Robert will follow his
end of (*s*_{2}, *s*_{2}) is that Liz
knows that Robert knows that *A*_{1},
*A*_{2}, and *A*_{3} are the case. So
convention evidently involves agents having at least
*second-order* mutual knowledge of *A*_{1},
*A*_{2}, and *A*_{3}, that is, Robert
(Liz) must know that Liz (Robert) knows that *A*_{1},
*A*_{2}, and *A*_{3} are the case. But
this raises the question: Can *third-order* mutual knowledge
that *A*_{1}, *A* _{2}, and
*A*_{3} obtain fail? No, argues Lewis. For if Robert
thought that Liz did not know that Robert knew that
*A*_{1}, *A*_{2}, and
*A*_{3} were the case, then Robert would have a false
belief about Liz. The additional supporting points also kick in again:
If Robert has second-order mutual knowledge that
*A*_{1}, *A*_{2}, and
*A*_{3} obtain, then he should conclude that Liz also
has this second-order mutual knowledge. To conclude otherwise would
require Robert to assume, counterintuitively, that he has analyzed
their deliberations in this situation in a way that Liz cannot. And how
did Robert get his second-order mutual knowledge of
*A*_{3}? The most obvious way to account for Robert's
second-order mutual knowledge would be to attribute to Robert the
knowledge that Liz has second-order mutual knowledge that
*A*_{1}, *A*_{2}, and
*A*_{3} are the case. So convention requires third-order
mutual knowledge that *A*_{1}, *A*_{2},
and *A* _{3} are the case. And the argument can be
continued for any higher level of mutual knowledge.

Lewis concludes that a necessary condition for agents to be following a convention is that their preferences to follow the corresponding coordination equilibrium be common knowledge. So on Lewis' account, a convention for a set of agents is a coordination equilibrium which the agents follow on account of their common knowledge of their rationality, the payoff structure of the relevant game and that each agent follows her part of the equilibrium.

A regularityRin the behavior of members of a populationPwhen they are agents in a recurrent situationSis aconventionif and only if it is true that, and it is common knowledge inPthat, in any instance ofSamong the members ofP,where

- everyone conforms to
R;- everyone expects everyone else to conform to
R;- everyone has approximately the same preferences regarding all possible combinations of actions;
- everyone prefers that everyone conform to
R, on condition that at least all but one conform to R;- everyone would prefer that everyone conform to
R′, on condition that at least all but one conform toR′,R′ is some possible regularity in the behavior of members ofPinS, such that no one in any instance ofSamong members ofPcould conform both toR′ and toR.

(Lewis 1969, p. 76)^{[19]}

Lewis includes the requirement that there be an alternate
coordination equilibrium *R*′ besides the equilibrium
*R* that all follow in order to capture the fundamental
intuition that how the agents who follow a convention behave depends
crucially upon how they expect the others to behave. In the Department
Store game, the (*s*_{2}, *s*_{2})
equilibrium is a Lewis-convention when Liz and Robert have common
knowledge of *A*_{1}, *A*_{2}, and
*A*_{3}. Had their expectations been different, so
either had believed that the other would not follow
(*s*_{2}, *s*_{2}), then the outcome
might have been very different.

Sugden (1986) and Vanderschraaf (1998) argue that it is not crucial
to the notion of convention that the corresponding equilibrium be a
coordination equilibrium. Lewis' key insight is that a convention is a
pattern of mutually beneficial behavior which depends on the agents'
common knowledge that all follow *this* pattern, and no other.
Vanderschraaf gives a more general definition of convention as a
*strict* equilibrium together with common knowledge that all
follow this equilibrium and that all would have followed a different
equilibrium had their beliefs about each other been different. An
example of this more general kind of convention is given below in the
discussion of the Figure 3.1 example.

Cubitt and Sugden (2003) provide an insightful study of Lewis'
account of convention. They notice that Lewis seems to be the first
author to analyze games played recurrently in a population under the
assumption that such games are not necessarily played by the
*same* agents. This differs from the traditional approach to
repeated games which considers games repeatedly played by the same
agents and treats them as a single, self-contained supergame. Lewis'
approach differs, however, also from the framework that game theorists
have subsequently used to analyze recurrent games in a population. The
methods adopted in such developments were inspired from the work of
theoretical biologists and focus, rather than on the role of reasoning,
on "blind or adaptive mechanisms of selection" (Cubitt and Sugden 2003,
p. 177). For example, Skyrms (1996) recasts Lewis' analysis of
convention in the framework of evolutionary game theory, dispensing
with assumptions that are central to Lewis' argument, among which that
of common knowledge. Building on their formal reconstruction of Lewis'
definition of common knowledge (see section 2.2 for details,) Cubitt
and Sugden claim that the regularity in the behavior of a population
*P* lying at the core of Lewis' definition of convention matches
the clauses of the definition if it does function as a basis for common
knolwedge of the fact that it will keep to be a regularity in the
future. Inductive reasoning plays a central role in Cubitt and Sugden's
account, letting them conclude that "Lewis' approach offers an
alternative both to the modelling strategy of classical game theory, in
which self-contained games are played by hyper-rational agents, and to
that of evolutionary game theory, in which players' behaviour is the
product of blind processes of selection." (cf. Cubitt and Sugden 2003,
p. 204).

### 3.3 Strategic Form Games

Lewis formulated the notion of common knowledge as part of his
general account of conventions. In the years following the publication
of *Convention*, game theorists have recognized that any
explanation of a particular pattern of play in a game depends crucially
on mutual and common knowledge assumptions. More specifically,
*solution concepts* in game theory are both motivated and
justified in large part by the mutual or common knowledge the agents in
the game have regarding their situation.

To establish the notation that will be used in the discussion that follows, the usual definitions of a game in strategic form, expected utility and agents' distributions over their opponents' strategies, are given here:

Definition 3.2

AgameΓ is an ordered triple (N,S,) consisting of the following elements:u

- A finite set
N= {1,2, … ,n}, called theset of agentsorplayers.- For each agent
k∈N, there is a finite setS_{k}= {s_{k1},s_{k2}, … ,s_{knk}}, called thealternative pure strategiesfor agentk. The Cartesian productS=S_{1}× … ×S_{n}is called thepure strategy setfor the game Γ.- A map
:uS→^{n}, called theutilityorpayoff functionon the pure strategy set. At each strategy combination= (ss_{1j1}, … ,s_{njn}) ∈S, agentk's particular payoff or utility is given by thek^{th}component of the value of, that is, agentuk's utilityu_{k}atis determined byswhereu_{k}() =sI_{k}((us_{1j1}, … ,s_{njn}))I_{k}() projectsx∈x^{n}onto itsk^{th}component.

The subscript ‘-*k*’ indicates the result of
removing the *k*^{th} component of an *n*-tuple
or an *n*-fold Cartesian product. For instance,

S_{-k}=S_{1}× … ×S_{k−1}×S_{k+1}× … ×S_{n}

denotes the pure strategy combinations that agent *k*'s
opponents may play.

Now let us formally introduce a system of the agents' beliefs into
this framework.
Δ_{k}(*S*_{-k}) denotes
the set of probability distributions over the measurable space
(*S*_{-k},
_{k}), where
_{k} denotes the Boolean algebra
generated by the strategy combinations
*S*_{-k}. Each agent *k* has a
probability distribution μ_{k} ∈
Δ_{k}(*S*_{-k}), and
this distribution determines the *(Savage) expected utilities*
for each of *k*'s possible acts:

E(u_{k}(s_{k j})) =∑

^{A-k ∈ S-k}u_{k}(s_{kj},s_{-k}) μ_{k}(s_{-k}),j= 1, 2, … ,n_{k}

If *i* is an opponent of *k*, then *i*'s
individual strategy *s*_{i j} may
be characterized as a union of strategy combinations
∪{
*s*_{-k} |
*s*_{ij} ∈
*s*_{-k} } ∈
_{k}, and so
*k*'s marginal probability for *i*'s strategy
*s*_{i j} may be calculated as
follows:

μ

μ _{k}(s_{ij}) =∑

^{{ s-k | sij ∈ s-k }}μ _{k}(s_{-k})

_{k}(· |

*A*) denotes

*k*'s conditional probability distribution given a set

*A*, and

*E*(· |

*A*) denotes

*k*'s conditional expectation given μ

_{k}(· |

*A*).

Suppose first that the agents have common knowledge of the full
payoff structure of the game they are engaged in and that they are all
rational, and that no other information is common knowledge. In other
words, each agent knows that her opponents are expected utility
maximizers, but does not in general know exactly which strategies they
will choose or what their probabilities for her acts are. These common
knowledge assumptions are the motivational basis for the solution
concept for noncooperative games known as *rationalizability*,
introduced independently by Bernheim (1984) and Pearce (1984). Roughly
speaking, a *rationalizable strategy* is any strategy an agent
may choose without violating common knowledge of Bayesian rationality.
Bernheim and Pearce argue that when only the structure of the game and
the agents' Bayesian rationality are common knowledge, the game should
be considered "solved" if every agent plays a rationalizable strategy.
For instance, in the "Chicken" game with payoff structure defined by
Figure 3.1,

if Joanna and Lizzi have common knowledge of all of the payoffs at every strategy combination, and they have common knowledge that both are Bayesian rational, then any of the four pure strategy profiles is rationalizable. For if their beliefs about each other are defined by the probabilities

Figure 3.1

αthen_{1}= μ_{1}(Joanna playss_{1}), and

α_{2}= μ_{2}(Lizzi playss_{1})

andE(u_{i}(s_{1})) = 3α_{i}+ 2(1 − α_{i}) = α_{i}+ 2

E(u_{i}(s_{2})) = 4α_{i}+ 0(1 − α_{i}) = 4α_{i},i= 1, 2

so each agent maximizes her expected utility by playing
*s*_{1} if α_{i} + 2
≥ 4α_{i} or
α_{i} ≤ 2/3 and maximizes her
expected utility by playing *s*_{2} if
α_{i} ≥ 2/3. If it so happens that
α_{i} > 2/3 for both agents, then
both conform with Bayesian rationality by playing their respective ends
of the strategy combination
(*s*_{2},*s*_{2}) *given their
beliefs*, even though each would want to defect from this strategy
combination were she to discover that the other is in fact going to
play *s*_{2}. Note that the game's pure strategy Nash
equilibria, (*s*_{1}, *s*_{2}) and
(*s*_{2}, *s*_{1}), are rationalizable,
since it is rational for Lizzi and Joanna to conform with either
equilibrium given appropriate distributions. In general, the set of a
game's rationalizable strategy combinations contains the set of the
game's pure strategy Nash equilibria, and this example shows that the
containment can be proper.

To show that rationalizability is a nontrivial notion, consider the 2-agent game with payoff structure defined by Figure 3.2a:

Figure 3.2a

In this game, *s*_{1} and *s*_{3}
strictly dominate *s*_{2} for Lizzi, so Lizzi cannot
play *s*_{2} on pain of violating Bayesian rationality.
Joanna knows this, so Joanna knows that the only pure strategy profiles
which are possible outcomes of the game will be among the six profiles
in which Lizzi does not choose *s*_{2}. In effect, the
3 × 3 game is reduced to the 2 × 3 game
defined by Figure 3.2b:

Figure 3.2b

In this reduced game, *s*_{2} is strictly dominated
for Joanna by *s*_{1}, and so Joanna will rule out
playing *s*_{2}. Lizzi knows this, and so she rules out
strategy combinations in which Joanna plays *s*_{2}. The
rationalizable strategy profiles are the four profiles that remain
after deleting all of the strategy combinations in which either Lizzi
or Joanna play *s*_{2}. In effect, common knowledge of
Bayesian rationality reduces the 3 × 3 game of Figure 3.2a to the
2 × 2 game defined by Figure 3.2c:

since Lizzi and Joanna both know that the only possible outcomes of the game are (

Figure 3.2c

*s*

_{1},

*s*

_{1}), (

*s*

_{1},

*s*

_{3}), (

*s*

_{3},

*s*

_{1}), and (

*s*

_{3},

*s*

_{3}).

Rationalizability can be defined formally in several ways. A variation of Bernheim's original (1984) definition is given here.

Definition 3.3

Given that each agentk∈Nhas a probability distribution μ_{k}∈ Δ_{k}(s_{-k}), the system of beliefsisμ= (μ_{1}, … , μ_{n}) ∈ Δ_{1}(S_{-1}) × … × Δ_{n}(S_{-n})Bayes concordantif and only if,and (3.i) is common knowledge. A pure strategy combination

(3.i) For i≠k, μ_{i}(s_{kj}) > 0 ⇒s_{kj}maximizesk's expected utility for some σ_{k}∈ Δ_{k}(s_{-k}),= (ss_{1j1}, … ,s_{njn}) ∈Sisrationalizableif and only if the agents have a Bayes concordant systemμof beliefs and, for each agentk∈N,

(3.ii) E(u_{k}(s_{kjk})) ≥E(u_{k}(s_{kik})), fori_{k}≠j_{k}.^{[20]}

The following result shows that the common knowledge restriction on the distributions in Definition 3.1 formalizes the assumption that the agents have common knowledge of Bayesian rationality.

Proposition 3.4

In a game Γ, common knowledge of Bayesian rationality is satisfied if, and only if, (3.i) is common knowledge.

When agents have common knowledge of the game and their Bayesian
rationality only, one can predict that they will follow a
rationalizable strategy profile. However, rationalizability becomes an
unstable solution concept if the agents come to know more about one
another. For instance, in the Chicken example above with
α_{i} > 2/3, *i* = 1, 2, if either
agent were to discover the other agent's beliefs about her, she would
have good reason not to follow the
(*s*_{2},*s*_{2}) profile and to revise
her own beliefs regarding the other agent. If, in the other hand, it so
happens that α_{1} = 1 and α_{2} = 0, so
that the agents maximize expected payoff by following the
(*s*_{2}, *s*_{1}) profile, then should
the agents discover their beliefs about each other, they will still
follow (*s*_{2}, *s*_{1}). Indeed, if
their beliefs are common knowledge, then one can predict with certainty
that they will follow (*s*_{2},*s*_{1}).
The Nash equilibrium (*s*_{2},*s*_{1}) is
characterized by the belief distributions defined by
α_{1} = 1 and α_{2} = 0.

The Nash equilibrium is a special case of *correlated equilibrium
concepts*, which are defined in terms of the belief distributions
of the agents in a game. In general, a correlated
equilibrium-in-beliefs is a system of agents' probability distributions
which remains stable given common knowledge of the game, rationality
and the *beliefs themselves*. We will review two alternative
correlated equilibrium concepts (Aumann 1974, 1987; Vanderschraaf
1995, 2001), and show how each generalizes the Nash equilibrium concept.

Definition 3.5

Given that each agentk∈Nhas a probability distribution μ_{k}∈ Δ_{k}(s_{-k}), the system of beliefs

μ* = (μ_{1}*, … , μ_{n}* ) ∈ Δ_{1}(s_{-1}) × … × Δ_{n}(s_{-n})is an

endogenous correlated equilibriumif, and only if,(3.iii) Fori≠k, μ_{i}*(s_{kj}) > 0 ⇒s_{kj}maximizesk's expected utility given μ_{k}*.If

μ* is an endogenous correlated equilibrium a pure strategy combination* = (ss_{1}*, … ,s_{n}* ) ∈ S is anendogenous correlated equilibrium strategy combination givenμ* if, and only if, for each agentk∈N,(3.iv)E(u_{k}(s_{k}*)) ≥E(u_{k}(s_{ki})) fors_{ki}≠s_{k}*.

Hence, the endogenous correlated equilibrium **μ***
restricts the set of strategies that the agents might follow, as do the
Bayes concordant beliefs of rationalizability. However, the endogenous
correlated equilibrium concept is a proper refinement of
rationalizability, because the latter does not presuppose that
condition (3.iii) holds with respect to the beliefs one's opponents
actually have. If exactly one pure strategy combination
** s*** satisfies (3.iv) given

**μ***, then

**μ*** is a

*strict equilibrium*, and in this case one can predict with certainty what the agents will do given common knowledge of the game, rationality and their beliefs. Note that Definition 3.5 says nothing about whether or not the agents regard their opponents' strategy combinations as probabilistically independent. Also, this definition does not require that the agents' probabilities are

*consistent*, in the sense that agents' probabilities for a mutual opponent's acts agree. A simple refinement of the endogenous correlated equilibrium concept characterizes the Nash equilibrium concept.

Definition 3.6

A system of agents' beliefsμ* is aNash equilibriumif, and only if,

- condition (3.iii) is satisfied,
- For each
k∈N, μ_{k}* satisfies probabilistic independance, and- For each
s_{kj}∈s_{k}, ifi,l≠kthen μ_{i}*(s_{kj}) = μ_{l}*(s_{kj}).

In other words, an endogenous correlated equilibrium is a Nash equilibrium-in-beliefs when each agent regards the moves of his opponents as probabilistically independent and the agents' probabilities are consistent. Note that in the 2-agent case, conditions (b) and (c) of the Definition 3.6 are always satisfied, so for 2-agent games the endogenous correlated equilibrium concept reduces to the Nash equilibrium concept. Conditions (b) and (c) are traditionally assumed in game theory, but Skyrms (1991) and Vanderschraaf (1995, 2001) argue that there may be good reasons to relax these assumptions in games with 3 or more agents.

Brandenburger and Dekel (1988) show that in 2-agent games, if the
beliefs of the agents are common knowledge, condition (3.iii)
characterizes a Nash equilibrium-in-beliefs. As they note, condition
(3.iii) characterizes a Nash equilibrium in beliefs for the
*n*-agent case if the probability distributions are consistent
and satisfy probabilistic independence. Proposition 3.7 extends
Brandenburger and Dekel's result to the endogenous correlated
equilibrium concept by relaxing the consistency and probabilistic
independence assumptions.

In addition, we have:Proposition 3.7

Assume that the probabilitiesare common knowledge. Then common knowledge of Bayesian rationality is satisfied if, and only if,μ= (μ_{1},…,μ_{n}) ∈ Δ_{1}(s_{-1}) × … × Δ_{n}(s_{-n})μis an endogenous correlated equilibrium.

Corollary 3.8(Brandenburger and Dekel, 1988)

Assume in a 2-agent game that the probabilitiesμ= (μ_{1},μ_{2}) ∈ Δ_{1}(s_{-1}) × Δ_{2}(s_{-2})are common knowledge. Then common knowledge of Bayesian rationality is satisfied if, and only if,

μis a Nash equilibrium.

Proof.

The endogenous correlated equilibrium concept reduces to the Nash equilibrium concept in the 2-agent case, so the corollary follows by Proposition 3.7.

If **μ*** is a strict equilibrium, then one can
predict which pure strategy profile the agents in a game will follow
given common knowledge of the game, rationality and
**μ***. But if **μ*** is such that
several distinct pure strategy profiles satisfy (3.iv) with respect to
**μ***, then one can no longer predict with certainty
what the agents will do. For instance, in the Chicken game of Figure
3.1, the belief distributions defined by α_{1} =
α_{2} = 2/3 together are a Nash equilibrium-in-beliefs.
Given common knowledge of this equilibrium, either pure strategy is a
best reply for each agent, in the sense that either pure strategy
maximizes expected utility. Indeed, if agents can also adopt randomized
or *mixed* strategies at which they follow one of several pure
strategies according to the outcome of a chance experiment, then any of
the infinitely mixed strategies an agent might adopt in Chicken is a
best reply given
**μ***.^{[21]}
So the endogenous
correlated equilibrium concept does not determine the exact outcome of
a game in all cases, even if one assumes probabilistic consistency and
independence so that the equilibrium is a Nash equilibrium.

Another correlated equilibrium concept formalized by Aumann (1974,
1987) does give a determinate prediction of what agents will do in a
game given appropriate common knowledge. To illustrate Aumann's
correlated equilibrium concept, let us consider the Figure 3.1 game
once more. If Joanna and Lizzi can tie their strategies to their
knowledge of the possible worlds in a certain way, they can follow a
system of correlated strategies which will yield a payoff vector they
both prefer to that of the mixed Nash equilibrium and which is itself
an equilibrium. One way they can achieve this is to have their friend
Ron play a variation of the familiar shell game by hiding a pea under
one of three walnut shells, numbered 1, 2 and 3. Joanna and Lizzi both
think that each of the three relevant possible worlds corresponding to
ω_{k} = {the pea lies under shell *k*} is
equally likely. Ron then gives Lizzi and Joanna each a private
recommendation, based upon the outcome of the game, which defines a
system of strategy combinations f as follows

() f(ω) =^{}

( s_{1},s_{1}) if ω_{k}= ω_{1}( s_{1},s_{2}) if ω_{k}= ω_{2}( s_{2},s_{1}) if ω_{k}= ω_{3}

*f* is a *correlated* strategy system because the
agents tie their strategies, by following their recommendations, to the
same set of states of the world Ω. *f* is also a strict
*Aumann correlated equilibrium*, for if each agent knows how Ron
makes his recommendations, but knows only the recommendation he gives
her, either would do strictly worse were she to deviate from her
recommendation.^{[22]}
Since there are several strict equilibria of
Chicken, *f* corresponds to a convention as defined in
Vanderschraaf (1998). The overall expected payoff vector of *f*
is (3,3), which lies outside the convex hull of the payoffs for the
game's Nash equilibria and which Pareto-dominates the expected payoff
vector (4/3, 4/3), of the mixed Nash equilibrium defined by
α_{1} = 2/3, *i* = 1,
2.^{[23]}
The correlated equilibrium
f is characterized by the probability distribution of the agents' play
over the strategy profiles, given in Figure 3.3:

Figure 3.3

Aumann (1987) proves a result relating his correlated equilibrium concept to common knowledge. To review this result, we must give the formal definition of Aumann correlated equilibrium.

Definition 3.9

Given a game Γ = (N,S,) together with a finite set of possible worlds Ω, the vector valued functionuf: Ω →Sis acorrelated n-tuple.Iff(ω) = (f_{1}(ω), … ,f_{n}(ω)) denotes the components offfor the agents ofN, then agentk'srecommended strategyat ω isf_{k}(ω).fis anAumann correlated equilibriumifffor eachE(u_{k}f) ≥E(u_{k}(f_{-k},g_{k})),k∈Nand for any functiong_{k}that is a function off_{i}.

The agents are at Aumann correlated equilibrium if at each possible
world ω ∈ Ω, no agent will want to deviate from his
recommended strategy, given that the others follow their recommended
strategies. Hence, Aumann correlated equilibrium uniquely specifies the
strategy of each agent, by explicitly introducing a space of possible
worlds to which agents can correlate their acts. The deviations
*g*_{i} are required to be functions of
*f*_{i}, that is, compositions of some other
function with *f*_{i}, because *i* is
informed of *f*_{i}(ω) only, and so can
only distinguish between the possible worlds of Ω that are
distinguished by *f*_{i}. As noted already, the
primary difference between Aumann's notion of correlated equilibrium
and the endogenous correlated equilibrium is that in Aumann's
correlated equilibrium, the agents correlate their strategies to some
event ω ∈ Ω that is external to the game. One way to
view this difference is that agents who correlate their strategies
exogenously can calculate their expected utilities conditional on their
own strategies.

In Aumann's model, a description of each possible world ω
includes descriptions of the following: the game Γ, the agent's
private information partitions, and the actions chosen by each agent at
ω, and each agent's prior probability distribution
μ_{k}(·) over
Ω. The basic idea is that conditional on ω, everyone knows
everything that can be the object of uncertainty on the part of any
agent, but in general, no agent necessarily knows which world ω
is the actual world. The agents can use their priors to calculate the
probabilities that the various act combinations
** s** ∈

*S*are played. If the agents' priors are such that for all

*i*,

*j*∈

*N*, μ

_{i}(ω) = 0 iff μ

_{j}(ω) = 0, then the agents' priors are

*mutually absolutely continuous*. If the agents' priors all agree, that is, μ

_{1}(ω) = … = μ

_{n}(ω) = μ(ω) for each ω ∈ Ω, then it is said that the

*common prior assumption*, or CPA, is satisfied. If agents are following an Aumann correlated equilibrium

*f*and the CPA is satisfied, then

*f*is an

*objective*Aumann correlated equilibrium. An Aumann correlated equilibrium is a Nash equilibrium if the CPA is satisfied and the agents' distributions satisfy probabilistic independence.

^{[24]}

Let *s*_{i}(ω) denote the strategy
chosen by agent *i* at possible world ω. Then
*s*: Ω → *S* defined by
*s*(ω) =
(*s*_{1}(ω),…,*s*_{n}(ω))
is a correlated *n*-tuple. Given that
_{i} is a partition of
Ω,^{[25]}
the function *s*_{i}: Ω →
*s*_{i} defined by *s* is
_{i}-*measurable* if for
each
_{ij} ∈
_{i},
*s*_{i}(ω′) is constant for each
ω′ ∈
_{ij}.
_{i}-measurability is a formal
way of saying that *i* knows what she will do at each possible
world, given her information.

Definition 3.10

AgentiisBayes rationalwith respect to ω ∈ Ω (alternatively, ω-Bayes rational) iffs_{i}is_{i}-measurable andfor anyE(u_{i}s|_{i})(ω) ≥E(u_{i}(v_{i},s_{-i}) |_{i})(ω)_{i}-measurable functionv_{i}: Ω →s_{i}.

Note that Aumann's definition of ω-Bayesian rationality
implies that μ_{i}(_{i}(ω)) > 0, so that
the conditional expectations are defined. Aumann's main result, given
next, implicitly assumes that μ_{i}(_{i}(ω))
> 0 for every agent *i* ∈ *N* and every possible
world ω ∈ Ω. This poses no technical difficulties if
the CPA is satisfied, or even if the priors are only mutually
absolutely continuous, since if this is the case then one can simply
drop any ω with zero prior from consideration.

Proposition 3.11(Aumann 1987)

If each agenti∈Nis ω-Bayes rational at each possible world ω ∈ Ω, then the agents are following an Aumann correlated equilibrium. If the CPA is satisfied, then the correlated equilibrium is objective.

Part of the uncertainty the agents might have about their situation
is whether or not all agents are rational. But if it is assumed that
all agents are ω-Bayesian rational at each ω ∈
Ω, then a description of this fact forms part of the description
of each possible ω and thus lies in the meet of the agents'
partitions. As noted already, descriptions of the agents' priors, their
partitions and the game also form part of the description of each
possible world, so propositions corresponding to these facts also lie
in the meet of the agents' partitions. So another way of stating
Aumann's main result is as follows: *Common knowledge of*
ω*-Bayesian rationality at each possible world implies that
the agents follow an Aumann correlated equilibrium.*

Propositions 3.7 and 3.11 are powerful results. They say that common
knowledge of rationality and of agents beliefs about each other,
quantified as their probability distributions over the strategy
profiles they might follow, implies that the agents' beliefs
characterize an equilibrium of the game. Then if the agents' beliefs
are unconditional, Proposition 3.7 says that the agents are rational to
follow a strategy profile consistent with the corresponding endogenous
correlated equilibrium. If their beliefs are conditional on their
private information partitions, then Proposition 3.11 says they are
rational to follow the strategies the corresponding Aumann correlated
equilibrium recommends. However, we must not overestimate the
importance of these results, for they say nothing about the
*origins* of the common knowledge of rationality and beliefs.
For instance, in the Chicken game of Figure 3.1, we considered an
example of a correlated equilibrium in which it was *assumed*
that Lizzi and Joanna had common knowledge of the system of recommended
strategies defined by
(). Given this
common knowledge, Joanna and Lizzi indeed have decisive reason to
follow the Aumann correlated equilibrium f. But where did this common
knowledge come from? How, in general, do agents come to have the common
knowledge which justifies their conforming to an equilibrium?
Philosophers and social scientists have made only limited progress in
addressing this question.

### 3.4 Games of Perfect Information

In extensive form games, the agents move in sequence. At each stage,
the agent who is to move must base her decisions upon what she knows
about the preceding moves. This part of the agent's knowledge is
characterized by an *information set*, which is the set of
alternative moves that an agent knows her predecessor might have
chosen. For instance, consider the extensive form game of Figure
3.4:

When Joanna moves she is at her information set

Figure 3.4

*I*

^{22}= {

*C*

^{1},

*D*

^{1}}, that is, she moves knowing that Lizzi might have chosen either

*C*

^{1}or

*D*

^{1}, so this game is an extensive form representation of the Chicken game of Figure 3.1.

In a game of perfect information, each information set consists of a
single node in the game tree, since by definition at each state the
agent who is to move knows exactly how her predecessors have moved. In
Example 1.4 it was noted that the method of backwards induction can be
applied to any game of perfect
information.^{[26]}
The backwards induction
solution is the unique Nash equilibrium of a game of perfect
information. The following result gives sufficient conditions to
justify backwards induction play in a game of perfect information:

Proposition 3.12(Bicchieri 1993)

In an extensive form game of perfect information, the agents follow the backwards induction solution if the following conditions are satisfied for each agentiat each information setI^{ik}:Proof.

iis rational,iknows this andiknows the game, and- At any information set
I^{jk + 1}that immediately followsI^{ik},iknows atI^{ik}whatjknows atI^{jk + 1}.

Proposition 3.12 says that far less than common knowledge of the
game and of rationality suffices for the backwards induction solution
to obtain in a game of perfect information. All that is needed is for
each agent at each of her information sets to be rational, to know the
game and to know what the next agent to move knows! For instance, in
the Figure 1.2 game, if *R*_{1} (*R*_{2})
stands for "Alan (Fiona) is rational" and
**K**_{i}(Γ ) stands for
"*i* knows the game Γ", then the backwards induction
solution is implied by the following:

- At
*I*^{24},*R*_{2}and**K**_{2}(Γ). - At
*I*^{13},*R*_{1},**K**_{1}(Γ),**K**_{1}(*R*_{2}), and**K**_{1}**K**_{2}(Γ). - At
*I*^{22},**K**_{2}(*R*_{1}),**K**_{2}**K**_{1}(*R*_{ 2}), and**K**_{2}**K**_{1}**K**_{ 2}(Γ). - At
*I*^{11},**K**_{1}**K**_{2}(*R*_{ 1}),**K**_{1}**K**_{2}**K**_{ 1}(*R*_{2}), and**K**_{1}**K**_{2}**K**_{ 1}**K**_{2}(Γ).^{[27]}

*classical argument*for the backwards induction solution. Many game theorists continue to accept the classical argument, but in recent years, the argument has come under strong challenge, led by the work of Reny (1987, 1992), Binmore (1987) and Bicchieri (1989, 1993). The basic idea underlying their criticisms of backwards induction can be illustrated with the Figure 1.2 game. According to the classical argument, if Alan and Fiona have common knowledge of rationality and the game, then each will predict that the other will follow her end of the backwards induction solution, to which his end of the backwards induction solution is his unique best response. However, what if Fiona reconsiders what to do if she finds herself at the information set

*I*

^{22}? If the information set

*I*

^{22}is reached, then Alan has of course not followed the backwards induction solution. If we assume that at

*I*

^{22}, Fiona knows only what is stated in (iii), then she can explain her being at

*I*

^{22}as a failure of either

**K**

_{1}

**K**

_{2}

**K**

_{ 1}(

*R*

_{2}) or

**K**

_{1}

**K**

_{2}

**K**

_{ 1}

**K**

_{2}(Γ) at

*I*

^{11}. In this case, Fiona's thinking that either ∼

**K**

_{1}

**K**

_{2}

**K**

_{1}(

*R*

_{2}) or ∼

**K**

_{1}

**K**

_{2}

**K**

_{1}

**K**

_{2}(Γ) at

*I*

^{11}is compatible with what Alan in fact does know at

*I*

^{11}, so Fiona should not necessarily be surprised to find herself at

*I*

^{22}, and given that what she knows there is characterized by (iii), following the backwards induction solution is her best strategy. But if rationality and the game are common knowledge, or even if Fiona and Alan both have just have mutual knowledge of the statements characterized by (iii) and (iv), then at

*I*

^{22}, Fiona knows that

**K**

_{1}

**K**

_{2}

**K**

_{ 1}(

*R*

_{2}) or

**K**

_{1}

**K**

_{2}

**K**

_{ 1}

**K**

_{2}(Γ) at

*I*

^{11}. Hence given this much mutual knowledge, Fiona no longer can explain why Alan has deviated from the backwards induction solution, since this deviation contradicts part of what is their mutual knowledge. So if she finds herself at

*I*

^{22}, Fiona does not necessarily have good reason to think that Alan will follow the backwards induction solution of the subgame beginning at

*I*

^{22}, and hence she might not have good reason to follow the backwards induction solution, either. Bicchieri (1993), who along with Binmore (1987) and Reny (1987, 1992) extends this argument to games of perfect information with arbitrary length, draws a startling conclusion: If agents have strictly too few or

*strictly too many*levels of mutual knowledge of rationality and the game relative to the number of potential moves, one cannot predict that they will follow the backwards induction solution. This would undermine the central role backwards induction has played in the analysis of extensive form games. For why should the number of levels of mutual knowledge the agents have depend upon the length of the game?

The classical argument for backwards induction implicitly assumes
that at each stage of the game, the agents discount the preceding moves
as strategically irrelevant. Defenders of the classical argument can
argue that this assumption makes sense, since by definition at any
agents' decision node, the previous moves that led to this node are now
fixed. Critics of the classical argument question this assumption,
contending that when reasoning about how to move at any of his
information sets, *including those not on the backwards induction
equilibrium path*, part of what an agent must consider is what
conditions might have led to his being at that information set. In
other words, agents should incorporate reasoning about the reasoning of
the previous movers, or *forward induction* reasoning, into
their deliberations over how to move at a given information set.
Binmore (1987) and Bicchieri (1993) contend that a backwards induction
solution to a game should be consistent with the solution a
corresponding forward induction argument recommends. As we have seen,
given common knowledge of the game and of rationality, forward
induction reasoning can lead the agents to an apparent contradiction:
The classical argument for backwards induction is predicated on what
agents predict they would do at nodes in the tree that are never
reached. They make these predictions based upon their common knowledge
of the game and of rationality. But forward induction reasoning seems
to imply that if any off-equilibrium node had been reached, common
knowledge of rationality and the game must have failed, so how could
the agents have predicted what would happen at these nodes?

This section has barely scratched the surface of this controversy over common knowledge and backwards induction. The key unresolved issue is of course explaining what happens at the off-equilibrium information sets. To date, there is not a generally accepted theory of what agents having certain mutual or common knowledge will do at off-equilibrium nodes. However, we can at least repeat one generally accepted conclusion: In a game of perfect information, mutual knowledge of rationality and the game which falls far short of common knowledge can suffice to explain why agents follow the game's Nash equilibrium, the backwards induction solution. On the other hand, unlike other examples we have considered in which agents have mutual and even common knowledge without having to reason through levels of knowledge, backwards induction arguments in games of perfect information require that at each information set, the agent who would move were the information set to be reached must reason her way through at least as many levels of knowledge as there are remaining potential moves in the game.

### 3.5 Communication Networks

Situations in which a member of a population *P* is willing
to engage in a certain course of action provided that a large enough
portion of *P* engages in some appropriate behavior are typical
problems of *collective action*. Consider the case of an agent
who is debating whether to join a revolt. Her decision to join or not
to join will depend on the number of other agents whom she expects to
join the revolt. If such a number is too low, she will prefer not to
revolt, while if the number is sufficiently large, she will prefer to
revolt. Michael Chwe proposes a model where such a situation is modeled
game-theoretically. Players' knowledge about other players' intentions
depends on a *social network* in which players are located. The
individual ‘thresholds’ for each player (the number of
other agents that are needed for that specific player to revolt) are
only known by the immediate neighbors in the network. Besides the
intrinsic value of the results obtained by Chwe's analysis regarding
the subject of collective action, his model also provides insights
about both the relation between social networks and common knowledge
and about the role of common knowledge in collective action. For
example, in some predicaments, first-order knowledge of other agents'
personal thresholds is not sufficient to motivate an agent to take
action, whereas higher-order knowledge or, in the limit, common
knowledge is.

We present Chwe's model following (Chwe 1999) and (Chwe 2000).
Suppose there is a group *P* of *n* people, and each
agent has two strategies: *r* (revolt, that is participating in
the collective action) and *s* (stay home and not participate).
Each agent has her own individual *threshold* θ ∈ (1,
2,..., *n*+1) and she prefers *r* over *s* if and
only if the total number of players who revolt is greater that or equal
to her threshold. An agent with threshold 1 always revolts; an agent
with threshold 2 revolts only if another agent does; an agent with
threshold *n* revolts only if all agents do; an agent with
threshold *n*+1 never revolts, etc. The agents are located in a
social network, represented by a binary relation → over
*P*. The intended meaning of *i → j* is that agent
*i* ‘talks’ to agent *j*, that is to say,
agent *i knows* the threshold of agent *j*. If we define
*B(i)* to be the set {*j ∈ P : j → i*}, we can
interpret *B(i)* as *i*'s ‘neighborhood’ and
say that, in general, *i* knows the thresholds of all agents in
her neighborood. A further assumption is that, for all *j,k ∈
B(i)*, *i* knows whetehr *j → k* or not, that
is, every agent knows whether her neighbors are communicating with each
other. The relation → is taken to be reflexive (one knows her own
threshold).

Players' knowledge is represented as usual in a possible worlds
framework. In possible worlds belonging to the same cell of a player's
partition, that players chooses the same action. Consider for example
the case in which there are two agents, both with thresholds either 1,
2 or 3. There are nine possible worlds represented by ordered pairs of
numbers, representing the first and second player's individual
thresholds respectively: 11, 12, 13,..., 32, 33. If the players do not
communicate, each knows her own threshold only. Player 1's information
partition reflects her ignorance about player's 2 threshold and it
consists of the sets {11, 12, 13}, {21, 22, 23} and {31, 32, 33};
whereas, similarly, player 2's partition consists of the sets {11, 21,
31}, {12, 22, 32}, {13, 23, 33}. If player 1's threshold is 1, she
revolts no matter what player 2's threshold is. Hence, player 1 revolts
in {11, 12, 13}. If player 1's threshold is 3, she never revolts.
Hence, she plays *s* in {31, 32, 33}. If her threshold is 2, she
revolts only if the other player revolts as well. Since in this example
we are assuming that there is no communication between the agents,
player 1 cannot be sure of player's 2 action, and chooses the non-risky
*s* in {21, 22, 23} as well. Similarly, player 2 playes
*r* in {11, 21, 31} and *s* otherwise. Consider now the
case in which 1 → 2 and 2 → 1. Both players have now the
finest information partitions. Thresholds of 1 and 3 yield *r*
and *s*, respectively, for both players again. However, in
player 1's cells {21} and {22}, she knows that player 2 will revolt,
and, having threshold 2, she revolts as well. Similarly for player 2 in
his cells {12} and {22}. Note, that the case in which both players have
threshold 2, yields both the equilibrium in which both players revolt
and the equilibrium in which each player stays home. It is assumed that
in the case of multiple equilibria, the one which results in the most
revolt will obtain.

Figure 3.5

The analysis of the example above applies to general networks with
*n* agents. Consider for example the three person network 1
→ 2, 2 → 1, 2 → 3, represented in figure 3.5a (notice
that symmetric links are represented by a line without arrowheads) and
assume that each player has threshold 2. The network between players 1
and 2 is the same as the one above, hence if they have threshold 2,
they both revolt regardless of the threshold of player 3. Player 3, on
the other hand, knows her own threshold and player 2's. Hence, if they
all have threshold 2, she cannot distinguish between the possibilities
in the set {122, 222, 322, 422}. At 422, in particular, neither player
1 nor player 2 revolt, hence player 3 cannot take the risk and does not
revolt, *even if* she has a neighbor who revolts. Since she does
not know whether 1 revolts or not, she cannot take the risk to revolt
herself. Adding the link 1 → 3 to the network (cf. figure 3.5b) we
provide player 3 with knowledge about player 1's action, hence in this
case, if they all have threshold 2, they all revolt. Notice that if we
break the link between players 1 and 2 (so that the network is 1 →
3 and 2 → 3), player 3 knows that 1 and 2 cannot communicate and
hence not revolt at 222, and she chooses *s* as well. Knowledge
of what other players know about other players is crucial.

Figure 3.6

The next example reveals that in some cases not even first-order
knowledge is sufficient to trigger action, and higher levels of
knowledge are necessary. Consider four players, each with threshold 3,
in the two different networks represented in figure 3.6
(‘square’, in figure 3.6a, and ‘kite’, in
figure 3.6b.) In the *square* network, player 1 knows that both
2 and 4 have threshold 3. However, she does not know about player 3's
threshold. If player 3 has threshold 5, then player 2 will never
revolt, since he does not know about player 4's threshold and it is
then possible for him that player 4 has threshold 5 as well. Player 1's
uncertainty about player 3 together with player 1's knowledge of player
2's uncertainty about player 4 force her not to revolt, although she
has threshold 3 and two neighbors with threshold 3 as well. Similar
reasoning applies to all other players, hence in the square no one
revolts. Consider now the *kite* network. Player 4 ignores
player 1's and player 2's thresholds, hence he does not revolt.
However, player 1 knows that players 2 and 3 have threshold 3, that
they know that they do, and that they know that player 1 knows that
they do. This is enough to trigger action *r* for the three of
them, and indeed if players 1, 2 and 3 all revolt in all states in
{3331, 3332, 3333, 3334, 3335}, this is an equilibrium since in all
states at least three people revolt each with threshold three.

The difference between the square and the kite networks is that,
although in the square enough agents are willing to revolt for a revolt
to actually take place, and they all individually know this, no agent
knows that others know it. In the kite, on the other hand, agents in
the triangle not only know that there are three agents with threshold
3, but they also know that they all know it, know that they all know
that they all know it, and so on. There is common knwoledge of such
fact among them. It is interesting to notice that in Chwe's model,
common knowledge obtains without there been a *publicly known*
fact (cf. section 2.2). The proposition "players 1, 2 and 3 all have
threshold 3" (semantically: the event {3331, 3332, 3333, 3334, 3335})
is known by players 1, 2 and 3 because of the network structure, and
becomes common knolwedge because the network structure is known by the
players. To be sure, the network structure is not just simply known,
but it is actually commonly known by the players. Player 1, for
example, does not only know that players 2 and 3 communicate with each
other. She also knows that players 2 and 3 know that she knows that
they communicate with each other, and so on.

In *complete* networks (networks in which all players
communicate with everyone else, as in the triangle in the kite network)
the information partitions of the players coincide, and they are the
finest partitions of the set of possible worlds. Hence, if players have
sufficiently low thresholds, such fact is commonly known and there is
an equilibrium in which all players revolt.

Definition 3.13

We say that → is asufficient networkif there is an equilibrium such that all players choose to revolt.

For a game in which all players have sufficiently low thresholds,
the complete network is clearly sufficient. Is the complete network
necessary to obtain an equilibrium in which all players revolt? It
turns out that it is not, and that a crucial role is played by
structures of the same kind as the ‘triangle’ group in the
kite network. In these structures (called *cliques*)
‘local’ common knowledge (that is, limited to the players
part of the structure) arises naturally. (Chwe 2000) proves an
interesting proposition that illustrates properties of networks that
are minimally sufficient for revolt to take place. Before stating
Chwe's result, we need two more definitions. The first defines a
*minimal* sufficient network, that is a network in which there
is sufficient but not superfluous communication for revolt. The second
defines a *clique*, that is a subset of the population in which
each member of the subset speaks with everyone else in the subset.

Definition 3.14

We say that → is aminimal sufficient networkif it is a suffcient network and, if →′ is a sufficient network, then →′ = →.

Definition 3.15

Acliqueof a network → is a setM⊆_{k}Psuch thati→jfor alli,j∈ M_{k}.

The following proposition is telling about the nature of minimal sufficient networks. For its proof, we refer the reader to (Chwe 2000).

Proposition 3.16(Chwe 2000)

Say that → is a minimal sufficient network. Then there exist cliquesM_{1},M_{2},...,M_{z}which coverPand a binary relation →* defined overM_{1},M_{2},...,M_{z}such that (1)i → jiff there exists anM_{k}to whichibelongs and aM_{l}to whichjbelongs andM_{k}→*M_{l}and (2) ifM_{iy−1}→*M_{iy}, then there exists a totally ordered setM_{i1},M_{i2}, … ,M_{iy−1},M_{iy}, whereM_{i1}is maximal.

The cliques (which need not be a partition of *P*) cover the
entire population, that is their union is equivalent to *P*. The
first part of the proposition shows that there is a relation among
cliques such that player *i* talks to player *j* if and
only if they belong to cliques *M _{k}* and

*M*respectively and all members of

_{l}*M*speak to all members of

_{k}*M*. In other words, if one clique speaks to another, then every member of one clique speaks to every member of the other clique. Moreover if two cliques do not speak to each other, there is no communication at all between the members of one and the members of the other. The second part of the proposition says that for every two cliques such that one is talking to the other, there exists a ‘chain’ of cliques with a starting element. In other words, every pair of cliques in the relation are part of a chain (of length at least 2) of cliques with a starting element (a

_{l}*leading*clique.) Revolt propagates in the network moving from ‘leading adopters’ to ‘followers’, according to the

*social role hierarchy*defined by the cliques and their relation. Consider the following example, in which cliques are represented by circles and numbers represent the thresholds of individual players:

Figure 3.7

Here the threshold 3 clique is the leading clique, igniting revolt in the threshold 5 follower clique. In turn, the clique of a single threshold 3 element follows. Notice that although she does not need to know that the leading clique actually revolts to be willing to revolt, that information is needed to ensure that the threshold 5 clique does revolt, and hence that it is safe for her to join the revolt. While in each clique information about thresholds and hence willingness to revolt is common knowledge, in a chain of cliques information is ‘linear’; each clique knows about the clique of which it is a follower, but does not know about earlier cliques.

Analyzing Chwe's models for collective action under the respect of weak versus strong links (cf. both Chwe 1999 and Chwe 2000) provides further insights about the interaction between communication networks and common knolwedge. A strong link, roughly speaking, joins close friends, whereas a weak link joins acquaintances. Strong links tend to increase more slowly than weak ones, since people have common close friends more often than they share acquaintances. In terms of spreading information and connecting society, then, weak links do a better job than strong links, since they traverse society more quickly and have therefore larger reach. What role do strong and weak links play in collective action? Adding a dynamic component to the static model described above shows that the answer to the question depends on the specific strategic interaction under consideration. A simple dynamics amounts to assume that, at time 1, one player's neighborhood consists of players one link away from her; at time 2 it consists of players two links away from her, and so on. In such a setup, strong links fare better when thresholds are low, whereas weak links are better when players' thresholds are higher. Intuitively, one sees that strong links tend to form small cliques right away (because of the symmetry instrinsic in them: my friends' friends tend to be my friends also); common knowledge arises quickly at the local level and, if thresholds are low, there is a better chance that a group tied by a strong link becomes a leading clique initiating revolt. If, on the other hand, thresholds are high, local common knowledge in small cliques remains fruitless, and weak links, reaching further more quickly, speed up communication and the building of the large cliques needed to sparkle collective action. Such considerations shed some light on the relation between social networks and common knowledge. While it is true that knowledge spreads faster in networks in which weak links predominate, higher-order knowledge (and, hence, common knowledge) tends to arise more slowly in this kind of networks. Networks with a larger number of strong links, on the other hand, facilitate the formation of common knowledge at the local level.

## 4. Is Common Knowledge Attainable?

Lewis formulated an account of common knowledge which generates the
hierarchy of‘*i* knows that *j* knows that …
*k* knows that *A*’ propositions in order to ensure
that in his account of convention, agents have correct beliefs about
each other. But since human agents obviously cannot reason their way
through such an infinite hierarchy, it is natural to wonder whether any
group of people can have full common knowledge of any proposition. More
broadly, the analyses of common knowledge reviewed in §3 would be
of little worth to social scientists and philosophers if this common
knowledge lies beyond the reach of human agents.

Fortunately for Lewis' program, there are strong arguments that
common knowledge is indeed attainable. Lewis (1969) and Schiffer (1972)
argue that the common knowledge hierarchy should be viewed as a chain
of implications, and not as steps in anyone's actual reasoning. They
give informal arguments that the common knowledge hierarchy is
generated from a finite set of axioms. We saw in §2 that it is
possible to formulate Lewis' axioms precisely and to derive the common
knowledge hierarchy from these axioms. Again, the basic idea behind
Lewis' argument is that for a set of agents, if a proposition A is
publicly known among them and each agent knows that everyone can draw
the same conclusions from A that she can, then A is common knowledge.
These conditions are obviously context dependent, just as an
individual's knowing or not knowing a proposition is context dependent.
Yet there are many cases where it is natural to assume that Lewis'
conditions are satisfied. If, for instance, a group of English speaking
persons in an automobile are listening to the radio, and the following
special news announcement, "The Pope has abdicated", is audibly
broadcast, then one may safely conclude that it is common knowledge for
this group that the Pope has abdicated. If one has skeptical doubts
about the agents' common knowledge in this situation, then one would
have to explain the failure of common knowledge as the result of some
circumstance that would be quite surprising in this context. Common
knowledge could fail if some of the people failed to hear the
announcement, or if some of them believed that some of the others could
not understand the announcement, but circumstances such as these would
be quite peculiar given the stated assumptions in this story. In this
context, skeptical doubt about common knowledge is certainly possible,
but such doubt relies upon *ad hoc* assumptions similar to those
that are needed to explain failure of *individual* knowledge,
not with the attainability of common knowledge in principle.

Aumann (1976) gives an alternate finitary procedure for generating
the common knowledge hierarchy in the special case in which the
relevant number of possible worlds in Ω is finite and each
agent's information system partitions Ω. To be sure, knowledge
does not always come so neatly packaged, but in many applications a
finite state space together with partitions is a good model of the
actual situation agents face. Aumann shows that a proposition
*A* is common knowledge for a set *N* of agents at
ω, if and only if,
(ω)
⊆ *A* where
(ω)
is the element in the
meet of the agents' private information partitions containing ω.
In words, anything implied by the agents' common information partition
is common knowledge. If the set Ω is finite, then the meet
of the agents' partitions
_{i}, *i*
∈ *N*, can be computed in finitely many steps. In a certain
sense, the issue of skepticism regarding common knowledge never arises
in Aumann's model. Common knowledge is built into Aumann's model, as a
result of the agents' having private knowledge which is defined by
*partitions* over the possible worlds. Put another way, common
knowledge could fail in Aumann's model only if at some ω ∈
Ω, some individual i's knowledge of
_{i}(ω) in i's private
information partition could fail, which reinforces the point made in
the previous paragraph. To reiterate, if one accepts Lewis' and
Aumann's analysis of common knowledge, then common knowledge is in
principle no more problematic than individual knowledge.

Nevertheless, care must be taken in ascribing common knowledge to a
group of human agents. Common knowledge is a phenomenon highly
sensitive to the agents' circumstances. The following section gives an
example that shows that in order for *A* to be a common truism
for a set of agents, they ordinarily must perceive an event which
implies *A* *simultaneously* and *publicly.*

## 5. Coordination and Common *p*-Belief

In certain contexts, agents might not be able to achieve common
knowledge. Might they achieve something "close"? One weakening of
common knowledge is of course *m*^{th} level mutual
knowledge. For a high value of *m*,
**K**^{m}_{N}(*A*)
might seem a good approximation of
**K**^{*}_{N}(*A*).
However, the following example, due to Rubinstein (1989, 1992), shows
that simply truncating the common knowledge hierarchy at any finite
level can lead agents to behave as if they had no mutual knowledge at
all.^{[28]}

### 5.1 The E-mail Coordination Example

Lizzi and Joanna are faced with the coordination problem summarized in the following figure:In Figure 5.1, the payoffs are dependent upon a pair of possible worlds. World ω

Figure 5.1

_{1}occurs with probability μ(ω

_{1}) = .51, while ω

_{2}occurs with probability μ(ω

_{2}) = .49. Hence, they coordinate with complete success by both choosing

*A*(

*B*) only if the state of the world is ω

_{1}(ω

_{2}).

Suppose that Lizzi can observe the state of the world, but Joanna
cannot. We can interpret this game as follows: Joanna and Lizzi would
like to have a dinner together prepared by Aldo, their favorite chef.
Aldo alternates between *A* and *B*, the two branches of
Sorriso, their favorite restaurant. State ω_{i}
is Aldo's location that day. At state ω_{1}
(ω_{2}), Aldo is at *A* (*B*). Lizzi, who
is on Sorriso's special mailing list, receives notice of
ω_{i}. Lizzi's and Joanna's best outcome occurs
when they meet where Aldo is working, so they can have their planned
dinner. If they meet but miss Aldo, they are disappointed and do not
have dinner after all. If either goes to *A* and finds herself
alone, then she is again disappointed and does not have dinner. But
what each really wants to avoid is going to *B* if the other
goes to *A*. If either of them arrives at *B* alone, she
not only misses dinner but must pay the exorbitant parking fee of the
hotel which houses *B*, since the headwaiter of *B*
refuses to validate the parking ticket of anyone who asks for a table
for two and then sits alone. This is what Harsanyi (1967) terms a game
of *incomplete information*, since the game's payoffs depend
upon states which not all the agents know.

*A* is a "play-it-safe" strategy for both Joanna and
Lizzi.^{[29]}
By choosing *A* whatever the state
of the world happens to be, the agents run the risk that they will fail
to get the positive payoff of meeting where Aldo is, but each is also
sure to avoid the really bad consequence of choosing *B* if the
other chooses *A*. And since only Lizzi knows the state of the
world, neither can use information regarding the state of the world to
improve their prospects for coordination. For Joanna has no such
information, and since Lizzi knows this, she knows that Joanna has to
choose accordingly, so Lizzi must choose her best response to the move
she anticipates Joanna to make regardless of the state of the world
Lizzi observes. Apparently Lizzi and Joanna cannot achieve expected
payoffs greater than 1.02 for each, their expected payoffs if they
choose (*A*, *A*) at either state of the world.

If the state ω were common knowledge, then the conditional
strategy profile (*A*, *A*) if ω =
ω_{1} and (*B*, *B*), if ω =
ω_{2} would be a strict Nash equilibrium at which each
would achieve a payoff of 2. So the obvious remedy to their predicament
would be for Lizzi to tell Joanna Aldo's location in a face-to-face or
telephone conversation and for them to agree to go where Aldo is, which
would make the state ω and their intentions to coordinate on the
best outcome given ω common knowledge between them. Suppose for
some reason they cannot talk to each other, but they prearrange that
Lizzi will send Joanna an e-mail message if, and only if,
ω_{2} occurs. Suppose further that Joanna's and Lizzi's
e-mail systems are set up to send a reply message automatically to the
sender of any message received and viewed, and that due to technical
problems there is a small probability, ε > 0, that any
message can fail to arrive at its destination. Then if Lizzi sends
Joanna a message, and receives an automatic confirmation, then Lizzi
knows that Joanna knows that ω_{2} has occurred. If
Joanna receives an automatic confirmation of Lizzi's automatic
confirmation, then Joanna knows that Lizzi knows that Joanna knows that
ω_{2} occurred, and so on. That ω_{2} has
occurred would become common knowledge if each agent received
infinitely many automatic confirmations, assuming that all the
confirmations could be sent and received in a finite amount of
time.^{[30]}
However, because of the probability
ε of transmission failure at every stage of communication, the
sequence of confirmations stops after finitely many stages with
probability one. With probability one, therefore, the agents fail to
achieve full common knowledge. But they do at least achieve something
"close" to common knowledge. Does this imply that they have good
prospects of settling upon (*B*, *B*)?

Rubinstein shows by induction that if the number of automatically
exchanged confirmation messages is finite, then *A* is the only
choice that maximizes expected utility for each agent, given what she
knows about what they both know.

Rubinstein's Proof

So even if agents have "almost" common knowledge, in the sense that
the number of levels of knowledge in "Joanna knows that Lizzi knows
that … that Joanna knows that ω_{2} occurred" is
very large, their behavior is quite different from their behavior given
common knowledge that ω_{2} has occurred. Indeed, as
Rubinstein points out, given merely "almost" common knowledge, the
agents choose as if no communication had occurred at all! Rubinstein
also notes that this result violates our intuitions about what we would
expect the agents to do in this case. (See Rubinstein 1992, p. 324.) If
*T*_{i} = 17, wouldn't we expect agent
*i* to choose *B*? Indeed, in many actual situations we
might think it plausible that the agents would each expect the other to
choose *B* even if *T*_{1} =
*T*_{2} = 2, which is all that is needed for Lizzi to
know that Joanna has received her original message and for Joanna to
know that Lizzi knows this!

### 5.2 Common *p*-Belief

The example in Section 5.1 hints that mutual knowledge is not the only
weakening of common knowledge that is relevant to coordination.
Brandenburger and Dekel (1987), Stinchcombe (1988) and Monderer and
Samet (1989) explore another option, which is to weaken the properties
of the **K**

^{*}

_{N}operator. Monderer and Samet motivate this approach by noting that even if a mutual knowledge hierarchy stops at a certain level, agents might still have higher level mutual

*beliefs*about the proposition in question. So they replace the knowledge operator

**K**

_{i}with a

*belief operator*

**B**

^{p}

_{i}:

Definition 5.1

If μ_{i}(·) is agenti's probability distribution over Ω, thenB^{p}_{i}(A) = { ω | μ_{i}(A|_{i}(ω)) ≥p}

**B**^{p}_{i}(*A*)
is to be read ‘*i* believes *A* (given *i*'s
private information) with probability at least *p* at
ω’, or ‘*i* *p*-believes
*A*’. The belief operator
**B**^{p}_{i} satisfies
axioms K2, K3, and K4 of the knowledge operator.
**B**^{p}_{i} does not
satisfy K1, but does satisfy the weaker property

μ_{i}(A|B^{p}_{i}(A)) ≥p

that is, if one believes *A* with probability at least
*p*, then the probability of *A* is indeed at least
*p*.

One can define *mutual* and *common p-beliefs*
recursively in a manner similar to the definition of mutual and common
knowledge:

Definition 5.2

Let a set Ω of possible worlds together with a set of agentsNbe given.(1) The proposition that

Ais(first level or first order) mutualp-belief for the agents ofN,B^{p}_{N1}(A), is the set defined by(2) The proposition thatB^{p}_{N1}(A) ≡ ∩_{i∈N}B^{p}_{i}(A).Aism^{th}level(orm^{th}order)mutualp-belief among the agents ofN,B^{p}_{Nm}(A), is defined recursively as the set(3) The proposition thatB^{p}_{Nm}(A) ≡ ∩_{i∈N}B^{p}_{i}(B^{p}_{Nm−1}(A))Aiscommon p-beliefamong the agents ofN,B^{p}_{N*}(A), is defined as the set

B^{p}_{N*}(A) ≡∞

∩

^{m=1}B^{p}_{Nm}(A).

If *A* is common (or *m*^{th} level mutual)
knowledge at world ω, then *A* is common
(*m*^{th} level) *p*-belief at ω for every
value of *p*. So mutual and common *p*-beliefs formally
generalize the mutual and common knowledge concepts. However, note that
**B**^{1}_{N*}(*A*) is not
necessarily the same proposition as
**K**^{*}_{N}(*A*), that
is, even if *A* is common 1-belief, *A* can fail to be
common knowledge.

Common *p*-belief forms a hierarchy similar to a common
knowledge hierarchy:

Proposition 5.3

ω ∈B^{p}_{Nm}(A) iff(1) For all agentsHence, ω ∈i_{1},i_{2}, … ,i_{m}∈N, ω ∈B^{p}_{i1}B^{p}_{i2}…B^{p}_{im}(A)B^{p}_{N*}(A) iff (1) is the case for eachm≥ 1.

Proof. Similar to the Proof of Proposition 2.5.

One can draw several morals from the e-mail game of Example 5.1.
Rubinstein (1987) argues that his conclusion seems paradoxical for the
same reason the backwards induction solution of Alan's and Fiona's
perfect information game might seem paradoxical: Mathematical induction
does not appear to be part of our "everyday" reasoning. This game also
shows that in order for A to be a common truism for a set of agents,
they ordinarily must perceive an event which implies A
*simultaneously* in each others' presence. A third moral is that
in some cases, it may make sense for the agents to employ some solution
concept weaker than Nash or correlated equilibrium. In their analysis
of the e-mail game, Monderer and Samet (1989) introduce the notions of
*ex ante* and *ex post ε-equilibrium.* An *ex
ante* equilibrium *h* is a system of strategy profiles such
that no agent *i* expects to gain more than ε-utiles if
*i* deviates from *h*. An *ex post* equilibrium
*h*′ is a system of strategy profiles such that no agent
*i* expects to gain more than ε-utiles by deviating from
*h*′ given *i*'s private information. When
ε = 0, these concepts coincide, and *h* is a Nash
equilibrium. Monderer and Samet show that, while the agents in the
e-mail game can never achieve common knowledge of the world ω, if
they have common *p*-belief of ω for sufficiently high
*p*, then there is an *ex ante* equilibrium at which they
follow (*A*,*A*) if ω = ω_{1} and
(*B*,*B*), if ω = ω_{2}. This
equilibrium turns out not to be *ex post.* However, if the
situation is changed so that there are no replies, then Lizzi and
Joanna could have at most first order mutual knowledge that ω =
ω_{2}. Monderer and Samet show that in this situation,
given sufficiently high common *p*-belief that ω =
ω_{2}, there is an *ex post* equilibrium at which
Joanna and Lizzi choose (*B*,*B*) if ω =
ω_{2}! So another way one might view this third moral of
the e-mail game is that agents' prospects for coordination can
sometimes improve dramatically if they rely on their common beliefs as
well as their mutual knowledge.

## Bibliography

### Annotations

Lewis (1969) is the classic pioneering study of common knowledge and its potential applications to conventions and game theory. As Lewis acknowledges, parts of his work are foreshadowed in Hume (1740) and Schelling (1960).
Aumann (1976) gives the first mathematically rigorous formulation of
common knowledge using set theory. Schiffer (1972) uses the formal
vocabulary of *epistemic logic* (Hintikka 1962) to state his
definition of common knowledge. Schiffer's general approach is to
augment a system of sentential logic with a set of knowledge operators
corresponding to a set of agents, and then to define common knowledge
as a hierarchy of propositions in the augmented system. Bacharach
(1992), Bicchieri (1993) and Fagin, *et al*. (1995) adopt this
approach, and develop logical theories of common knowledge which
include soundness and completeness theorems. Fagin, et. al. show that
the syntactic and set-theoretic approaches to developing common
knowledge are logically equivalent.

Aumann (1995) gives a recent defense of the classical view of backwards induction in games of imperfect information. For criticisms of the classical view, see Binmore (1987), Reny (1992), Bicchieri (1989) and especially Bicchieri (1993). Brandenburger (1992) surveys the known results connecting mutual and common knowledge to solution concepts in game theory. For more in-depth survey articles on common knowledge and its applications to game theory, see Binmore and Brandenburger (1989), Geanakoplos (1994) and Dekel and Gul (1996). For her alternate account of common knowledge along with an account of conventions which opposes Lewis' account, see Gilbert (1989).

Monderer and Samet (1989) remains one of the best resources for the study of common p-belief.

### References

- Alberucci, Luca and Jaeger, Gerhard. 2005, "About cut elimination
for logics of common knowledge",
*Annals of Pure and Applied Logic*133(1-3): 73-99. - Aumann, Robert. 1974, "Subjectivity and Correlation in Randomized
Strategies",
*Journal of Mathematical Economics*1, 67-96. - Aumann, Robert. 1976, "Agreeing to Disagree",
*Annals of Statistics*4, 1236-9. - Aumann, Robert. 1987, "Correlated Equilibrium as an Expression of
Bayesian Rationality",
*Econometrica*55, 1-18. - Aumann, R. 1995. "Backward Induction and Common Knowledge of
Rationality",
*Games and Economic Behavior*8: 6-19. - Bacharach, Michael. 1989. "Mutual Knowledge and Human Reason", mimeo.
- Barwise, Jon. 1988. "Three Views of Common Knowledge", in
*Proceedings of the Second Conference on Theoretical Aspects of Reasoning About Knowledge*, ed. M.Y. Yardi. San Francisco: Morgan Kaufman, pp. 365-379. - Barwise, Jon. 1989.
*The Situation in Logic.*Stanford: Center for the Study of Language and Information. - Bernheim, B. Douglas. 1984. "Rationalizable Strategic Behavior",
*Econometrica*, 52: 1007-1028. - Bicchieri, Cristina. 1989. "Self Refuting Theories of Strategic
Interaction: A Paradox of Common Knowledge",
*Erkenntnis*, 30: 69-85. - Bicchieri, Cristina. 1993. Rationality and Coordination. Cambridge: Cambridge University Press.
- Binmore, Ken. 1987. "Modelling Rational Players I",
*Economics and Philosophy*, 3: 179-241. - Binmore, Ken. 1992.
*Fun and Games.*Lexington, Massachusetts: D. C. Heath. - Binmore, Ken and Brandenburger, Adam. 1988, "Common knowledge and Game theory" ST/ICERD Discussion Paper 88/167, London School of Economics.
- Bonanno, Giacomo and Battigalli, Pierpaolo. 1999, "Recent results
on belief, knowledge and the epistemic foundations of game theory",
*Research in Economics*53(2): 149-225. - Brandenburger, Adam. 1992. "Knowledge and Equilibrium in Games",
*Journal of Economic Perspectives*6: 83-101. - Brandenburger, Adam, and Dekel, Eddie. 1987. "Common knowledge with
Probability 1",
*Journal of Mathematical Economics*16, 237-245. - Brandenburger Adam and Dekel, Eddie. 1988. "The Role of Common
Knowledge Assumptions in Game Theory", in
*The Economics of Missing Markets, Information and Games*, ed. Frank Hahn. Oxford: The Clarendon Press: 46-61. - Carnap, Rudolf. 1947,
*Meaning and Necessity: A Study in Semantics and Modal Logic*, Chicago, University of Chicago Press. - Chwe, Michael. 1999, "Structure and Strategy in Collective
Action.",
*American Journal of Sociology*105: 128-56. - Chwe, Michael. 2000, "Communcation and Coordination in Social
Networks",
*Review of Economic Studies*67: 1-16. - Chwe, Michael. 2001,
*Rational Ritual*. Princeton, NJ: Princeton University Press - Cubitt, Robin and Sugden, Robert. 2003, "Common Knowledge, Salience
and Convention: A Reconstruction of David Lewis' Game Theory",
*Economics and Philosophy*19: 175-210. - Dekel, Eddie and Gul, Faruk. 1996. "Rationality and Knowledge in Game Theory", working paper, Northwestern and Princeton Universities.
- Fagin, Ronald, Halpern, Joseph Y., Moses, Yoram and Vardi, Moshe Y.
1995.
*Reasoning About Knowledge.*Cambridge, Massachusetts: MIT Press. - Geanakoplos, John. 1994. "Common Knowledge", in
*Handbook of Game Theory*, Volume 2, ed. Robert Aumann and Sergiu Hart. Elsevier Science B.V.: 1438-1496. - Gilbert, Margaret. 1989,
*On Social Facts*, Princeton University Press, Princeton. - Halpern, Jospeh. 2001, "Alternative Semantics for Unawareness",
*Games and Economic Behavior*37(2): 321-339 - Harsanyi, J. 1967. "Games with incomplete information played by
"Bayesian" players, I: The basic model."
*Management Science*14: 159-82. - Harsanyi, J. 1968a. "Games with incomplete information played by
"Bayesian" players, II: Bayesian equilibrium points."
*Management Science*14: 320-324. - Harsanyi, J. 1968b. "Games with incomplete information played by
"Bayesian" players, III: The basic probability distribution of the
game."
*Management Science*14: 486-502. - Hintikka, Jaako. 1962.
*Knowledge and Belief.*Ithaca, New York: Cornell University Press. - Hume, David. (1740, 1888) 1976,
*A Treatise of Human Nature*, ed. L. A. Selby-Bigge. rev. 2nd. ed., ed. P. H. Nidditch. Clarendon Press, Oxford. - Lewis, C. I. 1943, "The Modes of Meaning",
*Philosophy and Phenomenological Research*, 4, 236-250. - Lewis, David. 1969,
*Convention: A Philosophical Study*, Harvard University Press, Cambridge, Massachusetts. - Lewis, David. 1978, "Truth in Fiction",
*American Philosophical Quarterly*15, 37-46. - Littlewood, J. E. 1953.
*Mathematical Miscellany*, ed. B. Bollobas. - Mckelvey, Richard and Page, Talbot, "Common knowledge, consensus
and aggregate information",
*Econometrica*54: 109-127. - Meyer, J.-J.Ch. and van der Hoek, Wiebe. 1995,
*Epistemic Logic for Computer Science and Artificial Intelligence*Cambridge Tracts in Theoretical Computer Science 41, Cambridge University Press. - Milgrom, Paul. 1981. "An axiomatic characterization of common
knowledge",
*Econometrica*49: 219-222. - Monderer, Dov and Samet, Dov. 1989, "Approximating Common Knowledge
with Common Beliefs",
*Games and Economic Behavior*1, 170-190. - Nash, John. 1950, "Equilibrium points in n-person games".
*Proceedings of the National Academy of Sciences of the United States*36, 48-49. - Nash, John. 1951, "Non-Cooperative Games".
*Annals of Mathematics*54, 286-295. - Pearce, David. 1984. "Rationalizable Strategic Behavior and the
Problem of Perfection".
*Econometrica*, 52: 1029-1050. - Reny, Philip. 1987. "Rationality, Common Knowledge, and the Theory of Games", working paper, Department of Economics, University of Western Ontario.
- Reny, Philip. 1992. "Rationality in Extensive Form Games",
*Journal of Economic Perspectives*, 6: 103-118. - Rubinstein, Ariel. 1987. "A Game with "Almost Common Knowledge": An
Example", in
*Theoretical Economics*, D. P. 87/165. London School of Economics. - Schelling, Thomas. 1960,
*The Strategy of Conflict*, Harvard University Press, Cambridge, Massachusetts. - Schiffer, Stephen. 1972,
*Meaning*, Oxford University Press, Oxford. - Sillari, Giacomo. 2005, "A Logical Framework for Convention",
*Synthese*147(2): 379-400. - Skyrms, Brian. 1984,
*Pragmatics and Empiricism*, Yale University Press, New Haven. - Stinchcombe, Max. 1988. "Approximate Common Knowledge", mimeo, University of California, San Diego.
- Sugden, Robert. 1986,
*The Economics of Rights, Cooperation and Welfare*, Basil Blackwell, New York. - Vanderschraaf, Peter. 1995, "Endogenous Correlated Equilibria in
Noncooperative Games",
*Theory and Decision*38: 61-84. - Vanderschraaf, Peter. 1998, "Knowledge, Equilibrium and
Convention",
*Erkenntnis*49: 337-369. - Vanderschraaf, Peter. 2001.
*A Study in Inductive Deliberation*, Routledge, New York. - von Neumann, John and Morgenstern, Oskar. 1944.
*Theory of Games and Economic Behavior*, Princeton University Press, Princeton.

## Other Internet Resources

## Related Entries

game theory | prisoner's dilemmaPeter Vanderschraaf <

*peterv@cyrus.andrew.cmu.edu*>

Giacomo Sillari <

*gsillari@andrew.cmu.edu*>