Will the Digital Computer Transform Classical Mathematics?
Phil. Trans. R. Soc. Lond. A (2003) 361, 1675 -1690
Technology is permeated by mathematics. Every aspect of it depends at some level
of its conception, design, and technical description, on mathematical ideas,
knowledge, practices and formalisms. Nowhere is this more so than for those
particular localized implementations of technology we call machines, which in
extreme cases are little more than fragments of mathematical structures that
engineers have found ways to physically realize. The reverse relation, less
obviously, is also the case: mathematics is permeated by technology, by the
traces and effects of machines and apparatuses that have surounded and impinged
on it throughout its history. Machines and mathematics are engaged in a two-way
traffic that forms a co-evolutionary loop, so that, for example, a machine like
the lever depends on the mathematics of ratios and conversely the theory of
ratios and rational numbers is consolidated and motivated by the concrete representations
levers provide. Of course, for an early machine such as the wheel-and-axle,
the traffic would initially be mostly one-way: the mathematical ideas of circles,
axes, angularity, and rotation being rooted in practical engagement with the
machine long before such mathematics was applied to the technologies of circular
and cyclic movement.
A materially framed history of mathematics would take this co-evolution as a
datum. Such a genealogy of the mathematics-machine nexus starting with the wheel
and lever and taking in the pump, the abacus, clock, printing press, sliderule,
loom, steam-engine, the camera, electric motor, typewriter, gramophone, radio,
and computer would uncover many particular versions of two-way traffic. Because
mathematics is augmented and permanently altered by its encounters with machines
whereas machines are simply isolated implementations of this cumulative resource,
the machines dependence on mathematics would be dominant and the reverse
traffic confined to local and minor effects - a mathematical topic connected
to the particular machine would get emphasized, certain abstract mathematical
relations created from their machinic realizations, and so on.
With the advent of the digital computer the entire co-evolutionary dynamic underwent
a phase shift taking the machine/mathematics traffic to an entirely new, radically
productive level. The digital computer was spawned by modern mathematics and
metamathematics. With the exception of the abacus (its precursor), it is the
most uncompromisingly mathematical of all machines ever created, being theoretically
based on a mathematical idealization of a definite procedure
the Turing machine and implemented as an electronic realization of the
binary logic and formal systems of mathematical logic.
The aim of this paper is to suggest that the effect of the computer on mathematics
(pure and applied) will be correspondingly far-reaching and radical; that the
computer will ultimately reconfigure the mathematical matrix from which it sprang
and will do this not only by affecting changes in content and method over a
wide mathematical terrain, but more importantly by altering the practice and
overall nature, and perhaps the very conception we have of mathematics.
To justify such a claim we need to employ a framework able to articulate the
deeper significance of the changes brought into play by the digital computer,
we need, in other words, a general characterization or abstract model of what
it means to do mathematics, to engage in the activity we call mathematics or,
which comes to the same thing, what it means to be and function as a mathematician.
A model of mathematical activity
Behind the various construals of mathematics as an activity investigation
of abstract patterns, symbolic reasoning about ideal entities, study of number
and space, and so on lie three distinct, fundamental theoretical discourses
that enter into the subject, namely: idea, symbol, and procedure. By idea is
meant the domain of human thought, mentation, and concepts from individual ideas
to the most elaborate narratives and products of the imagination; by symbol
is meant the domain of language, signs, symbols, and communicational and semiotic
practices from notational devices to entire linguistic systems; and by procedure
is meant the domain of action, process, and operations from a single calculation,
computation or verification to an arbitrarily complex program of instructions.
Mathematics can be understood as a vast and particular complex of activities
across these domains. The figure of the mathematician, the one who
carries out these activities, who knows and does mathematics, is on this understanding
an assemblage of agencies, each of which implements the characteristic activity
of a domain. So that far from being a monolithic or indivisible actor, the mathematician
can be seen as a plurality of parallel activities, a trio of different sources
of action as follows.
The actor corresponding to the domain of the idea I call the Person. This agency
has extra-mathematical physical and cultural presence, is immersed in natural
language and the subjectivity it makes available, has insight and hunches, provides
motivation for and is the source of intuitions behind concepts and proofs. Abstracted
from the Person is the agency corresponding to the domain of the symbol, which
I call the Subject. The Subject is the source of the intersubjectivity embedded
in mathematical languages, but is without the Persons capacity of self-reference,
since it is circumscribed by languages which by definition lack indexical markers
of time and physical place. Finally, there is the Agent, the actor associated
with the domain of procedure who functions as the delegate for the Person through
the mediation of the Subject; the Agent executes a mathematically idealized
version of the actions imagined by the Person and it does so formally since
it lacks the Subjects access to meaning and significance.
The model presented here, then, comprehends the mathematician as a tripartite
being enacting certain kinds of intersubjective thought-experiments, imagined
procedures conceived to take place in idealized times and spaces. The Person
imagines an idea X associated with some procedure Y the intuitive idea
of number and the action of counting, for example. The Subject thinks X and
Y within mathematical signs; for example, via a notation system which describes
and determines what number signifies mathematically and presupposes
what is meant by counting. The Agent, for whom X and Y are formal
symbols and procedures, without meaning or significance, executes the action
imagined by the Person; for example, the Agent carries out a calculation or
other operation on a list of numerical expressions.
One of the activities performed by the Person is giving proofs of assertions.
The Person makes a claim about an imagined task or procedure counting,
inverting a matrix, etc. that the Agent will execute; the task is articulated
via an appropriate linguistic apparatus supplied by the Subject. Then, by tracking
the Agent via this apparatus, that is by shadowing the Subject, the Person is
persuaded as to the truth of the claim.
These three agencies are by no means absolute or fixed. This is obvious in the
case of the Person who is manifestly a cultural-historical product. But it is
also true of the Subject and the Agent: they are constructs: what constitutes
them the rules and protocols of legitimate sign-usage, permitted definitions,
legitimate procedures, agreements about valid reasoning and proof, and so on
have been shaped over time by innovation, conceptual reconfiguration,
philosophical criticism, explicit programs of rigor, and the like. Having said
this, however, one should note their relative stability: they do not change
easily and when they do they are obliged do so in relation to each other in
order to preserve the coherence of the mathematician they jointly facilitate.
The latter constraint implies that a major change in one of these agencies is
likely to have significant consequences for the others, and hence for the character
of the mathematician and mathematics as a whole. This indeed looks to be the
case at the present historical juncture. The computer is providing an Agent
of incomparable power and flexibility, able to carry out procedures whose results
were previously unsuspected. This frees up the Person to enlarge the scope and
imaginative possibilities of mathematical thought and its significance. This
in turn demands a Subject together with a suitably rich and powerful mathematical
language for communicating between the Person and Agent. The result, as we shall
see, is that a reconfigured triad of agencies able to operate in new and complimentary
ways is being put in place.
With this triadic model as framework we can return to the digital computers
effect on classical mathematics. In order to focus the question on essentials
we shall restrict attention to what is essential and inescapable about the computer:
it is a rule-governed machine; it is a digital as opposed to an analog machine;
it is a material machine subject to and impinging on physical reality. The machinic
aspect will result in local effects on mathematics involving reasoning and proof
as well as the nature of iteration. The digital aspect will impinge on the notion
of continuity which, in the form of the linear continuum, is the site where
the analog is thought within classical mathematics. The material aspect will
impinge on the supposed purity of pure mathematics, that is, on its separation
from what mathematicians call the real world. Since these notions
form the cornerstone of post-Renaissance mathematics, it is likely that any
change in their status will represent be significant.
The remarks about mathematics and computers that follow fall into two parts.
The first is descriptive, outlining some of the varied ways the computer is
currently altering the logical status, content, practice, and evolution of mathematics,
the main thrust of which is a displacement of the infinite and the continuous
by the discrete and finite along with the openings such displacement bring into
play. Part two is more discursive, examining a sense in which infinity might
be a philosophical or theoretical problem for computer science. Specifically,
I look at a contention by cryptographers that computer sciences infinity-based
idealization of a feasible algorithm is irrelevant to practical,
real-world, real-time computational tasks.
Part I local and global transformation
Local effects: machine reasoning
As indicated, every machine will have natural and expected local connections
with the mathematics contiguous to it, that is those mathematical objects, practices,
structures, and topics it naturally requires and facilitates and in some cases
invents. For the lever it might have been rational numbers, for clocks and calendars
counting and the arithmetic of repetition, for the pendulum the concept of a
differential equation, and so on. For the computer the mathematics natural to
it ranges widely since it embraces two core aspects of the subject: the computational/calculational
process itself and the notion of proof in the form of mathematical logic and
metamathematics.
The local effects that flow from the computational process have to do with mathematical
content, with kinds of mathematics that are inspired by, demand, are amenable
to, or are directly fostered by computational and calculational procedures.
Many areas of mathematics are covered by such a description including much of
graph theory, finite combinatorics, cryptography, and the entire theory and
practice of cellular automata (of which more below); in addition there are fields
such as chaos theory, non-linear dynamics and fractal geometry, which embryonically
pre-existed computational methods but only flourished with the entry into mathematics
of the computer and which are now inseparable from the computational processes
permeating them. And, more at the periphery of mainstream mathematics, there
are those areas such as the investigation of symbolic systems, formal and artificial
languages, the theory of algorithms, and finite model theory which, whilst not
computer-driven, relate naturally to the investigation of computational questions.
Indeed, a subfield of one of these that of algorithmic complexity
gave rise to the P = NP question, now considered one of the major open problems
of contemporary mathematics. We shall examine an aspect of this problem in Part
Two.
The local effects associated with the logical, metamathematical aspect of computers
are oriented not to content but to method: to questions of reasoning and syntax,
modes of mathematical argument, deduction, inference, proof and validation,
and the mechanics of formal systems. One form this has taken has been the development
of computer-generated proofs, computer-aided reasoning, and theorem-proving
by machine. In a sense there is nothing new to such an enterprise. Ever since
Pascals invention of a calculating machine and Leibniz celebrated
call to replace all disputation and rational argument by formal, symbolic means,
there have been numerous and varied efforts to machanize reasoning. The contemporary
versions of these range from having the computer find new proofs of existing
theorems to the more impressive goal of generating proofs or disproofs of open
questions as part of efforts to rival traditional, entirely symbolic methods
of mathematical proof and deduction by computational validation. Some of these
have yielded positive results; one such theorem-proving procedure successfully
settled a question, the so-called Robbins conjecture about certain algebras
that had been open for sixty years [McClure, 1997).
Less ambitiously is the use of machine reasoning to augment rather than replace
the traditional symbolic modes of mathematical argument. The main need for augmentation
is the difficulty of managing large volumes of data and information involved
in the handling of cases, circumstances, and effects so numerous and heterogeneous
as to put them beyond the capacity of an individual mathematicians cognitive
wherewithal, not to say lifetime, to examine. For example, the theorems underlying
the classification of all simple finite groups relies on a computer-compiled
database too massive and unwieldy and internally complex for a single mathematician
to process. (Soloman, 1995) Though ostensibly more mundane than using computers
to prove theorems outright, this augmentative approach raises a fundamental
issue as to the nature and assumed constitution of the mathematician.
A famous example is the four-colour problem which asked whether every map drawn
in the plane could be coloured using only four colours. This was settled positively
by K. Appel and W. Haken in 1976 by a proof which relied on a computer program
to check thousands of intricate map configurations too complex to be verified
in any other way. A different kind of departure from traditional methods of
logical validation made possible by the computer occurs with so-called probabilistic
proof procedures which run a series of spot checks on randomly selected fragments
of long proofs and estimate to a given degree of confidence the correctness
of the whole. (Goldreich, 1995) The question raised by probabilistic proofs,
by the example of finite groups, and more pointedly by the Appel-Haken argument
is simple and profound: what is or should count as a proof? Traditionally, a
proof had always been a persuasive argument, a logical narrative which could
be checked and assented to, step by step, by a (in principle, any) individual
mathematician. Evidently, such is not possible for the collection of all simple
finite groups nor for the ensemble of maps that arise and need to be examined
in relation to colouring. Plainly, these examples of augmented proof call for
the idea of the mathematician to be made explicit in a way this
has not been necessary hitherto. Specifically, two issues arise: who
or what are mathematical arguments addressed to and how is this addressee to
be persuaded by a putative proof? In other words, what is or should
be the constitution of the Subject, i.e. what cognitive capacities and technologies
of inscription and representation should be available to it? How acxcording
to what principles -- might persuasion take place, once it is mediated by a
computer program and so removed from the familiar (but in truth largely unexamined)
orbit of an individual assenting to logical steps? Both these questions are
about the couplings between the triad of agencies constituting the mathematician,
with the second requiring the idea of persuasion as a dynamic involving the
Person and Agent to be made explicit (Rotman 1993).
Global effects: simulation not proof
Local effects, then, represent computer-inflected changes in mathematical content
and forms of proof. In terms of our model of mathematical activity, they are
principally the concern of the mathematical Subject and its relation to the
computational Agent. More far-reaching are the epistemic changes involving imagining,
understanding, conceiving analogies, and having intuitions, insight, and hunches
which concern the character of the Person and its relation to the Agent. However
the chief effect here that of digital simulation -- reaches beyond mathematics
to cover the entire field of contemporary technological and scientifico-mathematical
practice and is so widely operative and productive that it has been hailed as
an entirely new, third method of scientific research, a twentieth-century, specifically
computational mode different from and not reducible to the two existing research
modes -- the experimental/observational that founded science in the seventeenth
century and the much older theoretical/deductive mode basic to mathematical
reasoning. (Kaufmann and Smarr 1993)
To describe the effects and scope of digital simulation it will be useful to
note mathematics relation to physics. From the time of Galileo until well
into the twentieth century the bulk of external problems and questions put to
mathematics came from the natural sciences, principally physics. This coupled
with assumptions within physics of the smoothness and continuity of nature,
inseparable from the infinite divisibility of space and time, legitimated and
fostered a mathematics dominated by an infinite continuum of real numbers, infinite
series, and the differential equations of calculus allied to these notions.
But in order to operate at all, computers need to discretize calculus by converting
these equations into finite difference equations. But even without the effects
of such discretization, physics employment of infinitary, continuum-based
mathematics is no longer secure. Well before the advent of the computer, quantum
physics was born on the overthrow of the assumption of infinite divisibility
of the quantity physicists call action and its replacement by the
formalism of discrete quanta. Moreover, in empirical terms physics is not merely
non-infinitistic but boundedly finite: such easily named quantities as 10^(-100)
meters or 10^(-100) seconds or 10^(100) particles or 10^(100) light years have
no physical meaning in the universe we inhabit. Since the introduction of computational
methods the move against continuum-based mathematics has strengthened: various
projects over the past three decades have argued why and how physics needs to
be finitized (Greenspan 1973, Feynman 1982, Wheeler 1988, Landauer
1991). One of the most extreme and uncompromising finitization projects is Edward
Fredkins hypothesis of "Finite Nature" according to which "at
some scale, space and time are discrete and that the number of possible states
of every finite volume of space-time is finite". (Fredkin 1992) Along with
this goes the insistence that the entire physical universe be understood
somehow -- as a single device, a digital computing machine whose computations
are the processes of the universe and whose software constitutes what we call
the laws of physics.
With this in mind let us return to simulation -- by which is meant a digital
mock-up or electronic model of a given physical or mathematical process. To
mimic a process by computation it is necessary to create a digital version of
a typical state of the process together with a rule for changing states with
the idea that iterating the rule through repeated recursion will give a simulation
of the original process. To witness the mimicry the results are visualized,
that is plotted, graphed, or displayed so they present a picturable world, a
visual, on-screen image whose electronic behaviour can be software manipulated.
A simulation is created, then, in two phases, discretization and visualization
which together constitute a virtual world close enough to the original one (physical
or mathematical) to be a valuable investigative and creative tool.
All of the above applies to mathematical as well as physical worlds. However,
in the case of mathematics, where there is apparently no external world to be
simulated and which is already in the business of creating, representing, and
studying virtual worlds, the importance and novelty of the simulational mode
is easily missed. One reason is that the visualization it offers seems merely
an extension (enormously refined and powerful to be sure) of time-honoured methods
of plotting and curve-drawing which, however useful as cognitive aids, are eliminable
-- adding nothing substantive to mathematics. Such a viewpoint, held unproblematically
by most mathematicians, is misleading for several reasons.
In the first place the very premise that diagram and curve-drawing in mathematics
are epiphenomenal, of undoubted psychological and practical value but of no
epistemological worth, is a mistaken result of programmes of rigor, such as
that of Bourbakis rewriting of mathematics within a set-theoretical framework,
that neither know nor are interested in how mathematics is actually made or
changes. (Rotman 2000). Second, plotting and visualization techniques in the
past were severely limited in scope and epistemic importance by the difficulty
of performing more than a small number of the relevant calculations, a fact
which made it easy to naturalize their results into a mere picturing of pre-existing
ideas, to be jettiosoned once the idea had been symbolized. Quite to the contrary,
the computer creates complex topological surfaces, fractal functions, and iteration-based
entities which were previously not only invisible but unimagined, unconceived.
The consequent increase in mathematical knowledge has been considerable, ranging
from the discovery of new topological surfaces to the opening up of novel research
topics. (Friedhoff & Benzon 1991, Munzner 1995). Clearly, it would be a
reductive misapprehension to think of simulation as mere visualization.
Thirdly, the very focus on the visualization phase allows digital simulation
to be reduced to the business of picturing and so occludes the radically transformative
changes set in train by the prior phase of discretization and the real-time
iteration of a rule it facilitates both of which engage the finite discretum
and not the infinite continuum.
But, leaving visualization aside, the style and character of a simulation-inflected
mathematical science - pragmatic, material, experimental -- breaks with
mathematics traditional understanding of itself as a pure theoretical
and deductive science. The advent of an electronically modified Agent de-destabilizes
the triadic assemblage of actors that has held the pure mathematician
in place since classical times. In the wake of this empirical that is
impure -- Agent comes a digitally refigured Person who can intuit, imagine and
recognize the new sorts of mathematical objects, simulations and iterations,
delivered by computation; and at the same time a digitally refigured Subject
is required with the language and writing system the appropriate mathematical
software necessary to mediate the traffic between Person
and Agent.
Traditonal mathematics proves the existence of its abstract objects and establishes
claims about them by constructing logically controlled narratives of imagined
procedures using an Agent who is an ideal, infinite possibility. Simulational
mathematics, substitutes empirical observation for logical validation and exhibits
rather than proves the existence of its material objects via actual, real-time
computations; its Agent is a material, finite actuality.
What, then, are the consequences for future mathematics of this difference?
As weve already observed, there a change in practical methods of mathematical
enquiry: empirical mathematics, frees the Person of the burden of proof and
thereby liberates other, previously uninteresting or unrecognized ways of mathematical
thought and activity. This has implications for the different kinds of objects
and processes that the two styles can engage. Essentially, mathematical assertions
are predictions about what will happen if certain specified operations are carried
out. By insisting that its assertions be logically proved, classical pure mathematics
restricts itself to investigating procedures whose outcomes, in principle at
least, can be predicted. It is not equipped to investigate the fate of uncompressible
processes whose outcomes cannot be foretold; programs that have to be run in
order to reveal their final state; sequences of 0s and 1s which are only specifiable
by a rule equal in length to them. These are precisely the sort of mathematical
object simulational-empirical mathematics handles well. And since there are
in fact vastly more of such uncompressible processes and irreducible sequences
than the predictable kind, the territory of empirical mathematics is potentially
huge compared to that of the classical variety. This is essentially the claim
Stephen Wolfram makes for his "new science" based on simulational
methods (computational iteration of simple programs such as cellular automata)
which represents a "major generalization of mathematics with new
ideas and methods, and a vast new areas to be explored." (Wolfram 2002).
Finally, to put the matter ontologically, in terms of what exists and is observable
mathematically by virtue of the very materiality of computation, we can note
the importance of actual versus ideal counting and iteration procedures: simulational
mathematics achieves its difference and power precisely because it has access
to effects the actual results of iterating real-world operations -
unavailable to the idealized, necessarily internal, that is rigorously separate
from the world, regularities of classical reasoning.
Part II Infinity in Computer Science
a. An intractable problem
As weve observed, for a physics increasingly committed to a discrete,
quantized universe the use of infinity within its mathematical apparatus, via
continuity, unlimited divisibility, and real numbers, raises certain problems.
For theoretical computer science such a consideration seems irrelevant, since
its objects and processes and their mathematical formulations are discrete from
the start.
But this doesnt settle the question of theoretical computer sciences
relation to infinity: the science rests on the integers which form an infinite
progression and this very attribute seems as intrinsic to computing as infinite
divisibility was until recently to physics. In light of this, one can legitimately
ask whether the presence of infinity within its mathematical formalism is, or
might turn out to be, a problem for the science of computing.
I shall respond by concentrating on a combinatorial difficulty that goes back
to the early days of the subject, which has subsequently given rise to a mathematized
and highly generalised metaproblem, a problem about problems, namely the notoriously
difficult P = NP question, "considered to be one of the most important
problems in contemporary mathematics and theoretical computer science."
(Sipser 1992)
A famous examples of the difficulty that P = NP mathematizes is the Travelling
Salesman Problem. A salesman has to visit each of N cities just once and return
to his starting city. He has a table of all the intercity distances and wants
to calculate the length of the shortest circuit that will accomplish this. How
should he proceed? The problem seems ideal for the mindless number-crunching
computers excel at: the salesman tells the computer to find the length of every
circuit in turn and output the length of the shortest. But this involves an
extravagant computational cost as N increases. Thus, for small values, say N=10,
the number of circuits to check is (N-1)!= 9! or 362,880 circuits, a computational
fleabite for presentday computers. For a slightly bigger value, say N=20, the
number is approximately 10^(17), a very large collection of circuits but checkable
in a few months by an array of state-of-the-art computers. For N=50, or so,
no presentday computer or any conceivable improvement thereof could complete
a search through all possible circuits, whilst for N=100 the number is outside
the realm of the possible.
Though it features distances and circuits through cities the underlying issue
here is not tied to these notions. The difficulty facing the Salesman surfaces
in a vast web of seemingly unrelated problems, numbering several thousand, that
arise naturally in graph theory, scheduling, routing, and timetabling, in cryptography,
in Boolean logic, and so on over a wide range of disparate combinatorial situations.
Some, like the Salesmans question are optimization problems: they ask
for the largest or smallest value of a numerical parameter such as circuit length,
flow through a network, the size of a subset, etc. Others are in the form of
decision problems which simply demand a yes or no answer, such as the Boolean
Satisfaction Problem: given a Boloean expression in N variables, is there an
assignment of 1(Truth) or 0(False) to the variables such that the whole expression
has value 1? The two forms are for the most part convertible into each other.
Regardless of their diverse contexts and origins, all these problems can be
seen as posing the same basic question: Is there a way of finding the optimum
value or the yes/no answer that does not resort to the brute-force method of
checking through all the possibilities? In computational terms, is there an
algorithm accomplishing the relevant task that runs in a feasible amount of
time, one that doesnt, in other words, require an impossibly large number
of computational steps for a small or reasonably small value of its input variable
N.
To make this question meaningful, theoretical computer science requires a mathematical
definition of feasible. Now, one cannot expect such a definition
to address particular numerical facts (the computational leap from N=10 to N=50
we saw above) or, by the same token, to address what reasonably small
might mean. As a consequence, the question of feasibility is perforce posed
generically, in terms of an unrestricted number variable N, and takes the form
of a comparison between one type of mathematical function of N and another for
all values of N. Thus, since brute force requires an exponential number of computational
steps ((N-1)! for the salesmans problem), one has to define feasible
to mean a number of steps bounded above by some function f(N) which must be
guaranteed to be smaller than any exponential function of N. The decision by
the computer science community to choose f to be a polynomial function was very
natural: polynomials are a robust class of functions, closed with respect to
addition, multiplication, and composition whose well-known properties include
the crucially relevant mathematical fact that any polynomial function is eventually
smaller than any exponential function.
Defining feasible as polynomial time was the founding move of the theory of
algorithms, and though it attracted certain criticisms, the choice made possible
the development of complexity theory, an "elegant and useful theory that
says something meaningful about practical computation". (Papadimitriou
1994) Elegant certainly, but just how meaningfully related to practical computation
is as we shall see a matter of dispute.
For theoretical purposes the yes/no decision problem formulation rather than
the optimization formulation is more convenient. The class of all decision problems
solvable in polynomial time is denoted by P. Observe that if a problem D belongs
to P, that is, if there exists a feasible algorithm finding a solution s answering
D, then D has the additional property that any proposed solution s to D can
be checked for correctness in polynomial time. This is because one can run the
algorithm solving D and compare the result with s to determine whether s is
or is not a solution to D, all of which requires no more than polynomial time.
The class of all such decision problems which can be checked polynomially is
denoted by NP. Our observation amounts to the fact that P is automatically a
subclass of NP. The great open question enshrined in the equation P = NP thus
asks for the reverse inclusion: is NP contained in P? In other words, if one
can verify a possible solution to a decision problem D in polynomial time can
one find a solution to D in polynomial time?
Evidently, the concept of polynomial is the organizing idea behind the P = NP
problematic at every level; without it the question would collapse. This is
so for several reasons. First, the problem itself, as a mathematization of specific
combinatorial situations such as the Travelling Salesmans problem, depends
totally on the polynomial-based definition of feasibility; in fact the problem
is an artefact of this definition, brought into being with it and impossible
to conceive without it. Second, its encapsulation of the many thousands of diverse
concrete instances, of which the Salesmans problem is merely one, into
a single meta problem depends repeatedly on the unbounded freedom that results
from polynomial composition, namely if f( ) and g( ) are polynomial function
then so is f(g( )). Last, and by no means least, the formulation would make
no sense in terms of its thousands of motivating concrete instances, all of
which are resolvable by brute exponential search routines, were
it not for the mathematical fact that any polynomial function is smaller than
any exponential function.
But it is this last, quite crucial fact that is precisely the site of a dissatisfaction
with complexity theory, the point at which its practical relevance has been
challenged; and not coincidentally, as we shall now see, it is where infinity
enters the mathematical understanding of algorithms put in place by computer
science.
b. asymptotic growth
Defining a feasible computation as one with a polynomial run-time rests, then,
on a basic inequality that polynomial functions are smaller than exponential
ones. The guarantee behind the inequality lies in an asymptotic definition of
smaller than which refers to behaviour of functions at infinity.
Specifically, a function p( ) is (asymptotically) smaller than a function q(
) if there exists some number M such that p(N) < q(N) for all values of N
greater than M. In other words, the definition refers, in a way that cannot
be eliminated, to sufficiently large N. Thus, only in the limit, as N goes to
infinity, is the relative size of two functions resolvable, so as to make it
true that any polynomial function p is smaller than any exponential function
q.
But suppose allowing passage to the limit, that is, invoking sufficiently large
values of N, is ruled out? What if the values of N are constrained to lie below
some fixed bound b? In that case, the guarantee vanishes, however large b is:
one can, in other words, no longer assume, in the presence of a bound, that
polynomials are automatically smaller than exponentials. In fact, under the
condition of boundedness, the reverse can always be made to happen, polynomials
can be larger than exponentials: for any exponential function q( ), it is always
possible to find a polynomial p( ) such that p(N) > q(N) for all N below
b.
The definition of feasibility is supposed to ensure achievable computation times.
But what, in the real, finite world where the demand for it originates, is to
be meant by achievable? For, on any presently conceivable understanding of physical
reality the existence of a bound b is inescapable. This will follow from local
limits, such as those imposed by computer scientists not being able to wait
for unspecified periods for algorithms to terminate, to global ones, such as
the universe we inhabit appearing not to permit an arbitrarily large sequence
of physical actions to be executed.
The existence of a bound obviously disallows any appeal to an infinite range
of values of N and so prevents the theory of complexity from operating. Indeed,
the presence of a bound brings into relief a clutch of questions: What sense
does it make sense for a science of computing to formulate its fundamental concept
of feasibility in terms of an infinitistic criterion that necessarily appeals
to numbers outside the reach of its computational processes? Pragmatically,
what is the relevance of the P = NP question for categorizing (let alone understanding)
the possibilities and limits of computation? Why would its solution have anything
to say about real feasibility, about the possibility of finding or not being
able to find algorithms that worked within the limits and freedoms of the real
world rather than those of the idealized, boundless universe of infinitistic
mathematics?
These questions are far from idle. In cryptography, for example, one has to
ensure that an encrypted message is safe from attack, that no method of decryption
with a practically achievable running time exists, regardless of how such a
method might be mathematically characterized. In such a context, equating practically
achievable with the existence of a polynomial-time algorithm makes little
sense, and the entire P = NP problematic begins to appear beside the point.
Unsurprisingly, this is precisely the judgement voiced by many cryptographers
and engineers more concerned with practicalities than mathematical elegance.
Thus, for example, about the central cryptographic task of finding the factors
of an integer, Steve Morgan observes "Even if factoring is polynomial,
it isnt necessarily practical
A polynomial algorithm of say the
order of N^20 is essentially intractable even for small values of N. Conversely,
an algorithm that ran in the order of 2^(0.1N) would make factoring billion
bit numbers easy." (quoted Reinhold 1995)
Moreover, it is not even necessary to invoke high-degree polynomials to make
the point; exactly the same objection arises from the possibility of large multiplicative
constants being present. Thus if k is such a constant, 10^(30) say, then an
algorithm running in linear time, that is of order kN, will be intractable for
small values of N, whereas an exponential algorithm of the order of 2^(N/K)
would be tractable for all values of N that could possibly occur in practice.
These examples illustrate that not only is excluding exponential running times
and identifying feasibility with polynomial time not to the point, but it runs
counter to the underlying intuition relating to smallness, namely
that if (the input to) a problem is small, one wants a feasible computation
solving the problem to be small or at least near small. But whilst 10, say,
is by common agreement a small value of N one could not claim that 10^20 (one
hundred billion billion), let alone 10^30, is small or nearly so. These objections
have been long known and long shrugged off by complexity theorists as the price
of doing (asymptotic) business. If prodded the only response they can offer
is that N^20, though undeniably a polynomial function, is artificial, that in
practice, most familiar algorithms are of the order N^n where n is possibly
as large as 5 or 6 but certainly a number much smaller than 20, and that likewise
large constants such as k instanced above are simply not natural. But this appeal
is hardly in good-faith: the concepts of natural, and in practice,
as well as a notion of scale or size, and the opposition
of small/large, are precisely what have been expunged
from the theorys asymptotic account of feasibility.
In fact, the P = NP question and along with it the entire apparatus of algorithmic
complexity and the asymptotic characterization of feasible that
engendered it, is not merely irrelevant to cryptography, but has been deemed
conceptually deceptive. For A. Reinhold, the apparatus is a "naked emperor",
whose effect has been that "a whole generation of computer scientists wrongly
believes that complexity theory illuminates computation and that the P = NP
problem is the missing link to a theoretical basis for cryptography." (Rheinhold
1995)
Nor is it the case that cryptographers have special, ungeneralizable needs.
Their criticism is sharpened perhaps by the financial, security, and military
stakes involved in questions of decryption, but their overall objection stems
from unavoidable pragmatic demands; as such it arises in any technological or
engineering arena that looks to computer science for algorithms to solve real-world
numerical problems in real-world times.
The presence of a bound, then, not only eliminates the possibility of any asymptotically-based,
and in particular polynomial, criterion of feasibility, but it offers a basis
to start re-thinking the question of feasible computation. Thus, as has been
suggested by several researchers, one can divide the integers into two separate
regions, H1 and H2, below and above b respectively, and work inside H1 within
a limited universe of discourse situated far below b. Thus, Reinhold, for example,
suggesting the value 2^(2^1000) for b, goes on to comment that "the undue
attention paid to classical complexity theory arises from an inclination to
assume results that are true in H2 must somehow be true in H1. But there is
neither a theoretical not practical basis for this belief." Op. Cit.
Indeed, there is not, and working below b takes a first step toward re-thinking
the feasibility question. It achieves this by making explicit the counter-productive
nature of letting N go to infinity in a directly practical way.
More significantly perhaps, taking b as a dividing point does induce a numerical
scale or size into the proceedings and hence the possibility of defining small
and large in relation to it. But as it stands the approach is crude:
the bound b is imposed from outside and its value, chosen deliberately to be
so large as to be remote from any actualizable computation, is for this very
reason unconnected to anything real or observable. Any scale thus induced will
be extrinsic to such computations and their limits and unable to state (let
alone address) the expectation, attached to the intuitive notion of feasibility,
that computational time of a solution be of the same order of magnitude
(league, ballpark, size,numerical
reach) as the problem; that is, the expectation that for small
N the solution time of an N size problem should also be small or nearly so but
not in any event large
.
The irrelevance to cryptography of the P = NP question foregrounds a disconnect
between an infinitistic mathematical concept of feasibility and an intuitive,
pragmatic one; a gulf between the demands of real-world computation and the
idealized model associated with the classical sequence of integers. It will
be obvious that the gulf is a re-occurrence, posed in a particularly dramatic
and focussed form, of the theoretical difference between EMPIRICAL and classical
mathematics identified earlier. It would seem natural, then, to attempt to tackle
the question of feasibiity and hence perhaps the P = NP problematic anew using
the resources and possibilities put into play by simulational methods. Whether
this attempt to use empirical mathematics can yield dividends is surely knowable
only by experiment, but it is not clear that the attempt makes much sense as
an approach to the P = NP problem itself: the very status of this question as
a pre-eminent unsolved problem of contemporary pure mathematics indicates how
deeply folded it is within the world-view of classical mathematics. In which
case the problem might be the baby that empirical mathematics throws out along
with the infinitistic bath water.
Granting this, is there a different mathematical framework for thinking feasibility,
one which tries to navigate between the classical and empirical routes? If there
is, finding it would seem to entail overcoming a large and difficult obstacle:
how to effect a conceptual escape from the great attractor of the classical
integers. No easy matter, since it necessitates refusing the core belief of
classical mathematics, the time-honoured "Dogma of Natural Numbers"
(Rashevskii 1973), namely their naturality; their objective, non-human,
existence as the given, true and only idealization of
actual counting and iteration
.
The integers embody a formalism and an idea, a syntax and a semantics; one can
identify two sorts of attempts to escape from this dogma depending on which
aspect is given priority. The first makes questions of significance and meaning
subsidiary by starting from a symbolic apparatus based on formal definitions
and concepts able to be worked to produce mathematical results, and only subsequently
does it evaluate the meanings thereby facilitated and imposed. The second reverses
the procedure and insists on an intuition first, a convincing alternative picture
of what counting and iteration are to mean before any particular formalism is
offered. Examples of both can be found. The first approach has been pursued
by Vladimir Sazanov who introduces the inequality loglog N < 10 as a formal
axiom constraining the value of any N considered to be a feasible number, and
proves various metamathematical theorems on the internal consistency of the
effect of such a definition (Sazonov 1995, 1998). The second has been pursued
by the present writer who focusses on the semiotics of the process of repetition
inherent to the integers to refuse their naturality, and outlines
an alternative, non-classical understanding of counting and arithmetic (Rotman
1993, 1998). Whether either of these approaches will succeed in overcoming the
dogma of their naturality remains to be seen; evidently, the stakes in challenging
the notion are high.
Conclusion
Yes, of course, the computer will transform classical mathematics, surely this
is no longer in question. And it is reasonable to predict that it will do so
along the same lines machinic, digital, material which have determined
the changes it has so far wrought. Let me summarise what these are.
The computer is a machine for investigating mathematical reality; it is reconfiguring
the mathematical imaginary and reasoning in relation to repetition
as radically as the microscope/telescope reconfigured vision and seeing
in relation to scale; in its wake mathematical thought will never be the same.
Its machinic ability to recursively repeat the application of a rule far beyond
unaided human capacity enables it to transcend the knowledge and proof available
to such capacity, principally by engendering hitherto unknown and unsuspected
mathematical objects, thus confirming the fact that repetition of the same can
produce difference. In this it appears to replay on a fantastically more
complex plane the primal creative act of mathematics, namely counting,
which produces an endless stream of new, ever different numbers through the
recursive iteration of the selfsame operation of adding the unit.
The computer is a machine whose conception, operation, construction, and infrastructure
are digital: it is antagonistic to the analog and to notions of smoothness and
unbroken continuity associated with it. As a result it presents a conceptual
and methodological opposition to the real number continuum which is the formalization
within classical mathematics of these notions; and, less directly, to the concept
of infinity folded into the very concept of real number. A major effect of this
opposition to continuum-based mathematics, aided by independent moves within
physics to quantize all the parameters measuring physical reality, as weve
seen, is to push mathematics away from infinite methods in the direction of
the discretely finite
The computer is a material machine as against the ideal, immaterial object of
classical mathematics, the Turing machine, it is modelled on. Two consequences
flow from this fact. One, the topic of part two of the present essay, concerns
a difficulty that arises when this difference is elided and the ideal, potentially
infinite mathematics of the model is used to describe and investigate real-world
computations. How seriously theoretical computer science takes this difficulty
remains to be seen. The second consequence, more pertinent to the future of
mathematics, stems from a kind of reversal of the first, namely capitalizing
on the difference between real and imagined counting. The reversal works by
actually running a program on a material machine and observing the result rather
than theoretically running it on an ideal machine and predicting the result,
a procedure which delivers into mathematics effects from the so-called real
world, effects mathematical objects and relations -- available to the
mathematical Person in no other way. The result is the opening up of mathematics
by a powerful new Agent in the direction of an empirical science; a development
that not only goes against its time-honoured status as a purely theoretical
pursuit, but is a dramatic and radical reversal of the last two centuries
efforts to eliminate all traces of the physical world from mathematics
definitions and methods in the name of an abstract programme of mathematical
rigor.
Finally, a theme of this essay has been the computers impact on classical
mathematics core concept of infinity. This should not be misunderstood.
Certainly, the computers ongoing colonization of mathematics, particularly
in the direction of empirical simulations, is bringing about a widespread de-emphasis
of the concept. But to say this is not to decry the concept as such or interpret
the computers impact as destroying, eliminating or deconstructing it.
What is involved is not an attack on it, or a repudiation of its mathematical
legitimacy and usefulness, or a refusal of the validity of infinitary mathematics
pursued within its own terms, but rather a fresh realization or re-cognition
of its deeply ideal status in the light of computational methods. In the light,
that is, of an uncompromisingly finite Agent who, as a result of this very finitude
is able to sanction use of the concept on the level of the Subject and Person,
but as a convenient way of speaking, a permitted abuse of language, that despite
it literal meaning has no empirical content. However, it would be
naïve to think that such a re-cognition will not impinge ultimately on
the status and unchallenged cultural prestige of classical, infinitistic mathematics.
Keywords
Agent, classical, computer, continuum, digital, discrete, finite, infinite,
integer, machine, material, mathematics, Person, polynomial, procedure, Subject
It is a pleasure to acknowledge my gratitude to Prof. Alistair MacFarlane for
his illuminating comments and suggestions on an earlier draft of this essay;
without them it would have been a poorer thing
.
References and Bibliography
Feynman, R.: Simulating physics with computers, International Journal of Theoretical
Physics, Volume 21, (6-7), pp.467-468, 1982.
Fredkin, E.: Finite nature (1992)
http://www.im.lcs.mit.edu/poc/fredkin/Finite-Nature
Friedhoff, R and Benzon, W.: Visualization: the second computer revolution,
(W.H.Freeman, 1991).
Goldreich, O.: Electronic Colloquium on Computational Complexity, Lecture Notes
Series, 1995
http://www.eccc.uni-trier.de/eccc-local/ECCC-LectureNotes/goldreich/oded.html
Greenspan, D.: Discrete models (Addison Wesley, 1973).
Kaufmann,W. and Smarr, L.: Supercomputing and the transformation of science,
(Scientific American Library, 1993).
Landauer, R.: Information is physical, Physics Today, May 1991, pp. 23-29
Leivant, D.: (Editor) Logic and computational complexity, LNCS, Volume 960,
1995, pp.30-51.
Papadimitriou, C.: Computational complexity (Addison Wesley, 1994).
McClure, W.: Solution of the Robbins problem, J. Automatic Reasoning, Volume
19(3), 1997, pp. 263-76.
Munzner, T. et al: Interactive methods for visualizable geometry, (1995)
http://www.geom.umn.edu/~munzner/ieee94/ieee/ieee.html
Rashevskii, P.K.: On the dogma of natural numbers, Russian Mathematical Surveys,
Volume 28 (4), 1973, pp. 143-48.
Rheinhold, A.: P = NP doesnt affect cryptography (1995)
http://world.std.com/~reinhold/p=np.txt
Rotman, B.: Ad infinitum
the ghost in Turings machine, (Stanford
University Press, 1993).
Rotman, B.: Mathematics as sign, (Stanford University Press, 2000).
Rotman, B.: Response to Patrick Peccatte, Foundations of Mathematics List, 5/14/1998
http://ftp.psu.edu/simpson/fom/postings
Sazonov, V.: On Feasible Numbers, in (Leivant, D. 1995)
Sazonov, V.: Reply to Rotman, Foundations of Mathematics List, 8/31/98
http://ftp.psu.edu/simpson/fom/postings
Sipser, M.: The History and Status of the P versus NP question, 24th Annual
ACM, May 1992, pp.603-618.
Smale, S.: Mathematical problems for the next century, The Mathematical Intelligencer,
Volume 20 (2), 1998, pp. 7-15.
Soloman, R.: On finite simple groups and their classification, Notices of the
AMS, 1995, pp. 231-39.
Wheeler, J.H.: World as system self-synthesized by quantum networking, IBM Journal
of Research and Development, Voume 32 (1), 1988, pp.4-15.
Wolfram, S.: A new kind of science, (Wolfram Media Inc., 2002)