Introduction to Robotics
The history of industrial automation is characterized by periods of rapid change in
popular methods. Either as a cause or, perhaps, an effect, such periods of change in
automation techniques seem closely tied to world economics. Use of the industrial
robot, which became identifiable as a unique device in the 1960s, along with
computer-aided design (CAD) systems and computer-aided manufacturing (CAM)
systems, characterizes the latest trends in the automation of the manufacturing
process. These technologies are leading industrial automation through another
transition, the scope of which is stifi unknown.
In North America, there was much adoption of robotic equipment in the early
1980s, followed by a brief pull-back in the late 1980s. Since that time, the market has
been growing, although it is subject to economic swings, as are all markets. The number of robots being installed per year in the major
industrial regions of the world. Note that Japan reports numbers somewhat differently
from the way that other regions do: they count some machines as robots
that in other parts of the world are not considered robots (rather, they would be
simply considered "factory machines"). Hence, the numbers reported for Japanare
somewhat inflated.
A major reason for the growth in the use of industrial robots is their declining
cost. Through the decade of the 1990s, robot prices dropped
while human labor costs increased. Also, robots are not just getting cheaper, they
are becoming more effective—faster, more accurate, more flexible. If we factor
these quality adjustments into the numbers, the cost of using robots is dropping even
faster than their price tag is. As robots become more cost effective at their jobs,
and as human labor continues to become more expensive, more and more industrial
jobs become candidates for robotic automation. This is the single most important
trend propelling growth of the industrial robot market. A secondary trend is that,
economics aside, as robots become more capable they become able to do more and
more tasks that might be dangerous or impossible for human workers to perform.
The applications that industrial robots perform are gradually getting more
sophisticated, but it is stifi the case that, in the year 2000, approximately 78%
of the robots installed in the US were welding or material-handling robots.
A more challenging domain, assembly by industrial robot, accounted for 10% of
installations.
This book focuses on the mechanics and control of the most important form
of the industrial robot, the mechanical manipulator. Exactly what constitutes an
industrial robot is sometimes debated. Devices such as that shown in are
always included, while numerically controlled (NC) milling machines are usually
not. The distinction lies somewhere in the sophistication of the programmability of
the device—if a mechanical device can be programmed to perform a wide variety
of applications, it is probably an industrial robot. Machines which are for the most
part limited to one class of task are considered fixed automation. For the purposes
of this text, the distinctions need not be debated; most material is of a basic nature
that applies to a wide variety of programmable machines.
By and large, the study of the mechanics and control of manipulators is
not a new science, but merely a collection of topics taken from "classical" fields.
Mechanical engineering contributes methodologies for the study of machines in
static and dynamic situations. Mathematics supplies tools for describing spatial
motions and other attributes of manipulators. Control theory provides tools for
designing and evaluating algorithms to realize desired motions or force applications.
Electrical-engineering techniques are brought to bear in the design of sensors
and interfaces for industrial robots, and computer science contributes a basis for
programming these devices to perform a desired task.
The following sections introduce some terminology and briefly preview each of the
topics that will be covered in the text.
In the study of robotics, we are constantly concerned with the location of objects in
three-dimensional space. These objects are the links of the manipulator, the parts
and tools with which it deals, and other objects in the manipulator's environment.
At a crude but important level, these objects are described by just two attributes:
position and orientation. Naturally, one topic of immediate interest is the manner
in which we represent these quantities and manipulate them mathematically.
In order to describe the position and orientation of a body in space, we wifi
always attach a coordinate system, or frame, rigidly to the object. We then proceed
to describe the position and orientation of this frame with respect to some reference
coordinate system. Any frame can serve as a reference system within which to express the
position and orientation of a body, so we often think of transforming or changing
the description of these attributes of a body from one frame to another. Chapter 2
discusses conventions and methodologies for dealing with the description of position
and orientation and the mathematics of manipulating these quantities with respect
to various coordinate systems.
Developing good skifis concerning the description of position and rotation of
rigid bodies is highly useful even in fields outside of robotics.
Kinematics is the science of motion that treats motion without regard to the forces
which cause it. Within the science of kinematics, one studies position, velocity, acceleration, and all higher order derivatives of the position variables (with respect
to time or any other variable(s)). Hence, the study of the kinematics of manipulators
refers to all the geometrical and time-based properties of the motion.
Manipulators consist of nearly rigid links, which are connected by joints that
allow relative motion of neighboring links. These joints are usually instrumented
with position sensors, which allow the relative position of neighboring links to be
measured. In the case of rotary or revolute joints, these displacements are called
joint angles. Some manipulators contain sliding (or prismatic) joints, in which the
relative displacement between links is a translation, sometimes called the joint
offset.
The number of degrees of freedom that a manipulator possesses is the number
of independent position variables that would have to be specified in order to locate
all parts of the mechanism. This is a general term used for any mechanism. For
example, a four-bar linkage has only one degree of freedom (even though there
are three moving members). In the case of typical industrial robots, because a
manipulator is usually an open kinematic chain, and because each joint position is
usually defined with a single variable, the number of joints equals the number of
degrees of freedom.
At the free end of the chain of links that make up the manipulator is the endeffector.
Depending on the intended application of the robot, the end-effector could
be a gripper, a welding torch, an electromagnet, or another device. We generally
describe the position of the manipulator by giving a description of the tool frame,
which is attached to the end-effector, relative to the base frame, which is attached
to the nonmoving base of the manipulator. A very basic problem in the study of mechanical manipulation is called forward
kinematics. This is the static geometrical problem of computing the position and
orientation of the end-effector of the manipulator. Specifically, given a set of joint angles, the forward kinematic problem is to compute the position and orientation of
the tool frame relative to the base frame. Sometimes, we think of this as changing
the representation of manipulator position from a joint space description into a
Cartesian space description.
Introduction to artificial intelligence
"Artificial intelligence" is the ability of machines to do things
that people would say require intelligence. Artificial intelligence (AI)
research is an attempt to discover and describe aspects of human intelligence
that can be simulated by machines. For example, at present
there are machines that can do the following things:
1. Play games of strategy (e.g., Chess, Checkers, Poker) and
(in Checkers) learn to play better than people.
2. Learn to recognize visual or auditory patterns.
3. Find proofs for mathematical theorems.
4. Solve certain, well-formulated kinds of problems.
5. Process information expressed in human languages.
The extent to which machines (usually computers) can do these
things independently of people is still limited; machines currently exhibit
in their behavior only rudimentary levels of intelligence. Even so, the
possibility exists that machines can be made to show behavior indicative
of intelligence, comparable or even superior to that of humans.'
Alternatively, AI research may be viewed as an attempt to develop
a mathematical theory to describe the abilities and actions of things
(natural or man-made) exhibiting "intelligent" behavior, and serve as a
calculus for the design of intelligent machines. As yet there is no "mathematical
theory of intelligence," and researchers dispute whether there
ever will be.
This book serves as an introduction to research on machines that display intelligent behavior. Such machines some fimes
will be called "artificial intelligence's," "intelligent machines," or "mechanical
intelligence's."
The inclination in this book is toward the first viewpoint of AI research,
without forsaking the second. Since AI research is still in its
infancy, it is therefore prudent to withhold estimation of its future. It is
best to begin with a summation of present knowledge, considering such
questions as:
1. What is known .about natural intelligence?
2. When can we justifiably call a machine intelligent?
3. How and to what extent do machines currently simulate intelligence
or display intelligent behavior?
4. How might machines eventually simulate intelligence?
5. How can machines and their behavior be described mathematically?
6. What uses could be made of intelligent machines?
Each of these questions will be explored in some detail in this
book. The first and second questions are covered in this chapter. It is
hoped that the six questions are covered individually in enough detail
so that the reader will be guided to broader study if he is so inclined.
For parts of this book, some knowledge of mathematics (especially sets,
functions, and logic) is presupposed, though much of the book is understandable
without it.
TURING'S TEST
A basic goal of AI research is to construct a machine that exhibits
the behavior associated with human intelligence, that is, comparable to
the intelligence of a human being. It is not required that the
machine use the same underlying mechanisms (whatever they are) that
are used in human cognition, nor is it required that the
machine go through stages of development or learning such as those
through which people progress.
The classic experiment proposed for determining whether a machine
possesses intelligence on a human level is known as Turing's test (after
A. M. Turing, who pioneered research in computer logic, undecidability theory, and artificial intelligence). This experiment has yet to be performed
seriously, since no machine yet displays enough intelligent
behavior to be able to do well in the test. Still, Turing's test is the basic
paradigm for much successful work and for many experiments in
machine intelligence, from the Samuel's Checkers Player to "semanticinformation
processing" programs such as Colby's PARRY or Raphael's.
Basically, Turing's test consists of presenting a human being, A,
with a typewriter-like or TV-like terminal, which he can use to converse
with two unknown (to him) sources, B and C ). The
interrogator A is told that one terminal is controlled by a machine and
that the other terminal is controlled by a human being whom A has
never met. A is to guess which of B and C is the machine and which is
the person. If A cannot distinguish one from the other with significantly
better than 50% accuracy, and if this result continues to hold no matter
what people are involved in the experiment, the machine is said to
simulate human intelligence.
Some comments on Turing's test are in order. First, the nature
of Turing's test is such that it does not permit the interrogator A to observe
the physical natures of B and C; rather, it permits him only to
observe their "intellectual behavior," that is, their ability to communicate
with formal symbols and to "think abstractly." So, while the test
does not enable A to be prejudiced by the physical nature of either
B or C, neither does it give a way to compare those aspects of an
entity's behavior that reflect its ability to act non abstractly in the real
world-that is, to be ·intelligent in its performance of concrete operations
on objects. Can the machine, for example, fry an egg or clean
a house?
Second, one possible achievement of AI research would be to produce
a complete description of a machine that can successfully pass
Turing's test, or to find a proof that no machine can pass it. The complete
description must be of a machine that can actually be constructed.
A proof that there is no such constructive machine (it might say, e.g.,
"The number of parts in such a machine must be greater than the
number of electrons in the universe.") is consequently to be regarded
as a proof of the "no machine" alternative.
Third, it may be that more than one type of machine can pass
Turing's test. In this case, AI research has a secondary problem of
creating a general description of all machines that will successfully pass
Turing's test.
Fourth, if a machine passes Turing's test, it means in effect that
there is at least one machine that can learn to solve problems as well as
a human being. This would lead to asking if a constructive machine can
be described which would be capable of learning to solve not only those
problems that people can usually solve, but also those that people create
but can only rarely solve. That is, is it possible to build mechanical
intelligence's that are superior to human intelligence?
It is not yet possible to give a definite answer to any of these
questions. Some evidence exists that AI research may eventually attain
at least the goal of a machine that passes Turing's test.
It is clear that the intellectual capabilities of a human being are
directly related to the functioning of his brain, which appears to be a
finite structure of cells. Moreover, people have succeeded in constructing
machines that can "learn" to produce solutions to certain specific
intellectual problems,· which are superior to the solutions people can
produce. The most notable example is Samuel's Checkers Player, which
has learned to play a better game of Checkers than its designer, and
which currently plays at a championship level.
Software And Hardware II
Output Devices
Once data are processed, output devices translate the language of bits into a form
humans can understand. Output devices are divided into two basic categories: those
that produce hard copy, including printers and plotters; and those that produce
soft (digital) copy, including monitors (the most commonly used output device).
Soft copy is also produced by speakers that produce speech, sound, or music.
Secondary Storage Devices
The memory we have discussed so far is temporary or volatile. To save your work
permanently, you need secondary storage devices. Magnetic disk and magnetic tape
and optical disks are used as secondary storage media. Magnetic media (disk,
diskette, tape, and high-capacity Zip disks) store data and programs as magnetic spots or electromagnetic charges. High-capacity optical disks (compact disks [CDs]
or digital video disks [DVDs]) store data as pits and lands burned into a plastic disk.
Solid-state memory devices include flash memory cards used in notebooks, memory
sticks, and very compact key chain devices; these devices have no moving parts,
are very small, and have a high capacity. USB flash drives have a huge capacity for
information.
SOFTWARE
Software refers to the programs—the step-by-step instructions that tell the hardware
what to do. Without software, hardware is useless. Software falls into two general categories:
system software and application software.
System Software
System software consists of programs that let the computer manage its resources.
The most important piece of system software is the operating system. The operating
system is a group of programs that manage and organize resources of the computer.
It controls the hardware, manages basic input and output operations, keeps track
of your files saved on disk and in memory, and directs communication between
the CPU and other pieces of hardware. It coordinates how other programs work
with the hardware and with each other. Operating systems also provide the user
interface—that is, the way the user communicates with the computer. For example,
Windows provides a graphical user interface, pictures or icons that you click on with
a mouse. When the computer is turned on, the operating system is booted or loaded
into the computer’s RAM. No other program can work until the operating system is
booted.
Application Software
Application software allows you to apply computer technology to a task you need
done. There are application packages for many needs.
Word-processing software allows you to enter text for a paper, report, letter, or
memo. Once the text is entered, you can format it, that is, make it look the way you
want it to look. You can change the size, style, and face of the type. In addition, margins
and justification can be set to any specifications. Style checkers can help you
with spelling and grammar. Word-processing software also includes thesauri, headers
and footers, index generators, and outlining features.
Electronic spreadsheets allow you to process numerical data. Organized into
rows and columns intersecting to form cells, spreadsheets make doing arithmetic
almost fun. You enter the values you want processed and the formula that tells the
software how to process them and the answer appears. If you made a mistake entering
a value, just change it and the answer is automatically recalculated. Spreadsheet
software also allows you to create graphs easily—just by indicating what cells you
want graphed. Electronic health records (EHRs) can use spreadsheets to graph a
series of a patient’s blood values over time.
Database management software permits you to manage large quantities of data
in an organized fashion. Information in a database is organized in tables. The database
management software makes it easy to enter data, edit data, sort or organize
data, search for data that meets a particular criterion, and retrieve data. Once the
structure of the table is defined and the data entered, that data can be used for a
variety of purposes without being retyped. Eye-pleasing, businesslike reports can
easily be generated by simply defining their structure.
There are also specialized software packages used in specific fields such as medicine.
For example, there are specialized accounting programs used in medical
offices. Microsoft is considering developing a new software package for the health
care industry. Communications software includes Web browsers, such as Internet Explorer.
These programs allow you to connect your computer to other computers in a
network.
Software And Hardware
Hardware
The physical components of a computer are called hardware. Pieces of hardware may
be categorized according to the functions each performs: input, process, output, and
storage. As you recall, inside the computer, all data are represented by the binary
digits (bits) 1 (one) and 0 (zero). To translate data into 1s and 0s is to digitize.
Input Devices
Input devices function to take data that people understand and translate those data
into a form that the computer can process. Input devices may be divided into two
categories: keyboards and direct-entry devices.
Direct-entry devices include pointing devices, scanning devices, smart and optical
cards, speech and vision input, touch screens, sensors, and human-biology input
devices.
The pointing device with which you are most familiar is the mouse, which you
can use to position the insertion point on the screen, or make a choice from a
menu. Other pointing devices are variations of the mouse. Light pens, digitizing
tablets, and pen-based systems allow you to use a pen or stylus to enter data. The
marks you make or letters you write are digitized.
Most scanning devices digitize data by shining a light on an image and measuring
the reflection. Bar-code scanners read the universal product codes; optical mark
recognition devices can recognize a mark on paper; optical character recognition
devices can recognize letters. Special scanning equipment called magnetic ink character
recognition (MICR) is used by banks to read the numbers at the bottoms of
checks. You are familiar with fax machines, which scan images, digitize them, and
send them over telecommunication lines. Some scanning devices, called image
scanners, scan and digitize whole pages of text and graphics. One scanning device of
particular interest to those with impaired eyesight is the Kurzweil scanner—hardware
and software—which scans printed text and reads it aloud to the user.
Radio frequency identification (RFID) tags (input devices) are now used to
identify anything from the family dog to the sponge the surgeon left in your body, by
sending out radio waves. One medical insurance company is conducting a two-year
trial with chronically ill patients who will have an RFID the size of a grain of rice
implanted. The RFID will contain their medical histories. It transmits 30 feet without
the person’s knowledge. In 2006, one U.S. company implanted chips in two of
its employees “as a way of controlling access to a room where it holds security video
footage for government agencies and police.” Several different kinds of cards are used as input devices: your automated teller
machine (ATM) card or charge card contains a small amount of data in the magnetic stripe. A smart card can hold more data and contains a microprocessor. Smart cards
have been used as debit cards. Several states now use smart cards as driver’s licenses.
The card includes a biometric identifier and may include other personal information
as well. Privacy advocates fear that there is so much information on the cards that they
can become a target for identity thieves. An optical card holds about two thousand
pages. The optical card may be used to hold your entire medical history, including
test results and X-rays. If you are hospitalized in an emergency, the card—small
enough to carry in your wallet—would make this information immediately available.
Vision input systems are currently being developed and refined. A computer
uses a camera to digitize images and stores them. The computer “sees” by having the
camera take a picture of an object. The digitized image of this object is then compared
to images in storage. This technology can be used in adaptive devices, such as
in glasses that help Alzheimer’s patients. The glasses include a database of names
and faces; a camera sees a face, and if it “recognizes” the face, it gives the wearer the
name of the subject.
Speech input systems allow you to talk to your computer, and the computer
processes the words as data and commands. A speech-recognition system contains a
dictionary of digital patterns of words. You say a word and the speech-recognition
system digitizes the word and compares the word to the words in its dictionary. If it
recognizes the word, the command is executed. There are speech dictation packages
tailored to specific professions. A system geared toward medicine would
include an extensive vocabulary of digitized medical terms and would allow the
creation of patient records and medical reports. This system can be used as an
input device by physicians who, in turn, can dictate notes, even while, for example,
operating. Speech recognition is also especially beneficial as an enabling technology,
allowing those who do not have the use of their hands to use computers. In
English, many phrases and words sound the same, for example, hyphenate and -8
(hyphen eight). Speech-recognition software allows mistakes such as these to be
corrected by talking. The newest speech-recognition software does not need training
and gets “smarter” as you use it. It looks at context to get homophones (to, too,
two) correct. Of particular interest to health professionals are input devices called sensors.
A sensor is a device that collects data directly from the environment and sends those
data to a computer. Sensors are used to collect patient information for clinical
monitoring systems, including physiological, arrhythmia, pulmonary, and obstetrical/
neonatal systems. In critical care units, monitoring systems make nurses aware of
any change in a patient’s condition immediately. They detect the smallest change in
temperature, blood pressure, respiration, or any other physiological measurement.
The newest kinds of input devices are called human-biology input devices. They
allow you to use your body as an input device. They include biometrics, which are
being used in security systems to protect data from unauthorized access. Biometrics
identify people by their body parts. Biometrics include fingerprints, hand prints,
face recognition, and iris scans. Once thought to be almost 100 percent accurate,
biometric identification systems are now recognized as far from perfect. Line-of-sight input allows the user to look at a keyboard displayed on a screen
and indicate the character selected by looking at it. Implanted chips have allowed
locked-in stroke patients (a syndrome caused by stroke where a person cannot
respond, although he or she knows what is going on) to communicate with a computer
by focusing brain waves (brain wave input); this is experimental; research is
continuing.
Processing Hardware and Memory
Once data are digitized, they are processed. Processing hardware is the brain of the
computer. Located on the main circuit board (or motherboard), the processor or
system unit contains the central processing unit (CPU) and memory. The CPU has
two parts: the arithmetic-logic unit, which performs arithmetic operations and logical
operations of comparing; and the control unit, which directs the operation of
the computer in accordance with the program’s instructions.
The CPU works closely with memory. The instructions of the program being
executed must be in memory for processing to take place. Memory is also located
on chips on the main circuit board. The part of memory where current work is temporarily
stored during processing is called random-access memory (RAM). It is temporary
and volatile. The other part of memory is called read-only memory (ROM)
or firmware; it contains basic start-up instructions, which are burned into a chip at
the factory; you cannot change the contents of ROM.
Many computers have open architecture that allows you to add devices. The system
board contains expansion slots, into which you can plug expansion boards for
additional hardware. The board has sockets on the outside, called ports. You can
plug a cable from your new device into the port. The significance of open architecture
is the fact that it enables you to add any hardware and software interfaces to
your existing computer system. This means you can not only expand the memory of
your computer but also add devices that make your computer more amenable to
uses in medicine. Expansion boards also allow the use of virtual reality simulators,
which help in teaching certain procedures.
Introduction to Information Technology
The term information technology (IT) includes not only the use of computers but
also communications networks and computer literacy—knowledge of how to use
computer technology. As in other fields, the basic tasks of gathering, allocating, controlling,
and retrieving information are the same. The push to use IT in all aspects
of health care, from the electronic health record (EHR) to integrated hospital
information technology (HIT) systems, makes it crucial for health care professionals
to be familiar with basic computer concepts. In this chapter, we will focus on
computer literacy, computers, and networks. Currently, computer literacy involves
several aspects. A computer literate person knows how to make use of a computer in
his or her field to make tasks easier and to complete them more efficiently, has a
knowledge of terminology, and understands in a broad, general fashion what a computer
is and what its capabilities are. Computer literacy involves knowledge of the Internet and the World Wide Web and the ability to take advantage of their
resources and to critically judge the information.
A computer is an electronic device that accepts data (raw facts) as input, processes,
or alters them in some way and produces useful information as output. A computer
manipulates data by following step-by-step instructions called a program. The
program, the data, and the information are temporarily stored in memory while
processing is going on, and then permanently stored on secondary storage media for
future use. Computers are accurate, fast, and reliable.
“Technology will provide no miracles that feel like miracles for long,” observes editor
and historian Frederick Allen. Adds science-fiction writer Bruce Sterling, “We should
never again feel all mind boggled at anything that human beings create. No matter
how amazing some machine may seem, the odds are very high that we’ll outlive it.”
The personal computer is over two decades old. The Internet has been familiar to the
public for over 10 years. It has been more than five years since the now common place
“www” for World Wide Web began appearing in company ads. And, like
cars, elevators, air-conditioning, and television—all of which have wrought
tremendous changes on society and the landscape—they are rapidly achieving what
technology is supposed to do: become ordinary. They are becoming part of the
wallpaper of our lives, almost invisible.
When computer
and communications technologies are combined, the result is information
technology ”info tech” defined as technology that merges computing with
high-speed communications links carrying data, sound, and video.
Note there are two parts to this definition computers and communications.
A computer is a programmable, multi use machine that accepts data raw facts
and figures and processes, or manipulates, it into information we can use,
such as summaries, totals, or reports.
Communications technology, also called telecommunications technology,
consists of electromagnetic devices and systems for communicating over long
distances. Online means using a computer or other information device,
connected through a voice or data network, to access information and services
from another computer or information device.
Hardware and Software
To understand the myriad uses of IT in health care, you need to familiarize yourself
with computer terminology, hardware, and software applications. Every computer
performs similar functions. Specific hardware is associated with each function.
Input devices take data that humans understand and digitize those data, that is,
translate them into binary forms of ones and zeroes, ons and offs that the computer
processes; a processing unit manipulates data; output devices produce information
that people understand; memory and secondary storage devices hold information,
data, and programs.
Although all computers perform similar functions, they are not the same.
There are several categories based on size, speed, and processing power: supercomputers
are the largest and most powerful. Supercomputers are used for scientific
purposes, such as weather forecasting and drug design. Supercomputers take complex
mathematical data and create simulations of epidemics, pandemics, and other
disasters. Mainframes are less powerful and are used in business for input/output
intensive purposes, such as generating paychecks or processing medical insurance
claims. Minicomputers are scaled-down mainframes; they are multiuser computers
that are used by small businesses. Microcomputers (personal computers) are powerful
enough for an individual’s needs in word processing, spreadsheets, and database
management. Small handheld computers called personal digital assistants (PDAs)
originally could hold only a notepad, a calendar, and an address book. Today,
sophisticated PDAs are used throughout the health care system. Physicians can write
prescriptions on PDAs, consult online databases, and capture patient information
and download it to a hospital computer. PDAs also hold reference manuals and are
used in public health to gather information and help track diseases and epidemics.
The embedded computer is a single-purpose computer on a chip of silicon, which is embedded in anything from appliances to humans. An embedded computer may
help run your car, microwave, pacemaker, or watch. A chip embedded in a human
being can dispense medication, among other things.
Why Digital???
Comparing the block diagrams for analog and digital communication, respectively, we see that the digital communication system involves far more processing. However, this is not an obstacle for modern transceiver design, due to the exponential increase in the computational power of low-cost silicon integrated circuits. Digital communication has the following key advantages.
Optimality
For a point-to-point link, it is optimal to separately optimize source coding and channel coding, as long we do not mind the delay and processing incurred in doing so. Due to this source-channel separation principle, we can leverage the best available source codes and the best available channel codes in designing a digital communication system, independently of each other. Efficient source encoders must be highly specialized. For example, state of the art speech encoders, video compression algorithms, or text compression algorithms are very different from each other, and are each the result of significant effort over many years by a large community of researchers. However, once source encoding is performed, the coded modulation scheme used over the communication link can be engineered to transmit the information bits reliably, regardless of what kind of source they correspond to, with the bit rate limited only by the channel and transceiver characteristics. Thus, the design of a digital communication link is source-independent and channel-optimized. In contrast, the waveform transmitted in an analog communication system depends on the message signal, which is beyond the control of the link designer, hence we do not have the freedom to optimize link performance over all possible communication schemes. This is not just a theoretical observation: in practice, huge performance gains are obtained from switching from analog to digital communication.
Scalability
While a single digital communication link between source encoder and decoder, under the source-channel separation principle, there is nothing preventing us from inserting additional links, putting the source encoder and decoder at the end points. This is because digital communication allows ideal regeneration of the information bits, hence every time we add a link, we can focus on communicating reliably over that particular link. (Of course, information bits do not always get through reliably, hence we typically add error recovery mechanisms such as retransmission, at the level of an individual link or “end-to-end” over a sequence of links between the information source and sink.) Another consequence of the source-channel separation principle is that, since information bits are transported without interpretation, the same link can be used to carry multiple kinds of messages. A particularly useful approach is to chop the information bits up into discrete chunks, or packets, which can then be processed independently on each link. These properties of digital communication are critical for enabling massively scalable, general purpose, communication networks such as the Internet.
Such networks can have large numbers of digital communication links, possibly with different characteristics, independently engineered to provide “bit pipes” that can support data rates. Messages of various kinds, after source encoding, are reduced to packets, and these packets are switched along different paths along the network, depending on the identities of the source and destination nodes, and the loads on different links in the network. None of this would be possible with analog communication: link performance in an analog communication system depends on message properties, and successive links incur noise accumulation, which limits the number of links which can be cascaded.
The preceding makes it clear that source-channel separation, and the associated bit pipe abstraction, is crucial in the formation and growth of modern communication networks. However, there are some important caveats that are worth noting. Joint source-channel design can provide better performance in some settings, especially when there are constraints on delay or complexity, or if multiple users are being supported simultaneously on a given communication medium. In practice, this means that “local” violations of the separation principle (e.g., over a wireless last hop in a communication network) may be a useful design trick. Similarly, the bit pipe abstraction used by network designers is too simplistic for the design of wireless networks at the edge of the
Internet
physical properties of the wireless channel such as interference, multi path propagation and mobility must be taken into account in network engineering.
Why analog design remains important??
While we are interested in transporting bits in digital communication, the physical link over which these bits are sent is analog. Thus, analog and mixed signal (digital/analog) design play a crucial role in modern digital communication systems. Analog design of digital-to-analog converters, mixers, amplifiers and antennas is required to translate bits to physical wave-forms to be emitted by the transmitter. At the receiver, analog design of antennas, amplifiers, mixers and analog-to-digital converters is required to translate the physical received wave-forms to digital (discrete valued, discrete time) signals that are amenable to the digital signal processing that is at the core of modern transceivers. Analog circuit design for communications is therefore a thriving field in its own right, which this textbook makes no attempt to cover. However, the material in Chapter 3 on analog communication techniques is intended to introduce digital communication system designers to some of the high-level issues addressed by analog circuit designers. The goal is to establish enough of a common language to facilitate interaction between system and circuit designers. While much of digital communication system design can be carried out by abstracting out the intervening analog design (as done in Chapters 4 through 8), closer interaction between system and circuit designers becomes increasingly important as we push the limits of communication systems, as briefly indicated in the epilogue.
Digital Communication II
Channel
The channel distorts and adds noise, and possibly interference, to the transmitted signal. Much of our success in developing communication technologies has resulted from being able to optimize communication strategies based on accurate mathematical models for the channel. Such models are typically statistical, and are developed with significant effort using a combination of measurement and computation. The physical characteristics of the communication medium vary widely, and hence so do the channel models. Wireline channels are typically well modeled as linear and time-invariant, while optical fiber channels exhibit nonlinearities. Wireless mobile channels are particularly challenging because of the time variations caused by mobility, and due to the potential for interference due to the broadcast nature of the medium. The link design also depends on system-level characteristics, such as whether or not the transmitter has feedback regarding the channel, and what strategy is used to manage interference.
Example: Consider communication between a cellular base station and a mobile device. The electromagnetic waves emitted by the base station can reach the mobile’s antennas through multiple paths, including bounces off streets and building surfaces. The received signal at the mobile can be modeled as multiple copies of the transmitted signal with different gains and delays. These gains and delays change due to mobility, but the rate of change is often slow compared to the data rate, hence over short intervals, we can get away with modeling the channel as a linear time-invariant system that the transmitted signal goes through before arriving at the receiver.
Demodulator
The demodulator processes the signal received from the channel to produce bit estimates to be fed to the channel decoder. It typically performs a number of signal processing tasks, such as synchronization of phase, frequency and timing, and compensating for distortions induced by the channel.
Example: Consider the simplest possible channel model, where the channel just adds noise to the transmitted signal. In our earlier example of sending ±s(t) to send 0 or 1, the demodulator must guess, based on the noisy received signal, which of these two options is true. It might make a hard decision (e.g., guess that 0 was sent), or hedge its bets, and make a soft decision, saying, for example, that it is 80% sure that the transmitted bit is a zero. There are a host of other aspects of demodulation that we have swept under the rug: for example, before making any decisions, the demodulator has to perform functions such as synchronization (making sure that the receiver’s notion of time and frequency is consistent with the transmitter’s) and equalization (compensating for the distortions due to the channel).
Channel decoder
The channel decoder processes the imperfect bit estimates provided by the demodulator, and exploits the controlled redundancy introduced by the channel encoder to estimate the information bits.
Example: The channel decoder takes the guesses from the demodulator and uses the redundancies in the channel code to clean up the decisions. In our simple example of repeating every bit three times, it might use a majority rule to make its final decision if the demodulator is putting out hard decisions. For soft decisions, it might use more sophisticated combining rules with improved performance.
While we have described the demodulator and decoder as operating separately and in sequence for simplicity, there can be significant benefits from iterative information exchange between the two. In addition, for certain coded modulation strategies in which channel coding and modulation are tightly coupled, the demodulator and channel decoder may be integrated into a single entity.
Source decoder
The source decoder processes the estimated information bits at the output of the channel decoder to obtain an estimate of the message. The message format may or may not be the same as that of the original message input to the source encoder: for example, the source encoder may translate speech to text before encoding into bits, and the source decoder may output a text message to the end user.
Example: For the example of a digital image considered earlier, the compressed image can be translated back to a pixel-by-pixel representation by taking the inverse spatial Fourier transform of the coefficients that survived the compression. We are now ready to compare analog and digital communication, and discuss why the trend towards digital is inevitable.
Digital Communication
Source Encoder
As already discussed, the source encoder converts the message signal into a sequence of information bits. The information bit rate depends on the nature of the message signal (e.g., speech, audio, video) and the application requirements. Even when we fix the class of message signals, the choice of source encoder is heavily dependent on the setting. For example, video signals are heavily compressed when they are sent over a cellular link to a mobile device, but are lightly compressed when sent to an high definition television (HDTV) set. A cellular link can support a much smaller bit rate than, say, the cable connecting a DVD player to an HDTV set, and a smaller mobile display device requires lower resolution than a large HDTV screen. In general, the source encoder must be chosen such that the bit rate it generates can be supported by the digital communication link we wish to transfer information over. Other than this, source coding can be decoupled entirely from link design (we comment further on this a bit later).
Example: A laptop display may have resolution 1024×768 pixels. For a grayscale digital image, the intensity for each pixel might be represented by 8 bits. Multiplying by the number of pixels gives us about 6.3 million bits, or about 0.8 Mbyte (a byte equals 8 bits). However, for a typical image, the intensities for neighboring pixels are heavily correlated, which can be exploited for significantly reducing the number of bits required to represent the image, without noticeably distorting it. For example, one could take a two-dimensional Fourier transform, which concentrates most of the information in the image at lower frequencies and then discard many of the high frequency coefficients. There are other possible transforms one could use, and also several more processing stages, but the bottomline is that, for natural images, state of the art image compression algorithms can provide 10X compression (i.e., reduction in the number of bits relative to the original uncompressed digital image) with hardly any perceptual degradation. Far more aggressive compression ratios are possible if we are willing to tolerate more distortion. For video, in addition to the spatial correlation exploited for image compression, we can also exploit temporal correlation across successive frames.
Channel encoder
The channel encoder adds redundancy to the information bits obtained from the source encoder, in order to facilitate error recovery after transmission over the channel. It might appear that we are putting in too much work, adding redundancy just after the source encoder has removed it. However, the redundancy added by the channel encoder is tailored to the channel over which information transfer is to occur, whereas the redundancy in the original message signal is beyond our control, so that it would be inefficient to keep it when we transmit the signal over the channel.
Example: The noise and distortion introduced by the channel can cause errors in the bits we send over it. Consider the following abstraction for a channel: we can send a string of bits (zeros or ones) over it, and the channel randomly flips each bit with probability 0.01 (i.e., the channel has a 1% error rate). If we cannot tolerate this error rate, we could repeat each bit that we wish to send three times, and use a majority rule to decide on its value. Now, we only make an error if two or more of the three bits are flipped by the channel. It is left as an exercise to calculate that an error now happens with probability approximately 0.0003 (i.e., the error rate has gone down to 0.03%). That is, we have improved performance by introducing redundancy. Of course, there are far more sophisticated and efficient techniques for introducing redundancy than the simple repetition strategy just described.
Modulator
The modulator maps the coded bits at the output of the channel encoder to a transmitted signal to be sent over the channel. For example, we may insist that the transmitted signal fit within a given frequency band and adhere to stringent power constraints in a wireless system, where interference between users and between co-existing systems is a major concern. Unlicensed WiFi transmissions typically occupy 20-40 MHz of bandwidth in the 2.4 or 5 GHz bands. Transmissions in fourth generation cellular systems may often occupy bandwidths ranging from 1-20 MHz at frequencies ranging from 700 MHz to 3 GHz. While these signal bandwidths are being increased in an effort to increase data rates (e.g., up to 160 GHz for emerging WiFi standards, and up to 100 MHz for emerging cellular standards), and new frequency bands are being actively explored (see the epilogue for more discussion), the transmitted signal still needs to be shaped to fit within certain spectral constraints.
Example: Suppose that we send bit value 0 by transmitting the signal s(t), and bit value 1 by transmitting −s(t). Even for this simple example, we must design the signal s(t) so it fits within spectral constraints (e.g., two different users may use two different segments of spectrum to avoid interfering with each other), and we must figure out how to prevent successive bits of the same user from interfering with each other. For wireless communication, these signals are voltages generated by circuits coupled to antennas, and are ultimately emitted as electromagnetic waves from the antennas.
The channel encoder and modulator are typically jointly designed, keeping in mind the anticipated channel conditions, and the result is termed a coded modulator.
Analog & Digital
Even without defining information formally, we intuitively understand that speech, audio, and video signals contain information. We use the term message signals for such signals, since these are the messages we wish to convey over a communication system. In their original form–both during generation and consumption–these message signals are analog: they are continuous time signals, with the signal values also lying in a continuum. When someone plays the violin, an analog acoustic signal is generated (often translated to an analog electrical signal using a microphone). Even when this music is recorded onto a digital storage medium such as a CD, when we ultimately listen to the CD being played on an audio system, we hear an analog acoustic signal. The transmitted signals corresponding to physical communication media are also analog. For example, in both wireless and optical communication, we employ electromagnetic waves, which correspond to continuous time electric and magnetic fields taking values in a continuum.
Analog Communication
Block diagram for an analog communication system. The modulator transforms the message signal into the transmitted signal. The channel distorts and adds noise to the transmitted signal. The demodulator extracts an estimate of the message signal from the received signal arriving from the channel.
Given the analog nature of both the message signal and the communication medium, a natural design choice is to map the analog message signal (e.g., an audio signal, translated from the acoustic to electrical domain using a microphone) to an analog transmitted signal (e.g., a radio wave carrying the audio signal) that is compatible with the physical medium over which we wish to communicate (e.g., broadcasting audio over the air from an FM radio station). This approach to communication system design, depicted in Figure 1.1, is termed analog communication. Early communication systems were all analog: examples include AM (amplitude modulation) and FM (frequency modulation) radio, analog television, first generation cellular phone technology (based on FM), vinyl records, audio cassettes, and VHS or beta videocassettes While analog communication might seem like the most natural option, it is in fact obsolete. Cellular phone technologies from the second generation onwards are digital, vinyl records and audio cassettes have been supplanted by CDs, and videocassettes by DVDs. Broadcast technologies such as radio and television are often slower to upgrade because of economic and political factors, but digital broadcast radio and television technologies are either replacing or sidestepping (e.g., via satellite) analog FM/AM radio and television broadcast. Let us now define what we mean by digital communication, before discussing the reasons for the inexorable trend away from analog and towards digital communication.
Digital Communication
The conceptual basis for digital communication was established in 1948 by Claude Shannon, when he founded the field of information theory. There are two main threads to this theory:
Source coding and compression: Any information-bearing signal can be represented efficiently, to within a desired accuracy of reproduction, by a digital signal (i.e., a discrete time signal taking values from a discrete set), which in its simplest form is just a sequence of binary digits (zeros or ones), or bits. This is true whether the information source is text, speech, audio or video. Techniques for performing the mapping from the original source signal to a bit sequence are generically termed source coding. They often involve compression, or removal of redundancy, in a manner that exploits the properties of the source signal (e.g., the heavy spatial correlation among adjacent pixels in an image can be exploited to represent it more efficiently than a pixel-by-pixel representation).
Digital information transfer: Once the source encoding is done, our communication task reduces to reliably transferring the bit sequence at the output of the source encoder across space or time, without worrying about the original source and the sophisticated tricks that have been used to encode it. The performance of any communication system depends on the relative strengths of the signal and noise or interference, and the distortions imposed by the channel. Shannon showed that, once we fix these operational parameters for any communication channel, there exists a maximum possible rate of reliable communication, termed the channel capacity. Thus, given the information bits at the output of the source encoder, in principle, we can transmit them reliably over a given link as long as the information rate is smaller than the channel capacity, and we cannot transmit them reliably if the information rate is larger than the channel capacity. This sharp transition between reliable and unreliable communication differs fundamentally from analog communication, where the quality of the reproduced source signal typically degrades gradually as the channel conditions get worse.
A block diagram for a typical digital communication system based on these two threads is shown in Figure 1.2. We now briefly describe the role of each component, together with simplified examples of its function.
Communication Systems
Progress in telecommunications over the past two decades has been nothing short of revolutionary, with communications taken for granted in modern society to the same extent as electricity. There is therefore a persistent need for engineers who are well-versed in the principles of communication systems. These principles apply to communication between points in space, as well as communication between points in time (i.e, storage). Digital systems are fast replacing analog systems in both domains. This book has been written in response to the following core question:
what is the basic material that an undergraduate student with an interest in communications should learn, in order to be well prepared for either industry or graduate school? For example, a number of institutions only teach digital communication, assuming that analog communication is dead or dying. Is that the right approach? From a purely pedagogical viewpoint, there are critical questions related to mathematical preparation: how much mathematics must a student learn to become well-versed in system design, what should be assumed as background, and at what point should the mathematics that is not in the background be introduced? Classically, students learn probability and random processes, and then tackle communication. This does not quite work today: students increasingly (and I believe, rightly) question the applicability of the material they learn, and are less interested in abstraction for its own sake. On the other hand, I have found from my own teaching experience that students get truly excited about abstract concepts when they discover their power in applications, and it is possible to provide the means for such discovery using software packages such as Matlab. Thus, we have the opportunity to get a new generation of students excited about this field: by covering abstractions “just in time”
to shed light on engineering design, and by reinforcing concepts immediately using software experiments in addition to conventional pen-and-paper problem solving, we can remove the lag between learning and application, and ensure that the concepts stick.
This textbook represents my attempt to act upon the preceding observations, and is an outgrowth of my lectures for a two-course undergraduate elective sequence on communication at UCSB, which is often also taken by some beginning graduate students. Thus, it can be used as the basis for a two course sequence in communication systems, or a single course on digital communication, at the undergraduate or beginning graduate level. The book also provides a review or introduction to communication systems for practitioners, easing the path to study of more advanced graduate texts and the research literature. The prerequisite is a course on signals and systems, together with an introductory course on probability. The required material on random processes is included in the text.
We define communication as the process of information transfer across space or time. Communication across space is something we have an intuitive understanding of: for example, radio waves carry our phone conversation between our cell phone and the nearest base station, and coaxial cables (or optical fiber, or radio waves from a satellite) deliver television from a remote location to our home. However, a moment’s thought shows that that communication across time, or storage of information, is also an everyday experience, given our use of storage media such as compact discs (CDs), digital video discs (DVDs), hard drives and memory sticks. In all of these instances, the key steps in the operation of a communication link are as follows:
(a) insertion of information into a signal, termed the transmitted signal, compatible with the physical medium of interest.
(b) propagation of the signal through the physical medium (termed the channel) in space or
time;
(c) extraction of information from the signal (termed the received signal) obtained after propagation through the medium.
Communications systems can not only link people or systems at great distances via audio, visual, computer, or other messages, but may link the various parts within systems, and even within single semiconductor chips. They may communicate information in two directions, or only one way, and they may involve one node broadcasting to many, one node receiving from many, or a finite set of nodes communicating among themselves in a network. Even active measurement and remote sensing systems can be regarded as communications systems. In this case the transmitted signals are designed to be maxim ally sensitive to the channel characteristics, rather than insensitive, and the receiver’s task is to extract these channel characteristics knowing what was transmitted.
A two-node, one-way communication system consists of the channel that conveys the waves, together with a modulator and a demodulator. All communications systems can be regarded as aggregates of these basic two-node units. The modulator transforms the signal or symbol to be transmitted into the signal that is propagated across the channel; the channel may add noise and distortion. The task of the demodulator is to analyze the channel output and to make the best possible estimate of the exact symbol transmitted, accounting for the known channel characteristics and any user concerns about the relative importance of different sorts of errors. A sequence of symbols constitutes a message. A complete communications system is formed by combining many two-node, one-way systems in the desired configuration.
what is the basic material that an undergraduate student with an interest in communications should learn, in order to be well prepared for either industry or graduate school? For example, a number of institutions only teach digital communication, assuming that analog communication is dead or dying. Is that the right approach? From a purely pedagogical viewpoint, there are critical questions related to mathematical preparation: how much mathematics must a student learn to become well-versed in system design, what should be assumed as background, and at what point should the mathematics that is not in the background be introduced? Classically, students learn probability and random processes, and then tackle communication. This does not quite work today: students increasingly (and I believe, rightly) question the applicability of the material they learn, and are less interested in abstraction for its own sake. On the other hand, I have found from my own teaching experience that students get truly excited about abstract concepts when they discover their power in applications, and it is possible to provide the means for such discovery using software packages such as Matlab. Thus, we have the opportunity to get a new generation of students excited about this field: by covering abstractions “just in time”
to shed light on engineering design, and by reinforcing concepts immediately using software experiments in addition to conventional pen-and-paper problem solving, we can remove the lag between learning and application, and ensure that the concepts stick.
This textbook represents my attempt to act upon the preceding observations, and is an outgrowth of my lectures for a two-course undergraduate elective sequence on communication at UCSB, which is often also taken by some beginning graduate students. Thus, it can be used as the basis for a two course sequence in communication systems, or a single course on digital communication, at the undergraduate or beginning graduate level. The book also provides a review or introduction to communication systems for practitioners, easing the path to study of more advanced graduate texts and the research literature. The prerequisite is a course on signals and systems, together with an introductory course on probability. The required material on random processes is included in the text.
We define communication as the process of information transfer across space or time. Communication across space is something we have an intuitive understanding of: for example, radio waves carry our phone conversation between our cell phone and the nearest base station, and coaxial cables (or optical fiber, or radio waves from a satellite) deliver television from a remote location to our home. However, a moment’s thought shows that that communication across time, or storage of information, is also an everyday experience, given our use of storage media such as compact discs (CDs), digital video discs (DVDs), hard drives and memory sticks. In all of these instances, the key steps in the operation of a communication link are as follows:
(a) insertion of information into a signal, termed the transmitted signal, compatible with the physical medium of interest.
(b) propagation of the signal through the physical medium (termed the channel) in space or
time;
(c) extraction of information from the signal (termed the received signal) obtained after propagation through the medium.
Communications systems can not only link people or systems at great distances via audio, visual, computer, or other messages, but may link the various parts within systems, and even within single semiconductor chips. They may communicate information in two directions, or only one way, and they may involve one node broadcasting to many, one node receiving from many, or a finite set of nodes communicating among themselves in a network. Even active measurement and remote sensing systems can be regarded as communications systems. In this case the transmitted signals are designed to be maxim ally sensitive to the channel characteristics, rather than insensitive, and the receiver’s task is to extract these channel characteristics knowing what was transmitted.
A two-node, one-way communication system consists of the channel that conveys the waves, together with a modulator and a demodulator. All communications systems can be regarded as aggregates of these basic two-node units. The modulator transforms the signal or symbol to be transmitted into the signal that is propagated across the channel; the channel may add noise and distortion. The task of the demodulator is to analyze the channel output and to make the best possible estimate of the exact symbol transmitted, accounting for the known channel characteristics and any user concerns about the relative importance of different sorts of errors. A sequence of symbols constitutes a message. A complete communications system is formed by combining many two-node, one-way systems in the desired configuration.