Berkman Center logo

Resources

 

BOLD 2003: Development and the Internet

Module I
Module II
Module III
Module IV
Module V

 

Architecture

Ethan Zuckerman & Andrew McLaughlin
with Teaching Fellow Nandan Kamath

 

Table of Contents

Introduction
An Introduction to Internet Infrastructure
Access and Connectivity in Developing Countries
Interconnection in Developing Countries
Discussion Questions
Special Event

Introduction

Welcome to the introductory module of “Development and the Internet”. In this module we will guide you through, and encourage you to think about, the Internet’s Architecture and Infrastructure.

First, using simple examples, you will be introduced to the way the Internet works, the processes involved in keeping it running and the entities that have put it all together and continue to do so. You are particularly encouraged to follow the links available in Parts 2 and 3 of the first section, "An Introduction to Internet Infrastructure." Familiarity with these materials will help you appreciate the complexity of the network architecture as well as the degree of coordination needed to complete even the most basic Internet transaction. Remember to ask yourself what this complexity, as also the intense need for coordination among competitors, means for developing countries.

The second half of this module will discuss the challenges to achieving wider Internet connectivity in the developing world. Much of the global population still has no access to the Internet. Many of those who do manage to get online receive only very poor quality of service. Across the developing world, we'll find a wide range of approaches to the problem of expanding connectivity. While we will introduce you to some of these, we will examine one particular approach that will potentially change the connectivity landscape – that of fostering Internet Exchange Points (IXPs) to reduce costs and improve quality of service.

An Introduction to Internet Infrastructure

X-Originating-IP: [209.198.247.19]
From: "Ethan Zuckerman" yaoobruni@hotmail.com
To: mclaughlin@pobox.com
BcHotmail:
Subject: 70 hops
Date: Fri, 14 Mar 2003 13:13:31 -0500
X-OriginalArrivalTime: 14 Mar 2003 11:13:31.0535 (UTC)
FILETIME=[471B39F0:01C2D6B0] X-Loop-Detect: 1

Hey Andrew -

Checking Hotmail from my office in Accra - just got your email from Mongolia. Glad you're enjoying Ulaanbaatar.If I'm counting correctly, receiving and reading your email involved a minimum of 70 computers in 5 nations - makes you realize just how cool the net really is! Take care,

-E

Reply-To: mclaughlin@pobox.com
From: "Andrew McLaughlin" mclaughlin@pobox.com
To: "Ethan Zuckerman" ethan@geekcorps.org
Subject: 70 hops
Date: Fri, 14 Mar 2003 08:59:23 -0500
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
X-Loop-Detect: 1

Ethan -

70 computers - no kidding! That translates into dozens upon dozens of organizations or entities, from the technical architects to the ISPs, located in at least 20 different legal jurisdictions.

No wonder lawyers think the Internet is so scary...

--andrew

Fifty years ago, communication between Ghana and Mongolia would have taken months and transpired via mail. Ten years ago, it would have involved international phone calls costing several US dollars a minute and required the intervention of international operators to connect the two telephones. Today, Ethan and Andrew are able to communicate over immense distances, across dozens of national borders, with near-zero cost, no human assistance and mere seconds of lag-time between the transmission and receipt of the message.

What happened? And how is this possible?

PART 1 - An Introduction to the Internet Protocol

The Internet, and its attendant communication miracles, result from a fundamental principle of network engineering: Keep It Simple, Stupid (KISS). Every computer connected to the Internet is capable of doing a few, very simple tasks very quickly. By linking millions of comparatively simple systems together, complex functionality is achieved. The Internet is an ingenious communications network in large part because it is so simple.

At the heart of any Internet transmission - sending an email, viewing a web page, or downloading an audio or video file - is the Internet Protocol (IP). Invented in 1974 by Vint Cerf and Robert Kahn, IP is a communications scheme that defines how data is sent across networks. IP has two key standardized elements that are involved in every transmission: (1) a common method for breaking each transmission down into small chunks of data known as "packets", and (2) a unified global addressing system. IP gives every computer connected to the Internet a unique address, and a description of the packets of data that can be delivered to these addresses. [Note 1]

The Internet Protocol boils down to two simple rules:

  1. Every computer connected to the Internet must be reachable via a numerical address of a specific form: four eight-bit numbers separated by periods -- e.g., A.B.C.D where A, B, C, and D are between 0-255 (that's because each eight-bit string has 2ˆ8=256 different combinations). This address is called an "Internet Protocol address," or "IP address" for short. For example, the IP address for Google's homepage is 216.239.51.100. As far as most Internet computers are concerned, an IP address is all you really need -- as a test, try typing this URL into your browser: http://216.239.51.100. (A bit later on, we'll talk about the use of names as convenient substitutes for IP addresses). [Note 2]
  2. Every computer connected to the Internet must be able to accept packets that have a 24 to 32 byte header and a packet size of up to 576 bytes. The header contains information on the origin and destination address of each packet and the total size of the packet.

And that's it. Build a device capable of following those two rules and you can connect it to the Internet. Which goes a long way towards explaining how people have connected Coke machines and coffee pots to the Net (but provides little assistance in understanding why...).

Because IP is so simple, there are lots of useful features not included in the protocol. Perhaps the most important of these key features is "guaranteed delivery". Using "pure" IP, a message sent from one computer to another is first broken up into small packets, each labeled with the address of the destination machine; the sending computer then passes those packets along to the next connected Internet machine, which looks at the destination address and then passes it along to the next connected Internet machine, which looks the destination address and pass it along, and so forth, until the packets (we hope) reach the destination machine. IP is thus a "best efforts" communication service, meaning that it does its best to deliver the sender's packets to the intended destination, but it can't makes any guarantees.

By itself, IP can't ensure that the packets arrived in the correct order, or even that they arrived at all. That's the job of another protocol: TCP (Transmission Control Protocol). TCP sits "on top" of IP and ensures that all the packets sent from one machine to another are received and assembled in the correct order. Should any of the packets get "dropped" during transmission, the destination machine uses TCP to request that the sending machine resend the lost packets, and to acknowledge them when they arrive.

Terminology note: TCP and IP are used together so often that they are commonly referred to as the "TCP/IP protocol suite" or just "TCP/IP". A software implementation of TCP/IP is usually called a "stack" -- meaning that, for example, your computer's operating system almost certainly includes a TCP/IP stack. In engineer-speak, Internet traffic "passes through the TCP/IP stack" at both the sending and receiving ends of a data transmission -- meaning that the sender's Internet protocol software converts data (email, web pages, audio/video files, whatever) into packets, and the receiver's Internet protocol software recombines it back into its original format at the destination.

Why not just build delivery guarantees into IP, combining TCP and IP? Oddly enough, there are applications where it's less important that you receive all the data than it is that you receive the data as quickly as possible. If you're receiving streamed audio or video, you'd prefer to have a decrease in the quality of your signal than have the stream stop altogether while dropped packets get resent. Early Net architects were smart enough to anticipate this sort of situation and created a TCP alternative called UDP (User Datagram Protocol). While orders of magnitude less common than TCP, it's an important part of core Internet protocols.

A very informative tutorial on IP, TCP, UDP and the basics of IP routing is available in RFC 1180. While it was written in the "pre-web" Internet (1991), IP has not changed substantially since it was first invented, so the document is still a terrific introduction. [Note 3]

 

Still confused?

Think of TCP/IP this way: Sending a communication (an email or web page or video file or whatever) via Internet Protocol packets is like sending a book by postcard. Figuratively speaking, the Internet Protocol allows your computer to take your book, cut out the pages, and glue each page onto a postcard. In order to allow the destination computer to reassemble the pages properly, your computer writes a number on each postcard -- after all, there is no guarantee that the mailman will deliver the postcards in the exact right order.

Here's where it gets interesting. Because there's a danger that some postcards will be lost in the mail, your computer keeps a copy of each one, just in case it needs to resend a missing postcard. How will your computer know if it needs to resend some of the postcards? That's where TCP does its ingenious thing. TCP tells the destination computer to send a periodic confirmation postcard back to your computer, telling it that all postcards up to number X have been received. When your computer gets a confirmation postcard like that, it knows that it is safe to throw out the retained duplicate postcards up to number X. TCP also instructs your computer that, if no confirmation is received by a certain time, it should start to resend the postcards. The lack of a confirmation may mean that some postcards are missing, or that the confirmation itself got lost along the way. Your computer is not too worried about sending unnecessary duplicates, because it knows that the destination computer is smart enough to recognize and ignore duplicates. In other words, TCP says that it's better to err on the side of oversending. TCP also helps computers to deal with the fact that there is a limit to how many postcards can be stuffed into a mailbox at one time. It allows the two computers to agree that the sender will only send perhaps 100 postcards and await a postcard confirming receipt of the first 100 before sending the next group.

Thus, TCP gives the sending and receiving computers a way to exchange information about the status of a communication -- which packets have been received, which ones are missing. And it helps the two computers manage the rate of packet traffic, so as not to get overwhelmed.


Okay, so that's how TCP/IP works. Why has the protocol gained such widespread acceptance? And how does it help us get an email from Mongolia to Ghana? Let's dig a bit deeper.

Three reasons why IP is particularly unique: efficiency; medium independence; and application support.

Efficiency

When we think of communications, we tend to think of the telephone. In telephony, we open a "circuit" between two people. This circuit allows communication in both directions - i.e., I can speak, and I can hear you speak at the same time. With certain exceptions, it's private, and assuming nothing fails, it's got guaranteed availability for an unlimited period of time. All of these things are desirable, especially when you may be calling a loved one halfway across the world.

These desirable features are a big part of the reason circuit-based communications are, from a networking standpoint, incredibly inefficient. To set up a telephone call, you have to commandeer a piece of wire (or, more likely, a piece of a fiber optic cable) connecting you and the other party. No one else gets to use those wires for the time you're tying them up. Even worse, you're not transmitting data the whole time! When you're listening to the other person talk, you're not taking advantage of the circuit's capability to carry data bi-directionally. And during pauses between sentences, words or phonemes, you're not transmitting data at all. How selfish of you!

In comparison to telephony, IP is an extremely efficient protocol. On the same underutilized piece of copper carrying a phone call, hundreds of email exchanges can occur in the same period of time. Because Internet traffic has been packetized, there's no need to occupy a circuit for the full duration of an exchange. Instead, you can use the circuit just for the milliseconds needed to transmit the packet. And because each packet has a unique source and destination address embedded in the header, simultaneous conversations can coexist serially on the same circuit without interfering with one another.

One way to understand just how efficient packetizing data can be is to consider Voice over IP (VoIP), which essentially means telephone conversations over the Internet. By packetizing and compressing voice traffic, VoIP is able to provide up to six voice circuits in the same bandwidth of a traditional telephone line (56kbps). (Check out this VoIP bandwidth calculator for a clearer sense of the parameters involved with compressing voice traffic.)


Medium Independence

We've been talking about using Internet Protocol over phone lines. And, indeed, most Internet traffic is carried over copper or fiber optic phone lines. But IP is completely medium-independent. The Internet Protocol can be implemented "on top of" any means of communication. Internet links via radio and microwave are becoming increasingly common. Much of the developing world receives Internet connectivity via satellite links. WiFi links have become standard equipment at many US universities and businesses. Less common, but fascinating, is the practice of transmitting data via lasers and "open air optics" - i.e., through the air, rather than through glass fiber. Laurence Livermore laboratories recently announced a system capable of transmitting 2.5 Gbps (the equivalent of 40,000 simultaneous phone calls) over a single laser beam spanning 28 kilometers.

For proof of the fact that IP can run on absolutely ANY communications infrastructure, it's useful to consider RFC 1149, titled "A Standard for the Transmission of IP Datagrams on Avian Carriers" -- in other words, instructions for running an Internet using carrier pigeons. A successful implementation of the Carrier Pigeon Internet Protocol (CPIP) was recently carried out by network administrators in Bergen, Norway. While no one is suggesting that CPIP is likely to be a major factor in the growth of the global Internet, it's helpful in demonstrating that IP is interoperable with pretty much any existing network.

Application Support

The fact that IP is efficient and medium-independent wouldn't matter to us if there weren't so many useful applications built to run on top of it. Every application we think of as an Internet service is built on top of IP: email, FTP, web browsing, peer to peer file sharing. By building new applications that rely on IP, developers are able to greatly hasten the development process. If Shawn Fanning had needed to design the networking protocols that made Napster possible, it's unlikely the application would ever have been created. And, without hundreds of millions of potential users already connected to the Internet, it's unlikely that a network-based application like Napster would ever have reached critical mass. The importance of the ease of creating applications that rely on IP and the ability to leverage an existing user base cannot be underestimated.


PART 2 - Follow the Header (or "Around the World in 900 Milliseconds")

Armed with our new understanding of TCP/IP, we turn to our story of globetrotting technologists, Ethan (in Ghana) and Andrew (in Mongolia).

In order to understand how Internet communication looks, feels, and actually works in developing countries, we're going to look closely at the path of an email exchange between these two users -- a case study of TCP/IP in action.

On to our email tale then: Click here for Part 2 - Follow the Header. (Required Reading)



PART 3 - Alphabet Soup (or "Who Are All These Mysterious Internet Elves, Anyway?")

The heart of the Internet is not a place or an organization, but a principle: cooperation. The Internet is not a single network, but a vast network of networks that voluntarily choose to interconnect with each other. Internet networks are united by two universally shared features: (1) they transmit data using the Internet protocol, meaning that they take all data (email, web pages, audio/video files, etc.) and convert it into small packets in the format defined by the Internet protocol, and (2) they use the same unified addressing system, known as Internet protocol addresses, to route each packet toward its destination.

The story of the emails between Andrew in Mongolia and Ethan in Ghana shows how a single communication runs through many dozens of machines and ranges across multiple national borders, all in the blink of an eye. Each of the machines and organizations involved in that email exchange operates on the basis of voluntary cooperation, becoming part of a global network by implementing a set of common technical standards defined over the past 3 decades.

Who sets these standards? Who implements them? Who uses them? Click here for Part 3 - Alphabet Soup. (Required Reading)

So: Armed with your new understanding of who the “Internet elves” are, are you surprised with the number, variety and type of organizations that play a part in making the Internet happen? No kings, no presidents, just lots of (creatively tense) cooperation.


Access and Connectivity in Developing Countries

Now that we've taken a good, hard look at the way the Internet works, and the many organizations that play (or have played) a part in making it work, let's turn to the Internet connectivity problems that plague developing countries.

Listen to Andrew talk about the digital divide and a few startling connectivity statistics.

Please view the extremely informative and interesting 2002 Africa Connectivity Map (hover your mouse over the various countries to compare detailed connectivity statistics).

Listen to Ethan talk about the challenges on the ground in developing countries.

As you listen and read, make a list of some of the obstacles to connectivity for individuals in Africa: physical, technological, financial, political, and so forth.


Reading assignment:

You are now getting more familiar with some of the leading approaches to dealing with access to communications technology in developing countries. We will come back to the issue of “universal access” in our Discussion Groups. Please let the issues we have just considered frame the way you think about the rest of the materials in this module.


PART 4 - Solutions in the Architecture

Next, we will consider, in some detail, one important element of Internet infrastructure– “network interconnection” through the deployment of Internet Exchange Points (IXPs). Improved interconnection among networks, enabled by IXPs, brings dramatic changes to the connectivity landscape in developing countries, lowering costs and improving quality.

Interconnection in Developing Countries (or "The Missing Links")

Currently, nearly all developing countries suffer from Internet connectivity that is expensive and slow, in comparison to developed countries. To a large extent, this is the result of the fact that virtually all developing country Internet networks and service providers rely - directly or indirectly - on international satellite links to larger foreign upstream providers. As a result, nearly all Internet traffic in nearly every developing country must travel across multiple satellite hops to get routed and exchanged via a backbone in another country before it reaches its destination. In other words, nearly all Internet traffic in developing countries - even traffic from one Internet service provider (ISP) to another ISP in the same country - is routed overseas, most often via the United States or Europe. As a result, developing country Internet connections are significantly slower, less reliable, and more expensive than in developed countries.

One of the most effective mechanisms to enable local exchange of local Internet traffic - thereby producing both cost and service improvement - is the Internet Exchange Point (IXP). An IXP is a shared switching facility that allows ISPs to exchange Internet traffic with each other. Click here to continue reading Part IV - Solutions in the Architecture (Required Reading).

 


Discussion Questions:


First listen to Andrew’s hypothetical argument about the digital divide. (Or read an edited transcript here.)

Then read the ITUs “Istanbul Declaration” from the World Telecommunication Development Conference 2002.

Question 1: What are your views on the “universal access” debate? Is connecting everyone to the Internet a pressing and immediate priority, especially in developing countries? What are the other legitimate competing concerns that we ought to consider? Are the benefits of achieving universal access worth the costs?

Question 2: Are you persuaded that interconnection is a serious problem in developing countries? Where would you place the development/establishment of IXPs in the context of other measures that are capable of bridging the access divide? Given what you know about how the Internet works, and the players involved, what do you think should be done to improve Internet connectivity in developing countries? What sort of initiatives would you support and who would you expect to undertake them? Why? Do you think your answers are tied to your responses to the difficult queries posed in Discussion Question 1?

High Level Click Here

Low Level Click Here

(for more instructions, click here)


Special Event


DIALOGUE WITH KENYAN IXP LEADER

Now that you've had a chance to learn all about IXPs, it's time to talk to a real, live African IXP leader: Brian Longwe. Brian is a super-talented network engineer whose resume includes little things like being CTO of a Kenyan ISP, leading the technical design and implementation of Kenya's IXP, working to assist with the creation of IXPs in places like Nepal and Mozambique, and much much more. He's affiliated with Packet Clearing House, a very cool non-profit research institute that supports investigation and operations in the areas of Internet traffic exchange and routing economics.

Over the next weeks, Ethan and Andrew are going to conduct an online dialogue with Brian about the real-world challenges and opportunities for IXPs and interconnection in developing countries. Meanwhile, they'd like your questions for Brian! If you've got questions you'd like them to pose, please send them to Nandan Kamath (our Teaching Fellow).

contact: BOLD@cyber.law.harvard.edu