Search This Blog

Chitika Ads

ads

Friday, August 2, 2013

How it works: The basic tech behind the Internet | Data Explosion - InfoWorld

We all use the Internet, but few of us have an understanding of the fundamentals that propel it 


 
Pretty much everyone in the developed world today has at least passing knowledge of how to use the Internet. From a consumer perspective, using the Internet might be as simple as logging your iPad into a wireless network in a coffee shop, then opening a Web browser. IT pros know there's a lot more to it: access points, wireless controllers, a couple of DNS servers, a good firewall, and the all-important upstream Internet connection supplied by an ISP.
However, even for many people in the business, that online connection is the outer limit of our knowledge of how the Internet lives and breathes. Although it's true that unless you actually work for an ISP you probably don't have to know how the Internet sausage is really made, there are tons of ways in which it can be extremely helpful to have a general understanding.
In an age where more and more of an enterprises resources depend on external connectivity -- say, to reach a cloud service -- it's crucial to know how traffic actually gets where it's going on the Internet. Most people know that the Internet is really just an interconnected set of smaller networks, but exactly how are they connected? For example, how did the computer you're using right now know how to get to InfoWorld's website? Knowing that requires a passing understanding of DNS and the BGP routing protocol.
At the beginning of Internet access is DNS
When you fired up your computer or mobile device earlier today, it attached itself to the local network you're on right now. If you're using a PC on a typical enterprise network, your machine likely used a protocol called DHCP (Dynamic Host Configuration Protocol) to request network settings from the network. A DHCP server on the network gave you a few critical pieces of information: An IP address and netmask, a default gateway IP address, and IP addresses for one or more DNS servers.
Because of how IP networks are built, that is the sum total of the configuration that your computer needs to get anywhere on the Internet worldwide -- the rest it can figure out on its own by asking for help from the network. In simple terms, the netmask defines the size of the local IP network you're on, and the address defines your place within it. The default gateway is the address of a device on your local network that can get your traffic to addresses outside the local network. Finally, the DNS (Domain Name System) servers are responsible for translating human-readable DNS names (such as www.infoworld.com) into IP addresses that are actually used when you make connections across the Internet.
Figuring out how to get you to your destination
For you to have gotten here and read this, the first thing that needed to happen was that your computer needed to translate www.infoworld.com into an IP address.It did that by sending a very simple query to one of the DNS servers it received from the local DHCP server earlier. The DNS server then responded with the IP address of InfoWorld's website (right now, 65.214.57.178).
That seems straightforward, but how did your DNS server know that that address was the one to InfoWorld's site? The interesting answer here is that the DNS server probably didn't know until you asked. Instead, it seeks that address once you ask, in a process called DNS resolution.
DNS resolution works by using a hierarchal set of publicly accessible DNS servers that allow your local DNS server to resolve any name into an IP address.That hierarchy works backward through the DNS name you typed in to figure out which other DNS servers on the network to contact for information. It uses the information it gets from servers that are responsible (or "authoritative") for the first levels of that name to find the information for subsequent levels until it finally gets the answer it's looking for. After it has found (or "resolved") an address, it hands it back to your computer.
That whole process starts by sending a query to the Internet's root DNS servers.These servers are responsible for knowing which DNS servers are authoritative for all of the TLDs (top-level domains), such as .com, .net, and .org. Every DNS server in the world that needs to resolve addresses for clients needs to be seeded with the names and IP addresses of these 13 logical root servers. Because of this, their names and IP address are kept relatively static. However, to enhance reliability and response times, many of these servers are not single physical servers -- instead, they are large, geographically diverse collections of physical servers that use a protocol called Anycast to appear to be in more than one place at a time.
In the end, your local DNS server had to perform three lookups before it could give your computer the IP address it had asked for: One lookup to the root DNS servers to find the addresses for servers that handle .com, one to the Generic TLD servers that handle .com to find the servers responsible for infoworld.com, and one to the servers authoritative for infoworld.com to find the address forwww.infoworld.com. If you want to see this work, use dig +trace www.infoworld.comon a Linux computer or on any of a number of Web-based Dig tools out there.

Getting information from your Internet destination
Now that your computer knows that www.infoworld.com resolves to 65.214.57.178, it can start sending traffic there. Right away, your computer knows this IP address isn't on the local network, so it can't speak to it directly. Instead, it addresses its traffic to the default gateway's IP address that it learned from the DHCP server. From there, traffic is routed from router to router until it arrives at the server that actually serves up InfoWorld's site. That's where a lot of interesting things happen.
The first stop is the default gateway. Imagine that you're plugged into a large enterprise network and your default gateway is a core router on the network.That core router might be attached to a series of routers and firewalls that make up your company's network. If your company's network is large or complex enough, it probably uses an IGP (Interior Gateway Protocol), such as OSPF or EIGRP, to dynamically share routing information among all the routers on the network. This way, each router on the network only has to know what networks it is directly attached to -- it learns how to get to other networks that aren't adjacent from the routers on the network that are adjacent. As links between them go up or down, the routing tables on each dynamically update to reflect the changes in best paths between them.
However, these routers only know how to get to networks in your company -- an IGP won't teach them anything about how to get to InfoWorld's site.However, the corporate network will have its own default gateway that tells the core router to forward its traffic to the devices guarding the connection to your company's ISP.
After the ISP receives the traffic, it routes it through its network (probably using its own IGP) until it reaches a router that is running an EGP (Exterior Gateway Protocol). The EGP that predominantly runs the Internet is called BGP (Border Gateway Protocol). BGP works by allowing a network -- typically ISPs, but also corporations that connect to more than one ISP -- to advertise which public IP address blocks or prefixes they are responsible for to their peers.
Critical to this process is the ASN (Autonomous System Number) that identifies each organization's network. As one organization advertises IP prefixes to another, the receiving organization's routers prepend the sending organization's routes with the sending organization's ASN. This continues to happen from one interconnected organization's network to the next until every BGP router on the Internet has every single active prefix on the Internet in its routing table along with the AS (Autonomous System) paths to get to them.
Given that almost every BGP network is normally attached to multiple peer networks, most prefixes have more than one valid path by which you can reach them. Path optimization is typically accomplished by choosing the path with the shortest list of ASes that must be passed through to get to the target network, although individual organizations can influence the paths their routers use to optimize the use of their peering links with other organizations.  
To get some visibility into how this happens in practice, you either need direct access to a router running BGP that's actively peered with other routers on the Internet (which most people don't have) or you can use a "looking glass" server .These servers are a Web-based way to run routing queries on an ISP's routers.With them, you can see which path the traffic takes to cross the Internet from various points that you might not have access to.
For example, by getting on Bell Canada's looking glass server and performing a BGP query for 65.214.57.178, you can see that traffic leaving Bell Canada's network (AS577) crosses through a peering link with Tata Communications (AS6453), another link with the old UUNET network (AS701, subsequently MCI and now Verizon Business Services), and then finally another link to Verizon Online (AS11486). If you then run a traceroute for the same IP address, you'll see the results of that path being exercised in action, but with each router listed.
For an even better picture of how the Internet is tied together, you can use a tool like Hurricane Electric's BGP Toolkit . Although it won't let you see actual paths in use by individual ISPs (only a looking glass server that directly interrogates the ISP's routers can do that), it does allow you to perform lookups on individual prefixes to see who owns them, to see how well connected a network is, and to see what other prefixes they might advertise. The tool also features cool visualizations of that connectivity.
Although it's certainly true that fully understanding DNS and BGP takes a lot of studying and practical hands-on experience, even a passing understanding can really help fill in the blanks about how the Internet works. For example, that knowledge makes comprehensible this analysis of the network outages following Hurricane Sandy or this chart of the growth of the Internet by thenumber of actively advertised IPv4 routes . This knowledge can also help tremendously when making choices about whom to purchase Internet connectivity from when trying to improve latency to a cloud service.

No comments:

Post a Comment

wibiya widget

Disqus for Surut Shah

AddThis Smart Layers

Shareaholic

Repost.Us

Web Analytics