Welcome to Network4Dev!

Network4Dev is a community-driven initiative to explain networking infrastructure concepts & challenges to a wider audience (for example, but not only, devs & sysadmins).

Introduction

Welcome to Network4Dev! Network4Dev is a community-driven, open source networking site. The site was founded by Daniel Dib in 2018 and all work behind the scenes is done by Cristian Sirbu. At the time of this writing 2018-08-19, the two maintainers of the GitHub repository are Cristian and Daniel.

As infrastructure is moving towards more elastic, composable and automated setups, infrastructure engineers must have a broader set of skills while still remaining relevant within their speciality. This means that networking people must understand compute, storage, applications, systems and coding at a level where they can interact with the other teams. Conversely, it means that people working mainly with systems and apps must have some understanding of networking. While there are plenty of sites and courses for networking people to learn about automation, there are a lot less resources for people working with systems and apps to learn about networking. The goal of N4D is to fill this gap.

In order to write for people that don’t have a strict networking background, the articles will be written to be short, concise and to the point. The articles will not be deep dives but instead focus on fundamentals and the WHY of networking.

By making this site open source and community driven, more people can contribute into making this site living and more frequently updated than a site where only one or a couple of people are contributing.

In order to contribute to this site, read the Contributing guide.

Please note that all contributions to this site will be reviewed and edited by the maintainers for quality, accuracy and clarity both in terms of language and technical level in order to maintain our goals.

Audience

The target audience for N4D is mainly people working as developers or system administrators. However, this site will be useful for anyone that wants to learn more about networking, especially junior networking people. The reason for this being that this site aims to explain networking concepts in a clear and concise manner, although not in depth.

Contributing

This site was created to be open for contributions from the open-source community. This is achieved by commiting files to Github in .rst format.

In order to contribute to this site, the following is needed:

  • Github account
  • Github client
  • Text editor
  • (optional) Local install of Sphinx

By leveraging Github, multiple people can write documents for the site without having to use a platform like Wordpress using many different accounts.

Architectures

N4D chapters are based on architectures. At the start these are:

  • Cloud
  • Datacenter
  • Design
  • Networking
  • Security
  • Services

These are quite self-explanatory but the following adds more context to each architecture:

Cloud - Focus on networking in the public cloud, such as AWS, Azure and GCP.

Datacenter - Focus on on-premises datacenter technologies and design.

Design - Focus on networking design principles such as scalability, fault domains and more.

Networking - Focus on “standard” networking such as LAN and WAN.

Security - Focus on security such as firewalls, IPSec and more.

Services - Focus on networking related services such as DNS and DHCP.

N4D structure

N4D, using Read the Docs, thus using Sphinx as the backend, has the following structure setup:

  • docs - All the documents go here
  • _static - All static content, such as images, go here

In order to keep things sorted, there are also subdirectories based on different architectures, for the docs directory these are:

  • Cloud (docs/cloud)
  • Datacenter (docs/datacenter)
  • Design (docs/design)
  • Networking (docs/networking)
  • Security (docs/security)
  • Services (docs/services)

The _static directory also has the same layout:

  • Cloud (docs/_static/cloud)
  • Datacenter (docs/_static/datacenter)
  • Design (docs/_static/design)
  • Networking (docs/_static/networking)
  • Security (docs/_static/security)
  • Services (docs/_static/services)

Workflow

This site uses a forking workflow, meaning that each contributor forks the official repository to create their own server-side copy of the project. This allows the maintainers to keep write access for the maintainers and instead approving incoming Pull Requests.

Setting up the environment

The first step to contributing is to create a fork of the N4D repository. This is done by clicking the fork icon in the top right corner.

_images/Fork.png

When this is done, clone the forked repo to a local directory.

daniel@demo:~/n4d-fork$ git clone https://github.com/ddib-ccde/n4d.git
Cloning into 'n4d'...
remote: Counting objects: 22, done.
remote: Compressing objects: 100% (16/16), done.
remote: Total 22 (delta 2), reused 18 (delta 1), pack-reused 0
Unpacking objects: 100% (22/22), done.

When using the forking workflow, the upstream repository needs to be added.

git remote add upstream https://github.com/network4dev/n4d.git

The command git remote -v can be used to verify the origin and upstream repo.

daniel@demo:~/n4d-fork/n4d$ git remote -v
origin  https://github.com/ddib-ccde/n4d.git (fetch)
origin  https://github.com/ddib-ccde/n4d.git (push)
upstream        https://github.com/network4dev/n4d.git (fetch)
upstream        https://github.com/network4dev/n4d.git (push)

Contributing text

Before committing any text, it’s recommended to make sure that your local copy of the repo is up to date. This is accomplished by fetching from the upstream master and merging it with your local copy.

If you are already synched, it will look like in the demo below.

daniel@demo:~/n4d-fork/n4d$ git fetch upstream
From https://github.com/network4dev/n4d
* [new branch]      master     -> upstream/master
daniel@demo:~/n4d-fork/n4d$ git checkout master
Already on 'master'
Your branch is up to date with 'origin/master'.
daniel@demo:~/n4d-fork/n4d$ git merge upstream/master
Already up to date.

Before working on a document, create a branch and name it appropriately, such as osi-model-text with git checkout -b osi-model-text. Using your text editor, edit the file and then commit the changes git commit -a -m "Wrote about the OSI model. Then the commit is pushed to origin, using git push origin osi-model-text. When pushing to Github, you will need to supply your credentials.

The demo below shows an update to the scrollbar for N4D:

daniel@demo:~/n4d-fork/n4d$ git checkout -b fix-issue-5-edit-scrollbar
Switched to a new branch 'fix-issue-5-edit-scrollbar'
daniel@demo:~/n4d-fork/n4d$ git commit -a -m "Updated index.rst to set caption to main page"
[fix-issue-5-edit-scrollbar d5ac8bf] Updated index.rst to set caption to main page
1 file changed, 1 insertion(+), 1 deletion(-)
daniel@demo:~/n4d-fork/n4d$ git push origin fix-issue-5-edit-scrollbar
Username for 'https://github.com': ddib-ccde
Password for 'https://ddib-ccde@github.com':
Counting objects: 3, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 324 bytes | 324.00 KiB/s, done.
Total 3 (delta 2), reused 0 (delta 0)
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To https://github.com/ddib-ccde/n4d.git
* [new branch]      fix-issue-5-edit-scrollbar -> fix-issue-5-edit-scrollbar

Once the commit has been pushed to your forked repo at Github, you can submit a PR to upstream. By navigating to your forked repo at Github, Github will see that you have pushed a new branch, in order to create a PR from the branch, click on compare & pull request.

_images/Branch-pushed.PNG

When this is done, it will look like the picture below.

_images/Open-PR.PNG

If this PR is a fix to an issue, this can be indicated in the text, as it is in this example with “Fixes #5”. Issues can be referenced by putting the hash sign followed by the number of the issue.

When the PR has been opened, it will look like the picture below.

_images/PR-opened.PNG

The PR will then have to be approved by one of the maintainers before it will be merged to upstream.

Brief intro to reStructuredText

reStructuredText (reST) is the default plaintext markup language used by Sphinx. People familiar with Markdown will recognize some of the syntax.

First in every .rst document is the header. This is created by using equal sign above and below the text, like below:

=================
This is a heading
=================

Inline markup

  • one asterisk: *text* for italics
  • two asterisks: **text** for boldface
  • backquotes: ``text`` for code samples

Below is a sample of what it looks like:

  • italics
  • boldface
  • code sample

Images

Images are referenced with the image directive, like below:

.. image:: static_/my_image.png

Remember to put a blank line before and after this directive.

Code samples

It is supported to highlight different code examples, for example output from console with:

.. code-block:: console

It looks like the output below:

daniel@demo:~/n4d-fork/n4d/docs$ ls
audience.rst  changelog.rst  contributing.rst  design  infrastructure.rst  networking  services
_build        cloud          datacenter        images  introduction.rst    security    _static

It is also possible to highlight programming languages such as Python.

For a more complete guide, refer to reStructuredText Primer

To test your syntax, it’s possible to use Online reStructuredText editor

If unsure of the syntax, check one of the existing files in the docs directory.

Local Sphinx build

It is also possible to install Sphinx locally to test out your changes, this process is described here

Note that you should clone your forked repo if you want to be able to push changes upstream.

Issues

If there is an issue with the site, an article or if you need support. Open an issue by clicking New issue.

Infrastructure

This site uses Read the Docs which is a platform commonly used for technical documentation. Read the Docs is using Sphinx to create the documents. The source files are kept in a Github repository leveraging webhooks to push any updates to Read the Docs.

By leveraging Github for the source files, this project can be contributed to from multiple parties and when a PR has been approved, the update will automatically be pushed to Read the Docs.

Changelog

v1.0 - Work in progress

General Networking

This part of the site is about general networking, i.e. networking technologies that are general enough to be used in many parts of the network.

OSI Model

Physical

Ethernet

Discovery Protocols - CDP and LLDP

Switching

IPv4

NAT44

Network Address Translation is a technique used to change, or translate, the source or destination IPv4 address. This is not an exclusive “or” as both addresses can be translated. This document is iterative in that each use case is described and demonstrated sequentially and in small batches.

Definitions

Master these terms before continuing:

  • inside: The network behind the NAT point, often where source addresses are translated from private to public.
  • outside: The network in front of the NAT point. Often packets in this network hae publicly routable source IP addresses.
  • state: The information stored in the NAT device about each translation. NAT state is typically stored in a table containing pre and post NAT addressing, and sometimes layer-4 port/protocol information as well.
  • pool: A range of addresses which can be used as outputs from the NAT process. Addresses in a pool are dynamic resources which are allocated on-demand when hosts traverse a NAT point and require different IPs.
  • ALG: Application Level Gateway is a form of payload inspection that looks for embedded IP addresses inside the packet’s payload, which traditional NAT devices cannot see. ALGs are NAT enhancements on a per-application basis (yuck) to translate these embedded addresses. Applications such as active FTP and SIP require ALGs.
  • unsolicited traffic: In the context of NAT, this refers to traffic originating from the outside network and destined for some device on the inside network. It is called “unsolicited” because the inside host did not initiate the communication.

In Cisco-speak, some additional terms are defined. These seem confusing at first, but are a blessing in disguise because they are unambiguous.

  • Inside local: IP on the inside network from the perspective of other hosts on the inside network. This is often a private RFC 1918 IP address and is seen by other hosts on the inside as such.
  • Inside global: IP on the inside network from the perspective of hosts on the outside network. This is often a publicly routable address which represents the inside host to outside hosts.
  • Outside global: IP on the outside network from the perspective of other hosts on the outside network. This is often a publicly routable address and is seen by other hosts on the outside as such.
  • Outside local: IP on the outside network from the perspective of hosts on the inside network. This is usually the same as the outside global since the all hosts, whether inside or outside, view outside hosts as having a single, public IP address (but not always).
Notes

Things to keep in mind:

  • All demonstrations assume Cisco Express Forwarding (CEF) has been disabled which allows IP packet debugging (routing lookups) to be shown. This is not supported on newer Cisco routers, so older devices running 12.4T are used in these demonstrations.
  • This document describes NAT, but does not prescribe NAT. NAT designs are almost never the best overall solutions and should be considered short-term, tactical options. Software should avoid using duplicate and/or hardcoded IPs, then relying on NAT later to fix the blunder.
  • Mastery of the NAT order of operations is required for any developer building an application that may traverse a NAT device. This affects which addresses are revealed (and when), impacts DNS, impacts packet payloads (ie, embedded IPs), and more. The basic inside source NAT order of operations is described below.
    1. Packet arrives on inside
    2. Router performs route lookup to destination
    3. Router translates source address
    4. Reply arrives on outside
    5. Route translates source address back
    6. Router performs route lookup to destination
Static One-to-one (1:1) Inside Source NAT

The simplest type of NAT is 1:1 NAT where a single source IP is mapped to another. This technique is useful in the following cases:

  • Inside host should not or cannot share a source IP with other hosts
  • Custom IP protocols are used (not TCP, UDP, or ICMP)
  • Outside-to-inside unsolicited traffic flow is required
  • Address traceability is required, typically for security audits

When traffic traverses NAT, only the source IP changes. In this example, inside hosts 192.0.2.1 and 192.0.2.2 are mapped to 203.0.113.1 and 203.0.113.1, respectively. Cisco IOS syntax is below:

# ip nat inside source static (inside_local) (inside_global)
ip nat inside source static 192.0.2.1 203.0.113.1
ip nat inside source static 192.0.2.2 203.0.113.2
_images/nat44-static11.png

Enabling NAT (debug ip nat) and IP packet (debug ip packet) debug reveals how this technology works. Follow the order of operations:

  1. Initial packet arrives from 192.0.2.1 to 198.51.100.1 (not in debug)
  2. NAT device performs routing lookup towards 198.51.100.1
  3. NAT device translates source from 192.0.2.1 to 203.0.113.1
  4. NAT device forwards packet to 198.51.100.1 in the outside network
  5. Return packet arrives from 198.51.100.42 to 203.0.113.1 (not in debug)
  6. NAT device translates destination from 203.0.113.1 to 192.0.2.1
  7. NAT device perform routing lookup towards 192.0.2.1
  8. NAT device forwards packet to 192.0.2.1 in the inside network

Some information has been hand-edited from the debug for clarity:

IP: tableid=0, s=192.0.2.1, d=198.51.100.1, routed via RIB
NAT: s=192.0.2.1->203.0.113.1, d=198.51.100.1
IP: s=203.0.113.1, d=198.51.100.1, g=203.0.113.253, len 100, forward
NAT*: s=198.51.100.1, d=203.0.113.1->192.0.2.1
IP: tableid=0, s=198.51.100.1, d=192.0.2.1, routed via RIB
IP: s=198.51.100.1, d=192.0.2.1, g=192.0.2.253, len 100, forward

See if you can work through the following example on your own, explaining each step, for the second host 192.0.2.2. Say each step aloud or hand-write it on paper. Mastering the NAT order of operations is essential:

IP: tableid=0, s=192.0.2.2, d=198.51.100.2, routed via RIB
NAT: s=192.0.2.2->203.0.113.2, d=198.51.100.2
IP: s=203.0.113.2, d=198.51.100.2, g=203.0.113.253, len 100, forward
NAT*: s=198.51.100.2, d=203.0.113.2->192.0.2.2
IP: tableid=0, s=198.51.100.2, d=192.0.2.2, routed via RIB
IP: s=198.51.100.2, d=192.0.2.2, g=192.0.2.253, len 100, forward

The main drawbacks of this solution are scalability and manageability. A network with 10,000 hosts would require 10,000 dedicated addresses with no sharing, which limits scale. Managing all these NAT statements, made massively simpler with modern network automation, is still burdensome as someone must still maintain the mapping source of truth.

Dynamic One-to-one (1:1) Inside Source NAT

Rather than identify every inside global address and its corresponding mapping to an inside local address, one can use a NAT pool to define a collection of addresses for dynamic allocation when traffic traverses the NAT. It has the following benefits: * Arbitrary pool size coupled with arbitrary inside host list * Easy management and configuration * Dynamically-allocated state

When traffic traverses NAT, only the source IP changes. In this example, inside hosts within 192.0.2.0/25 are dynamically mapped to an available address in the pool 203.0.113.0/27. Note that the two network masks do not need to match, and that the ACL can match any inside local address. Cisco IOS syntax is below:

# ACL defines the inside local addresses (pre NAT source)
ip access-list standard ACL_INSIDE_SOURCES
 permit 192.0.2.0 0.0.0.127

# Pool defines the inside global addresses (post NAT source)
ip nat pool POOL_203 203.0.113.12 203.0.113.31 prefix-length 27

# Binds the inside local list to the inside global pool for translation
ip nat inside source list ACL_INSIDE_SOURCES pool POOL_203
_images/nat44-dynamic11.png

The output from NAT and IP packet debugging shows an identical flow from the previous section. The process has not changed, but the manner in which inside local addresses are mapped to inside global addresses is fully dynamic:

IP: tableid=0, s=192.0.2.1, d=198.51.100.1, routed via RIB
NAT: s=192.0.2.1->203.0.113.12, d=198.51.100.1
IP: s=203.0.113.12, d=198.51.100.1, g=203.0.113.253, len 100, forward
NAT: s=198.51.100.1, d=203.0.113.12->192.0.2.1
IP: tableid=0, s=198.51.100.1, d=192.0.2.1, routed via RIB
IP: s=198.51.100.1, d=192.0.2.1, g=192.0.2.253, len 100, forward

Like the static 1:1 solution, this solution still requires a large number of publicly routable (post NAT) addresses. Though the administrator does not have to manage the addresses individually, simply obtaining public IPv4 addresses is a challenge given its impending exhaustion. The solution can optionally allow outside-to-inside traffic but only after the NAT state has already been created. The solution does not provide traceability between inside local and inside global unless the administrator specifically captures the NAT state table through shell commands, logging, or some other means. the

Static One-to-one (1:1) Inside Network NAT

This option is a hybrid of the previous static and dynamic 1:1 NAT techniques. The solution requires one inside local prefix and one inside global prefix, both of the same prefix length, to be mapped at once. The solution can be repeated for multiple inside local/global prefix pairs.

It has the best of both worlds: * Inside host should not or cannot share a source IP with other hosts * Custom IP protocols are used (not TCP, UDP, or ICMP) * Outside-to-inside unsolicited traffic flow is required * Address traceability is required, typically for security audits * Easy management and configuration * Dynamically-allocated state

When traffic traverses NAT, only the source IP changes. In this example, inside hosts within 192.0.2.0/25 are statically mapped to their matching address in the pool 203.0.113.0/25. The network masks must match. Cisco IOS syntax is below:

# ip nat inside source static network (inside_local) (inside_global) (pfx_len)
ip nat inside source static network 192.0.2.0 203.0.113.0 /25
_images/nat44-network11.png

The output from NAT and IP packet debugging shows an identical flow from the previous section:

IP: tableid=0, s=192.0.2.1, d=198.51.100.1, routed via RIB
NAT: s=192.0.2.1->203.0.113.1, d=198.51.100.1
IP: s=203.0.113.1, d=198.51.100.1, g=203.0.113.253, len 100, forward
NAT: s=198.51.100.1, d=203.0.113.1->192.0.2.1
IP: tableid=0, s=198.51.100.1, d=192.0.2.1, routed via RIB
IP: s=198.51.100.1, d=192.0.2.1, g=192.0.2.253, len 100, forward

IP: tableid=0, s=192.0.2.2, d=198.51.100.2, routed via RIB
NAT: s=192.0.2.2->203.0.113.2, d=198.51.100.2
IP: s=203.0.113.2, d=198.51.100.2, g=203.0.113.253, len 100, forward
NAT: s=198.51.100.2, d=203.0.113.2->192.0.2.2
IP: tableid=0, s=198.51.100.2, d=192.0.2.2, routed via RIB
IP: s=198.51.100.2, d=192.0.2.2, g=192.0.2.253, len 100, forward
Static Many-to-one (N:1) Inside Source NAT (Port Forwarding)

This technique is used to provide overloaded outside-to-inside access using TCP or UDP ports. It’s particularly useful for reaching devices typically hidden behind NAT that need to receive unsolicited traffic.

In these examples, telnet (TCP 23) and SSH (TCP 22) access is needed from the outside network towards 192.0.2.1. To reach this inside local address, outside clients will target 203.0.113.1 using ports 9022 for SSH and 9023 for telnet, respectively. Cisco IOS syntax is below:

ip nat inside source static tcp 192.0.2.1 22 203.0.113.1 9022
ip nat inside source static tcp 192.0.2.1 23 203.0.113.1 9023

Generating traffic to test this solution requires more than a simple ping. From the outside, a user telnets to 203.0.113.1 on port 9023 which the NAT device “forwards” to 192.0.2.1 on port 23. That is how this feature earned the name “port forwarding”. The inside local device sees the session coming from 198.51.100.1, the outside local address, which in this example is untranslated:

R3#telnet 203.0.113.1 9023
Trying 203.0.113.1, 9023 ... Open
Username: nick
Password: nick

R1#who
    Line       User       Host(s)   Idle       Location
   0 con 0                idle      00:00:47
* 98 vty 0     nick       idle      00:00:00 198.51.100.1
_images/nat44-staticn1.png

Enabling NAT (debug ip nat) and IP packet (debug ip packet) debug reveals how this technology works. The NAT process is similar but starts from the outside.

  1. Initial packet arrives from 198.51.100.1 to 192.0.2.1 (not in debug)
  2. NAT device translates destination port from 9023 to 23
  3. NAT device translates destination from 203.0.113.1 to 192.0.2.1
  4. NAT device perform routing lookup towards 192.0.2.1
  5. NAT device forwards packet to 192.0.2.1 in the inside network
  6. Return packet arrives from 192.0.2.1 to 198.51.100.1 (not in debug)
  7. NAT device performs routing lookup towards 198.51.100.1
  8. NAT device translates source port from 23 to 9023
  9. NAT device translates source from 192.0.2.1 to 203.0.113.1
  10. NAT device forwards packet to 198.51.100.1 in the outside network

Device output:

NAT: TCP s=37189, d=9023->23
NAT: s=198.51.100.1, d=203.0.113.1->192.0.2.1
IP: tableid=0, s=198.51.100.1, d=192.0.2.1, routed via RIB
IP: s=198.51.100.1, d=192.0.2.1, g=192.0.2.253, len 42, forward
IP: tableid=0, s=192.0.2.1, d=198.51.100.1, routed via RIB
NAT: TCP s=23->9023, d=37189
NAT: s=192.0.2.1->203.0.113.1, d=198.51.100.1
IP: s=203.0.113.1, d=198.51.100.1, g=203.0.113.253, len 42, forward

The next example is almost identical except uses SSH. Use this opportunity to test your understanding by following the debug output and reciting the NAT order of operations:

R3#ssh -l nick -p 9022 203.0.113.1

Password: nick

R1#who
    Line       User       Host(s)              Idle       Location
   0 con 0                idle                 00:02:23
* 98 vty 0     nick       idle                 00:00:00 198.51.100.1

  Interface    User               Mode         Idle     Peer Address

NAT: TCP s=20534, d=9022->22
NAT: s=198.51.100.1, d=203.0.113.1->192.0.2.1 [21542]
IP: tableid=0, s=198.51.100.1, d=192.0.2.1, routed via RIB
IP: s=198.51.100.1, d=192.0.2.1, g=192.0.2.253, len 40, forward
IP: tableid=0, s=192.0.2.1, d=198.51.100.1, routed via RIB
NAT: TCP s=22->9022, d=20534
NAT: s=192.0.2.1->203.0.113.1, d=198.51.100.1
IP: s=203.0.113.1, d=198.51.100.1, g=203.0.113.253, len 268, forward
Dynamic Many-to-one (N:1) Inside Source NAT

This type of NAT is the most commonly deployed. Almost every consumer Internet connection has a LAN network for wired/wireless access. All hosts on this segment are translated to a single source IP address by using layer-4 source port overloading. The changed source port serves as the demultiplexer to translate return traffic back to the proper source IP address.

This solution is also called NAT overload, Port Address Translation (PAT), or Network Address/Port Translation (NAPT). The solution can consume a NAT pool (range) but can reuse inside global addresses in the pool across many inside local addresses:

# ACL defines the inside local addresses (pre NAT source)
ip access-list standard ACL_INSIDE_SOURCES
 permit 192.0.2.0 0.0.0.127

# Pool defines the inside global addresses (post NAT source)
ip nat pool POOL_203 203.0.113.12 203.0.113.31 prefix-length 27

# Binds the inside local list to the inside global pool for translation
# Addresses allocated from the pool can be re-used
ip nat inside source list ACL_INSIDE_SOURCES pool POOL_203 overload
_images/nat44-dynamicn1.png

In order to see this solution, debug ip nat detailed is used to more explicitly show the inside and outside packets and their layer-4 ports. The first example uses telnet from 192.0.2.1 to 198.51.100.1.

  1. Initial packet arrives from 192.0.2.1 to 198.51.100.1 (not in debug)
  2. NAT device performs routing lookup towards 198.51.100.1
  3. NAT device identifies inside translation using source port 57186.
  4. NAT device translates source from 192.0.2.1 to 203.0.113.12
  5. NAT device forwards packet to 198.51.100.1 in the outside network
  6. Return packet arrives from 198.51.100.42 to 203.0.113.1 (not in debug)
  7. NAT device identifies outside translation using destination port 57186.
  8. NAT device translates destination from 203.0.113.12 to 192.0.2.1
  9. NAT device perform routing lookup towards 192.0.2.1
  10. NAT device forwards packet to 192.0.2.1 in the inside network

Device output below explains how this works. Note that TCP port 23 is irrelevant for this type of NAT because only the source port matters:

IP: tableid=0, s=192.0.2.1, d=198.51.100.1, routed via RIB
NAT: i: tcp (192.0.2.1, 57186) -> (198.51.100.1, 23)
NAT: s=192.0.2.1->203.0.113.12, d=198.51.100.1
IP: s=203.0.113.12, d=198.51.100.1, g=203.0.113.253, len 42, forward
NAT: o: tcp (198.51.100.1, 23) -> (203.0.113.12, 57186)
NAT: s=198.51.100.1, d=203.0.113.12->192.0.2.1
IP: tableid=0, s=198.51.100.1, d=192.0.2.1, routed via RIB
IP: s=198.51.100.1, d=192.0.2.1, g=192.0.2.253, len 42, forward

The power of the solution is illustrated in the output below. A new source, 192.0.2.2 also initiates a telnet session to 198.51.100.1 and is translated to the same inside global address 203.0.113.12 except has a source port of 55943. Try to follow the order of operations:

IP: tableid=0, s=192.0.2.2, d=198.51.100.1, routed via RIB
NAT: i: tcp (192.0.2.2, 55943) -> (198.51.100.1, 23)
NAT: s=192.0.2.2->203.0.113.12, d=198.51.100.1
IP: s=203.0.113.12, d=198.51.100.1, g=203.0.113.253, len 42, forward
NAT: o: tcp (198.51.100.1, 23) -> (203.0.113.12, 55943)
NAT: s=198.51.100.1, d=203.0.113.12->192.0.2.2
IP: tableid=0, s=198.51.100.1, d=192.0.2.2, routed via RIB
IP: s=198.51.100.1, d=192.0.2.2, g=192.0.2.253, len 42, forward

The solution, while very widely used, has many drawbacks:

  • Many hosts use a common IP address; removes end-to-end addressing
  • Applications that require source ports to remain unchanged may not work. The NAT would have to retain source ports, which assumes inside local devices never use the same source port for inside-to-outside flows.
  • It only works TCP and UDP traditionally, but most implementations also support ICMP. Protocols like GRE, IPv6-in-IPv4, and L2TP do not work.
Twice NAT

This NAT technique is by far the most complex. It is generally limited to in environments where there are overlapping IPs between two hosts that must communicate directly. This can occur between RFC 1918 addressing when organizations undergo mergers and acquisitions.

In this example, an inside host 192.0.2.1 needs to communicate to an outside host 198.51.100.1. Both of these problems exist:

  1. There is already a host with IP 198.51.100.1 on the inside network
  2. There is already a host with IP 192.0.2.1 on the outside network

To make this work, each host must see the other as some other address. That is, both the inside source and outside source must be concurrently translated, hence the name “twice NAT”. This should not be confused with “double NAT” which is discussed in the CGN section.

Twice NAT is where all four of the NAT address types are different:

  • inside local: 192.0.2.1
  • inside global: 203.0.113.1
  • outside global: 198.51.100.1
  • outside local: 192.0.2.99

Cisco IOS example syntax:

# ip nat inside source static (inside_local) (inside_global)
ip nat inside source static 192.0.2.1 203.0.113.1

# ip nat outside source static (outside_global) (outside_local)
ip nat outside source static 198.51.100.1 192.0.2.99 add-route
_images/nat44-twice.png

The order of operations is similar to previous inside source NAT examples, except that following every source translation is a destination translation.

  1. Initial packet arrives from 192.0.2.1 to 192.0.2.99 (not in debug)
  2. NAT device performs routing lookup towards 192.0.2.99
  3. NAT device translates source from 192.0.2.1 to 203.0.113.1
  4. NAT device translates destination from 192.0.2.99 to 198.51.100.1
  5. NAT device forwards packet to 198.51.100.1 in the outside network
  6. Return packet arrives from 198.51.100.1 to 203.0.113.1 (not in debug)
  7. NAT device translates source from 198.51.100.1 to 192.0.2.99
  8. NAT device translates destination from 203.0.113.1 to 192.0.2.1
  9. NAT device perform routing lookup towards 192.0.2.1
  10. NAT device forwards packet to 192.0.2.1 in the inside network

Device output showing the order of operations is below:

IP: tableid=0, s=192.0.2.1, d=192.0.2.99, routed via RIB
NAT: s=192.0.2.1->203.0.113.1, d=192.0.2.99 [40]
NAT: s=203.0.113.1, d=192.0.2.99->198.51.100.1 [40]
IP: s=203.0.113.1, d=198.51.100.1, g=203.0.113.253, len 100, forward
NAT*: s=198.51.100.1->192.0.2.99, d=203.0.113.1 [40]
NAT*: s=192.0.2.99, d=203.0.113.1->192.0.2.1 [40]
IP: tableid=0, s=192.0.2.99, d=192.0.2.1, routed via RIB
IP: s=192.0.2.99, d=192.0.2.1, g=192.0.2.253, len 100, forward
NAT as a crude load balancer

This use case is uncommonly used but does, at a basic level, represent how a server load balancer might work. It combines the logic of port forwarding (unsolicited outside-to-inside access) with a rotary NAT pool to create a dynamic solution whereby external clients can access internal hosts in round-robin fashion, typically for load balancing.

Cisco IOS syntax example:

# Define a list of virtual IPs for outside-to-inside access
ip access-list standard ACL_VIRTUAL_IP
 permit 203.0.113.99

# Define the inside local "servers" in the pool
ip nat pool POOL_192 192.0.2.1 192.0.2.4 prefix-length 29 type rotary

# Bind the virtual IP list to the server pool
ip nat inside destination list ACL_VIRTUAL_IP pool POOL_192

Each time the outside host with IP 198.51.100.1 telnets to the virtual IP address used to represent the server pool of 203.0.113.99, the NAT device selects a different inside local address for the destination of the connection. The range of inside IP addresses goes from 192.0.2.1 to 192.0.2.4, which are actual server IP addresses behind the NAT.

_images/nat44-tcp-lb.png

The debug is shown below with RIB/packet forwarding output excluded for clarity as it reveals nothing new. Each block of output is from a separate telnet session, and represents a single keystroke:

# Choose server 192.0.2.1
NAT*: s=198.51.100.1, d=203.0.113.99->192.0.2.1
NAT*: s=192.0.2.1->203.0.113.99, d=198.51.100.1

# Choose server 192.0.2.2
NAT*: s=198.51.100.1, d=203.0.113.99->192.0.2.2
NAT*: s=192.0.2.2->203.0.113.99, d=198.51.100.1

# Choose server 192.0.2.3
NAT*: s=198.51.100.1, d=203.0.113.99->192.0.2.3
NAT*: s=192.0.2.3->203.0.113.99, d=198.51.100.1

# Choose server 192.0.2.4
NAT*: s=198.51.100.1, d=203.0.113.99->192.0.2.4
NAT*: s=192.0.2.4->203.0.113.99, d=198.51.100.1
Carrier Grade NAT (CGN)

This NAT technique does not introduce any new technology as it usually refers to simply chaining PAT solutions (dynamic inside source NAT) in series to create an aggregated, more overloaded design. Consider the home network of a typical consumer that has perhaps 5 IP-enabled devices on the network concurrently. Each device may have 50 connections for a total of 250 flows. A single IP address could theoretically support more than 65,000 flows using PAT given the expanse of the layer-4 port range. Aggregating flows at every higher point in a hierarchical NAT design helps achieve greater NAT efficiency/density. Thus, given ~250 flows per home, a single CGN could perform NAT aggregation services for ~260 homes to result in ~65,000 ports utilized from a single address.

The purpose of CGN has always been to slow to exhaustion of IPv4 addresses, buying time for a proper IPv6 deployment. The vast majority of network engineers acknowledge that CGN is not a permanent solution.

CGN is also known as Large Scale NAT (LSN), LSN444, NAT444, hierarchical NAT, and double NAT. Do not confuse CGN with “twice NAT” which is typically a single NAT point where the source and destination are both translated.

Cisco IOS sample syntax on the home router (first NAT point):

# ACL defines the inside local addresses (pre NAT source)
ip access-list standard ACL_INSIDE_SOURCES
 permit 192.0.2.0 0.0.0.127

# Use the single outside IP (often a DHCP address) to translate inside hosts
ip nat inside source list ACL_INSIDE_SOURCES interface FastEthernet0/1 overload

Cisco IOS sample syntax on the CGN router (second NAT point):

# Contains the inside global addresses (CGN range) from the home routers
ip access-list standard ACL_CONSUMER_ROUTERS
 permit 100.64.0.0 0.63.255.255

# Create NAT pool, typically a small number of public IPs for the CGN domain
ip nat pool POOL_CGN 203.0.113.0 203.0.113.3 prefix-length 30

# Bind the home routers using CGN 100.64.0.0/10 space to the CGN pool
ip nat inside source list ACL_CONSUMER_ROUTERS pool POOL_CGN overload

Given that no new technology is introduced, the debug below can be summarized in 4 main steps. The debugs are broken up to help visualize the NAT actions across multiple devices along the packet’s journey upstream and downstream.

  1. Inside-to-outside NAT at home router
  2. Inside-to-outside NAT at CGN router
  3. Outside-to-inside NAT at CGN router
  4. Outside-to-inside NAT at home router
_images/nat44-cgn.png

Device output:

# Inside-to-outside NAT at home router
IP: tableid=0, s=192.0.2.1, d=198.51.100.4, routed via RIB
NAT: s=192.0.2.1->100.64.23.254, d=198.51.100.4
IP: s=100.64.23.254, d=198.51.100.4, g=100.64.23.253, len 100, forward

# Inside-to-outside NAT at CGN router
IP: tableid=0, s=100.64.23.254, d=198.51.100.4, routed via RIB
NAT: s=100.64.23.254->203.0.113.2, d=198.51.100.4
IP: s=203.0.113.2, d=198.51.100.4, g=203.0.113.254, len 100, forward

# Outside-to-inside NAT at CGN router
NAT*: s=198.51.100.4, d=203.0.113.2->100.64.23.254
IP: tableid=0, s=198.51.100.4, d=100.64.23.254, routed via RIB
IP: s=198.51.100.4, d=100.64.23.254, g=100.64.23.254, len 100, forward

# Outside-to-inside NAT at home router
NAT*: s=198.51.100.4, d=100.64.23.254->192.0.2.1
IP: tableid=0, s=198.51.100.4, d=192.0.2.1, routed via RIB
IP: s=198.51.100.4, d=192.0.2.1, g=192.0.2.253, len 100, forward

Like most “quick fix” solutions, CGN has serious drawbacks:

  • Massive capital investment for specialized hardware to perform high-density NAT with good performance and reliability.
  • Burdensome regulatory requirements on NAT logging. If ~260 households all share a single public IP, criminal activity tracking becomes more difficult. Carriers are often required to retain NAT logging, along with consumer router IP addressing bindings, to identify where malicious activity originates. This comes with additional capital and operating expense.
  • ALG support is already hard with NAT. With CGN, almost impossible.
  • No possibility of outside-to-inside traffic, even with port forwarding, for customers.
  • No possibility of peer-to-peer applications working.
  • Restricted ports per consumer (250 in the theoretical example above)

In closing, developers should build applications that can use IPv6, totally obviating the complex and costly workarounds need to get them working across CGN in many cases. Applications that can be dual-stacked from the beginning provide an easy migration path to IPv6.

NAT as a security tool

This is a topic of much debate, and there are two sides to the coin.

NAT provides the following security advantages:

  • No reachability: In one-to-many NAT scenarios where unsolicited outside-to-inside flows are technically impossible, the inside hosts are insulated from direct external attacks.
  • Obfuscation: The original IP address of the inside host is never revealed to the outside world, making targeted attacks more difficult.
  • Automatic firewalling: As a stateful device, only outside-to-inside flows already in the state table are allowed, and the inside-to-outside NAT state is created based on an ACL, much like a firewall.

Many of these security advantages are easily defeated:

  • Relatively simple attacks like phishing and social engineering cause clients to initiate connections to the outside. Such attacks are easy to stage and more effective than outside-in frontal assaults.
  • Most applications don’t use their IP addresses as identification. A web application might have usernames or a digital certificate. The IP address itself is mostly irrelevant and not worth protecting. It’s entirely irrelevant when clients receive IP addresses dynamically (via DHCP).
  • A NAT device drops outside-in flows due to lack of state, but does not provide any protection against spoofed, replayed, or otherwise bogus packets that piggy-back on existing flows. Security appliances do. The only similarity between NATs and firewalls is that they maintain state. They should not be categorized in any other way.

In closing, do not rely on NAT as a real security technology. It’s roughly equivalent to closing your blinds in your home. It adds marginal security value but there are clearly better alternatives. Like blinds on windows, NAT may conceal and slow down some attacks, but should never be confused for a legitimate security component in the design.

Section author: Nick Russo <njrusmc@gmail.com>

IPv6

Routing

Ping and Traceroute

Troubleshooting

Services

This part of the site is about services such as DNS, DHCP and more.

Further Reading

Network Design

This part of the site is about network design, i.e. network design principles around high availability, resiliency, modularity, scalability and more.

Further Reading

Cloud Networking

This part of the site is about cloud networking, e.g. AWS, Azure, GCP or other companies leveraging cloud computing.

AWS

Amazon Web Services (AWS) is at the time of this writing (2018-09-04) the largest provider of cloud services in the world. AWS offers a breadth of services in both compute, storage, databases, serverless, CDN and much more. The purpose of this section is to describe the networking services in AWS and an overview of AWS networking.

Networking in AWS Overview

Networking in public cloud is quite different from an on-premises network. In order to build a multi-tenant, scalable, fault tolerant and high speed network, AWS has designed the network quite differently than a “traditional” network. The most notable difference is:

  • No support for multicast
  • No support for broadcast

What this means is that routing protocols using multicast are not supported without some form of tunneling but more importantly this means that because broadcast is not supported, ARP requests are not flooded in the network. Without ARP, how can a host learn the MAC address of another host? This requires some magic. Which is performed by the hypervisor. When the hypervisor sees an ARP request, it will proxy the ARP response back to the host. More about the specifics of networking in AWS later.

AWS Fundamentals

Instance

With AWS terminology, a virtual machine (VM) is called an instance.

Availability Zone

An Availability Zone (AZ) is one or several datacenters in the same geographical area sharing infrastructure such as power and internet transports.

Region

A region is a collection of AZs. The number of regions is always growing and currently there are regions in North America, Europe, Asia, and South America.

All regions consist of at least two AZs, some three and some even four. To provide high availability, applications should be deployed in more than one AZ.

Some of AWS’ services are regional and some are global.

VPC

Virtual Private Cloud (VPC) is one of the foundational building blocks of AWS. VPC is a virtual datacenter deployed in AWS’ multi-tenant cloud. By default, all instances deployed in AWS, are deployed to a default VPC. The default VPC is designed to help organizations start quickly. For that reason, the instances will by default get a public IP assigned, and have a route to the internet by default. This is not true for customer created VPCs.

These are actions that are performed in the default VPC:

  • Creates a VPC with a /16 CIDR block (172.31.0.0/16)
  • Creates a /20 subnet in each AZ
  • Creates an internet gateway and attaches it to the VPC
  • Creates a route table with a default route (0.0.0.0/0) pointing to the IGW
  • Creates a default security group (SG) and associates it with instances in the VPC
  • Creates a Network ACL (NACL) and associates it with the subnets in the VPC
  • Associates the default DHCP options set for the AWS account with the default VPC

The default security group has the following logic:

  • Allows all inbound traffic from other instances associated with the same SG
  • Allows all outbound traffic

The default NACL allows all traffic both inbound and outbound.

Subnet

When a VPC is created, it must have a subnet in order to be able to address the instances. For the default VPC, 172.31.0.0/16 is used. For customer created VPCs, the user will assign a subnet ranging from a /16 up to a /28.

Subnets are always specific to an AZ, but there can be several subnets in an AZ.

Note that some addresses in a subnet are reserved for use by Amazon. For example, in the subnet 192.168.0.0/24, the following addresses would be reserved:

  • 192.168.0.0 - Network address
  • 192.168.0.1 - Reserved by AWS for VPC router
  • 192.168.0.2 - Reserved by AWS for DNS server
  • 192.168.0.3 - Reserved by Aws for future use
  • 192.168.0.255 - Broadcast address

So what does a VPC look like? This image from Amazon’s blog shows the VPC construct:

_images/aws-vpc-overview-1.PNG
Route Table

Every subnet needs a route table, by default a subnet gets assigned to a default route table. It is also possible to create custom route tables. A route table can be associated with multiple subnets but every subnet can only belong to one route table.

It is common practice to have private and public subnets and using different route tables for each. The public subnet would then have a public IP or elastic IP and be reachable from the internet while the private subnet wouldn’t be, at least not without first passing an NAT gateway or instance.

Note that routes always point to some form of gateway or endpoint, such as an internet gateway (IGW), NAT gateway, virtual gateway (VGW), EC2 instance, VPC peer or VPC endpoint and not towards an IP address directly.

It’s important to know that forwarding can’t be changed for local traffic, that is traffic inside of the VPC. It’s not possible to override this behavior with a longer route so routing in a VPC is not strictly longest prefix matching (LPM).

Security Group

A security group is similar to a stateful ACL or a basic form of firewall, albeit distributed. The SG is applied to the instance directly and keeps track of TCP, UDP and ICMP connection status. Rules are white list only, consisting of ACL entries, and contain an implicity deny, meaning that anything not permitted, is denied. A SG can reference another SG, for example a database SG allowing any instance in the application SG to send traffic inbound on the port that the database service is exposing.

Note that an instance can be associated with several SGs, and that all rules would be evaluated before permitting or denying a flow. It’s possible to change the SG applied to an instance after it has been launched.

Network ACL

The Network ACL (NACL) is a stateless ACL, applied at the subnet level. Logically, any traffic going to an instance, will be evaluated in the NACL first, before being evaluated by the SG. NACLs are ordered and can have both permit and deny statements. It’s only possible to reference CIDR ranges and not instances or other ACLs.

For example, to deny all SSH traffic into a subnet, a NACL could block TCP port 22 inbound instead of putting rules in each SG.

Elastic Network Interface

The Elastic Network Interface (ENI) is a virtual network interface card (NIC). An instance can have multiple ENIs and an ENI can be moved to another instance within the same subnet. It’s also possible to have multiple addresses assigned to an ENI. An ENI has a dynamically assigned private IP and optionally a public one as well. While the primary ENI assigned to an instance has dynamically allocated IP addresses, with a customly created ENI, there’s flexibility to auto-assign or statically assign IP addresses within the given subnet. An instance primary ENI may also have one or more statically provided secondary addresses through the console or API.

Public IP Address

Public IP addresses are addresses that are reachable from the internet. AWS uses two types of public IP addresses:

  • Dynamically assigned public IP addresses
  • Elastic IP addresses

If the instance does not need to have a persistent public IP, it can use the public IP (dynamically assigned). This IP is however released if the instance terminates. For instances needing a persistent IP, use the Elastic IP instead. Note that this IP comes with a cost when not assigned to an instance.

Elastic IP

The Elastic IP is not dynamically assigned but rather statically assigned to instances that need a persistent IP address. It can be associated with another instance if the first one is terminated.

Having a persistent IP can simplify writing ACLs, DNS entries and having the IP address whitelisted in other systems. Note that there is always a private IP assigned to the instance, this is the one that is seen when looking at the OS level. This holds true for both dynamically assigned addresses and the Elastic IP. The private IP address remains the same unless the EIP is moved to another instance.

How can the instance have a private IP but still be publically accessible? NAT. AWS will perform 1:1 NAT so that the instance can be reached from the outside.

The VPC then, with added details, looks like below, once again the picture from Amazon’s blog.

_images/aws-vpc-routing-1.PNG

External Connectivity to the VPC

There are multiple ways to connect to a VPC but keep in mind that a VPC does not do transitive routing. This picture, from Amazon’s blog, shows the different options of connecting to the VPC.

_images/aws-vpc-external-1.PNG
Internet Gateway

The internet gateway (IGW) provides access to the internet. This device is provided by Amazon, is highly available and is used by pointing a default route at the IGW. The IGW is a regional service.

NAT Gateway

The NAT gateway is a service provided per AZ, used for translating private IP addresses into public ones. This is mainly used for instances that need to reach the internet, for example for downloading software updates, but that should not be reachable from the outside. The devices having ther private IP addresses being translated, will share a single Elastic IP.

As noted above, because it is a service provided per AZ, if multiple AZs are leveraged, use one NAT gateway per AZ.

Virtual Private Gateway

The virtual private gateway (VGW) is a regional service, highly available, that terminates two AWS services:

  • Virtual Private Networks (VPNs)
  • AWS Direct Connect

The VGW terminates IPSec tunnels and can either use static routing or Border Gateway Protocl (BGP) for dynamic routing. It’s also possible to propagate routes learned dynamically into the VPC.

Direct Connect circuits point to the VGW which is the entry point into the VPC.

Virtual Private Network

VPNs are provided through the VGW, as noted above or by having an instance terminate the VPN. For example by using AMIs from networking vendors such as Cisco, Palo Alto and others. A VGW can never setup a tunnel to another VGW. This is because a VGW is always only a responder, never the initiator.

Only 100 routes are supported to be added into the VPC through the use of VPNs.

Direct Connect (DX)

AWS Direct Connect provides the ability to establish a dedicated network connection from sites such as data centers, offices, or colocation enviroments to AWS. It provides a more consistent network experience than internet-based connections at bandwidths ranging from 50 Mbps to 10 Gbps on a single connection. Direct Connect (DX) has the following requirements :

  • 802.1Q
  • BGP
  • BFD (Optional)
Virtual Interfaces (VIFs)

A VIF is a configuration consisting primarily of an 802.1Q VLAN and the options for an associated BGP Session. It contains all the configuration parameters required for both the AWS end of a connection and your end of the connection AWS Direct connect support two types of VIFs:

  • Public VIFs
  • Private VIFs

1. Public VIFs: Public Virtual interfaces enable your network to reach all of the AWS public IP addresses for the AWS region with which your AWS Direct Connect connection is associated. Public VIFs are typically used to enable direct network access to services that are not reachable via a private IP address within your own VPC. These include Amazon S3, Amazon DynamoDB and Amazon SQS.

2. Private VIFs: Private Virtual Interfaces enable your network to reach resources that have been provisioned within your VPC via their private IP address. A Private VIF is associated with the VGW for your VPC to enable this connectivity. Private VIFs are used to enable direct network access to services that are reachable via an IP address within your own VPC. These include Amazon EC2, Amazon RDS and Amazon Redshift.

802.1Q Virtual Local Area Networks (VLANs)

802.1Q is an Ethernet standard that enables Virtual Local Area Networks (VLANs) on an Ethernet network, it uses the addition of a VLAN tag to the header of an Ethernet frame to define membership of a particular VLAN.

Direct Connect Gateway

A Direct Connect gateway enables you to combine private VIFs with multiple VGWs in local or in the remote regions. You can use this feature to establish connectivity from an AWS Direct Connect location in one geographical zone to an AWS region in a different geographical zone.

Bidirectional Forwarding Detection (BFD)

Bidirectional forwarding detection (BFD) is a network fault detection protocol that provides fast failure detection times, which facilitates faster re-convergence for dynamic routing protocols. It is a mechanism used to support fast failover of connections in the event of a failure in the forwarding path between two routers. If a failover occurs, then BFD notifies the associated routing prorocols to recalculate available routes.

Border Gateway Protocol

Border Gateway Prorocol (BGP) is a routing prorocol used to exchange network routing and reachability information, either within the same AS (iBGP) or a different autonomous system (eBGP).

BGP Best Path Selection (VGW)

The best path algorithm in AWS follows this logic:

  1. Local routes to the VPC
  2. Longest prefix match first
  3. Static route table entries to:
  • IGW
  • VGW
  • Network interface
  • Instance ID
  • VPC peering
  • NAT gateway
  • VPC endpoint
  1. VPN routes
    1. Prefer Direct Connect BGP routes
    2. VPN static routes
    3. BGP Routes from VPN (not direct connect)

For BGP routes coming from VPN, the following priority applies:

  1. Shortest AS-path
  2. Lowest origin (IGP over EGP)

Other Interesting Information

Virtual Router

There is a “virtual” router attached to the VPC. In reality, this is a distributed service provided by both HW and a hypervisor, of course.

DNS in a VPC

DNS is handled by the VPC by default. Every instance is assigned a DNS name with its IP address, automatically managed by AWS. It’s possible to use Route53 to setup a private zone as well or using an instance hosting the DNS service.

Be mindful that DNS service is only available to DNS clients within the same VPC. Although the “plus two” address (e.g. 172.31.0.2) for each VPC is reserved for DNS service by default, for simplicity Amazon also reserves the IP address 169.254.169.253 which can be used by any DNS client to reach the DNS service endpoint configured for its local VPC.

Routing and Switching
  • AWS only supports unicast traffic, no multicast or broadcast. As described previously, ARP responses are proxied by the hypervisor
  • Traffic between subnets in a VPC is allowed, unless restricted by a security group, NACL or filtered at the host
  • Traffic between two instances in the same VPC is always local and can’t be overridden with more specific routes
  • Traffic that is not sourced by or destined to an instance is dropped, unless disabling the source/destination check
  • Transitive routing is not supported, meaning that VPC A can’t communicate with VPC C by going through VPC B
  • Jumbo frames are supported inside of a VPC but traffic leaving a VPC always uses a Maximum Transmission Unit (MTU) of 1500 bytes

Section author: Daniel Dib

Azure

Microsoft Azure is a relative newcomer in cloud computing compared to AWS. Over the past couple of years Microsoft has done some incredible work with the Azure platform and open source community to become one of the leading cloud providers.

The services they offer include compute, storage, database, serverless, CDN and much more. The purpose of this section is to describe the networking services in Azure and an overview of Azure networking.

Networking in Azure Overview

Networking in public cloud is quite different from an on-premises network. In order to build a multi-tenant, scalable, fault tolerant and high speed network, Azure has designed the network quite differently than a “traditional” network. The most notable difference is:

  • No support for multicast
  • No support for broadcast

What this means is that routing protocols using multicast are not supported without some form of tunneling but more importantly this means that because broadcast is not supported, ARP requests are not flooded in the network. Without ARP, how can a host learn the MAC address of another host? This requires some magic. Which is performed by the hypervisor. When the hypervisor sees an ARP request, it will proxy the ARP response back to the host. More about the specifics of networking in Azure later.

Azure Fundamentals

Geopolitical Region

A Geopolitical region is a collection of Azure regions which can be reached from a standard ExpressRoute.

https://docs.microsoft.com/en-us/azure/expressroute/expressroute-locations has more details around this.

Regions & Availability Zones

A Region in Azure is one or several datacenters in the same geographical area sharing infrastructure such as power and internet transports.

Within a region there are multiple Availability Zones. Each Availability Zone is a separate physical location, for example a separate data centre room, or another building with very low latency to the other availability zones within the region.

https://azurecomcdn.azureedge.net/cvt-8efd50c617ae2a512e0e084105e16e030ebe42305c0e7bc7ece951fb57e33112/images/page/global-infrastructure/regions/understand-regional-architecture.png

https://azure.microsoft.com/en-gb/global-infrastructure/regions/

Resource Group (RG)

A resource group is a logical grouping of infrastructure resources. For example a Production RG may include:

  • Production VNet
  • Storage account(s) related to VMs
  • VM Servers
  • Azure Traffic Manager DNZ zones
VNet

A VNet is an IP Prefix allocated for use in Azure.

You can have multiple VNets within a subscription/resource group.

VNets are isolated instances by default, if you have multiple VNets within a region or globally they will be unable to communicate with each other.

VNet Peering, VPNs, or ExpressRoute can allow this communication.

Example of VNet assignments might be:

VNet Assignment
10.0.0.0/16 Western Europe (Prod)
10.0.16.0/16 North Europe (DR)
Subnet

A subnet is an IP Prefix allocation within a VNet. Multiple Subnets are generally assigned within a VNet. Depending on the application deployment model you might want to assign a VNet for Management, Apps, Databases, etc…

Note

This is just an example and not a recommended subnet layout.

Subnet Usage
10.0.0.0/24 GatewaySubnet
10.0.1.0/24 Management
10.0.2.0/24 Apps
10.0.3.0/24 Database
Network Security Group (NSG)

A NSG is a stateful firewall in Azure. They can be associated either with a specific VM, or with an entire subnet. See https://docs.microsoft.com/en-us/azure/virtual-network/security-overview for more information.

Network Virtual Appliance (NVA)

A Network Virtual Appliance (NVA) is the name for using a traditional network vendor VM in Azure.

For example you can deploy VMs from vendors such as Juniper VSRX firewalls, Palo Alto firewalls, Cisco CSR routers and integrate them into your standard networking toolset.

There is nothing special about these VMs, many people deploy Linux VMs to manage routing and firewalling.

User Defined Routes (UDR)

UDR is a routing table within Azure. It allows static routes to be defined and system Routes overwritten. UDRs are assigned at a subnet level, and you can assign the same UDR to multiple subnets. Generally you wouldn’t use a UDR unless you’re using a Network Virtual Appliance (NVA)

Section author: Daniel Rowe <n4d@danrowe.email>

Datacenter Networking

This part of the site is about datacenter networking, i.e. networking technologies used in on-premises datacenters.

Further Reading

Network Security

This part of the site is about network security such as firewalls, IDS/IPS, VPNs and more.

Further Reading