What happens when you type a URL in your browser and press Enter?

Melisa Rojas
8 min readSep 13, 2021

It’s something you probably do every day, several times a day. Type the URL of a website in your browser, press Enter and the website will load on your screen. All text and images are displayed exactly where they are supposed to, and there may be some input forms where you can submit your personal information. How does all this happen? A lot happens behind the scenes and sometimes it can all seem a bit magical.

In this article, I will do my best to explain my understanding of the question, What happens when you type “https://www.holbertonschool.com” in your browser and press Enter?

Let’s review the main concepts:

Request-response model

For this to make sense, it is important to understand the request and response model. This is the method by which computers communicate with each other over the Internet. One computer, called a client, sends a request to a second computer, called a server. Then the client waits for a response from the server. This response will generally be the content of a web page or an error message if it can’t process the response. When you are browsing the Internet, your web browser is the client and the server contains the website you are trying to access is the server.

URL

The first thing that happens is that your URL, or Uniform Resource Locator, is parsed. This means that all parts of the text you entered are divided and decoded. A URL is a type of URI, or Uniform Resource Identifier, a generic term for all types of names and objects that refer to objects on the World Wide Web. The URL in our case is https://www.holbertonschool.com.

The first part of the URL is the protocol, https. In our case, the protocol is https, which stands for Hypertext Transfer Protocol Secure. This is a more secure extension of the http protocol. This is a protocol that uses the request-response model to communicate between the client (your browser) and the server.

The protocol is followed by a colon and two slashes, then the name of the resource. A resource name is a domain name or an IP address. Our resource name is the domain name www.holbertonschool.com. The domain name is divided into 3 components: the subdomain, the second-level domain, and the top-level domain. Our subdomain is www, our second-level domain is holbertonschool, and our top-level domain is com.

DNS request

Now that we understand all the parts of the URL, let’s get to the DNS request. DNS stands for Domain Name System, and this is the system involved with translating between domain names and IP addresses. The domain name we’re dealing with is www.holbertonschool.com. But web browsers actually interact with IP addresses. IP stands for Internet Protocol, which is the set of rules that makes it possible for devices to communicate over the Internet.

If you have visited this website before, the address may be cached in your browser. The cache is like your phone’s contact list, containing names (domains) associated with numbers (IP addresses). If it is not stored there, the browser will ask the operating system (OS) if it has www.holbertonschool.com stored in its cache.

Otherwise, the operating system will start the process of going through the Domain Name System (DNS) to find the address. There are four DNS servers involved in loading a webpage. The first is the DNS recursor, which receives queries from the client. It also makes additional requests to satisfy the query. The second is the root nameserver, which resolves host names into IP addresses. The third is the top level domain (TLD) nameserver, which hosts the last part of the hostname (in our case, com). Finally, the authoritative nameserver is the last step in the process. If it has access to the requested record, the IP address is returned to the DNS recursor.

TCP/IP

TCP/IP, the Transmission Control Protocol/Internet Protocol, is used to connect network devices on the Internet. It specifies how data is exchanged, broken into packets, addressed, transmitted, routed, and received. It’s designed to make networks reliable. TCP is involved with creating channels of communication across a network . It also manages how the message is turned into smaller packets to be transmitted and then reassembled at the destination address. IP is involved with how to address and route packets and make sure they reach the correct destination. Because our request is sent via https, it uses port 443.

An alternative transport layer protocol, User Datagram Packet (UDP), is faster, but less reliable: packet delivery is not double checked. UDP is typical for streaming services where instant content takes precedence; TCP is used almost everywhere else.

Firewall

A firewall monitors network traffic, both incoming and outgoing, and permits or blocks packets. It’s designed to create a barrier between an internal network and the Internet, blocking viruses and hackers. Firewalls have a set of rules and filter traffic coming from suspicious sources. When we make our request to access the Holberton School website, the firewall will decide if it’s safe to transmit the site to us. The firewall is configured to allow requests on port 443, which is the port associated with https.

HTTPS/SSL

Earlier, we observed that the first part of our URL is the https protocol. This stands for Hypertext Transfer Protocol Secure, a more secure extension of http. Any website beginning with https will be secured by an SSL certificate. SSL, Secure Sockets Layer, is a technology that ensures that any data sent between two systems is encrypted. This makes it safe to transmit sensitive data, such as credit card numbers and social security numbers. If a hacker were able to intercept your data, it would be impossible to read. When you browse a website that has an SSL certificate, a lock icon will appear next to the URL. You can click on this lock to view the details of the certificate, including the issuing authority and the corporate name of the website owner.

Clicking this lock on the Holberton School website brings up all kinds of information. I can see that the certificate is issued by Amazon and the server is located in California. The website is secured with SHA-256, a cryptographic hash algorithm, with RSA encryption. RSA is the current standard for public key cryptography. All of this means that if someone were to intercept data transmitted between my browser and the Holberton School website, they would not be able to understand it.

Load-balancer

Websites are located on servers. If a website were located on only one server, this would create a single point of failure. In this situation, if the server hosting the website was down, nobody would be able to access the website. To handle this problem, load-balancers are used. Several copies of the Holberton School website are stored on different servers, and a load-balancer is used to determine which server should be used for each client. A load-balancer can redirect traffic to other servers if one server goes down. It can also start sending requests to any new servers that are added to the group.

There are different load balancing algorithms that can be used to determine how traffic should be distributed. If the round robin algorithm is used, requests will alternate between the available servers. If the least connections algorithm is used, new requests are sent to whichever server has the fewest connections to clients. If the IP hash algorithm is used, the IP address of the client is used to determine which server receives the request.

Web server

The load-balancer will direct your request to a web server. The web server hardware is a computer that stores the different components of a website, such as HTML documents, images, CSS stylesheets, and JavaScript files. When a browser needs a file that’s hosted on a web server, the browser requests the file. When the request reaches the correct web server hardware, the web server software accepts the request, finds the document, and sends it back to the browser. That is, the web server responds to the HTTP requests issued.

Application server

If a website is static, then a web server alone would be enough. That website would look exactly the same for all users. But many websites today are dynamic. They can change for different users. Dynamic websites allow for logging in with a username and password, saving information on the site, etc. Application servers work between the web server and the database server. The Holberton School website allows you to create an account and apply to the school, saving your information.

Database

A database is an organized collection of data, usually controlled by a database management system. The data is arranged in rows and columns in a series of tables that make it efficient to process and query. The most popular language is SQL, structured query language. MySQL is an example of a database management system based on SQL. It’s designed for web applications and can run on any platform. User information on the Holberton School website is stored a database.

As you can see, there are many steps involved in loading a website. It’s hard to imagine all of this happening in less than a second every time you type a URL and press Enter.

Let’s look at a quick summary of the process just described:

  1. The browser receives the URL https://www.holbertonschool.com and parses it into its protocol (https), hostname (www.holbertonschool.com), port (implicity, 443), and location (implicity, root /).
  2. The browser checks if the hostname has already been resolved in its own or the OS’s cache. If so, the corresponding IP is retrieved right there and then.
  3. Otherwise, the hostname is resolved through the Domain Name System.
  4. The browser completes a TLS handshake with the load balancer specified at the resolved IP. This communication occurs over TCP/IP.
  5. Having established an encrypted connection method, the browser sends the load balancer a GET request for the file located at the root of www.holbertonschool.com.
  6. The GET request is passed through a firewall on the load balancer.
  7. The load balancer distributes the GET request to the next available host server, as determined by its configured load balancing algorithm.
  8. The GET request is passed through a firewall on the web server.
  9. The web server retrieves the file located at its root directory and returns its content, served dynamically by the application and database servers.
  10. The browser receives the HTTP response message containing the file content and renders the HTML page to the user.
Diagram

--

--