Understanding Web Requests (HTTP and HTTPs Fundamentals)
Problem Statement - Understand the underlying mechanism of web communication
Approach - Explanation of URL components, HTTP vs. HTTPS, client-server model, and DNS resolution flow.
Tools Used - Web browser, conceptual understanding of DNS servers.
Introduction
Understanding web requests is critical for any cybersecurity enthusiast to be successful. It focuses on how web applications work. The first focus is on HyperText Transfer Protocol (HTTP). HTTP is an application layer protocol that is used to access the World Wide Web resources. Hypertext stands for texts that contain links to other resources and texts that can reader can understand. Since HTTP communication is client and server, the client requests resources from the server that processes the request and returns the requested resources. The default port for HTTP communication is port 80 and that of HTTPS is port 443. To fully access the web, the user must enter a fully qualified domain name as a Uniform Resource Locator (URL) to reach the desired website. A URL has different specifications including the scheme that describes the protocol being accessed, user information that contains the credentials of the client, a host that contains the resources being requested, a port that defines the way to access the server, a path that points to the location of the requested resource, query string that consists of parameters and value and the fragment (See Figure 1). It is worth noting that not all of these components are required to access web resources. However, scheme and host are essential to access the resource.
Activity and Lessons Learned
I explored a typical HTTP flow and learned that it
involves two main processes. The first process happens when the client enters
the domain name on the browser. The browser will contact the preferred DNS
server to resolve the domain name to IP. In this process, the browser will
first look at its local cache if it has looked up the address recently. If not,
the browser forwards the request to the Recursive DNS server that also checks
its local cache if it has recently looked up the address. If yes, the recursive
DNS relays the IP address to the client and the request ends. If not, the
recursive DNS will forward the request to the Internet’s Root DNS servers which
will determine the correct Top-Level Domain (TLD) server to process the
address. The TLD server will then forward the request to the authoritative
server that will process the address and return the IP address. The TLD and
recursive DNS servers will save the address in their local cache and relay it
to the browser.
The second process happens when the browser has
received the IP address of the desired domain name. The client will send the
HTTP GET request to port 80 asking for the root path. The server receives the
request, processes and returns it with a 200 OK. The web browser then renders
and outputs it to the user (See Figure 2).
Client URL (cURL)
Client URL is a command-line tool used to send various types of web requests from the command line. When returning responses, cURL does not render and presents it in raw format. Client URL uses various flags for various purposes as seen in Figure 3.
Note: In HTTP, data is transferred in clear-text, giving room for Man-in-the-middle
(MiTM) Attacks.
HyperText Transfer Protocol Secure (HTTPs)
HTTPs are a more secure protocol and are used to counter the risks associated with transferring data in clear text seen in HTTP. In this protocol, all communication/data are transmitted in encrypted format, making it difficult for the third party to extract and retrieve the data. When HTTP is transferring data, the data is in plain text and anyone can easily read it. But with HTTPs, the data is encrypted and transferred as Application data, which is transferred as a single encrypted stream, making it difficult for any malicious actor to capture information such as credentials (See Figure 5). HTTPS websites are identified through the https:// scheme.
Note * While HTTPs transfers data/communication in an encrypted format, the
request may still reveal the visited URL if it contacts a clear-text DNS
server. Thus, it is safe to use encrypted DNS servers such as 8.8.8.8 or
1.1.1.1 or use a VPN service to ensure all traffic is encrypted.
HTTPs Flow
When a client types http:// to visit an https:// site, the browser will attempt to resolve the domain. It redirects the user to the web server hosting the target website, and the request is sent to the web server through port 80 – this is an unencrypted HTTP protocol. The server detects this and redirects the client to secure port 443. This redirecting process is achieved through the 301 Moved Permanently response code. Upon receipt of the server response, the client sends a client hello packet to introduce itself, and the server responds with a server hello message. This process is followed by key exchange to exchange SSL certificates from the server and the client verifies the certificate and sends its certificate. It will then initiate an encrypted handshake to confirm whether the encryption and transfer are working correctly. After the handshake completes successfully, normal HTTPs communication resumes. See Figure 6 for details.
This activity is a comprehensive exploration of web requests, fundamental for cybersecurity. It elucidates the client-server communication model of HTTP and HTTPS, detailing URL components and the crucial DNS resolution process. The security implications of clear-text HTTP versus encrypted HTTPS are highlighted, along with the functionality of cURL for command-line interactions.
0 comments:
Post a Comment