This chapter discusses sessions and the inherent risks associated with stateful web applications. You will first learn the fundamentals of state, cookies, and sessions; then I will discuss several concerns—cookie theft, exposed session data, session fixation, and session hijacking—along with practices that you can employ to help prevent them.
The rumors are true: HTTP is a stateless protocol. This description recognizes the lack of association between any two HTTP requests. Because the protocol does not provide any method that the client can use to identify itself, the server cannot distinguish between clients.
While the stateless nature of HTTP has some important benefits—after all, maintaining state requires some overhead—it presents a unique challenge to developers who need to create stateful web applications. With no way to identify the client, it is impossible to determine whether the user is already logged in, has items in a shopping cart, or needs to register.
An elegant solution to this problem, originally conceived by Netscape, is a state management mechanism called cookies. Cookies are an extension of the HTTP protocol. More precisely, they consist of two HTTP headers: theSet-Cookieresponse header and theCookierequest header.
When a client sends a request for a particular URL, the server can opt to include aSet-Cookieheader in the response. This is a request for the client to include a correspondingCookieheader in its future requests. Figure 4-1 illustrates this basic exchange.
If you use this concept to allow a unique identifier to be included in each request (in aCookieheader), you can begin to uniquely identify clients and associate their requests together. This is all that is required for state, and this is the primary use of the mechanism.
Figure 4-1. A complete cookie exchange that involves two HTTP transactions
The best reference for cookies is still the specification provided by Netscape at http://wp.netscape.com/newsref/std/cookie_spec.html. This most closely resembles industry support.
The concept of session management builds upon the ability to maintain state by maintaining data associated with each unique client. This data is kept in a session data store, and it is updated on each request. Because the unique identifier specifies a particular record in the session data store, it’s most often called the session identifier.
If you use PHP’s native session mechanism, all of this complexity is handled for you. When you callsession_start(), PHP first determines whether a session identifier is included in the current request. If one is, the session data for that particular session is read and provided to you in the$_SESSIONsuperglobal array. If one is not, PHP generates a session identifier and creates a new record in the session data store. It also handles propagating the session identifier and updating the session data store on each request. Figure 4-2 illustrates this process.
While this convenience is helpful, it is important to realize that it is not a complete solution. There is no inherent security in PHP’s session mechanism, aside from the fact that the session identifier it generates is sufficiently random, thereby eliminating the practicality of prediction. You must provide your own safeguards to protect against all other session attacks. I will show you a few problems and solutions in this chapter.