Optimal audience: PHP programmers who want to accept input from web users without risking duplicate input when users refresh their browsers or click 'Back' arrows.
Gettin' and Postin'
Web browsers send more than just the URL when they transmit requests to web servers; a verb is bundled in as well. Usually, this verb is GET. It means: Hey, web server, show me what you have at http://www.timecube.com/
Whether simple or complex, GET is about retrieving content from a web server. Best practice is for GET requests to be free of side effects. In particular, GET should not be used to update a web site's database, because programs like Googlebot try to GET everything they can; this can lead to nightmare scenarios for databases affected by GET requests.
POST is another verb (or request method), except this one is about sending content to a web server. If you use any online forums, think of GET as what you use to read posts and POST as what you use to post posts.
The Trouble With Posting
If a user requests the same URL two times or ten times using GET, the only downside is some extra bandwidth usage. Requesting the same URL multiple times with POST can mean sending duplicate information to the web server. This is why some websites warn against extra clicking, or refreshing, or navigating with Back and Forward buttons. It can lead to duplicate forum posts, duplicate user registrations, or duplicate purchases. Not good!
|Diagram courtesy of Quilokos.|
The fix is to start with a POST to send information to the web server, but end up on a GET: a safe, no-surprises GET. So instead of immediately receiving a confirmation page in response to a POST, the web client receives a redirect response which, in turn, causes the web client to issue a GET to see the confirmation page. (Yes, this is a bit convoluted.)
|Diagram courtesy of Quilokos.|
This general fix is called the "Post Redirect Get" pattern (or PRG pattern). What tripped me up was how to implement the PRG pattern in PHP. I found parts of a solution here and there, but not a (relatively) simple example all in one place.
A (Relatively) Simple Example All In One Place
Create a file named "echochamber.php" and paste in the following contents (minus the line numbers):
Oh, You Want An Explanation?
PRG can be done with three separate files: the form to fill out, a file that processes filled out forms and gives a redirect response, and a final result page that is the target of redirection. But it's often convenient for the fill-in page and the final-result page to be the same or very similar. Why not stuff everything into one file? At any rate, I'm taking the all-in-one approach in this example. It's less intuitive, but not too bad.
Suppose a user navigates to http://www.prg-in-php-example.gov/echochamber.php (or whichever domain you're using). The two 'if' statements on lines 6 and 13 will fail. In fact, the big PHP section does nothing significant besides initializing the $echoedShout variable to an empty string. The HTML section is rendered as a simple text input box:
This hypothetical user is a Poe fan, so she types in "Lenore" and hits Enter. Lines 31 and 32 take this input and construct a POST request that includes a variable called "shout" with the contents "Lenore". This POST request is sent back to the web server's echochamber.php file (which happens to the same file in this case). Execution starts again from the top.
On this second time around, the $_POST superglobal tested on line 6 has some content, i.e. the "shout" variable and its associated content "Lenore". Ignore line 7 for just a moment. Lines 9 and 10 respond to the POST request with redirect headers. The user's web browser will receive the redirect headers and start a new GET request for echochamber.php.
Problem! How will this GET request differ from the original GET request? After all, it's not redirecting users to "/echochamber.php?shout=Lenore" or anything that obvious.
The secret sauce is the $_SESSION superglobal. It provides a temporary holding place for this user's data. Line 7 puts the contents of "shout" that came in $_POST into "shout" in $_SESSION so that "Lenore" can survive a trip through a fresh GET. The same principle can work for ten, twenty, or more variables.
Data Display, Finally
Third time around. Second GET. $_SESSION is loaded up.
$echoedShout is once again initialized to an empty string, but won't stay that way for long. This is a GET, so the 'if' statement on line 6 will fail. Line 13's 'if' will succeed because $_SESSION is holding a value for "shout". That value is copied to $echoedShout and then the HTML renders:
Two Ways to Go Wrong
Is all of this complexity really necessary? For instance, why bother with lines 20 and 21's functions session_unset() and session_destroy()? The difference is what happens when a user refreshes a page showing "Lenore" over the blank field.
With session-killer functions: "Lenore" vanishes, and the user sees the original page with a blank field alone and no hidden state in $_POST or $_SESSION.
Without these functions: "Lenore" remains. Any code between lines 13 and 19 will run again with the same $_SESSION values. This can cause duplicate database entry on account of $_SESSION, even if the PRG pattern is preventing duplicate entry on account of POST.What happens if we really simplify and leave out the PRG pattern entirely? In other words, what if "echochamber.php" were only:
At first, it might seem like everything is hunky-dorey, but hit Refresh and you'll see a warning like this: