PROBLEM
In the midst of continued development on my jHistory plugin for jQuery [ blog post - google code page - plugin page ] there was a very persistent bug with Internet Explorer - once you reload the page, the browser's history is lost. This is only an issue with dynamic pages yet works with static (aka flat file, .html, .html, etc.) pages.
The simple breakdown is that Internet Explorer retains forward/back support in the IFRAME history if the first level page responds with a 304 and not a 200 regardless of Cache-Control declarations. This means that server-side HTTP validation must be supported in your dynamic script.
Here's the simple test case to reproduce the issue:
Build a dynamically generated page, i.e. .php, .aspx, .cfm, .py, .pl, etc.
Add an IFRAME to the source to store history entries in the browser's built in history stack
Add a new entry into the browser's history by setting one of the following
Changing the SRC attribute of the IFRAME
Changing the location.href attribute on the IFRAME's window object
Changing the location.hash attribute on the IFRAME's window object
Simply reload the page and watch the history disappear.
Here's the simple test case built to use jQuery to add entries into the browser history by hijacking the IFRAME.window.location.hash object:
Load each of these test case demos in a fresh Internet Explorer window/tab and run through the simple test scenario:
Click on the "add history entry" button at least once to write a history entry - not the Back button now becoming available
Immediately reload the browser page
Note neither the Forward or Back buttons being available
The exact same test case code was used for both the demonstration files. The only difference being their extension and how the web server handles serving and/or pre-processing them which affects the response headers. What this really boils down to is the difference between the response headers when serving these two files.
SOLUTION
Internet Explorer requires a
304 Not Modified
request response on the first level page for the browser history to be retained upon reload.
This solution is to mimic default web server response behavior for HTML flat files - responding with a 304 status code to force the browser to use a cached version. The goal is to coax the browser into not pulling another version of the first level page from the server so that the browser history from the IFRAME will be retained. One might think this avenue is a bit overkill - but every combination of attempting to force the browser to cache the first level page through cache controlling headers did not work.
This solution was not the first in the list of remedy attempts. My first inkling would be to modify those headers which communicate to the browser in the request response on how to cache the page. If you are not familiar with or are completely familiar with HTTP response headers, please read or re-read the Caching Tutorial by Mark Nottingham.
All attempts at sending the
Expires
or
Cache-Control
headers to retain the browser history failed. It Mark's words, this was attempting to send the response with solely "freshness" information. It then was clear that a working validator would be required. Such validators as the oldschool
Last-Modified
or the middleschool
ETag
headers proved to be the solution to the problem.
This solution proves to be a bit of a calculated risk. The risk being that the first level page is required to be cached in the browser in order for the browser history in IE to be retained. This means that it cannot be fully dynamic - but this is perfectly acceptable in certain scenarios such as the first level page being completely static and all dynamic content being loaded through javascript, ajax, etc. This also puts a much more granular control over the browser's caching from the server side. The later examples show how to control the browser's caching for a specific amount of time. Case in point, the browser will download the file from the server upon initial load then use the cached version after every subsequent load up to a specific timeout limit. After the timeout limit is reached it will download a new version upon the next request. This follows the logic of session variables perfectly if you match the validator forced caching timeout against the timeout of your session variables.
I have built two examples of controlling request caching validation in PHP using either the
Last-Modified
or the
ETag
headers as follows:
Last-Modified header logic on 1200 second cache timeout
<?php
// last-modified logic HTTP/1.0 oldschool style 304 - send 304 on 1200 second timeout
if ( isset($_SERVER['HTTP_IF_MODIFIED_SINCE']) && ((strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE'], time()) - time()) / 60) > 0 ) {
header('Not Modified', true, 304);
header('Last-Modified: '. $_SERVER['HTTP_IF_MODIFIED_SINCE']);
exit(0);
} else {
header('Last-Modified: '. gmdate("D, d M Y H:i:s", time() + 1200) .' GMT');
}
?>
<html>
<head>
<script language="JavaScript" type="text/javascript" src="jquery.js"></script>
<script>
function addEntry () {
$('#hframe')[0].contentWindow.document.open().close();
$('#hframe').contents()[0].location.hash = '#' + (new Date()).getTime();
}
</script>
</head>
<body>
<input type="button" onClick="addEntry();" value="add history entry"><br>
<iframe id="hframe" src="cache.html" />
</body>
</html>
Both the above examples will serve the request dynamically upon first load. Then subsequent requests will check the incoming validation headers against a set timeout, 1200 seconds. If we validate that the last request was within the timeout we send the 304 Not Modified to the browser instructing it to use its cached version. If we validate that the request was beyond the timeout we generate a new request. Each request, regardless of status or meeting the validation timeout requirement, will respond with an updated validator. This mimics simple web session handling logic.
My preference is to use the ETag header logic. This is not strictly typed such as the Last-Modified header which requires an HTTPDATE in GMT format. The ETag is more generic and allows us to put different values - such as numeric unix EPOCH timestamps. The latter proves easier for the validator timeout comparison logic. This also makes it easier to control new code deployment. This allows the administrator to hijack the comparison logic to force fresh versions of the file to the browser regardless of if the validator was within the timeout to ensure deployment of the updated pages.
I have not closed the book on potential solutions to this problem in Internet Explorer - nor have I found an honest answer as to why IFRAME browser history is not retained upon browser window reload when using standard cache controlling headers.