Yesterday I happened to read a good bit of discussion which happened way back in 2005 on behavior of Google’s Web Accelerator and the trauma caused to websites. Although coming a little late to the party (3 years later), my post on Best Practices for GET and POST HTTP commands does answer some big questions which kept occurring in the discussion.
At first, a little introduction to the havoc wreaked by WebAccelerator: It sits with your browser, and “clicks” links intelligently on the page you have visited. This ensures that your next click opens the new page instantly. However, “intelligent” behavior started to trouble web applications where links happened to update/delete records in Admin Consoles.
Although the bigger question raised was regarding privacy concerns (Google indexes pages prefetched by WebAccelerator, which includes pages unreachable by its crawlers), lets keep that out for a moment and revisit the issues faced by web developers. As Web Accelerator is no longer active, you may wonder why we need to recap history. The reason is, you never know what plugin the users of your app have installed on their browsers. Yesterday, it was Google. Tomorrow, it may be something smaller, having auto-installed with another package, and no one will have an idea that your pages are being prefetched.
As always, information websites with links sprinkled around do not need to bother about prefetch. Its the websites with user authentication required that mostly fell prey to this activity.
I’ve not tried GWA, and there are comments stating that GWA doesn’t do a lot of things which have been alleged. However, our work here is not to discuss merits of Web Accelerators and their conformance to standards. All we want to do is strengthen our own website. So lets take a look at some problems faced, and graceful solutions or workarounds opined.
1. “Logout” link prefetched once the user logged in: This threw the user out before he did any other activity. Quite irritating. The “Best Practices” supporters came out in strong defense of Google here. Why would developers keep Logout as a link (GET) and not a POST, they asked. Except that Logout is really an idempotent operation! A user can logout once or ten times, and it is always the same result, in almost all cases. Our little tweak to the Best Practices helps in deciding that POST is better for Logout.
A safer deal is to have form method as GET when the application state does not change at all
2. “Delete” links prefetched in Admin consoles: Well, this is pretty straightforward. You cannot have “Delete” as a GET operation. But here’s where we get out of utopia. In the real world, navigation and look and feel of the application is largely decided by the UI team, and the developer has little say in the matter. If the designers feel that links alongside 10 items feel “cool” and buttons don’t, well, you need to keep a link. The workaround here is to have a href = "#" and code a form submit on the onclick event of the link.
3. Links which involved heavy database operations: …and thus increased server load were prefetched. A way out here is to limit the number of “heavy” operations performed by a user per minute. This seems like a fair balance between a hack to redirect to 403 and a puritan approach of removing links altogether, making pages accessible only through Javascript or POST operations.
4. Links which retrieved data but also imposed exclusive locks on the data: The first user to come along could end up locking quite a bit of system data, thanks to prefetch operation. However, isn’t a lock on data change of application state? The change needn’t be in a database operation. Any change of state should require (scream) POST.
Well, that’s quite an interesting list of 4 points with repeated gyaan which has, no doubt, also been written before by others. But, as long as reading this post helps at least one developer, I’m happy. Benefited developer, please post a comment so that I stay vindicated