Sunday, August 28, 2016

KO'd the damn SQL injection

SQL Injection Prevention Framework in PHP 


A student of Arizona State University, I decided to take Software Security CSE 545 as one of my courses. The course covered a variety of topics one of them being SQL Injection. Those who are not familiar with it I suggest going through SQL Injection OWASP

Prior to enrolling in this course I took 'Principles of Programming Language CSE 340'. This course taught me how programming languages evolved and why the work, semantics, syntax and lexical analysis. The course was fun and so the instructor Assistant Professor Dr. Adam Doupé. He took keen interest in teaching with answering questions of 'WHY programming? HOW they work? in a clear crisp way.

He is an expert in web application security thus I decided to take his course 'Software Security'. As thecourse continued I decided to take up SQL injection as my semester project. Well I'm aware there are a lot of solutions out there and many prevention techniques, but guess I had learned something very useful 'semantics of a programming language' and thus decided to put it to use. SQL being structured query language and has its own syntax and semantics. I decided to work on it with another friend of mine expert in web security Tejas Khairnar.
"The purpose of this project is used to introduce a framework that prevents SQL injection in any PHP web application with minimum changes. We go to the root cause of the injection ‘malicious user supplier input’ that changes the original query"
Our approach is to put an end to SQL injection flaws in any PHP application with a framework that is fairly simple to integrate with minimum changes and robust. We describe our approach in three sections architecture, integration and simulation.

I. Architecture:

Each query written in a PHP file has certain semantics and is intended for a specific purpose for convenience we say ‘before Query semantics’. Now the malicious user input changes the intention or behavior of the original query as given below.


Thus we can say that the original query is modified and the semantics of the query has changed for convenience we say ‘after Query semantics’. The semantics of SQL queries is formally defined by stating a set of rules that determine a syntax-driven translation of an SQL query to a formal model [4]. Some of the highlights of the semantics of SQL query is given by table name(s), number of columns, type of query, column references etc. These attributes define the structure of the semantic model of the query representation.
" Our framework is based on the comparison the semantic model of ‘before Query’ and ‘after Query "
The detailed steps describe the detail flow of the web application request through the firewall to the database.
  1. SQLFirewall has a component ‘SQL Firewall DB Proxy Connector’ that connects to the database. Thus all queries pass through this proxy connector. Typically a wrapper over the existing connectors.
  2. The SQL Firewall DB proxy connector then passes the query to the ‘PHP SQL Parser’ component to generate the semantic model of the query that has been received for execution with user supplied input. For simpler purpose we define it as after Query.
  3. It then finds out the calling function in a PHP file recursively to identify the query that was used with supplied user input. We use the back trace functionality in PHP. All these functions are loaded in the component ‘PHP SQL Query Extractor’ which extracts the query defined in the PHP file before the supplied user input. We can say that this is a static analysis of a certain SQL query defined in a PHP file. For simpler purpose we define this as before Query.
  4. It then stores the semantic model of the before query in the ‘SQL Firewall Cache’ with key as a hash of combination of filename where the SQL query is defined, line number of that file name and function that was invoked with the SQL query parameter name. This uniquely identifies the after Query to this before Query stored in the cache.
  5. After which both the semantic models are then passed to the SQL Firewall validator which performs a deep check on the semantic model attributes of the after Query and the before Query.
  6. If any change is found the ‘SQL Injection Exception Handler’ component is called to determine the type of injection.
  7. If no change then the ‘SQL Firewall’ resumes execution of the actual flow and returns the result.

II. Integration:

These are the following two changes the developer needs to do to integrate our framework.
  1.  Use our ‘SQL Firewall DB Proxy Connector’ as the source to connect to MySQL.
  2.  include_once ‘application/application.php’ in any PHP startup loader page.

III. Simulation:


Limitations
Currently our framework is written in PHP and compatible with MySQL. The another drawback in our framework is during static analysis we get the semantic model of the ‘before Query’ we try to trace the callback to the appropriate SQL query defined in the PHP file using file name, line number of invocation and function name. Then we use the SQL query extractor to extract the SQL query related to this database call. However, there could be a situation where there are two variables with the same name are defined in the same file. We tackled this by throwing a warning to the developer to make appropriate changes.

Future Work
We look forward the incorporate the idea with other web applications technologies and databases. We could also extend our framework to include access controls on tables thereby adding an extra layer of protection. This framework could be part of bigger idea for a web application firewall.

References
  1.  OWASP SQL injection. Available: https://www.owasp.org/index.php/SQL_Injection
  2.  PHP SQL injection manual. Available: http://php.net/manual/en/security.database.sql-injection.php
  3.  PHP SQL Parser incorporated in the framework. Available at https://github.com/greenlion/PHP-SQL-Parser.
  4.  Formal semantics of SQL queries. Available at http://dl.acm.org/citation.cfm?id=111212/

I hope the article highlighted some key aspects how semantics can play a role in preventing SQL injection. Your comments and suggestion are most welcome.