Saturday, July 21, 2007

Agile Management of Database Changes

The last couple of weeks I have been researching on a better way to manage our application database changes. We have a small size team (8-9 persons) working on the application and we have a hard time managing the database changes. The way we handle the changes currently is not the most elegant approach but is followed in many team environments.

In the current system we store the database changes as sql statements in source control categorized by the changed component such as tables, sequences, indexes etc. So we basically have a "db" folder under source control containg separate folders for various DB components such as tables, sequences, indexes etc. For each new DB component a new sql file is created and is stored in the corresponding db component folder and changes to existing DB components are added to corresponding DB component creation script.

Here are some of the issues with our current approach:

1) Hard to compare db changes between build versions

2) No elegant way to release db changes to production/staging/other team member dev boxes. In order to push the db changes to production system @ build time, a separate changes.sql file is created and run at release time. Sometimes for developers to catch up with current db state, they need to go through a maze of sql files and manually determine the changes that need to be synched.

3) Hard to switch database to a previous release version. Database changes must be correctly segregated to accomplish the above scenario.

Looking around in search of an elegant approach I stumbled across this concept of agile databases. Of many things that constitute the core principals of agile databases, incremental change and easy change management is one of them. So I found this tool called migratDB that can be used to manage db changes.

MigrateDB is a simple xml based solution that applies all the changes defined by you in the XML file. For each sql change, you provide a pre-condition or test condition for the execution of the sql and the SQL is executed when the condition is met. The tool provides command line support and also gives ANT integration.

Here are some of the salient features (taken from migratDB):

1) Allows construction of a database at a particular version
2) Allows migration from an existing database to a later version
3) Human readable format for releases
4) Release ‘action’ available on multiple environments (i.e., various operating systems) allowing development on a different platform than production
5) Provides complete history of changes for each database object
6) The source code can be branched and merged
7) Allows multiple developers to work with/on the same database source code, at the same time
8) Supports an ‘automated build’ / ‘continuous integration’ enviroment


Here's the setup I played around with:

Database: Postgres

Sample changes.xml file:


<project>



<!-- test & action to create a new employee table -->
<change>
<sqltest exists="false">
select * from information_schema.tables where table_name = 'employee';
</sqltest>
<sqlaction>
CREATE TABLE employee (
emp_id int4 NOT NULL DEFAULT nextval('employee_empId_seq'::text),
first_name VARCHAR(12) NOT NULL,
middle_initial CHAR(1) NOT NULL,
last_name VARCHAR(15) NOT NULL,
dept_id int4,
phone_number CHAR(4),
hire_date DATE,
job VARCHAR(50),
education_level int2 NOT NULL,
sex CHAR(1) ,
birth_date DATE ,
salary NUMERIC ,
bonus NUMERIC ,
commission NUMERIC ,
CONSTRAINT employee_emp_id_pk PRIMARY KEY (emp_id),
CONSTRAINT employee_phone_number_ck CHECK (phone_number >= '0000' AND phone_number <= '9999')
);
</sqlaction>
<sqlaction>
ALTER TABLE employee OWNER TO pgsql;
</sqlaction>
<sqlaction>
GRANT ALL ON TABLE employee TO pgsql;
</sqlaction>
</change>



<!-- test & action to create a new department table -->
<change>
<sqltest exists="false">
select * from information_schema.tables where table_name = 'department';
</sqltest>
<sqlaction>
CREATE TABLE department (
dept_id int4 NOT NULL DEFAULT nextval('department_dept_id_seq'::text),
dept_name VARCHAR(36) NOT NULL,
manager_emp_id int4,
admin_dept_id int4,
location VARCHAR(50),
CONSTRAINT department_dept_id_pk PRIMARY KEY (dept_id)
);
</sqlaction>
<sqlaction>
ALTER TABLE department OWNER TO pgsql;
</sqlaction>
<sqlaction>
GRANT ALL ON TABLE department TO pgsql;
</sqlaction>
</change>



<!-- test & action to create a new referential integrity -->
<change>
<sqltest exists="false">
SELECT * FROM information_schema.referential_constraints where
constraint_name = 'department_admin_dept_id_fk';
</sqltest>
<sqlaction>
alter table department add CONSTRAINT department_admin_dept_id_fk FOREIGN KEY
(admin_dept_id) REFERENCES department(dept_id)ON DELETE SET NULL;
</sqlaction>
</change>




<!-- test & action to create a new sequence -->
<change>
<sqltest exists="false">
SELECT * FROM pg_catalog.pg_statio_user_sequences WHERE relname = 'department_dept_id_seq';
</sqltest>
<sqlaction>
CREATE SEQUENCE department_dept_id_seq
INCREMENT 1
MINVALUE 0
MAXVALUE 9223372036854775807
START 1
CACHE 1;
</sqlaction>
<sqlaction>
ALTER TABLE department_dept_id_seq OWNER TO pgsql;
</sqlaction>
</change>




<!-- test & action to create a new index -->
<change>
<sqltest exists="false">
SELECT * FROM pg_catalog.pg_statio_user_indexes WHERE indexrelname = 'department_admin_dept_id_n_idx';
</sqltest>
<sqlaction>
CREATE INDEX department_admin_dept_id_n_idx ON department (admin_dept_id);
</sqlaction>
<sqlaction>
ALTER INDEX department_admin_dept_id_n_idx OWNER TO pgsql;
</sqlaction>
</change>




</project>



The above changes.xml file is stored in source control and can be used to detect database changes between versions. A custom XML parser can be built to track changes for a particular db object.

For databases that do not support data dictionary, an alternative approach called Generic Migrations can be used:

Sample ant target:


<project name="sample_migrateDB">
<description>Ant build file for database synchronization</description>
<property file="${basedir}/buildMigrateDB.properties" />

<target name="init">
<property value="${db.user}" name="db.user" />
<property value="${db.password}" name="db.password" />
<property value="${db.drivername}" name="driver.name" />
<property value="${db.url}" name="db.url" />
<echo message="DB Name: ${db.url}" />
</target>

<!-- Define the build path -->
<path id="buildMigrateDB.classpath.ref">
<!-- ensure the following libs are present in the {lib.dir}: migratedb.jar and postgres jdbc jar -->

<fileset dir="${lib.dir}">
<include name="*.jar" />
<include name="*.zip" />
</fileset>
</path>

<property name="buildMigrateDB.classpath" refid="buildMigrateDB.classpath.ref" />
<!-- End defining the build path -->

<taskdef name="dbrelease" classname="net.sf.migratedb.ant.MigrateDbTask" classpath="${buildMigrateDB.classpath}" />

<target name="dbmigrate" depends="init" description="Create Latest Database Version">

<dbrelease driver="${db.drivername}"
url="${db.url}"
userid="${db.user}"
password="${db.password}"
apply="true"
verbose="true"
<!-- location for above change xml file -->
file="${root.dir}/database/postgres/db.xml"
/>
</target>
</project>

Thursday, July 19, 2007

Managing Log4j Appenders @ Runtime

Many a times when debugging production issues we wish the log level would have been @ DEBUG level or some other lower level to get more insight into the issue. The usual way to debug non-obvious production issue is to reproduce it in you local development environment. In many cases it works but in some cases its just hard to reproduce the exact environment which caused the error. For instance trying to reproduce a complex hibernate transaction error and that sort. In our application we use a hell lot of webservice calls to interact with various third party hosted services. Its almost impractical to log all the SOAP interactions and many a times reproducing the environment which generated the faulty SOAP call is not possible.

For this reason we manage our Log4j appenders @ runtime wherein we can add new appenders and change priority levels of existing appenders @ runtime. Once the stacktrace or the required log is gathered, the priority levels can be reverted back to the non-voluminous/required levels.

We have a Log4j administration page in our application which can only be accessed by super ninjas a.k.a developers. This page can be used to change/add appenders at runtime. The page has sections to add new appenders and change priority levels of existing appenders.

To start with we display all the current appenders with their current priority levels. This can be achieved using the code below (for simplicity sake I am :



// collect all current appenders
List appenders = new ArrayList(50);
Enumeration e = LogManager.getCurrentLoggers();

while ( e.hasMoreElements() )
{
appenders.add(e.nextElement());
}
request.setAttribute("appenders",appenders);

// all possible priority levels
Priority [] prios = Priority.getAllPossiblePriorities();
request.setAtrribute("possiblePriorities", prios);



In the JSP page you can use a simple JSTL tag to display the above collected information in a tabular format.



<%@ page import="java.util.*,org.apache.log4j.*"

....
....

<table>
<tr>
<td> Appender </td>
<c:forEach var="priority" items='${possiblePriorities}'>
<td><c:out value='${priority}'/></td>
</c:forEach>
</tr>

<c:forEach var="appender" items='${appenders}'>
<tr>
<td><c:out value='${appender.getName()}'/></td>
<c:forEach var="priority" items='${possiblePriorities}'>
<td>
<input type="radio" name="'${appender.getName()}'" value="'${priority}'"
<c:if test='${appender.getChainedPriority() == priority)}'> checked </c:if>
>
</td>
</c:forEach>
</tr>
</c:forEach>

<tr>
<td rowspan=5 align="center"> <button type="submit" name="submit" value="update">Update</button></td>
</tr>

</table>

<!-- section of page to add new appender-->

<table style="width:auto;">
<tr>
<td align="center">
<input type="text" name="newLogger" size="70">
<td>
<td>
<select name="newLoggerLevel">
<c:forEach var="priority" items='${possiblePriorities}'>
<option value="<c:out value='${priority}' />"><c:out value='${priority}' /></option>
</c:forEach>
</select>
<td>
<td>
<button type="submit" name="submit" value="Add">Add Logger</button>
</td>
</tr>
</table>



The idea is to get a view of this sort:




Two events can generate from the above page:
1) add a new appender
2) update level of existing appender

Both of these events can be handled with the below servlet code:



import org.apache.log4j.*;

....
....

Enumeration e = LogManager.getCurrentLoggers();
while (e.hasMoreElements())
{
Logger logger = (Logger) e.nextElement();
String prio = request.getParameter(logger.getName());
if (prio != null &&amp; prio.length() > 0)
{
Level p = Level.toLevel(prio);
if (p != null && ! p.equals(logger.getEffectiveLevel()))
{
logger.setLevel(p);
}
}
}

// add new loggers desired
String newLogger = request.getParameter("newLogger");
String newLoggerLevel = request.getParameter("newLoggerLevel");
if (newLogger != null)
{
Level p = Level.toLevel(newLoggerLevel);
Logger logger = Logger.getLogger(newLogger);
logger.setLevel(p);
}