Improve your Claims_DB Front-End by using Checksumming

This content was originally published to Benscomputer.no-ip.org in Nov 2009

This is a tutorial on how to improve the response time of applications that use Claims_DB as a backend. The method works whether you are using Claims_DB_listener to access Claims_DB on a remote server, or accessing Claims_DB on your local system.





In the case of the former, replace the direct Claims_DB code with a Claims_DB_listener request (using METHOD 14). I've included an example Claims_DB_listener query, but commented it out.

We are going to start by building a simple BASH script, without checksumming. It's going to generate a HTML page using the information provided by Claims_DB, we will simply define the template.

Note: Choose a Database and Table that already exists within your Claims_DB installation. We are not going to write to the database in this tutorial.

Create your script in a convenient directory (I used ~/sandbox, but it's down to personal preference). I use Kate for most of my scripting, but use whichever text editor you are most comfortable with.

# kate example_script.sh

Now enter the following code;

#!/bin/bash
#
# This is a simple script to demonstrate the usefulness of checksumming
#
# Copyright Ben Tasker 2009
# Released under the GNU GPL V3 - See http://benscomputer.no-ip.org/LICENSE
#
#

# You need to define these yourself
#
CLAIMS_ROOT="/path/to/Claims_DB/installation/folder"
DBROOT="name of a DB stored in Claims_DB"
TABLE="name of a table within the Database named in DBROOT"

# If you're going to use Claims_DB_listener you need to set this next
# one to the name/ip of the database server
DATABASE_SERVER="Holly"

DATE=$( date )


# Generate the HTML
/bin/cat << EOM > /tmp/script.tmpfile.$$
<html>
<head><title>Example Script to Demonstrate Checksumming</title></head>
<body bgcolor="white">
<center><b>This is an example HTML report generated from a Claims_DB query.</b></center>

<br><br><b><u>$DATE</b></u>
<br><br>
<table>
<tr>
EOM

# Grab the column titles

# Claims_DB_Listener request, uncomment this and comment the following line to fire the request that way
# wget -O /tmp/Claims_DB-response.$$ "http://$DATABASE_SERVER/cgi-bin/Claims_DB_listener.sh?TABLE=$TABLE&DBROOT=$DBROOT&METHOD=3"

# Comment this one out if you're using the listener
TABLE="$TABLE" DBROOT="$DBROOT" "$CLAIMS_ROOT"/bin/read_records_claims.sh -headers > /tmp/Claims_DB-response.$$

# Remove the text delimeters
sed s/''"//g /tmp/Claims_DB-response.$$ > /tmp/column_headers

# Create the Column Head

COLT1=$( awk -F'' '{print $1}' /tmp/column_headers )
COLT2=$( awk -F'' '{print $2}' /tmp/column_headers )
COLT3=$( awk -F'' '{print $3}' /tmp/column_headers )
COLT4=$( awk -F'' '{print $4}' /tmp/column_headers )


/bin/cat << EOM >> /tmp/script.tmpfile.$$
<td>$COLT1</td><td>$COLT2</td><td>$COLT3</td><td>$COLT4</td></tr>
EOM

# That's the quick & easy part out of the way!

# let's take the data from the table

# Claims_DB_Listener request, uncomment this and comment the following line to fire the request that way
# wget -O /tmp/Claims_DB-response.$$ "http://$DATABASE_SERVER/cgi-bin/Claims_DB_listener.sh?TABLE=$TABLE&DBROOT=$DBROOT&METHOD=2"

# Comment this one out if you're using the listener
TABLE="$TABLE" DBROOT="$DBROOT" "$CLAIMS_ROOT"/bin/read_records_claims.sh -read > /tmp/Claims_DB-response.$$

# Now we want to process it line by line

while read -r a
do

# Remove the text delimeters
echo "$a" | sed s/''"//g > /tmp/current_line

# Grab each column and put it to a variable
COL1=$( awk -F'' '{print $1}' /tmp/current_line )
COL2=$( awk -F'' '{print $2}' /tmp/current_line )
COL3=$( awk -F'' '{print $3}' /tmp/current_line )
COL4=$( awk -F'' '{print $4}' /tmp/current_line )

# Generate the HTML
/bin/cat << EOM >> /tmp/script.tmpfile.$$
<tr><td>$COL1</td><td>$COL2</td><td>$COL3</td><td>$COL4</td></tr>
EOM

# We've generated our rows

done < /tmp/Claims_DB-response.$$

#Tidy up a bit
rm -f /tmp/current_line
rm -f /tmp/Claims_DB-response.$$
rm -f /tmp/column_headers


# Close off the HTML
/bin/cat << EOM >> /tmp/script.tmpfile.$$
<tr><td>$COL1</td><td>$COL2</td><td>$COL3</td><td>$COL4</td></tr>
</table>
<hr>
<br>
Report Ends
</body></html>
EOM



# Open the resulting file in our HTML Browser
firefox /tmp/script.tmpfile.$$

# Give firefox a few seconds to load before we remove the file
sleep 5
rm -f /tmp/script.tmpfile.$$
exit


So we've generated a HTML page containing every record in the selected Database. It should look something like the following;



This is an example HTML report generated from a Claims_DB query.


05 November 2009

Title 1 Title 2 Title 3 Title 4
Data 1 Col 1 Data 1 Col 2 Data 1 Col 3 Data 1 Col 4
Data 2 Col 1 Data 2 Col 2 Data 2 Col 3 Data 2 Col 4
Data 3 Col 1 Data 3 Col 2 Data 3 Col 3 Data 3 Col 4
Data 4 Col 1 Data 4 Col 2 Data 4 Col 3 Data 4 Col 4


Report Ends



Obviously you'd normally tidy things up (center the data etc.) but it'll do for the purposes of this tutorial. The page loads, and we can read it which is great, but if the Table is quite large then it can take absolutely ages to load!

Try running it on a table that contains more than 100 rows of data, the while loop takes a long time to complete. But by using checksumming we can generate a cachefile, if the checksum hasn't changed then we can simply display the cached file giving the user an almost instant response.

To do this, take the code from above and make the following changes (in red)

#!/bin/bash
#
# This is a simple script to demonstrate the usefulness of checksumming
#
# Copyright Ben Tasker 2009
# Released under the GNU GPL V3 - See http://benscomputer.no-ip.org/LICENSE
#
#

# You need to define these yourself
#
CLAIMS_ROOT="/path/to/Claims_DB/installation/folder"
DBROOT="name of a DB stored in Claims_DB"
TABLE="name of a table within the Database named in DBROOT"

# If you're going to use Claims_DB_listener you need to set this next
# one to the name/ip of the database server
DATABASE_SERVER="Holly"

DATE=$( date )

# We want the DATE to be generated everytime, so lets put it to a different file

# Generate the HTML
/bin/cat << EOM > /tmp/script.tmpfile.$$.head
<html>
<head><title>Example Script to Demonstrate Checksumming</title></head>
<body bgcolor="white">
<center><b>This is an example HTML report generated from a Claims_DB query.</b></center>

<br><br><b><u>$DATE</b></u>
<br><br>
<table>
<tr>
EOM

Get a checksum from Claims_DB

# Uncomment this line if you are using the listener
# wget -O /tmp/checksum.new "http://$DATABASE_SERVER/cgi-bin/Claims_DB_listener.sh?TABLE=$TABLE&DBROOT=$DBROOT&METHOD=14"

#Comment this next line if you are using the listener
TABLE="$TABLE" DBROOT="$DBROOT" "$CLAIMS_ROOT"/bin/read_records_claims.sh - --gen-tbl-sig > /tmp/checksum.new

# Check for a cache file
if [ -e /tmp/.claims_db.response.$TABLE ]
then
# Cache file exists, but is it valid?

# We need to compare the new checksum to the one stored in /tmp/.claims_db.response.$TABLE.sum
NEWSUM=$( cat /tmp/checksum.new )
OLDSUM=$( cat /tmp/.claims_db.response.$TABLE.sum )

# Do they match?
if [ "$NEWSUM" == "$OLDSUM" ]
then
# Checksums match, use the cache file
# Tidy up first
rm -f /tmp/checksum.new

# Place the header into a file, then place the cache file in under it.
cat /tmp/script.tmpfile.$$.head > /tmp/script.tmpfile.$$
cat /tmp/.claims_db.response.$TABLE.sum >> /tmp/script.tmpfile.$$

# Now open the file
firefox /tmp/script.tmpfile.$$

# Give firefox some time then tidy up and exit
sleep 5
rm -f /tmp/script.tmpfile.$$
rm -f /tmp/script.tmpfile.$$.head
exit

fi

# Checksums don't match

fi

# Either checksums don't match or the cachefile didn't exist, process as normal


# Grab the column titles

# Claims_DB_Listener request, uncomment this and comment the following line to fire the request that way
# wget -O /tmp/Claims_DB-response.$$ "http://$DATABASE_SERVER/cgi-bin/Claims_DB_listener.sh?TABLE=$TABLE&DBROOT=$DBROOT&METHOD=3"

# Comment this one out if you're using the listener
TABLE="$TABLE" DBROOT="$DBROOT" "$CLAIMS_ROOT"/bin/read_records_claims.sh -headers > /tmp/Claims_DB-response.$$

# Remove the text delimeters
sed s/''"//g /tmp/Claims_DB-response.$$ > /tmp/column_headers

# Create the Column Head

COLT1=$( awk -F'' '{print $1}' /tmp/column_headers )
COLT2=$( awk -F'' '{print $2}' /tmp/column_headers )
COLT3=$( awk -F'' '{print $3}' /tmp/column_headers )
COLT4=$( awk -F'' '{print $4}' /tmp/column_headers )


/bin/cat << EOM >> /tmp/script.tmpfile.$$
<td>$COLT1</td><td>$COLT2</td><td>$COLT3</td><td>$COLT4</td></tr>
EOM

# That's the quick & easy part out of the way!

# let's take the data from the table

# Claims_DB_Listener request, uncomment this and comment the following line to fire the request that way
# wget -O /tmp/Claims_DB-response.$$ "http://$DATABASE_SERVER/cgi-bin/Claims_DB_listener.sh?TABLE=$TABLE&DBROOT=$DBROOT&METHOD=2"

# Comment this one out if you're using the listener
TABLE="$TABLE" DBROOT="$DBROOT" "$CLAIMS_ROOT"/bin/read_records_claims.sh -read > /tmp/Claims_DB-response.$$

# Now we want to process it line by line

while read -r a
do

# Remove the text delimeters
echo "$a" | sed s/''"//g > /tmp/current_line

# Grab each column and put it to a variable
COL1=$( awk -F'' '{print $1}' /tmp/current_line )
COL2=$( awk -F'' '{print $2}' /tmp/current_line )
COL3=$( awk -F'' '{print $3}' /tmp/current_line )
COL4=$( awk -F'' '{print $4}' /tmp/current_line )

# Generate the HTML
/bin/cat << EOM >> /tmp/script.tmpfile.$$
<tr><td>$COL1</td><td>$COL2</td><td>$COL3</td><td>$COL4</td></tr>
EOM

# We've generated our rows

done < /tmp/Claims_DB-response.$$

#Tidy up a bit
rm -f /tmp/current_line
rm -f /tmp/Claims_DB-response.$$
rm -f /tmp/column_headers


# Close off the HTML
/bin/cat << EOM >> /tmp/script.tmpfile.$$
<tr><td>$COL1</td><td>$COL2</td><td>$COL3</td><td>$COL4</td></tr>
</table>
<hr>
<br>
Report Ends
</body></html>
EOM

# Combine the header and our newly created HTML cat /tmp/script.tmpfile.$$.head > /tmp/script.tmpfile.$$.1
cat /tmp/script.tmpfile.$$ >> /tmp/script.tmpfile.$$.1

# Before we display the resulting file, let's create a cachefile for next time;
cp /tmp/script.tmpfile.$$.1 /tmp/.claims_db.response.$TABLE

# Store the Checksum for use next time
mv /tmp/checksum.new /tmp/.claims_db.response.$TABLE.sum

# Move the file back to it's original location
mv /tmp/script.tmpfile.$$.1 /tmp/script.tmpfile.$$

# Open the resulting file in our HTML Browser
firefox /tmp/script.tmpfile.$$

# Give firefox some time then tidy up and exit
sleep 5
rm -f /tmp/script.tmpfile.$$ rm -f /tmp/script.tmpfile.$$.head exit


So the script will generate the same result, but will first check to see if anything has changed. If something has changed, it will generate a new page (which could still take a while) but if nothing has changed, it'll return the results previously generated. Try running the script against your big table again, first run takes a while, second run should be almost instant.

You can also use this checksumming method for queries, although you need to do some of the work yourself from within your script. Such a query would look more like the following;

TABLE="$TABLE" DBROOT="$DBROOT" "$CLAIMS_ROOT"/bin/read_records_claims.sh -query "Query Text" COLUMNNUMBER  > /tmp/Claims_DB-response.$$  md5sum /tmp/Claims_DB-response.$$ > /tmp/checksum.new


Using the result of a query to generate a checksum does not provide as much of an improvement in response time, however depending on the length of the returned data, it may still be an improvement. Results can be cached in the same way, although you need to ensure that the cache filename will be unique to that particular query. I use the format

/tmp/.Claims_DB.$TABLE.$DBNAME.$QUERY_TEXT.$COLUMNNUMBER.cache
/tmp/.Claims_DB.$TABLE.$DBNAME.$QUERY_TEXT.$COLUMNNUMBER.cache.sum


Because the cache is based on a checksum of the returned data, there is no need to impose time limits on the locally stored cache files. You may wish to do so in order to avoid cache files building up, but it will not affect the accuracy of the data returned.

This concludes this tutorial on how to improve front end response times by using checksumming, hopefully you've learned something, have fun creating your application!!