Posts tagged ‘database’

mHealth: Patient Identification Issues

Patient identification is still proving to be quite an issue with the records that Health Workers are submitting, making it difficult to be sure that a record for a follow up visit is attached to the right patient.

As not everyone has an id number we can use, originally we asked health workers to identify patients by the id number they enter in their log book. This, we thought, had the advantage that we could easily then match up the database records to the paper records. The combination of this and the health post name (selected from a text list in the form, but stored as code number in the database) should have given us a unique identification for each patient. Only the first registration form contains the full name, on the other visit forms, we just ask for the health post name and the id number, plus the year of birth and age to use as checks for the data.

Using the year of birth and age checks we can identify where patient ids may have been entered incorrectly, but we are seeing quite a lot of errors. In theory rectifying these errors shouldn’t be very time consuming or difficult, assuming that follow up calls to the HEWs are made soon after the error is made. Unfortunately, delays to following up these errors, mean that now it will be quite difficult to fix all the errors.

On each patient visit form, we recently also added the patient first name, as an aid to matching errors back to their correct registration forms.

Some of the problems we have come across include:

  • Two (or more) patients being registered with the same ID number
  • Patient visit forms being entered with the wrong ID – and so getting matched to the wrong patient registration record
  • Patients being re-registered with a new id number, especially when they may attend a visit at a different health post, or in a health centre. HEWs issue a registration card to each patient when they are first registered. If the patient later visits a different facility, the health post name and id from the card should be used, but seems this is not always happening and patients are getting re-registered. This makes it very difficult to track whether patients are following up on referral advice.
  • Some health post have restarted the numbering in their log books (the new year in the Ethiopian calendar started in September), so we are starting to see the same id number being re-used for new patients (although this wasn’t meant to happen)

Given the lack of reliable identification numbers, it was probably inevitable that we would have experienced some errors with correctly matching records up. I would have hoped that with quick follow up to rectify errors, the health workers would have soon got used to taking extra care when entering patient id information.

There are other options we could have taken for patient identification, but these may have also had their own drawbacks. For example:

  1. pre-registering all patients in a given area – though this seems like substantial work; or
  2. providing a set of pre-generated bar codes or numbers (with check digits), which the HEWs can issue when the see a new patient. A check digit mechanism, would have really helped ensure mistakes in entering id numbers were minimised – though it may not have avoided the same numbers being reused for different patients. In retrospect I think this is the approach we should have taken.

Another factor which may have contributed to this problem is that we’re forcing ODK to do something that it probably wasn’t really designed for. ODK is a general data collection tool, each form is an independent entity, not necessarily designed to link up records entered from different forms. Some other recent mHealth tools, have a front-end so the user needs to click on a particular patient to enter the a visit record. But this requires some form of synchronization of the data between the phone and the main database, to ensure that all the patients a health worker may visit have their records already stored on the phone, otherwise it may lead (again) to patients being re-registered.

In Ethiopia, there seem to be some efforts to resolve this identification issue, for example the national Health Management Information System (HMIS) or Family Folder system, but these aren’t fully rolled out to all the health posts we’re working in, so we wouldn’t be able to take advantage of these. It seems to me that for these types of mHealth tools to work well and generate good quality reliable data, then a reliable and consistent system for patient identification is required, but hopefully this will be coming soon in Ethiopia.

Importance of being indexed

We noticed the other day that the MySQL process on one of our MSG servers was using up far more resources that it should do. After a little investigation it turned out that the problem was due to one very slow running query. The query in question had around 200k records in one table joined to a table with approx 80k records. Running the same query against my test database (with 70k and 80k records in the tables) ran in about 0.2 secs, but with the increased no of records it could take up to 630 seconds (yes – over 10 minutes), so no wonder then MySQL process was going into overdrive.

There were indexes on each of the tables, but only through a compound primary key, but the columns used for the join in the 2 tables in question didn’t have their own index. As soon as I added an index on these columns the query started to run in a much more respectable time (<0.2 secs).

So the lesson for me is to check that indexes are created for any columns used for joins on tables which are going to get big!

On a side note, this was in the OpenFire database and I’m not keen on the fact that their database structure uses the username column (a varchar(64) field) as the primary key for the user, rather than a more usual integer value. I suspect that joining two large tables on a varchar type column rather than integer column contributed to the slow running of the query.

BTW… can anyone recommend any (free/open-source) tools for analysing the performance and use of the indexes on a MySQL database? There seem to be loads MySQL admin type tools around, many paid-for products, but hard to tell which are any good for what I’m looking to do. Really I guess I’m looking for the similar features as MS SQL Query Analyser provides! Update 26/11/07 – I’m looking for something which gives me information about the query execution plan (i.e. how MySQL will use the indexes etc in the construction of the query) rather than general db admin tools for simply executing queries.

Floating point hell…!

I updated the map grids so they are divided equally, and all appears to be working in that the markers divided and combined in an expected way. However I did find there were quite a few little bugs etc with the reclustering algorithm which were all related to floating points and the number of decimal places.

The problems appear to be due to rounding errors when I’m storing the coordinates in the database. For example, when I think I’m storing say “51.2”, actually “51.2000000000001” gets stored in the database, so then when I look up all the points matching 51.2 the database doesn’t return anything!

I have found a way around this, though I’m not sure how much of a hack it is! The solution seemed to be to treat the coordinates as strings when adding them to the prepared statement for updating and querying on. So instead of using:

String sql ="SELECT * FROM mytable WHERE lat = ?";
Connection conn = conn.getConnection();
PreparedStatement pstmt = conn.prepareStatement(sql);
pstmt.setDouble(1,51.2);

I’m using:

String sql ="SELECT * FROM mytable WHERE lat = ?";
Connection conn = conn.getConnection();
PreparedStatement pstmt = conn.prepareStatement(sql);
DecimalFormat f = new DecimalFormat("#.000000000000");
pstmt.setString(1,f.format(51.2));

This all appears to be working for storing/retrieving the correct values/records – note that the values are still stored as datatype decimal(18,15). My only concern is that I’ve only tested it against a MySQL database and I’m not sure whether other rdbmss would accept decimals/floats passed in string format.

Another possible solution that was suggested to me was to multiply up the coordinates so they are stored as integers in the database – but as this would be a fairly significant change at this stage (and may introduce more bugs!!), I’ll stick with the solution I’ve implemented for now 😉

Floats, numerics, java, mysql, precision and rounding…

Spent the last couple of days trying to figure out why my clustering algorithm in Java wasn’t working properly – was working for some grids, but was totally messed up on others. I think I’ve tracked down why… the problem I think is to do with the precision of the datatypes I’m using in Java and MySQL. In Java I’m using the float type to store a latitude or longitude, then in the MySQL database I’m using decimal(15,12).

There appears to be a bit of a rounding problem somewhere along the line which I need to figure out. For some reason if I try and send a latitude of 52.2 from Java to be stored in the database, it gets stored as something like 52.2000007623. This means when I try and find all the points with latitude 52.2 it doesn’t pick up the record I’ve just inserted/updated.

I’ve not had much experience using this amount of decimal precision before, so the problems are probably due to my lack of understanding of the precision/accuracy of the various data types.

I think the way around this is to change my Java floats to be doubles for more precision and hopefully the problem won’t then occur  – not totally sure what to try if that doesn’t work, so please let me know if you have any suggestions 🙂

Tuesday 11 April 06

Now that I’ve got most of the webservices etc sorted out for the JoinIn enterprise database, I started to look at creating the standalone interface for doing the grouping. I started off by creating a blanks struts application (on Tomcat 4.1- as I had been using for the JoinIN enterprise database), but soon noticed that I can’t use the filters in the servlet 2.3 spec that I’d have liked to have used for the page layout (header & footer), the reason being that the struts framework uses a forward to map the action in the URL to the actual jsp page. This functionality is only available in servlet spec 2.4, which then also means upgrading to Tomcat 5.5.

So, after a bit of ‘fun’, I’ve got both the JoinIn enterprise database/webservice and the standalone grouping interface now running on tomcat 5.5 on my tablet PC. I now need to make sure that I update the tomcat on our dev server and get both the apps running correctly on that too.

How a datasource is configured in tomcat 5.5 is slightly different to tomcat 4.1 too. Rather than creating a joinindb.xml file in the webapps folder to store the context data, this file must now be within the META-INF directory, called context.xml.

Monday 27 March 06

Have spent some time this morning looking at linking up a MySQL database up to Tomcat 4.1, using database connection pooling (DBCP). I had a fair bit of teething troubles getting it to work (using the instructions that I’d found to edit the tomcat server.xml)  as I kept getting the message that:

org.apache.commons.dbcp.SQLNestedException: Cannot create JDBC driver of class ” for connect URL ‘null’

It seems that this is quite a common problem (from all the forum postings about it), and different solutions appear to work for different people (depending I guess on exactly what version of tomcat, which database etc etc). So anyway, here’s what worked for me in the end…. (for info, here I’m using Tomcat v4.1 on Windows, and the MySQL connector in mysql-connector-java-3.1.12.jar – which I put in $CATALINA_HOME/common/lib/)

I did not edit the $CATALINA_HOME/conf/server.xml file, instead I created an xml file in the webapps directory named the same as my webapp that would like to use the database (joinindb.xml). This file looks contains this (for info, this is the entire file)

<Context path="/joinindb" docBase="joinindb"
        debug="5" reloadable="true" crossContext="true">

  <Logger className="org.apache.catalina.logger.FileLogger"
             prefix="localhost_JoinInDB_log." suffix=".txt"
             timestamp="true"/>

  <Resource name="jdbc/JoinInDB"
               auth="Container"
               type="javax.sql.DataSource"/>

  <ResourceParams name="jdbc/JoinInDB">
    <parameter>
      <name>factory</name>
      <value>org.apache.commons.dbcp.BasicDataSourceFactory</value>
    </parameter>

    <!– Maximum number of dB connections in pool. Make sure you
         configure your mysqld max_connections large enough to handle
         all of your db connections. Set to 0 for no limit.
         –>
    <parameter>
      <name>maxActive</name>
      <value>100</value>
    </parameter>

    <!– Maximum number of idle dB connections to retain in pool.
         Set to 0 for no limit.
         –>
    <parameter>
      <name>maxIdle</name>
      <value>30</value>
    </parameter>

    <!– Maximum time to wait for a dB connection to become available
         in ms, in this example 10 seconds. An Exception is thrown if
         this timeout is exceeded.  Set to -1 to wait indefinitely.
         –>
    <parameter>
      <name>maxWait</name>
      <value>10000</value>
    </parameter>

    <!– MySQL dB username and password for dB connections  –>
    <parameter>
     <name>username</name>
     <value>myusername</value>
    </parameter>
    <parameter>
     <name>password</name>
     <value>xxxxxxxxxx</value>
    </parameter>

    <!– Class name for mm.mysql JDBC driver –>
    <parameter>
       <name>driverClassName</name>
       <value>org.gjt.mm.mysql.Driver</value>
    </parameter>

    <!– The JDBC connection url for connecting to your MySQL dB.
         The autoReconnect=true argument to the url makes sure that the
         mm.mysql JDBC Driver will automatically reconnect if mysqld closed the
         connection.  mysqld by default closes idle connections after 8 hours.
         –>
    <parameter>
      <name>url</name>
      <value>jdbc:mysql://localhost:3306/joinin?autoReconnect=true</value>
    </parameter>
  </ResourceParams>
</Context>

I then added the following to my webapp web.xml file (this was added just before the closing </web-app> tag):

 <resource-ref>
      <description>DB Connection</description>
      <res-ref-name>jdbc/JoinInDB</res-ref-name>
      <res-type>javax.sql.DataSource</res-type>
      <res-auth>Container</res-auth>
</resource-ref>

I created a class to test the connection and the method that is used is as follows… (it just returns the name of the class):

public String testConnection() {
        String retvar="nothing here";
        try {
            Context context = new InitialContext();
            if (context == null ) {
                retvar = "Boom – No Context";
            }
            DataSource datasource = (DataSource)context.lookup("java:comp/env/jdbc/JoinInDB");
            if (datasource == null ) {
                retvar= "Boom – No DataSource";
            }
            retvar= "Database connected!";
            retvar = datasource.getConnection().getClass().getName();
    } catch (Exception e) {
            e.printStackTrace() ;
            retvar = "Database Not Available…. aarrgh …. why? – see the stack trace" ;
    }
        return retvar;
    }

When I called this method from a JSP in my webapp, the code run without error returned the value: org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper as the classname.

Friday 3 March 06

I’ve spent the last couple of days sorting out how we’re going to move some of our websites and databases to our new cluster server and database server, this all seems to be going quite well, and I’ve moved 4 of the sites over to the cluster already, but several more to go. Apart from that I’ve spent the last couple of weeks doing work on the JoinIn project and on our internal project planning database.

Thursday 16 February 06

Spent most of this week working on our internal project planning database, which is all going well – it’s moving forward much quicker than I thought it would take – so that’s always good!!

Had really good meeting today with Ernie from the LAMS team about our JoinIn project, and I’ll post up more information about that on the JoinIn blog/webpage