hammingNN progress report

Since my medical school days, I’ve been mulling over some ideas for artificial neural networks that adhere more closely to the physiology that I studied. Over the past couple of years, I finally implemented those ideas, using the Forth programming language on a Macintosh. The process went considerably faster than I initially thought it would, and the results have been very encouraging. What I have come up with is a paradigm for an ANN that:

  1. Learns in a single pass through the training data;
  2. Is robust, ie there are no parameters to tweak;
  3. Gracefully degrades in performance when inputs are missing or corrupted, or synaptic connections are “broken”;
  4. Is easily implemented in silicon;
  5. When implemented on a chip, would provide virtually instantaneous pattern recognition.

I have tested my ANN with some publicly available test datasets. The Wisconsin breast cancer dataset consists of 699 records, each record comprising 9 parameters of fine needle biopsy data. Each parameter can have one of ten different values. The records are from actual patient data, and the problem is to determine whether a particular tumor is benign or malignant.

With one pass through the training data (the first 350 records), my ANN achieved an accuracy of 98.275% in correctly classifying the test set records (the last 174 records) as being either benign or malignant (the other 175 records were used for validation). This compares to a published accuracy of 97.80% obtained by Pena-Reyes & Sipper (references available on request) for their neural network on the same dataset, or 94.15% obtained by Setiono & Liu.

The Sonar dataset has 208 records, each with 60 parameters representing sonar returns. 111 records represent a mine, the remainder a cylindrically shaped rock. The problem is to classify a record as either a mine or the rock. I obtained a classification accuracy of 94.230%, considerably better than the result of 86.1% (Mannevaara) or 89.2% (Gorman & Sejnowski), and a modest improvement over Setiono & Liu, who obtained 93.81%. My result equaled that of Kwak & Choi, who also obtained 94.23%.

Paul Pace, of the Canadian Forces Experimentation Centre (CFEC), suggested I test my ANN with the MSTAR data set. When I trained on the entire set of SAR images taken at 15 degrees (7 different serial numbers, 3 different vehicles), and then tested on the entire set of images taken at 17 degrees, my ANN correctly classified 1299 targets out of 1392, for an overall accuracy of 93.32%.

Based on these results, Dr. Pace was confident that he would be able to arrange for up to $25,000 in funding to further develop the ANN, in order to test it with a U.S. military target dataset of 12,000 images. Unfortunately, Paul suffered a heart attack, and after his convalescence returned to work on a different project.

In addition to Automated Target Recognition, I believe my ANN paradigm could also find use in OCR, iris scanning, fingerprint recognition, medical diagnosis, anti-terrorist applications including profiling and monitoring of electronic communications, as well as adaptive filters and content-addressable memory. The analysis of micro-array patterns in the emerging science of genomics represents another possibility.

However, what excites me the most is the extension of this paradigm to neural networks capable of recognizing serial, or time-based patterns. While there are many challenges ahead in this development path, serial pattern recognition with a robust ANN that learns in a single pass could lead to significant improvements in areas such as speech recognition, weather and earthquake prediction, detection of train wheel defects, and so on. I also have some ideas about how a time-based ANN could improve ATR by eliminating the necessity for scaling and rotation algorithms. Animal physiology also suggests that such an ANN could be part of a simple but reliable collision avoidance device for aircraft and other vehicles.

I have already applied this serial, or time-based, pattern recognition paradigm to speaker recognition. After being trained on a dataset consisting of 270 vocal utterances (vowels) from 9 different speakers, my ANN could correctly identify the speaker in 297 out of 370 test utterances, an accuracy of 80.27%. I used a network which operated on 29 time steps. Because most of the vocal utterances were considerably shorter (as short as 11 time steps) I padded them out to 29 steps by using the average value of each parameter; this explains the relatively poor performance.

I have also had the opportunity to benefit from the feedback of Yoshua Bengio, a neural networks researcher at the University of Montreal. He points out that my basic paradigm is a variant of a nearest neighbour classifier, and believes that it would be subject to the same kind of difficulty that plagues other classifiers, namely the “dimensionality curse”: as the number of parameters increases, the performance decreases. While my network achieves comparable results to other classifiers without pre-processing, I have not explored its potential with large- dimension datasets with pre-processing.

The commercial potential of this paradigm, when applied to static (ie, line- labelled, non-time-based) pattern classification, lies in the fact that it learns in a single pass through the training data, and can be easily made into an integrated circuit, which would require only fractions of a nanosecond to perform pattern recognition. My short-term objective, therefore, is to find collaborators who can help with the hardware design and prototyping process, for a programmable array implementation of the static pattern classifier.

Leave a Reply