SOMine Lite 2.1
survey performed by Juha Ikonen, June 29th 1998
SOMine Lite is a data mining tool for various applications of
Self-Organising Maps (or Kohonen Networks) such as marketing, finance,
industry, economics and science. It has many useful features which
support analysis of non-linear dependencies, parameter-free
clustering, association and prediction, non-linear regression, pattern
recognition and animated system states monitoring.
The program implements a variant of SOM, the Kohonen's Batch-SOM
with a scaling technique which is not presented in documentation we
had in hand at the time of survey. Since the resulting maps are in
good agreement with maps created with the basic SOM, the employed
algorithm appears to be correct.
SOMine Lite succeeds in hiding complex technology from the user,
knowledge on neural networks or the SOM algorithm is not
needed. Together with easy-to-use graphical user interface this makes
the program a good practical tool for visualising complex data.
Disclaimer: If any information on this page simply is not
true, please tell us about it and we'll correct it ASAP.
Disclaimer: The opinions and observation herein should be
considered personal of the person having performed by the survey, at
the time of the survey. They do not reflect any official standing of
his employer, of the Laboratory of Computer and Information Science or
the Neural Networks Research Center.
General
Program name |
Viscovery SOMine Lite 2.1 |
Availability |
Commercial, demo version is available from the website at http://www.eudaptics.com/
Pricing: Single license $1,495, non-commercial license $695.
Company information:
Eudaptics Software GmbH
Hauptstrasse 99
A-4232 Hagenberg, Austria
Tel: +43 7236 3343 388
Fax: +43 7236 3769 |
Purpose |
A practical tool for advanced analysis and monitoring of numerical data sets. |
Operating system |
Windows 95, Windows NT 4.0 |
User interface |
Graphical user interface
Good in general
Good regarding the SOM |
Documentation |
Online help
Good regarding user interface and program usage
Mediocre in technical/scientific details |
[General comments]
SOM features
map parameters |
Teaching algorithm |
Batch-SOM with growing map.
Implementation seems correct when compared to maps created with the basic SOM. |
Map size |
Two-dimensional map grid with minimum of 9 and maximum of 20000 nodes. |
Map lattice and shape |
Nodes are arranged on a hexagonal grid, map shape is rectangular.
The ratio of the two axii can be set by user or the software can derive it automatically
from the data set. |
Neighborhood function |
Function type: bubble (probably) |
Neighborhood size (h): N/A |
Learning rate (alpha): N/A |
Initialization |
Data sample |
Distance function |
Euclidian (probably) |
Unknown components |
Allowed |
Teaching length |
Explicit, depends on selected training schedule |
efficiency |
Speed [Windows NT 4.0, 200 MHz Pentium MMX, 128 MB RAM] |
For 3000 samples of 13-dimension data, 13 epochs:
1 minute 18 seconds with default training settings. |
Results |
Normal results.
Final average quantization error 0.02232 |
Usability
preprocessing |
Input formats |
Text files, Microsoft Excel 5.0/95 files, Windows Clipboard |
Data handling |
Program provides good features for data handling:
- Histogram of a selected component can be viewed
- With logarithmic or sigmoid transformation user can influence the density
characteristics of a component's distribution.
- Each component of the data set is scaled separately, two components can be linked to
apply the same scaling factor. There are two alternatives: scaling by variance and scaling
by range.
- Components can be weighted by a priority factor.
|
Data selection |
Data can be selected by means of amplifying or suppressing certain ranges of component
values. Also by setting a priority factor to zero a component can be omitted from training
process. |
postprocessing |
Output formats |
- Map graphics can be saved in Windows Metafile or bitmap formats.
- Selected nodes can be saved in text format or copied to clipboard.
- Selected path among map nodes can be saved in text format or copied to clipboard.
|
Map measures |
For quantization error, frequency and map curvature both views and numerical values. |
Labelling |
Advanced labelling: labels can be inserted by typing, by importing from an external
text file or pasted from clipboard. |
Clustering |
Automatic clustering. User can set cluster threshold value and minimum cluster size in
nodes. A clustering significance view helps in finding proper parameters for clustering.
Clusters are visualised by shading and/or by separating lines. |
visualization |
Inspection of neurons |
Advanced: component values, frequency, quantization error and curvature measures can
be inspected. Also statistical figures are provided for a cluster, neighbourhood of a node
and a range of nodes. K nearest neighbours can be viewed for different values of K. |
Clusters/map shape |
U-matrix and clusters. Contours of similarity between adjacent nodes can be viewed by
shading and/or separating lines. Map curvature can be viewed. |
Correlations |
By visualisation: component planes can be viewed in separate windows. |
Data projections |
- An external data set can be evaluated statistically with
respect to a map clustering. The results are stored in a text
file, there are no visual methods for inspecting the
results.
- Process monitoring feature plots a trajectory of best
matching units (BMU) of data vectors from an external data
set.
- An external data set can be presented to a map and a set of
BMU vector values are written to a text file.
|
Markers |
Labels |
Teaching interface is
user-friendly. During training a graph shows the quantizing error
and the normalised distortion of the process. |
http://www.cis.hut.fi/projects/somtoolbox/links/somine.shtml
somtlbx@mail.cis.hut.fi
Monday, 09-Oct-2000 12:53:09 EEST
|