SOMine Lite 2.1

survey performed by Juha Ikonen, June 29th 1998

SOMine Lite is a data mining tool for various applications of Self-Organising Maps (or Kohonen Networks) such as marketing, finance, industry, economics and science. It has many useful features which support analysis of non-linear dependencies, parameter-free clustering, association and prediction, non-linear regression, pattern recognition and animated system states monitoring.

The program implements a variant of SOM, the Kohonen's Batch-SOM with a scaling technique which is not presented in documentation we had in hand at the time of survey. Since the resulting maps are in good agreement with maps created with the basic SOM, the employed algorithm appears to be correct.

SOMine Lite succeeds in hiding complex technology from the user, knowledge on neural networks or the SOM algorithm is not needed. Together with easy-to-use graphical user interface this makes the program a good practical tool for visualising complex data.

Disclaimer: If any information on this page simply is not true, please tell us about it and we'll correct it ASAP.

Disclaimer: The opinions and observation herein should be considered personal of the person having performed by the survey, at the time of the survey. They do not reflect any official standing of his employer, of the Laboratory of Computer and Information Science or the Neural Networks Research Center.

General

Program name	Viscovery SOMine Lite 2.1
Availability	Commercial, demo version is available from the website at http://www.eudaptics.com/ Pricing: Single license $1,495, non-commercial license $695. Company information: Eudaptics Software GmbH Hauptstrasse 99 A-4232 Hagenberg, Austria Tel: +43 7236 3343 388 Fax: +43 7236 3769
Purpose	A practical tool for advanced analysis and monitoring of numerical data sets.
Operating system	Windows 95, Windows NT 4.0
User interface	Graphical user interface Good in general Good regarding the SOM
Documentation	Online help Good regarding user interface and program usage Mediocre in technical/scientific details

[General comments]

SOM features

map parameters
Teaching algorithm	Batch-SOM with growing map. Implementation seems correct when compared to maps created with the basic SOM.
Map size	Two-dimensional map grid with minimum of 9 and maximum of 20000 nodes.
Map lattice and shape	Nodes are arranged on a hexagonal grid, map shape is rectangular. The ratio of the two axii can be set by user or the software can derive it automatically from the data set.
Neighborhood function	Function type: bubble (probably)	Neighborhood size (h): N/A	Learning rate (alpha): N/A
Initialization	Data sample
Distance function	Euclidian (probably)
Unknown components	Allowed
Teaching length	Explicit, depends on selected training schedule
efficiency
Speed [Windows NT 4.0, 200 MHz Pentium MMX, 128 MB RAM]	For 3000 samples of 13-dimension data, 13 epochs: 1 minute 18 seconds with default training settings.
Results	Normal results. Final average quantization error 0.02232

Usability

preprocessing
Input formats	Text files, Microsoft Excel 5.0/95 files, Windows Clipboard
Data handling	Program provides good features for data handling: Histogram of a selected component can be viewed With logarithmic or sigmoid transformation user can influence the density characteristics of a component's distribution. Each component of the data set is scaled separately, two components can be linked to apply the same scaling factor. There are two alternatives: scaling by variance and scaling by range. Components can be weighted by a priority factor.
Data selection	Data can be selected by means of amplifying or suppressing certain ranges of component values. Also by setting a priority factor to zero a component can be omitted from training process.
postprocessing
Output formats	Map graphics can be saved in Windows Metafile or bitmap formats. Selected nodes can be saved in text format or copied to clipboard. Selected path among map nodes can be saved in text format or copied to clipboard.
Map measures	For quantization error, frequency and map curvature both views and numerical values.
Labelling	Advanced labelling: labels can be inserted by typing, by importing from an external text file or pasted from clipboard.
Clustering	Automatic clustering. User can set cluster threshold value and minimum cluster size in nodes. A clustering significance view helps in finding proper parameters for clustering. Clusters are visualised by shading and/or by separating lines.
visualization
Inspection of neurons	Advanced: component values, frequency, quantization error and curvature measures can be inspected. Also statistical figures are provided for a cluster, neighbourhood of a node and a range of nodes. K nearest neighbours can be viewed for different values of K.
Clusters/map shape	U-matrix and clusters. Contours of similarity between adjacent nodes can be viewed by shading and/or separating lines. Map curvature can be viewed.
Correlations	By visualisation: component planes can be viewed in separate windows.
Data projections	An external data set can be evaluated statistically with respect to a map clustering. The results are stored in a text file, there are no visual methods for inspecting the results. Process monitoring feature plots a trajectory of best matching units (BMU) of data vectors from an external data set. An external data set can be presented to a map and a set of BMU vector values are written to a text file.
Markers	Labels
Teaching interface is user-friendly. During training a graph shows the quantizing error and the normalised distortion of the process.

http://www.cis.hut.fi/projects/somtoolbox/links/somine.shtml
somtlbx@mail.cis.hut.fi
Monday, 09-Oct-2000 12:53:09 EEST