- The Bow Toolkit - A library of C code useful for writing statistical text analysis, language modeling and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document retrieval (arrow)
- AutoClass - Takes a database of cases described by a combination of real and discrete valued attributes, and automatically finds the natural classes in that data. It can be seen as a Naive Bayes classifier where the class node is hidden. [Free]
- WinMine Toolkit - Tools for learning dependency networks or Bayesian networks from data. [Free]
- Bayes Net Toolbox for Matlab - Supports several inference algorithms and learning algorithms. Allows simulation of static and dynamic networks, including HMMs, IOHMMs, and Kalman filters.
- FastMix - Generates Gaussian mixture models for large datasets using efficient EM clustering algorithms. [Free]
- Incremental Decision Tree Induction - An algorithm that incrementally constructs decision trees from labeled examples. [Free for individual research purposes]
- Weka 3 - Open Source Machine Learning Software in Java - Suite that implements decision trees and tables, rule learners, Naive Bayes, support vector machines, voted perceptrons, multi-layer perceptron. Meta schemes include bagging, stacking, and boosting. [Free under GPL]
- The NEITHER Theory Revision System - A propositional theory refinement system that will modify a incomplete or incorrect rule base so as to make it consistent with a set of input training examples. [Free]
- LNKnet Pattern Classification Software - A software package developed at MIT Lincoln Laboratory which integrates more than 20 neural network, statistical, and machine learning classification, clustering, and feature selection algorithms into a modular software package. [Public domain license]
- PRODIGY System - An architecture for planning and learning. [Free]
- HMMER - Sean Eddy's lab, present profile hidden Markov models for biological sequence analysis, a tool used to build HMMs from multiple alignments, and calculate e-scores.
- Machine Learning Packages from the CMU Artificial Intelligence Repository - Links to FTP repositories including ACCEL, CLASSWEB, FOCL, FOIL, GOLEM, INDEX, MILES, MOBAL, OC1, Occamn, PEBLS, RWM.
- The CHILL Empirical Parser Acquisition System - A general approach to the problem of inducing natural language parsers. It uses an annotated corpus, and produces a parser by using ILP for inducing the rules that control the actions of a shift-reduce parser. [Free]
- Meta-MEME v2.0.1 - Software toolkit for building and using motif-based hidden Markov models of DNA and proteins - from the Univ. of California-San Diego.
- SUBDUE Knowledge Discovery in Structural Databases - The program discovers interesting and repetitive subgraphs in a labeled graph representation using the minimum description length principle. Applications to molecular biology. [Free]
- HMM and other statistical programs - On this page an imlementation of Hidden Markov Models and an application to part-of-speech tagging. Also available a multivariate hypothesis testing software for Gaussian Data and TRUEVIZ: A groundtruth/metadata Editing and Visualizing Toolkit for OCR.
- Pfam - A large collection of multiple sequence alignments and trained hidden Markov models covering many common protein domains.
- Multiple EM for Motif Elicitation and Motif Alignment and Search Tool (MEME/MAST) - MEME System is a program for discovering motifs in groups of related DNA or protein sequences. MAST is a tool for searching biological sequence databases for sequences that contain one or more of a group of known motifs.
- Sequence Alignment and Modeling System (SAM) - A collection of tools for creating and using HMMs for biological sequences. Free license for academic and nonprofit usages.
- MIX - Software for learning Mixture Distributions. Commercial license.
- Machine Learning Programs by Peter Clark - QM: Guiding inductive learning with a Qualitative Model. LPE: Lazy Partial Evaluation. CN2: Rule induction from examples. [Free]
- Statistical Decision Trees - A program for inducing Bayesian decision trees. Applications to speech. [Free]
- Software Packages for Graphical Models/Bayesian Networks - Directory of software tools for modeling graphs and Bayesian networks. Some have learning capabilities.
- Observable Operator Modeling Kit - Machine learning library for Observable Operator Models (OOMs) suitable for time-series and sequence data classification and prediction. OOMs are similar but more powerful than HMMs. [C++, BSD license]
- GNU Hidden Markov Model Library - Hidden Markov Models software library from the Center of Applied Informatics, Cologne. Includes algorithms such as Viterbi, Baum-Welch, and Forward-Backward. [C, GPL license]
- Bayes++ - A library of C++ classes for Bayesian filtering. From the Australian Centre for Field Robotics. [C++, MIT license]
- libbpfl - Bayesian Probability Filtering Library - A general purpose library for Bayesian filtering. [C++, LGPL license]
- XELOPES Data Mining Library - Platform- and data-source-independent library for embedded data mining based on the CWM/OMG and other data mining standards. XELOPES-Java algorithms: SVMs, market basket analysis, sequence analysis, decision trees, cluster analysis, multidimensional group
- Experience-Based Language Acquisition - Computational model of human language acquisition written in Java; currently acquires a protolanguage of nouns and verbs language based on visual perception
- N-gram Statistics Package (NSP) - Suite of Perl tools for counting and analyzing word n-grams in text; provides standard tests of association for identifying word n-grams in large corpora and allows users to implement other tests with minimal Perl knowledge.
- Pattern Recognition Application Programmer's Interface (PRAPI) - A C++ library for many pattern recognition tasks; main focus is on image analysis, but a general architecture and XML-based data interchange format allows it to be used for many other tasks as well.
- EM algorithm for Mixture models - Shotaro Akaho's implementation of EM algorithm for modeling Mixtures of Gaussians (Java, free). An extended version is available from the author.
- ANNI - Artificial Neural Network Investing - Commercial Securities Modeler that uses artificial neural networks and genetic algorithms for customizable prediction
- Tilburg Memory Based Learner (TIMBL) - A program implementing several memory-based learning techniques. These learners store representation of the training set explicitly, and classifies new cases by extrapolation from the most similar stored cases. Free for educational or non-commercial resea
- Bayesian Essay Test Scoring System (BETSY) - A freeware windows-based program that classifies text based on trained material. Designed for automated essay scoring, BETSY can be applied to any text classification task.
- An AI Learning System - A description of an AI system, with a demonstration program and Delphi sourcecode.
- C4.5 and FOIL - Home page of R. Quinlan. FTP links to FOIL (inductive logic programming) and C4.5 (learning decision trees).
- TRON - A learning computer player for the light cycles game in Tron
- Classification Toolbox for MATLAB - A site by Elad Yom-Tov, co-author of the toolbox, that contains additions and updates to the toolbox, as well as a discussion board
- SNoW - A learning architecture specifically taylored for learning in very high-dimensional feature spaces. The current release uses sparse variations of Winnow, Perceptron, and Naive Bayes. Free for personal academic and research purposes.
|