Artficial Immune Systems


Human immune system is an adaptive system that can learn and have a memory. Its principles are emulated in Artificial Immune Systems (AIS).

Artificial Immune Systems for English Grammar Checking

We have build an artificial immune system of English that detects language errors. We used part of TDT2 English Text Corpus as the self, from which we obtained "training data".

Grammar Checker
  • GrammarLearner. Class containing tools for reading and checking input arguments. According to the input argument it starts either data preparation or learning English grammar.
  • LearningEngine. Class which controls the process of learning English grammar. It also loads the tagset and the grammar model needed for the learning process.
  • SequenceGenerator. Class which generates two-, three- and four-tag candidate sequences according to the specified probabilities.
  • SearchEngine. Class which searches the language corpus for the given candidate tag sequence.
  • Memory. Class emloyed as a storage device. It evaluates the candidate tag sequences according to their overlaps with already accepted detectors, stores the results and writes them into the output file. Optionally it loads the results from the specified file.
  • Heap. Class which selects the most suitable detector for deriving new candidate sequences.
  • SequenceDerivator. Class which derives new candidate sequences from a given detector according to the rules of the grammar model.

The corpus data are sentences like this:

The mystery surrounding Pol Pot deepened Sunday after Thailand’s foreign minister claimed that the Khmer Rouge leader had fled Cambodia. Earlier , Chinese diplomats here denied allegations he had been granted asylum in China .

The corresponding taqgged output of the MontyLingua Parser is:

The/DT mystery/NN surrounding/VBG Pol/NNP Pot/NNP deepened/ VBD Sunday/NNP after/IN Thailand/NNP ’s/POS foreign/JJ minister/NN claimed/VBD that/IN the/DT Khmer/NNP Rouge/NNP leader/NN had/VBD fled/VBN Cambodia/NNP ./.
– monty took 0.03 seconds. –
Earlier/RBR ,/, Chinese/JJ diplomats/NNS here/RB denied/VBN allegations/NNS he/PRP had/VBD been/VBN granted/VBN asylum/ NN in/IN China/NNP ./.
– monty took 0.01 seconds. –

The extraction of the metalanguage results in the following sequences:

DT NN VBG NNP NNP VBD NNP IN NNP POS JJ NN VBD IN DT NNP NNP NN VBD VBN NNP . RBR , JJ NNS RB VBN NNS PRP VBD VBN VBN NN IN NNP .

Finally, the encoding converts these sequences into byte sequences:

3 12 29 14 14 28 14 6 14 17 7 12 28 6 3 14 14 12 28 30 14 0 21 0 7 13 20 30 13 18 28 30 30 12 6 14 0

Further, we used the negative selection algorithm. The attribute negative selection suggests, that using this procedure we do not primarily describe the language itself, but the outside of the language: we are given a limited set of structural elements of the language and we search for those configurations of these elements, which do not constitute any part of the language. If we consider the space of all configurations, then the language may be represented as a bounded subspace of this space. Using the terms of the artificial immune systems, grammar errors are treated as pathogens, detectors as antibodies.


Previous page: Swarm Intelligence
Next page: Neural Networks