AUTOMATIC RECOGNITION OF THE ESTONIAN STEM CHANGES

Evelin Kuusik

Introduction

This paper represents part of a project, the goal of which is to create a rule-based morphology model that takes into account language regularities as far as possible and does not depend on technical restrictions imposed by the computer. The project is based on the open language model according to which all regular and productive phenomena of the natural language are represented by different types of rules, while exceptions are listed in small dictionaries called exception lists. This approach provides an opportunity to process those regular words that are not listed in dictionaries, i.e. new derivatives, compound words, loanwords etc.

From the units point of view the morphological system can be regarded as consisting of the following levels:

A rule-based model of morphology requires some formal rules and corresponding exception lists for each level. Today there exist formalised rules for three levels. The formation of word forms is described in the grammar part of the Concise Morphological Dictionary (= CMD) [6,7] and is also implemented on computer [1,4]. The most productive derivation rules are listed in CMD. The spelling checker for Estonian EEDI [2] contains the rules for compounding.

But there is as yet no comprehensive, statistically founded formal description of the Estonian stem changes that could be implemented on a computer. Stem changes are handled in dictionaries (ÕS, CMD) by listing all possible stem variants of each headword. CMD contains over 36,000 headwords, each of which has two stem variants on the average (the number of possible stem variants can vary strongly in Estonian: in some inflection types there are no stem variants at all, whereas in some others there can be as many as five different ones).

The present work attempts to fulfil the gap by presenting tools to create a formal description of the Estonian stem changes.

Types of the Estonian stem changes

The Estonian morphology is characterised by the variation of all morphological units. In stem changes several kinds of variation can occur simultaneously. In some cases the stem remains unchanged, in some cases the change involves just phonetic quantity, but in most cases either the stem-end or stem-grade changes or both take place. Figure 1 depicts the classification of CMD vocabulary according to stem changeability.

Words with the similar phonological structure can suffer different stem changes (cf. asi \thing\ - asja - asja, susi \wolf\ - soe - sutt and musi \kiss\ - musi - musi ).

Figure 1. Partition of the CMD vocabulary

There is no direct correspondence between a stem change and a particular inflection. The rules are similar for noun and verb morphology, as well as for derivation (cf. pank \bank\ - panga; hankima \to obtain\ - hangib; vanker \castle\ - vangerdama \to castle\).

The principal types of changes are the following:

  1. Stem-grade changes. A stem can occur either in a strong or a weak grade; the grades are differentiated first of all by phonetic quantity (2nd or 3rd degree of quantity) that may be accompanied by various sound changes involving the medial sounds. (Medial sounds begin from the first vowel of the main-stress syllable and end with the sound preceding the first vowel of the next syllable.) For instance, the members of the stem pair hõive-h'õive are distinguished only by their different phonetic quantity; in the case of the aabe-'aape pair the rewriting rule b ® p is concurrent with the phonetic quantity change.
  2. Stem-end changes. A stem can appear either as a lemmatic stem or an inflection stem; stem variants are differentiated by changes concerning the final sounds (e.g. aadel-aadli, jalg-jalga, sipelgas-sipelga).
  3. Secondary changes. Here belong certain substitutions envoked by the context resulting from the application of the preceding rule (rules of distribution). Phonologically conditioned changes are connected with certain stem-end or stem-grade changes. E.g. in the pair kimmel-kimli mm will be shortened as the stem-end change el®li produces the consonant cluster mml (mml®ml), not allowed by the Estonian orthography. Morphonologically conditioned changes may appear as formatives are added to the stem. E.g. when a stem ending on -i attracts an i-initiated formative, a replacement i®e occurs in the stem (voodi_id®voode_id). In the present approach to the modeling of stem changes morphonologically conditioned changes need not be taken into account.

Our classification of the stem changes is summarized on Figure 2.


Figure 2. Classification of the stem changes

Automatic recognition of the stem changes

Learning string data

To attempt a systematic review of the stem changes we first need a detailed (incl. statistical) analysis of all stem changes registered. Such an analysis, in turn, requires a database containing the stem changes for each pair of a lemmatic stem and an inflection. The database can be created from CMD by adding a recognition system which for each pair of stem variants is able to decide what kind of a stem change holds between them. The features considered by the recognition system are in some respects different form those used by the human experts. For instance, the second and the third degree of phonetic quantity cannot always be differentiated by mere orthography unless the source text is not specially tagged. Certain historical and traditional criteria, that may sometimes be quite important for linguists can hardly be formalized at all.

Formally the recognition of stem changes can be reduced to a classification task with string pairs as the objects to classify and possible rules of stem changes as the classes. The system has to create class descriptions from the 'available' data: characters and their belonging to certain sound classes. An important requirement to the classification system is linguistic correctness. This is why the technique of inductive supervised learning is the best suited for our task.

Inductive supervised learning (learning from examples) is one of the main techniques in machine learning. Given a set of examples and counterexamples of a class the learning system induces a general class description that covers all of the positive examples and none of the counterexamples.

Most of the learning algorithms assume attribute-value pairs as input, with a fixed number of attributes and a known set of values for every attribute. In the case of data presented as strings class is often defined by a substring varying in length. The determinant role belongs to the main attribute (in our case, character corresponding to the changing sound) and its direct environs i.e. its context that consists of complement attributes. In most cases the width of the determining context is unknown at first, so the learning system has to deal with an undefined number of attributes.

In other words, inductive learning consists in the formulation of a description hypothesis, followed by an example-by-example adjustment process. The description is not accepted until it covers all of the positive examples and none of the negative ones.

The main specifying operation in the case of string data is the adding of an attribute - extending the context. As the length of the strings can be very different and in most cases the strings are relatively long, the learning direction towards expanding the context is preferable. The other way, i.e. to consider all string as context at first and try to specify the class description by dropping the redundant attributes is much more complex (complexity depends directly on the string length).

Depending on the domain the string alphabet can be divided into hierarchical classes. In this case, class descriptions can be generalized in terms of domain specific classes (sound classes in the present case).

The following algorithm is designed to find a class description for each of the seven formal stem changes:

  1. deletion of a single character;
  2. insertion of a single character;
  3. replacement of a single character by another single character;
  4. replacement of two characters by a single character;
  5. replacement of a single character by two characters,
  6. replacement of two characters by two characters,
  7. replacement of (final) suffixes.

The first six rules correspond to the stem-grade changes, while the last applies to the stem-end changes.

Notation

The main alphabet:

S ={a,b,d,e,f,g,h,i,j,k,l,m,n,o,p,r,s,š,z,,t,u,v,õ,ä,ö,ü}

Alphabet of the sound classes:

D ={Z,W,C,Q,M,L,H}

The main alphabet is divided into the following subsets::

vowels={a,e,i,o,u,õ,ä,ö,ü}

consonants={b,d,f,g,h,j,k,l,m,n,p,r,s,š,z,,t,v}

voiced_consonants={j,l,m,n,r,z,,v}

voiceless_consonants ={b,d,f,g,h,k,p,s,š,t}

stops={b,d,g,k,p,t}

non_stops={f,h,s,š}

Z stands for a character from the main alphabet

W stands for a character from the vowels subset

C stands for a character from the consonants subset

L stands for a character from the voiced consonants subset

H stands for a character from the voiceless consonants

subset

Q stands for a character from the stops subset

M stands for a character from the non-stops subset

In addition, two special characters are used:

# for the end of a word

e for an empty string

The sound classes are organized hierarchically (Figure 3).

Subscripts denote the equality of characters: Q1Q1 marks two equal stops (pp,kk); W1W2 marks a diphthong (ae,ai, ...). The lack of subscripts denotes that equality is not essential (WW may stand for a long vowel aa as well as for a diphthong ai ).




Figure 3. Hierarhcy of the sound classes

Phonological patterns

The seven formal stem changes described above make up the set of classes. As at the same time only one of these can be recognized it seems appropriate to join the individual class descriptions into a decision list covering the whole set of examples. The decision list is a sequence of if then... else... clauses arranged according to the generality level of the conditions, while the last condition (class description for the stem-end changes in the present case) is the constant true. In other words, if no stem-grade changes are observed between two stem variants, the stem-end change holds between them.

For each stem-grade changing rule the system has to create a description differentiating the stem variant pairs it applies to from the rest of such pairs. The main attributes are the characters corresponding to the changing sound in the first string and the characters in the same positions of the second string. Consequently, the description can be represented as a conjunction of the conditions:

ai Î W & bi Î Q & ...

To make the class description more lucid the conjunction can be presented as a phonological pattern.

Phonological pattern is an aligned pair of strings in the sound class alphabet, corresponding to the phonological structure of the variable part of the stem variants. To establish the corresponding phonological pattern the string characters standing for the sounds subject to change are separated and transformed from the main alphabet into the sound class alphabet. Phonological patterns are characterised by their cover extent, i.e. the number of string pairs to which the given pattern applies.

Unfortunately the pattern derived from the variable sounds only need not suffice to correctly establish the actual stem change. E.g. such pairs as

varre modernne

vars modernse

yield the same pattern

L1L1

L1 H

but actually are subject to different rules (rr®rs, ne®se). Therefore the pattern has to be extended. How wide the context to separate stem pairs belonging to different classes must be becomes evident only during the compiling of the description.

An extended phonological pattern (hereafter: pattern) is a phonological pattern extended to cover also the immediate environment (context) of the characters standing for the changing sounds and is represented by a triple


where a and b are strings in the sound class alphabet and s denotes the position of the character corresponding to a changing sound in the pattern (in string a).

A discriminative pattern is a phonological pattern (either extended or not) which does not cover any string pair belonging to the counterexamples.

Complement pattern are such phonological patterns that differ in a single element, while the difference is between two immediate offsprings of one and the same node in the sound class hierarchy, e.g.

WW WW

WQ WM

Training set

An inductive learning system needs a domain expert to predefine the possible classes and provide each class with examples, i.e. objects belonging to this class. Usually the counterexamples, i.e. objects which do not belong to the class are supplied too.

In the present work positive examples are the stem variant pairs subordinating to current stem changing rule. Counterexamples are the positive examples of all remaining classes.

For the selection of examples it is important to guarantee that all different sounds that can participate in a certain type of change were represented. In the opposite case the resulting description may turn out inadequate. If, for instance, the positive examples for the rule loss of one character include the pairs: uba-oa, lagi-lae, pida-pea, but miss out käsi-käe, the system will conclude that the main class-defining attribute is just a stop.

To avoid such occasionality, each stem variant pair with different pseudosuffixes (the final part of the stem) were included in the set of examples. The example set was then submitted to the domain expert who provided each example with a class tag (Figure 4).


3 GES# - KSA# TÕRGES - TÕRKSA


6 NIS# - DSA# SÜNNIS - SÜNDSA

6 NER# - DRE# MANNER - MANDRE

6 NER# - DRI# MANNER - MANDRI

3 BUS# - PSA# HÕLBUS - HÕLPSA

Figure 4. Tagged examples (tag number corresponds to the serial number in the list of the formal stem changes described above)

Learning algorithm

The initial description hypothesis is a disjunction of phonological patterns (not extended). This divides the sets of positive and negative examples into subsets conforming to different patterns. Discriminative patterns (if any) are added to the description and the corresponding subsets are removed from the sets of examples.

A phonological pattern can be extended in two directions: left, towards the beginning of the word, and right, towards the end of the word. Preferred is the direction in the case of which the cover extent of the discriminative patterns is higher. If the cover extents are equal, the left extension is chosen as according to the domain specific heuristics the left context (that enfolding the medial sounds) is more informative.

Given:

- A set of examples N={n1..nn}, in which each example is a triple nj=(a,b,z)

in which

a,b strings in the main alphabet (stem variants)

z position of the main attribute in the string a

(position of the character corresponding to the changing sound in the first stem variant)

The set of examples falls into two disjoint subsets - positive and negative examples for each rule (N+ and N-, respectively):

N= N+È N- & N+Ç N-=Æ

- Function Get_Class, which maps strings from alphabet S into alphabet D according to the given hierarchy level t.

Get_Class: (a Î S* , t): a Î D*

- Sets of patterns P+={p1 ... pn} and P-= {p1 ... pm}, acquired from the sets of positive and negative examples

- The set C= which is searched for under the condition Pn+ Ç Pn- = Æ

denotes the subset of examples conforming to pattern p;

p(n) pattern holding for n

a(p) component a of pattern p

· concatenation of strings

STLearn

Init (P+, P-)

C=P+\P-

Update ( N+, N -, P+, P- )

While P+ Ç P- ¹ Æ

Get_left_context ( Pl+, Pl- )

Get_right_context ( Pr+, Pr- )

Select_successor (P+, P-, C' )

C=C' È C

Update ( N+, N -, P+, P- )

EndWhile

Optimize (C)

Procedure Init sets up the initial class of descriptions taking into account only the characters corresponding to the changing sound. By the initial patterns the sets of examples N+ and N- are divided into subsets corresponding to concrete patterns.

... and ...

Init (P+,P-)

foreach n Î N+

if a(n)z(n)=a(n)z(n)-1 Ú b(n)z(n)=b(n)z(n)-1

a(p(n))=Get_Class(a(n)z(n)-1 ·a(n)z(n),3)

b(p(n))=Get_Class(b(n)z(n)-1 ·b(n)z(n),3)

s(p)=2

else

a(p(n))=Get_Class(a(n)z(n),3)

b(p(n))=Get_Class(b(n)z(n),3)

s(p)=1

Mark_Equal_Char(p)

if pÏP+

P+ = P+ È {p}

= È {n}

foreach n Î N -

if a(n)z(n)=a(n)z(n)-1 Ú b(n)z(n)=b(n)z(n)-1

a(p(n))=Get_Class(a(n)z(n)-1 ·a(n)z(n),3)

b(p(n))=Get_Class(b(n)z(n)-1 ·b(n)z(n),3)

s(p)=2

else

a(p(n))=Get_Class(a(n)z(n),3)

b(p(n))=Get_Class(b(n)z(n),3)

s(p)=1

Mark_Equal_Char(p)

if pÏP -

P -=P - È {p}

= È {n}

The procedure Update refreshes the sets of examples. Discriminative patterns are added to the class description and examples corresponding to them are removed from the training set.

Update ( N+, N -, P+, P - )

foreach p Î P+

if p Ï ( P+ Ç P - )

P+ = P+ \ {p}

foreach p Î P -

if p Ï ( P+ Ç P- )

P - = P - \ {p}

Procedures Get_Right_Context and Get_Left_Context extend the patterns by adding the right and the left context to them, respectively.

get_right_context (Pr+, Pr-)

foreach ni Î N +

a (p(n))= a(p(n)) · Get_Class(a(n) z(n)+| a(p(n)) |+1-s,1)

b (p(n))= b(p(n)) · Get_Class(b(n) z(n)+| b(p(n)) |+1-s,1)

Mark_Equal_Char(p)

if pÏP+

P+ = P+ È {p}

= È {n}

foreach ni Î N -

a(p(n))= a(p(n)) · Get_Class(a(n)z(n)+| a(p(n)) |+1-s,1)

b(p(n))= b(p(n)) · Get_Class(b(n)z(n)+| b(p(n)) |+1-s,1)

Mark_Equal_Char(p)

if pÏP -

P - = P - È {p}

= È {n}

get_left_context (Pl+, Pl-)

foreach ni Î N +

a(p(n))= Get_Class(a(n)z(n)-s,1) ·a(p(n))

b(p(n))= Get_Class(b(n)z(n)-s,1) ·b(p(n))

s(p)= s(p)+1

Mark_Equal_Char(p)

if pÏP+

P+ = P+ È {p}

= È {n}

foreach ni Î N -

a(p(n))= a(p(n)) · Get_Class(a(n)z(n)+| a(p(n)) |+1-s,1)

b(p(n))= b(p(n)) · Get_Class(b(n)z(n)+| b(p(n)) |+1-s,1)

s(p)= s(p)+1

Mark_Equal_Char(p)

if pÏP -

P - = P - È {p}

= È {n}

The next procedure selects the better one of two extended patterns.

Select_successor (P+, P-, C' )

C'l = Pl+ \ Pl-

C'r = Pr+ \ Pr

If |C'l| =|C'r|

P+ = Pl+

P- = Pl-

C' = C'l

else

sr =

If Sl > Sr

P+ = Pl+

P- = Pl-

C' = C'l

else

P+ = Pr+

P- = Pr-

C' = C'r

The last procedure generalises the final class description by replacing the complement patterns with a more general common one. The predicate Compl_Pat (pi , pk ) is true, if its arguments are complement patterns.


Optimize ( C )

for i = 1 to |C|-1

for k = i+1 to |C|

if Compl_Pat (pi , pk ) then

pi = Gen_Pat (pi , pk )

C = C \ {pk }

In addition, two auxiliary procedures are used:

Mark_Equal_Char supplies an index to those pattern elements that correspond to one and the same character.

Gen_Pat replaces complement patterns with an appropriate general one.

The STLearn algorithm is implemented on a IBM-486 computer using the software development environment Delphi. The complexity evaluation of the algorithm for the worst case is O( max(|ni| |N | +|C|2 ).

Compiling the sample class description

Let us consider compiling the class description for the first class - deletion of a single character (Figure 5). For the sake of clarity this example contains but small subsets of positive and negative examples.

After the first step there is a single discriminative pattern (boldfaced in Figure 5):

Q

W

The pattern is added to the description under compilation and all examples covered by this pattern are left out of further consideration. The set of negative examples is cleansed of all examples covered by pattern

Q1

Q2

and work is continued only with those examples that are covered by patterns common to the positive and negative examples. Subsequently the description is modified by extending the context to the left and to the right by one character (columns 3 and 4). The cover extents of the extended patterns are equal. According to domain specific heuristics the left-hand extension is chosen and the pattern

WM

WW

is added to the class description. The next step again results in two different partitions (columns 5 and 6) in which discriminative patterns arise only in the left-hand extension. Patterns

WW Q WWM

WW# WW#

are added to the class description. As no more patterns common to the positive and negative examples are discovered the algorithm stops. As the last two patterns are complement they are replaced by the corresponding generalization:

WWH

WW#

The resulting class description for the first class is as follows:

if

Q or WM or WWH

W WW WW#

then class=1

1
2
3
4
5
6
example
initial

pattern
extension

to right

extension

to left

extension

to right

extension

to left

positive examples
viske#

vise#

Q

W
lagi#

lae#

Q

W
si#

käe#

M

W
MW

W #
WM

WW
raag#

rao#

Q

#
Q #

#
WQ

W #
WQ#

W#
WWQ

WW#
paas#

pae#

M

#
M #

#
WM

W #
WM #

W#
WWM

WW#
negative examples
aabe#

aape#

Q1

Q2
armsa#

armas#

M

W
MW

WC
CM

CW
aetud#

aetu#

Q

#
Q #

#
WQ

W#
WQ#

W#
CWQ

CW#
sipelgas#

sipelga#

M

#
M #

#
WM

W#
WM #

W#
CWM

CW#

Figure 5. Compiling a sample class description

The examples of the remaining classes are processed the same way. The acquired class descriptions so obtained are ordered according to their level of generality (descriptions consisting of narrower patterns are more general) to form a decision list.

The decision list:

If L1L1 L1Q L1L1 L1 M #

L1Q or L1L1 or L1 M # or L1L1

then class=6

else if

H1 # C1 W Q1 W

H1 H1 or C1 C1 or Q2 Q2

then class=5

else if

H1 H1 C1 C1 Q1 Q1

H1 # or C1 W or Q2 W

then class=4

else if

Q1 Z QZ LZ H # L W

Q2 Z or LZ or QZ or LW or H #

CQ1W CQ2 # WWQ1 # WWQ2 W

CQ2 # or CQ1W or WWQ2 W or WWQ1 #

then class=3

else if

Q WM WWH

W or WW or WW #

then class=1

else if

W WW WW #

Q or WM or WWH

then class=2

else

class=7

Identification of the stem changes

Morphological and phonological phenomena are usually described by rewriting rules [3]. To make the present work adhere to this tradition the stem changing rules are represented in the following way:

Rj : a ®b / Cl _ Cr

This means that string a is to be replaced by (rewritten as) string b whenever it is preceded by Cl ( the left context) and followed by Cr (the right context). If the context consists of an empty string, it means that the context is not relevant to trigger the rule in question. If a is equal to an empty string, then the rewriting operation is reduced to one of insertion, while the same for string b means deletion.

As has been mentioned above stem variants can be connected by more than one stem change. E.g. the stem variants sepp \smith\ - sepa are subject to two rules:

Therefore a decision list should be cyclically referred to until all stem changes are identified. From the viewpoint of the modelling of the stem changing system of a natural language more information can be derived from rule sets holding for stem variant pairs than from single rules. In our algorithm a decision list compiled by STLearn is used as a shell for the identification of such rule sets.

Each established rule is immediately applied to the first string and the algorithm continues with an intermediate word form (that may not actually exist in the real language) until the first string becomes equal to the second. Domain theory says that in Estonian only one stem-end and/or stem-grade change can appear between a pair of stem variants.

As at rule recognition the string pair is parsed from left to right a stem-grade change is usually detected before a stem-end change. In case the rule set contains both types of rules, the contexts usually need some correction, because there are certain secondary changes in medial sounds that can take place only after stem-end changes have been applied. To correct the contexts the rules are once more applied in the right order (stem-end change first) and the contexts are corrected.

Recognise-rules-algorithm

Let R= be the rule set wanted, a and b are the stem variants and r is the current rule.

  1. Search for the stem-grade or stem-end change;
  2. Form the corresponding rule r;

2.1. add rule to the rule set R;

2.2. apply rule r to string a;

  1. If a=b, then stop, otherwise search for secondary changes;
  2. If a secondary change is observed,

4.1. add the corresponding rule to the set R;

4.2. apply rule r to string a;

  1. If a=b, then stop, otherwise go to step 1;
  2. If |R| >1, then update the contexts.

For the sample pair ütle-öel \to say\ the algorithm works as follows:

At the first step the deletion of the single character (t® e) is observed. The corresponding stem change rule t ® e / ü_l is formed and added to the rule set R. The rule is applied to the string ütle and work is continued with the intermediate stem pair *üle-öel. As the two strings are not equal a secondary change is searched for. As a corresponding phonological pattern is not observed the algorithm returns to the first step and searches again for the primary change. Observed is the stem-end rule
le
®el / _# after the application of which work is continued with the pair *üel-öel. At the next step a secondary change of the lowering of the vowel ü®ö / _e is detected. After an application of the secondary rule the strings become equal and the algorithm stops. As the acquired set of rules contains more than one rule, the contexts are updated. The resulting rule set for the pair of stem variants ütle-öel is the following:

R={

R1 : le®el / _#

R2 : t®e / ü_e

R3 : ü®ö / _e ; R2

}

This way the rule sets for each stem variant pair are compiled. In addition, each rule is provided with its frequency of occurrence.

Conclusion

The aim of the current work is to create tools for the automatic recognition of the Estonian stem changes. The main problem consists in the bringing together of the formal classification features available to the computer (letters and their attributability to certain sound classes) and a classification based partly on unformalizable traditional and language historical criteria. E.g. in the string pair haudu-hau (d®e) an intervocalic d is deleted, whereas the string pair kareda-kare is subject to the transformation da®e.

The test results of the algorithms described in the present work show that it is possible to classify the stem changes by orthographic features and at the same time correctly in the linguistic sense.

The class descriptions were formed using 540 training examples. The recognition algorithm was tested on 56, 120 stem variant pairs, while not one linguistically incorrectly classified pair was observed. (From the initial test material 22 exceptional words had been removed by a domain expert.)

Analysis of the rule sets proves that a major part of the Estonian stem variants are related only by a stem-end change, or by a concurrent stem-grade and stem-end change (Figure 6).


Stem-grade changes
Stem-end changes
Secondary

changes
Frequency of occurrence (in CMD)
t
2.66
t
83.82
t
t
13.10
t
t
0.14
t
t
0.25
t
t
t
0.03

Figure 6. Combination of the stem changes

Further work should provide the generalization of the acquired rule sets into a formal grammar in terms of sound classes and an elicitation of the corresponding exceptions lists. The resulting description should be adequate to the open model of language.

References

  1. I. Hein. Practical realisation of the morphological analysis. In: Automatic Morphology of Estonian 1. Research Report. Tallinn 1994: 29-36.
  2. H. Kaalep. Õigekirjakorrektorist EEDI. Eesti keel ja tekstitöötlus. "Arvutimaailm" 1993, 5: 26-27.
  3. R.K.Kaplan, M.Kay. Regular models of phonological rule systems. In: Computational Linguistics, Vol. 20: 331-379, MIT 1994.
  1. E.Kuusik. Morphological Synthesis of Estonian Based on the Agglutination Strategy. In: Automatic Morphology of Estonian 1. Research Report. Tallinn 1994: 36-49.
    1. R.S.Michalski. A Theory and Methodology of Learning from Examples. In: R.S.Michalski, J.G. Carbonell, T.M.Mitchell, Machine Learning, an Artificial Intelligence Approach, pp. 83-134, Springer-Verlag 1984
    2. Ü.Viks. Väike vormisõnastik I: Sissejuhatus & grammatika. Tallinn 1992.
    3. Ü.Viks. Väike vormisõnastik II: Sõnastik & lisad. Tallinn 1992.
    4. Ü.Viks. Eesti keele klassifikatoorne morfoloogia. Dissertationes Philologiae Estonicae Universitatis Tartuensis 1. Tartu 1994.
    5. Ü.Viks. A morphological analyzer for the Estonian language: the possibilities and impossibilities of automatic analysis. In: Automatic Morphology of Estonian 1, Tallinn 1994: 7-28.