-------------------------------------------------------

  Korean Tagging System : kts Tools

                         S.H.Lee

-------------------------------------------------------

---Corpus To Information---

  crps2info* : Shell Program 
	   Tagged Corpus ̿Ͽ (total.dict) Connetability Matrix(TRANTBL) .

	   Usage : crps2info [ CORPUS] [̸] [TRANTBL]

     MKDctFrmCrps* : crps2info callϴ Shell Program 
              Make Dictionary From Corpus : Tagged Corpus  . 
 			   
   	      Usage : MKDctFrmCrps CORPUS DICT

     MKCnFrmCrps* : crps2info callϴ Shell Program
	      Make Connectability Matrix From Corpus 

	      Usage : MKCnFrmCrps CORPUS TRANTBL
  
---Learning(Training) Corpus To Information---

  lrncrps2info* : Shell Program
	  Training Corpus
				   
	  P(Tag_(i)|Tag_(i-1)) .

          Usage : lrncrps2info Traing-Corpus TRANFREQ

  MKMrgnFrmLCrps* : lrncrps2info callϴ Shell Program 
          (̰  KTS ver 0.9ʹ  ʴ´.)
	  Make Marginal Probability From Training-Corpus
				 
	  Usage : MKMrgnFrmLCrps Traing-Corpus TAGFREQ

  MKBiFrmLCrps* : lrncrps2info callϴ Shell Program
	  Make Bigram Probability From Training-Corpus

	  Usage : MKBiFrmLCrps Traing-Corpus TRANFREQ


---ұĢ ---

  irr.d : `' ұĢ
  irr.l : `' ұĢ
  irr.h : `' ұĢ
  irr.s : `' ұĢ
  irr.L : `' ұĢ
  irr.b : `' ұĢ
  
  PutIrr.c :  ұĢ tag ϴ α׷
  put_irr* : crps2info callϴ Shell Program
			   ұĢ tag Ѵ.

---ETC---

  mkbiprob.c : TAGFREQ ̿Ͽ TRANPROB .
			   Make-Bigram-Probability

  tag2in1.c : Tag Sequence Bigram Sequence .
  
  mktrntbl.c : Make Transition Table : TRANTBL .
 
  freq.c : Frequency Ѵ.
  
  gettagseq.c : CORPUS Tag ϴ α׷
  
  wordambi.c : Prob(Tag|Word) ϴ α׷

  int_sequence : 
  appenddict   : 
  splitdict    : 
  gethan       : AWK Program

  LEARNGCORPUS


=========================   ============================

[csone]Tool> crps2info
This program getting informations from corpus
 1. Dictionary with probabilities , irr-codes
 2. Connetability Matrix based on corpus
Usage : corpus2info CORPUS DICT TRANTBL 
[csone]Tool> crps2info ../Corpus/Tagged/newTrn.Tag total.dict TRANTBL
=================================
 1. Making Dictionary with prob  
=================================
Getting all the words from CORPUS
DONE....
Sorting all the words
DONE....
Getting all the hanguel-words from Word-Tag Pairs
DONE....
Getting Frequencies of Word-Tag Pairs
DONE....
write (Tag , Word , Frequency)
DONE....
Splitting the Dictionary into Not-YeongEon and YeongEon
DONE....
Putting the irr-code into YeongEon-Dictionary
DONE....
Merging the Not-YeongEon and YeongEon with irr-code
DONE....
Now, The Dictionary with Probability is  total.dict
=====================
........End..........
=====================
=================================
 2. Making Connectability Matrix 
=================================
Making Tag-Sequence from CORPUS
DONE....
Getting Uniq Tag-Sequence
DONE....
Appending int-sequence to Uniq Tag-Sequence
DONE....
Making Transition Table from Uniq Tag-Sequence
DONE....
=====================
........End..........
=====================
[csone]Tool> more total.dict
a  3
a  4
a  9
a  3
a  1
a  36
a  1
a  2
a Ȥ 1
a  8
a ڱ 64
a  2
a  12

. .
. .
. .

. .

xn  89
xn ¥ 11
xn ° 9
xn  33
xpa  38 b
xpa  452
xpv  258
xpv Ű 27
xpv  1830
int  1
[csone]Tool> more TRANTBL
0111111011111111111111111111111111110111011010000100000
0000000010100010000011000000000010000000000000000000010
0110110010000000000000000000000000000000000000000000010
0000000011010011001001110100000111000000000000000000010
0010000010101010100101000100100011110111100010000000010

.        .          .                  . 
.        .          .                  .

.        .          .                  . 

0000000000000000000000000000000000000000011111011000100
0000000000000000000000000000000000000001001000000000010
0000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000011111011000000
[csone]Tool> trncrps2info ../Corpus/Tagged/newTrn.Tag TRANFREQ
=================================
 4. Making Bigram Probability    
=================================
Getting Tag-Sequences from LEARNING-CORPUS
DONE....
Making Bigrams from Tag-Sequences
DONE....
Sorting Bigrams
DONE....
Getting the Frequencies (Pair) Tag_(i-1) , Tag_(i)
DONE....
=====================
........End..........
=====================
[csone]Tool> more TRANFREQ 
INI a 535
INI ad 30
INI ajs 315
INI at 149
INI f 377
INI i 861
INI jcp 3
INI jx 1
INI m 41
INI md 350
INI mn 88

.   .  .
.   .  .

.   .  .


xpa exn 8
xpa xa 59
xpv ecq 91
xpv ecs 221
xpv ecx 340
xpv ef 186
xpv efp 605
xpv exm 609
xpv exn 58
xpv xa 5
[csone]Tool> 



