forked from GiellaLT-Archive/clean_lang_history
-
Notifications
You must be signed in to change notification settings - Fork 0
/
sms.diff
2244 lines (2244 loc) · 217 KB
/
sms.diff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
623,624c623,625
< [Template merge - langs/und] Corrected copy-paste bug in the build steps for areal grammar checker analysers. The bug caused SMJ to fail. 2020-04-17T06:36:43+00:00
< [Template merge - langs/und] Fixed bug with multiple declarations of EXTRA_DIST and noinst_DATA in the previous template merge. 2020-04-17T06:16:14+00:00
---
> This experiment is now over, and the dir removed. Confirmed by Jack, who started it. 2022-10-30T12:34:17+01:00
> [Template merge - langs/und] Corrected copy-paste bug in the build steps for areal grammar checker analysers. The bug caused SMJ to fail. 2020-04-17T06:33:14+00:00
> [Template merge - langs/und] Fixed bug with multiple declarations of EXTRA_DIST and noinst_DATA in the previous template merge. 2020-04-17T06:11:19+00:00
625a627
> [Template merge - langs/und] Preparations for moving the phonology files inside morphology/ (later to be renamed fst/). 2020-04-17T05:53:53+00:00
629,632c631,634
< [Template merge - langs/und] Reorganised mt/apertium make files so that fixed content is in Makefile.am, and userj-editable content is in Makefile.modifications.am. 2020-04-07T13:54:30+00:00
< [Template merge - langs/und] Started splitting the local Makefile.am in two, by moving it to a new filename, and then create a new Makefile.am that just includes the moved one. In later commmits, some of the content can be moved from one file to the other. 2020-04-06T11:57:59+00:00
< [Template merge - langs/und] Fixed the remaining cases of improved upper-lower case configurable processing. Removed a variable from configure.ac with comments, turned out it wasn't needed. 2020-04-05T11:19:59+00:00
< [Template merge - langs/und] First step in fixing default case handling: downcasing of derived proper nouns can now be turned off for the standard fst's by changing a test in configure.ac. 2020-04-03T13:08:27+00:00
---
> [Template merge - langs/und] Reorganised mt/apertium make files so that fixed content is in Makefile.am, and userj-editable content is in Makefile.modifications.am. 2020-04-07T12:58:36+00:00
> [Template merge - langs/und] Started splitting the local Makefile.am in two, by moving it to a new filename, and then create a new Makefile.am that just includes the moved one. In later commmits, some of the content can be moved from one file to the other. 2020-04-06T11:56:36+00:00
> [Template merge - langs/und] Fixed the remaining cases of improved upper-lower case configurable processing. Removed a variable from configure.ac with comments, turned out it wasn't needed. 2020-04-05T11:19:22+00:00
> [Template merge - langs/und] First step in fixing default case handling: downcasing of derived proper nouns can now be turned off for the standard fst's by changing a test in configure.ac. 2020-04-03T12:47:37+00:00
634,635c636,637
< [Template merge - langs/und] Fixed bug in phonology compilation when there are multiple phonology files: temporary files were deleted before being used due to name overlap. 2020-03-31T07:26:54+00:00
< [Template merge - langs/und] Added Automake variables to handle demanding or non-default uppercasing, or writing systems with no case distinction at all. 2020-03-30T13:47:05+00:00
---
> [Template merge - langs/und] Fixed bug in phonology compilation when there are multiple phonology files: temporary files were deleted before being used due to name overlap. 2020-03-31T07:24:28+00:00
> [Template merge - langs/und] Added Automake variables to handle demanding or non-default uppercasing, or writing systems with no case distinction at all. 2020-03-30T13:43:49+00:00
645a648
> [Template merge - langs/und] Adding |{➤}|{•} to pmscript. 2019-12-16T08:08:15+00:00
648a652
> [Template merge - langs/und] Added ‹ and › to the list of possible punctuation marks in the tokenisers. 2019-11-15T12:26:48+00:00
655,656c659,660
< [Template merge - langs/und] Replace UNDEFINED with __UNDEFINED__, so that text replacement can take place. 2019-10-24T14:20:07+00:00
< Updated ignore patterns. 2019-10-23T18:40:46+00:00
---
> [Template merge - langs/und] Added Makefile setting for enabling swaps in error models (ie ab -> ba). Default is no (as this used not to work, and the existing error models are based on this fact). 2019-11-06T16:53:03+00:00
> [Template merge - langs/und] Replace UNDEFINED with __UNDEFINED__, so that text replacement can take place. 2019-10-24T09:35:27+00:00
659,663c663,667
< [Template merge - langs/und] tools/mt/Makefile.am needs am-shared/lookup-include.am as well. 2019-10-22T09:18:16+00:00
< [Template merge - langs/und] Forgot to add cgbased to the SUBDIRS variable in tools/mt/Makefile.am. 2019-10-22T08:34:12+00:00
< [Template merge - langs/und] Added basic support for CG-based machine translation. Ongoing work. 2019-10-22T07:30:33+00:00
< [Template merge - langs/und] Make sure some jspwiki header files for generated documentation are included in the distro. 2019-10-16T06:13:47+00:00
< [Template merge - langs/und] Made it possible to disable Forrest validation when Forrest is installed. This reduces build time and annoying warnings for people not working on the documentation. Default is still to do Forrest validation. 2019-10-14T10:57:18+00:00
---
> [Template merge - langs/und] tools/mt/Makefile.am needs am-shared/lookup-include.am as well. 2019-10-22T09:17:05+00:00
> [Template merge - langs/und] Forgot to add cgbased to the SUBDIRS variable in tools/mt/Makefile.am. 2019-10-22T08:30:19+00:00
> [Template merge - langs/und] Added basic support for CG-based machine translation. Ongoing work. 2019-10-22T07:20:24+00:00
> [Template merge - langs/und] Make sure some jspwiki header files for generated documentation are included in the distro. 2019-10-16T06:08:49+00:00
> [Template merge - langs/und] Made it possible to disable Forrest validation when Forrest is installed. This reduces build time and annoying warnings for people not working on the documentation. Default is still to do Forrest validation. 2019-10-14T10:47:28+00:00
665d668
< [Template merge - langs/und] Wrapped command line tools in double quotes, to protect against spaces in pathnames. Spaces will occur when building on Windows using Windows Subsystem for Linux, as locations such as 'Program Files' are included in the default search path. 2019-10-10T09:44:31+00:00
666a670
> [Template merge - langs/und] Wrapped command line tools in double quotes, to protect against spaces in pathnames. Spaces will occur when building on Windows using Windows Subsystem for Linux, as locations such as 'Program Files' are included in the default search path. 2019-10-10T07:24:00+00:00
669,674c673
< foma-ignore 2019-10-08T09:57:00+00:00
< ign 2019-10-07T21:32:11+00:00
< ign 2019-10-07T21:15:15+00:00
< ign 2019-10-07T21:13:09+00:00
< Force unix line endings, to make sure it works ok also on the Windows subsystem for Linux. 2019-10-07T17:16:53+00:00
< [Template merge - langs/und] Improved build process for pattern hyphenators - now patgen config is done programmatically instead of interactively. The values are configured in the Makefile.am. 2019-10-02T22:19:52+00:00
---
> [Template merge - langs/und] Improved build process for pattern hyphenators - now patgen config is done programmatically instead of interactively. The values are configured in the Makefile.am. 2019-10-02T21:59:56+00:00
683c682
< [Template merge - langs/und] Added script for testing tag coverage, made by Kevin, and originally for sme. 2019-09-17T08:42:15+00:00
---
> [Template merge - langs/und] Added script for testing tag coverage, made by Kevin, and originally for sme. 2019-09-17T08:35:00+00:00
688c687
< [Template merge - langs/und] Added support for comments in error model text files. Added support for zipped but uncompressed files (required by divvunspell for now). 2019-09-05T04:07:21+00:00
---
> [Template merge - langs/und] Added support for comments in error model text files. Added support for zipped but uncompressed files (required by divvunspell for now). 2019-09-04T20:28:04+00:00
695c694
< [Template merge - langs/und] Added simple shell script to easily run the grammar checker test tool, and considering build directories etc. 2019-08-09T12:10:05+00:00
---
> [Template merge - langs/und] Added simple shell script to easily run the grammar checker test tool, and considering build directories etc. 2019-08-09T11:49:01+00:00
697c696
< [Template merge - langs/und] Generate and compile the new filter for removing semantic tags in front of derivations. Require new version of the giella-core because of dependencies. 2019-06-14T11:08:42+00:00
---
> [Template merge - langs/und] Generate and compile the new filter for removing semantic tags in front of derivations. Require new version of the giella-core because of dependencies. 2019-06-14T11:03:00+00:00
698a698
> [Template merge - langs/und] Make sure all generated files have a suffix that will make them be ignored. Added comments to clarify. 2019-06-14T07:45:34+00:00
700d699
< Updating svn ignores for tools/analysers/. 2019-06-14T06:38:51+00:00
703,704c702,703
< [Template merge - langs/und] Fixed stupid copy-paste error in the previous commit. Reorganised the code a bit to make a variable definition clearer and more logical. 2019-05-27T11:15:02+00:00
< [Template merge - langs/und] Make sure that the input to all variants of the mobile speller is weighted. 2019-05-27T07:18:59+00:00
---
> [Template merge - langs/und] Fixed stupid copy-paste error in the previous commit. Reorganised the code a bit to make a variable definition clearer and more logical. 2019-05-27T11:01:53+00:00
> [Template merge - langs/und] Make sure that the input to all variants of the mobile speller is weighted. 2019-05-27T07:13:11+00:00
706,708c705
< Updating svn ignores. 2019-05-24T09:55:04+00:00
< Updating svn ignores. 2019-05-24T09:44:55+00:00
< [Template merge - langs/und] Fixed fsttype mismatch error for filters when building mobile spellers, by building filters locally of the correct fst type, as we do for desktop spellers. 2019-05-24T09:23:42+00:00
---
> [Template merge - langs/und] Fixed fsttype mismatch error for filters when building mobile spellers, by building filters locally of the correct fst type, as we do for desktop spellers. 2019-05-24T09:11:52+00:00
735c732,733
< [Template merge - langs/und] Ensure that the correct grammar checker pipeline is the default one, so that it will be executed when no pipeline is specified. 2019-03-13T08:46:19+00:00
---
> [Template merge - langs/und] Added UpCase function to the tokenisers, to handle all-upper variants of the input side. It does almost double the size of the fst, but at least it is just one additional line of code. Also, it does only work in Linux/using glib (for other platforms it is restricted to Latin1 - still, that covers a major portion of the Sámi fst's and running text, so much better than nothing). 2019-03-22T14:30:22+00:00
> [Template merge - langs/und] Ensure that the correct grammar checker pipeline is the default one, so that it will be executed when no pipeline is specified. 2019-03-13T08:45:12+00:00
740c738,739
< [Template merge - langs/und] Changed sub-post tag for symbols from +ABBR to +Symbol. Needs to be declared as multichar in each language. 2019-02-27T13:33:17+00:00
---
> [Template merge - langs/und] Added the new multichar +Symbol to the multichar definitions. 2019-02-28T07:21:29+00:00
> [Template merge - langs/und] Changed sub-post tag for symbols from +ABBR to +Symbol. Needs to be declared as multichar in each language. 2019-02-27T13:28:21+00:00
742d740
< Updated svn ignores. 2019-02-27T10:18:02+00:00
744a743
> [Template merge - langs/und] Added support for shared Symbol file: build rules, affix file, modifications to root.lexc. Also increased required version of giella-common, to make sure that the shared stem file is actually there. 2019-02-26T08:52:43+00:00
746c745,746
< [Template merge - langs/und] Added support for building an analyser tool. This is in practice an xml-specified pipeline identical to what is used in the grammar checker, but where the pipeline does text analysis instead of grammar checking. Also made grammar checkers and mobile spellers part of the --enable-all-tools configuration. 2019-02-25T17:07:57+00:00
---
> [Template merge - langs/und] Added support for building an analyser tool. This is in practice an xml-specified pipeline identical to what is used in the grammar checker, but where the pipeline does text analysis instead of grammar checking. Also made grammar checkers and mobile spellers part of the --enable-all-tools configuration. 2019-02-25T15:37:11+00:00
> Update to today's status. 2019-02-25T13:00:48+00:00
748c748
< [Template merge - langs/und] Added filter to remove the +MWE tag from the grammar checker generator. It blocked generation of some word forms (and should not be visible in any case). 2019-02-13T07:47:37+00:00
---
> [Template merge - langs/und] Added filter to remove the +MWE tag from the grammar checker generator. It blocked generation of some word forms (and should not be visible in any case). 2019-02-13T07:44:29+00:00
754,756c754,756
< [Template merge - langs/und] Fixed another case of transducer format mismatch for hyphenators, this time regarding pattern-based hyph building. 2019-01-25T08:54:07+00:00
< [Template merge - langs/und] Corrected an instance of transducer format mismatch when building hyphenators. 2019-01-25T08:08:55+00:00
< [Template merge - langs/und] Make the mobile keyboard layout error model work properly (ie on input longer than one char) by circumfixing it with any-stars. 2019-01-17T19:30:10+00:00
---
> [Template merge - langs/und] Fixed another case of transducer format mismatch for hyphenators, this time regarding pattern-based hyph building. 2019-01-25T08:45:29+00:00
> [Template merge - langs/und] Corrected an instance of transducer format mismatch when building hyphenators. 2019-01-25T08:05:02+00:00
> [Template merge - langs/und] Make the mobile keyboard layout error model work properly (ie on input longer than one char) by circumfixing it with any-stars. 2019-01-17T19:08:20+00:00
758,763c758,761
< [Template merge - langs/und] First round of improved handling of compilation errors in shell pipes: instruct make to delete targets when some of the intermediate steps fail. 2019-01-11T13:53:26+00:00
< [Template merge - langs/und] Added configure.ac conditional to control whether spellers for alternative orthographies are built. The default is 'true'. Set this to 'false' for historical or other orthographies for which a speller is not relevant. 2019-01-09T10:41:17+00:00
< [Template merge - langs/und] Fix broken hfst builds of xfscript files when there is no final newline in the source file (caused the save command to be shaddowed by the final line of text, usually a comment, so no file was saved, and thus there was nothing to work on for the next build step). 2019-01-09T08:59:21+00:00
< [Template merge - langs/und] Apply alternate orthography conversion after hyphenation marks have been removed, but before the morphology marks are deleted. Especially word boundaries are useful for certain types of conversion, but other borders will likely be useful as well. The conversion scripts need to take the border marks into consideration. 2019-01-08T08:59:35+00:00
< Ignore compiled cg3 files in tools/tokenisers/. 2019-01-08T07:08:34+00:00
< Ignore more files, including files that are automatically added to svn when populating a new language. This is done to avoid them showing up as noise for external languages, in which case these files might not be in our svn (but in the external svn repo instead). 2019-01-08T06:55:51+00:00
---
> [Template merge - langs/und] First round of improved handling of compilation errors in shell pipes: instruct make to delete targets when some of the intermediate steps fail. 2019-01-11T12:34:50+00:00
> [Template merge - langs/und] Added configure.ac conditional to control whether spellers for alternative orthographies are built. The default is 'true'. Set this to 'false' for historical or other orthographies for which a speller is not relevant. 2019-01-09T10:34:50+00:00
> [Template merge - langs/und] Fix broken hfst builds of xfscript files when there is no final newline in the source file (caused the save command to be shaddowed by the final line of text, usually a comment, so no file was saved, and thus there was nothing to work on for the next build step). 2019-01-09T08:56:21+00:00
> [Template merge - langs/und] Apply alternate orthography conversion after hyphenation marks have been removed, but before the morphology marks are deleted. Especially word boundaries are useful for certain types of conversion, but other borders will likely be useful as well. The conversion scripts need to take the border marks into consideration. 2019-01-08T08:55:05+00:00
766,770c764,769
< [Template merge - langs/und] Improved Easter egg generation, using the improved script in giella-core. Increased the required giella-core version correspondingly. 2018-12-14T09:21:24+00:00
< [Template merge - langs/und] Cleaned the HFST_MINIMIZE_SPELLER macro, and also its use. No need to include push weights anymore, it is done always, for all speller fst's. 2018-12-13T10:22:14+00:00
< [Template merge - langs/und] Push weights for all final fst's, + optimise error model. 2018-12-13T09:57:44+00:00
< [Template merge - langs/und] Changed how the att file is produced. From now on it should be built once, and then added to svn. The att file will usually not change, and storing it in svn will avoid rebuilding it every time. Also changed the compression. 2018-12-12T14:55:54+00:00
< [Template merge - langs/und] Added support for adapting the error model to the mobile keyboard layout for the language in question. 2018-12-11T14:27:30+00:00
---
> [Template merge - langs/und] Replicate the desktop error model for the mobile speller, and generalise the corpus weighting compilation. Now the build code is ready for mobile speller release. 2018-12-17T17:31:13+00:00
> [Template merge - langs/und] Improved Easter egg generation, using the improved script in giella-core. Increased the required giella-core version correspondingly. 2018-12-14T09:11:01+00:00
> [Template merge - langs/und] Cleaned the HFST_MINIMIZE_SPELLER macro, and also its use. No need to include push weights anymore, it is done always, for all speller fst's. 2018-12-13T10:12:43+00:00
> [Template merge - langs/und] Push weights for all final fst's, + optimise error model. 2018-12-13T09:56:41+00:00
> [Template merge - langs/und] Changed how the att file is produced. From now on it should be built once, and then added to svn. The att file will usually not change, and storing it in svn will avoid rebuilding it every time. Also changed the compression. 2018-12-12T14:51:02+00:00
> [Template merge - langs/und] Added support for adapting the error model to the mobile keyboard layout for the language in question. 2018-12-11T14:17:07+00:00
782c781
< [Template merge - langs/und] Two more places to remove the Use/-GC and the MWE tags: mt and speller fst's. Now done. 2018-11-06T07:54:39+00:00
---
> [Template merge - langs/und] Two more places to remove the Use/-GC and the MWE tags: mt and speller fst's. Now done. 2018-11-06T07:48:42+00:00
786,788c785,787
< [Template merge - langs/und] Had forgotten to remove the Use/-GC tag in the core fst's, only from all the others. Now fixed. 2018-11-05T15:57:42+00:00
< [Template merge - langs/und] Step 2 in blocking dynamic compounds of MWE tagged entries: moved all MWE tag processing away from the *-raw-* targets to the specific *.tmp targets. This way the MWE tags will survive long enough to be available for the blocking done in the tokeniser fst's. Tested in SME, and seems to work as intended. 2018-11-05T09:10:48+00:00
< [Template merge - langs/und] Added step 1 in blocking dynamic comounds between an MWE and another noun: added new filter that will turn the MWE tag into a flag diacritic. Increased required giella-common version number due to the new filter. 2018-11-02T11:16:52+00:00
---
> [Template merge - langs/und] Had forgotten to remove the Use/-GC tag in the core fst's, only from all the others. Now fixed. 2018-11-05T15:51:27+00:00
> [Template merge - langs/und] Step 2 in blocking dynamic compounds of MWE tagged entries: moved all MWE tag processing away from the *-raw-* targets to the specific *.tmp targets. This way the MWE tags will survive long enough to be available for the blocking done in the tokeniser fst's. Tested in SME, and seems to work as intended. 2018-11-05T08:39:46+00:00
> [Template merge - langs/und] Added step 1 in blocking dynamic comounds between an MWE and another noun: added new filter that will turn the MWE tag into a flag diacritic. Increased required giella-common version number due to the new filter. 2018-11-02T11:16:04+00:00
802c801
< [Template merge - langs/und] Fixed bug when building the punctuation file - the required subdir was not made. 2018-10-24T08:39:39+00:00
---
> [Template merge - langs/und] Fixed bug when building the punctuation file - the required subdir was not made. 2018-10-24T08:32:02+00:00
821,822d819
< ignore for bin 2018-10-14T13:31:01+00:00
< added korp.cg3 to svn ignore. 2018-10-14T12:56:20+00:00
827,828c824,825
< [Template merge - langs/und] Moved the whitespace analyser almost to the beginning of the pipeline, directly after the tokeniser+analyser. This is to be able to support sentence boundary detection, as the whitespace analyser will give some valuable tags for that. 2018-10-12T14:07:22+00:00
< [Template merge - langs/und] Corrected typo in a configuration option - dekstop instead of desktop. Thanks to our friends in Nuuk for noticing. 2018-10-11T15:55:10+00:00
---
> [Template merge - langs/und] Moved the whitespace analyser almost to the beginning of the pipeline, directly after the tokeniser+analyser. This is to be able to support sentence boundary detection, as the whitespace analyser will give some valuable tags for that. 2018-10-12T14:05:29+00:00
> [Template merge - langs/und] Corrected typo in a configuration option - dekstop instead of desktop. Thanks to our friends in Nuuk for noticing. 2018-10-11T15:54:45+00:00
833,834c830,831
< [Template merge - langs/und] Moved whitespace tagging after the speller, to avoid that it creates trouble for the speller. That happens when whitespace error tags are applied to the word form that should be spell-checked. 2018-10-09T14:08:58+00:00
< [Template merge - langs/und] Made it possible to tag something as _only_ for the grammar checker, or _not_ for the grammar checker. Updated required giella-share version, due to new required filters. 2018-10-09T11:50:21+00:00
---
> [Template merge - langs/und] Moved whitespace tagging after the speller, to avoid that it creates trouble for the speller. That happens when whitespace error tags are applied to the word form that should be spell-checked. 2018-10-09T14:07:38+00:00
> [Template merge - langs/und] Made it possible to tag something as _only_ for the grammar checker, or _not_ for the grammar checker. Updated required giella-share version, due to new required filters. 2018-10-09T11:42:04+00:00
835a833
> [Template merge - langs/und] Moved whitespace chars to the blank regex, thereby reinstating the old compilation speed. Thanks to Kevin and Tino for noticing and suggesting the improvement. Also added comment to document what incondform is supposed to contain, again thanks to Kevin. 2018-10-09T10:01:28+00:00
836a835
> [Template merge - langs/und] Removed hyphen from the regular unknown alphabet, thereby reverting analysis of -foo as one (unknown) token, and instead back to two tokens. Added hyphen to alphamiddle, so that foo-bar will still be analysed as one big unknown token. 2018-10-09T08:51:54+00:00
841a841
> [Template merge - langs/und] Better handling of unknowns: defined more whitespace characters, defined a lot more vowels in the alphabet, added recent improvements to flag diacritic like symbols at token boundaries. 2018-10-08T17:21:50+00:00
846c846
< [Template merge - langs/und] Fixed two build bugs: abbr.txt was only autogenerated when building with hfst, and the url.?fst file was not properly generated from url.tmp.?fst. 2018-10-04T11:04:14+00:00
---
> [Template merge - langs/und] Fixed two build bugs: abbr.txt was only autogenerated when building with hfst, and the url.?fst file was not properly generated from url.tmp.?fst. 2018-10-04T10:58:49+00:00
848,849c848,849
< [Template merge - langs/und] Fixed bug in MT compilation - pattern rules are not used, but new filenames still had them due to copy-paste error. 2018-10-04T08:43:53+00:00
< [Template merge - langs/und] Added pmatch filtering also to MT and spellcheckers. Now all tools and fst's should be covered. 2018-10-04T07:59:17+00:00
---
> [Template merge - langs/und] Fixed bug in MT compilation - pattern rules are not used, but new filenames still had them due to copy-paste error. 2018-10-04T08:40:45+00:00
> [Template merge - langs/und] Added pmatch filtering also to MT and spellcheckers. Now all tools and fst's should be covered. 2018-10-04T07:53:29+00:00
851c851
< [Template merge - langs/und] Forgot to add pmatch filtering to the default targets in src/ - duh. Now done. 2018-10-04T07:32:33+00:00
---
> [Template merge - langs/und] Forgot to add pmatch filtering to the default targets in src/ - duh. Now done. 2018-10-04T07:31:29+00:00
859,860c859,860
< [Template merge - langs/und] Added pmatch filtering to the rest of the build targets in src/. Also added grammar checker filtering. 2018-10-03T10:42:04+00:00
< [Template merge - langs/und] Major reorganisation to properly handle pmatch preparations, by splitting the disamb-analyser compilation in two: one going to the regular disamb analyser, and the other going to the pmatch variant. We use the two tags +Use/PMatch and +Use/-Pmatch in complementary distribution to specify paths for each, one path containing pmatch backtracking poings (used with the --giella format of hfst-tokenise), and one without. The backtracking machinery is used to handle ambiguous tokenisation. Increased required version of giella-shared due to new, required filters. 2018-10-03T07:47:18+00:00
---
> [Template merge - langs/und] Added pmatch filtering to the rest of the build targets in src/. Also added grammar checker filtering. 2018-10-03T09:49:12+00:00
> [Template merge - langs/und] Major reorganisation to properly handle pmatch preparations, by splitting the disamb-analyser compilation in two: one going to the regular disamb analyser, and the other going to the pmatch variant. We use the two tags +Use/PMatch and +Use/-Pmatch in complementary distribution to specify paths for each, one path containing pmatch backtracking poings (used with the --giella format of hfst-tokenise), and one without. The backtracking machinery is used to handle ambiguous tokenisation. Increased required version of giella-shared due to new, required filters. 2018-10-03T07:33:30+00:00
862c862
< [Template merge - langs/und] More improvements to the analysis regression check: undo space->underscore from lookup2cg (to avoid meaningless diffs when comparing to the new hfst-tokenise), and removed weight info. Also changed the dir ref for abbr.txt to ref the build dir, not the source dir, as that is where the file is generated. 2018-10-01T09:57:18+00:00
---
> [Template merge - langs/und] More improvements to the analysis regression check: undo space->underscore from lookup2cg (to avoid meaningless diffs when comparing to the new hfst-tokenise), and removed weight info. Also changed the dir ref for abbr.txt to ref the build dir, not the source dir, as that is where the file is generated. 2018-10-01T09:52:30+00:00
866c866
< [Template merge - langs/und] Improved regression check script: check that the abbr file is built, for improved traditional tokenisation; and make the patch command silent, for less noise during testing. 2018-09-29T12:13:33+00:00
---
> [Template merge - langs/und] Improved regression check script: check that the abbr file is built, for improved traditional tokenisation; and make the patch command silent, for less noise during testing. 2018-09-29T12:02:57+00:00
868,870c868,869
< [Template merge - langs/und] Thanks to Børre, the analysis regression script will now remove diffs due to different handling of dynamic compounds when comparing old and new tokenisation. This makes it much easier to spot real differences between the two. 2018-09-25T10:10:13+00:00
< [Template merge - langs/und] Improved shell script for analysis regression testing, so that in cases of no diffs it will only print a short message and continue. The test for no diff is also much faster than a real diff. Improves processing time a lot for large test corpora. 2018-09-25T06:57:58+00:00
< svn ignore update 2018-09-20T08:44:05+00:00
---
> [Template merge - langs/und] Thanks to Børre, the analysis regression script will now remove diffs due to different handling of dynamic compounds when comparing old and new tokenisation. This makes it much easier to spot real differences between the two. 2018-09-25T10:10:09+00:00
> [Template merge - langs/und] Improved shell script for analysis regression testing, so that in cases of no diffs it will only print a short message and continue. The test for no diff is also much faster than a real diff. Improves processing time a lot for large test corpora. 2018-09-25T06:22:01+00:00
872d870
< updated svn ignore. 2018-09-20T08:28:11+00:00
877a876,877
> Restored use of old puncutation file, due to '7 and consonant gradation. We need a way to handle that, or replace '7 with something not containing a punctuation symbol. 2018-09-13T08:39:39+00:00
> [Template merge - langs/und] Moved punctuation definitions from each language to giella-shared/all_langs/. Makes much more sense, and will help in resolving random tokenisation bugs due to « and ». 2018-09-13T08:33:05+00:00
879,885c879,882
< [Template merge - langs/und] Fixed hyphenation build when there is no phonology file. 2018-09-10T11:52:22+00:00
< Added default ignore patterns. Earlier they had none, and showed up as deviations during general updates across all languages. Now they will just be left as is. 2018-09-10T11:25:41+00:00
< More general ignore pattern for tools/mt/apertium/tagsets/. 2018-09-10T11:16:40+00:00
< [Template merge - langs/und] Corrected an error after the Hunspell config section was commented out. 2018-09-10T10:56:33+00:00
< [Template merge - langs/und] Added --enable-all-tools option to configure.ac, to allow for easier configuration and testing of all common tools. Unstable or experimental tools must still be explicitly enabled. Commented out the Hunspell speller config completely, it is not supported. Corrected a comment. 2018-09-10T10:35:59+00:00
< Updated svn ignore patterns. 2018-09-08T05:26:27+00:00
< [Template merge - langs/und] Improved and completed the code to skip building phonology fst's. Clearer logic and comments. 2018-09-08T04:50:27+00:00
---
> [Template merge - langs/und] Fixed hyphenation build when there is no phonology file. 2018-09-10T11:49:24+00:00
> [Template merge - langs/und] Corrected an error after the Hunspell config section was commented out. 2018-09-10T10:52:48+00:00
> [Template merge - langs/und] Added --enable-all-tools option to configure.ac, to allow for easier configuration and testing of all common tools. Unstable or experimental tools must still be explicitly enabled. Commented out the Hunspell speller config completely, it is not supported. Corrected a comment. 2018-09-10T09:49:35+00:00
> [Template merge - langs/und] Improved and completed the code to skip building phonology fst's. Clearer logic and comments. 2018-09-08T04:39:57+00:00
886a884
> [Template merge - langs/und] Added a configure.ac setting to skip phonology compilation, typically used when compiling external sources, that provides a full analyser in src/morphology. Also added a configuration option to compile xfscript files with lexicon references in them, so allow for faster and more optimised rule composition. This variable has no effect yet, the rest of the machinery is missing. 2018-09-07T22:09:17+00:00
887a886
> [Template merge - langs/und] Remove all tmp files when cleaning. 2018-09-06T11:43:30+00:00
889,890c888,890
< [Template merge - langs/und] Fixed bug: the url analyser is located elsewhere, and should not be processed here in any case. 2018-09-06T10:09:11+00:00
< [Template merge - langs/und] Made url analyser compilation open for local adaptations, by going via a tmp file. 2018-09-06T07:32:50+00:00
---
> [Template merge - langs/und] Remove also url.tmp.lexc when cleaning. 2018-09-06T11:36:02+00:00
> [Template merge - langs/und] Fixed bug: the url analyser is located elsewhere, and should not be processed here in any case. 2018-09-06T10:08:30+00:00
> [Template merge - langs/und] Made url analyser compilation open for local adaptations, by going via a tmp file. 2018-09-06T07:31:53+00:00
891a892
> [Template merge - langs/und] Remove also url.lexc when cleaning, it is copied from giella-shared. 2018-09-05T13:52:15+00:00
893,896c894,896
< [Template merge - langs/und] Corrected double installation of url analyser bug. It should not be installed at all. 2018-08-31T17:48:19+00:00
< [Template merge - langs/und] Add missing ‘|’ in analyser-gt-whitespace.hfst goal. 2018-08-31T11:04:24+00:00
< Updated svn ignores. 2018-08-30T16:00:09+00:00
< [Template merge - langs/und] Fixed a bug in the previous commit that surfaced when enabling tokenisers but not grammar checkers. 2018-08-30T14:09:22+00:00
---
> [Template merge - langs/und] Corrected double installation of url analyser bug. It should not be installed at all. 2018-08-31T17:39:32+00:00
> [Template merge - langs/und] Add missing ‘|’ in analyser-gt-whitespace.hfst goal. 2018-08-31T10:49:54+00:00
> [Template merge - langs/und] Fixed a bug in the previous commit that surfaced when enabling tokenisers but not grammar checkers. 2018-08-30T13:36:49+00:00
898,904c898,903
< Updated svn ignores. 2018-08-29T05:25:34+00:00
< [Template merge - langs/und] Added filter dir and filter compilation to the fst-based hyphenators. Moved filter compilation from src/filters/ to the local filter dir (by copying the regex files and then compile them), to make the build process mostly fst format independent. 2018-08-28T11:48:12+00:00
< Updating svn ignores. 2018-08-28T10:47:06+00:00
< [Template merge - langs/und] Added support for local modifications of the hyphenator build via a tmp file. Simplified tmp file handling in the src/ dir. 2018-08-27T12:21:01+00:00
< [Template merge - langs/und] Added dir structure and Autotools data to prepare for adding hyphenation testing. 2018-08-27T10:57:05+00:00
< [Template merge - langs/und] Downcasing of derived proper nouns was only applied on the input side, not the hyphenated side. This caused such words to be case-shifted: arabialaččat -> A^ra^bi^a^lač^čat. This is now fixed. 2018-08-27T07:54:04+00:00
< [Template merge - langs/und] Fixed hyphenation bug where the lexicon-based hyphenator missed hyphenation points, mainly in propernouns, due to flag diacritics. Fixed by telling the fst compiler to treat flags as epsilons. Now the lexicon-based hyphenator is beating the plain rule-based one in most (all?) cases where there are differences. Must be tested better, though. 2018-08-27T06:21:20+00:00
---
> [Template merge - langs/und] Massive rewrite of filter codes and automatically generated tag conversions, all done to handle bug #2474 (URL tag not correctly formatted in the tokeniser output). The bug should be fixed now. 2018-08-30T11:50:04+00:00
> [Template merge - langs/und] Added filter dir and filter compilation to the fst-based hyphenators. Moved filter compilation from src/filters/ to the local filter dir (by copying the regex files and then compile them), to make the build process mostly fst format independent. 2018-08-28T11:17:10+00:00
> [Template merge - langs/und] Added support for local modifications of the hyphenator build via a tmp file. Simplified tmp file handling in the src/ dir. 2018-08-27T11:59:27+00:00
> [Template merge - langs/und] Added dir structure and Autotools data to prepare for adding hyphenation testing. 2018-08-27T10:55:19+00:00
> [Template merge - langs/und] Downcasing of derived proper nouns was only applied on the input side, not the hyphenated side. This caused such words to be case-shifted: arabialaččat -> A^ra^bi^a^lač^čat. This is now fixed. 2018-08-27T07:48:39+00:00
> [Template merge - langs/und] Fixed hyphenation bug where the lexicon-based hyphenator missed hyphenation points, mainly in propernouns, due to flag diacritics. Fixed by telling the fst compiler to treat flags as epsilons. Now the lexicon-based hyphenator is beating the plain rule-based one in most (all?) cases where there are differences. Must be tested better, though. 2018-08-26T16:58:10+00:00
911c910
< [Template merge - langs/und] Added comment to guide placement of local build targets (to avoid future merge conflicts), and a comment reminder about other places to change filenames. 2018-08-22T06:50:55+00:00
---
> [Template merge - langs/und] Added comment to guide placement of local build targets (to avoid future merge conflicts), and a comment reminder about other places to change filenames. 2018-08-22T06:46:22+00:00
914c913,914
< [Template merge - langs/und] Refactored repeating patterns of code with variables, fixes upload link after XServe crash last winter. 2018-08-20T10:01:02+00:00
---
> [Template merge - langs/und] Reorganised the source filenames to make it easy to override when needed. Should make it possible to solve the bug where src/syntax/disambiguator.cg3 overrides the same file in tools/grammarcheckers/. 2018-08-20T16:39:29+00:00
> [Template merge - langs/und] Refactored repeating patterns of code with variables, fixes upload link after XServe crash last winter. 2018-08-20T09:55:48+00:00
941,942c941,942
< [Template merge - langs/und] Corrected and improved the compilation of the analysers including the URL analysis. This should fix the problem with compiling SMA and other languages, and should in general reduce both compilation time and analyser size. The basic change was to union in the URL analysis as the last step in building the analysers, instead of early - the early injection led to fst blowup during minimisation. Now no blowup appears to take place. 2018-06-05T12:25:12+00:00
< [Template merge - langs/und] Added the special target .NOTPARALLEL to the hfst speller make file, to work around a make bug that caused a prerequisite to not be built when invoking make with the -j option. Also added some comments. 2018-05-18T13:00:28+00:00
---
> [Template merge - langs/und] Corrected and improved the compilation of the analysers including the URL analysis. This should fix the problem with compiling SMA and other languages, and should in general reduce both compilation time and analyser size. The basic change was to union in the URL analysis as the last step in building the analysers, instead of early - the early injection led to fst blowup during minimisation. Now no blowup appears to take place. 2018-06-05T12:07:51+00:00
> [Template merge - langs/und] Added the special target .NOTPARALLEL to the hfst speller make file, to work around a make bug that caused a prerequisite to not be built when invoking make with the -j option. Also added some comments. 2018-05-18T13:00:26+00:00
943a944
> [Template merge - langs/und] Updated command in comments to use the correct option. 2018-05-18T06:32:57+00:00
946c947
< [Template merge - langs/und] Reverted the more robust semantic tag reordering, it was just too slow. Now we are back to a less robust and more fragile system (including bugs), but with faster compilation. Ultimately we will abandon _semantic_ tag reordering altogether, and instead rewrite the lexc code to always place the semantic tags where they should be. 2018-05-16T09:08:46+00:00
---
> [Template merge - langs/und] Reverted the more robust semantic tag reordering, it was just too slow. Now we are back to a less robust and more fragile system (including bugs), but with faster compilation. Ultimately we will abandon _semantic_ tag reordering altogether, and instead rewrite the lexc code to always place the semantic tags where they should be. 2018-05-16T08:54:41+00:00
949,951c950,951
< [Template merge - langs/und] Corrected automake (and make?) syntax error that broke compilation. 2018-05-15T11:09:28+00:00
< [Template merge - langs/und] Simplified semantic tag filtering regex construction. 2018-05-15T07:32:58+00:00
< More things to ignore. 2018-05-14T10:33:30+00:00
---
> [Template merge - langs/und] Corrected automake (and make?) syntax error that broke compilation. 2018-05-15T11:08:55+00:00
> [Template merge - langs/und] Simplified semantic tag filtering regex construction. 2018-05-15T07:27:42+00:00
953,954c953,954
< [Template merge - langs/und] Too eager in the previous commit to get rid of semantic tag processing: removed the filter to zero out semantic tags completely, which broke compilation of a number of fst's where semantic tags are not wanted. 2018-05-09T08:15:02+00:00
< [Template merge - langs/und] Corrected bugs in reordering semantic tags by doing the reordering in two steps: 1) insert the tag in the new and correct position, and 2) remove the tag in the wrong position. There will probably be things to iron out, but initial tests are fine. This should also make the whole semantic tag reordering a bit faster to compile and apply, as the generated regexes are smaller and simpler. 2018-05-08T18:26:25+00:00
---
> [Template merge - langs/und] Too eager in the previous commit to get rid of semantic tag processing: removed the filter to zero out semantic tags completely, which broke compilation of a number of fst's where semantic tags are not wanted. 2018-05-09T08:09:24+00:00
> [Template merge - langs/und] Corrected bugs in reordering semantic tags by doing the reordering in two steps: 1) insert the tag in the new and correct position, and 2) remove the tag in the wrong position. There will probably be things to iron out, but initial tests are fine. This should also make the whole semantic tag reordering a bit faster to compile and apply, as the generated regexes are smaller and simpler. 2018-05-08T17:53:29+00:00
956,961c956,961
< [Template merge - langs/und] Now that the downcasing script works in all cases, remove all the special processing, and get rid of spurious rebuilds of the dependent fst's. Another time-saver:-) 2018-05-02T10:13:28+00:00
< [Template merge - langs/und] Changed the downcasing script to work also with hyperminimised hfst-fst's. Now the downcasing script works both with Xerox, Hfst and Foma, and both with standard and hyperminimised hfst-fst's. Finally! 2018-05-02T09:13:57+00:00
< [Template merge - langs/und] Added support for filters for grammatical and derivation tags, sorted the generated filter list. 2018-04-23T14:46:22+00:00
< [Template merge - langs/und] Bugfix: OLang/xxx tags were removed, not made optional, in generators. 2018-04-20T08:32:55+00:00
< [Template merge - langs/und] Do not delete disambiguator.cg3 and grammarchecker.cg3 when cleaning. 2018-04-19T08:49:44+00:00
< [Template merge - langs/und] Whether to let the orig-lang tags be visible in the disambiguating analyser or not is dependent on the language and the needs of each language community. Moving the removal of those tags from the general processing to the language specific processing. Step 2: removing it from the general processing. 2018-04-18T13:16:04+00:00
---
> [Template merge - langs/und] Now that the downcasing script works in all cases, remove all the special processing, and get rid of spurious rebuilds of the dependent fst's. Another time-saver:-) 2018-05-02T09:44:57+00:00
> [Template merge - langs/und] Changed the downcasing script to work also with hyperminimised hfst-fst's. Now the downcasing script works both with Xerox, Hfst and Foma, and both with standard and hyperminimised hfst-fst's. Finally! 2018-05-02T09:10:41+00:00
> [Template merge - langs/und] Added support for filters for grammatical and derivation tags, sorted the generated filter list. 2018-04-23T14:45:28+00:00
> [Template merge - langs/und] Bugfix: OLang/xxx tags were removed, not made optional, in generators. 2018-04-20T08:20:52+00:00
> [Template merge - langs/und] Do not delete disambiguator.cg3 and grammarchecker.cg3 when cleaning. 2018-04-19T07:27:50+00:00
> [Template merge - langs/und] Whether to let the orig-lang tags be visible in the disambiguating analyser or not is dependent on the language and the needs of each language community. Moving the removal of those tags from the general processing to the language specific processing. Step 2: removing it from the general processing. 2018-04-18T13:09:08+00:00
974,978c974,978
< [Template merge - langs/und] Added the -p option to the yaml testing command, to remove all passing test. This should make it easier to spot the actual FAILs. 2018-03-08T12:52:16+00:00
< [Template merge - langs/und] Corrected path to zhfst file. Also changed the return code when the zhfst file is not found, so that it will be reported as a FAIL. Since this test is only run when configured for building spellers, a missing zhfst file should be fatal. Also changed variable name to avoid confusion with the shell variable. 2018-03-08T11:02:54+00:00
< [Template merge - langs/und] Added phony target forwarding 'make test' to 'make check'. Required to make 'make check' work on some build systems. 2018-03-08T10:41:42+00:00
< [Template merge - langs/und] Added a separate disambiguation file for the spell checker output, and a spell-checker-only pipeline (well, still tokenisation and disambigation, but no proper grammar checking). 2018-03-05T15:40:34+00:00
< [Template merge - langs/und] Corrected Foma compilation for phonology rules. 2018-03-05T10:23:30+00:00
---
> [Template merge - langs/und] Added the -p option to the yaml testing command, to remove all passing test. This should make it easier to spot the actual FAILs. 2018-03-08T12:46:47+00:00
> [Template merge - langs/und] Corrected path to zhfst file. Also changed the return code when the zhfst file is not found, so that it will be reported as a FAIL. Since this test is only run when configured for building spellers, a missing zhfst file should be fatal. Also changed variable name to avoid confusion with the shell variable. 2018-03-08T10:55:45+00:00
> [Template merge - langs/und] Added phony target forwarding 'make test' to 'make check'. Required to make 'make check' work on some build systems. 2018-03-08T10:36:21+00:00
> [Template merge - langs/und] Added a separate disambiguation file for the spell checker output, and a spell-checker-only pipeline (well, still tokenisation and disambigation, but no proper grammar checking). 2018-03-05T15:20:58+00:00
> [Template merge - langs/und] Corrected Foma compilation for phonology rules. 2018-03-05T10:15:59+00:00
980,982d979
< Added ignore pattern for in.txt 2018-03-01T07:09:50+00:00
< More ignores 2018-03-01T06:52:33+00:00
< More svn ignores. 2018-03-01T06:25:59+00:00
985,989c982,984
< Added svnignore pattern for sigma.txt. 2018-02-21T09:49:57+00:00
< [Template merge - langs/und] Made symbol alignment default - I can see no cases where we don't want it, but it is still possible to disable it if such a need pops up. Also improved the error message when trying to build a twolc language using Foma. 2018-02-09T08:08:15+00:00
< [Template merge - langs/und] Added INFO text about switching to Hfst as a fallback when Xerox tools are not found. Also added test and error message when using Foma on a language with a twolc file. 2018-02-09T07:36:31+00:00
< Two more files to ignore. 2018-02-06T09:44:18+00:00
< [Template merge - langs/und] Fixed URL analysis in MT. All URL's and email addresses are now tagged +URL. Although the url analyser itself is small, the resulting analyser quadrupled in size (in sme). 2018-02-05T19:49:56+00:00
---
> [Template merge - langs/und] Made symbol alignment default - I can see no cases where we don't want it, but it is still possible to disable it if such a need pops up. Also improved the error message when trying to build a twolc language using Foma. 2018-02-09T08:00:22+00:00
> [Template merge - langs/und] Added INFO text about switching to Hfst as a fallback when Xerox tools are not found. Also added test and error message when using Foma on a language with a twolc file. 2018-02-09T07:11:09+00:00
> [Template merge - langs/und] Fixed URL analysis in MT. All URL's and email addresses are now tagged +URL. Although the url analyser itself is small, the resulting analyser quadrupled in size (in sme). 2018-02-05T19:44:27+00:00
992,1021c987,1015
< [Template merge - langs/und] Removed filters for removing morphological borders - they destroy the assymetry of the fst's, and make yaml testing more complicated. 2018-02-02T08:12:06+00:00
< [Template merge - langs/und] Added support for Area variants of the grammar checker generator. Should fix nightly build error for SMJ. 2018-02-01T19:32:30+00:00
< [Template merge - langs/und] Added missing Foma support for dictionary fst's. 2018-02-01T18:40:23+00:00
< [Template merge - langs/und] Fixed the last bunch of path errors. Now all yaml tests are back to normal. 2018-02-01T17:50:32+00:00
< [Template merge - langs/und] Cleanup: commented in outcommented test loop, removed exit statement used during development, fixed path for two test scripts. 2018-02-01T15:59:06+00:00
< [Template merge - langs/und] The last set of test runners for yaml tests changed to the new system. 2018-02-01T15:15:22+00:00
< [Template merge - langs/und] Three more yaml test runners done, still a few more to go before yaml testing is back in shape. 2018-02-01T13:58:57+00:00
< [Template merge - langs/und] Changed the last yaml testing scripts in the template to follow the new and improved system. No need for autoconf processing anymore. 2018-02-01T12:11:53+00:00
< [Template merge - langs/und] Major rework of the yaml testing framework, to be able to properly support fst type specific yaml testing (ie test only xfst or hfst transducers, or everything but xfst transducers (=foma & hfst)). This change triggered a number of other changes. The user-facing shell scripts are greatly simplified by this change. 2018-02-01T09:56:53+00:00
< Updated svn ignores. 2018-01-31T12:13:59+00:00
< [Template merge - langs/und] Corrected AM errors in the previous merge. Now the build is working again, 2018-01-31T11:42:51+00:00
< [Template merge - langs/und] Added support for grammar checker generators for alternative orthographies and writing systems. Should fix nightly build issue in CRK. 2018-01-31T11:14:39+00:00
< [Template merge - langs/und] Added support for a grammar checker specific generator. Should fix various issues re generation of suggestions. 2018-01-25T09:40:03+00:00
< [Template merge - langs/und] Added test for the presence of divvun-validate-suggest, which is now required to build grammar checkers. Now configure will error out instead of make. 2018-01-23T07:34:53+00:00
< [Template merge - langs/und] Add note to the errors.xml file that it is generated, and from which file it is generated, to avoid people editing the wrong file. 2018-01-22T12:42:48+00:00
< [Template merge - langs/und] Error messages are now copied from a source file to a build file, after bein validated. This allows support for VPATH builds and retains the integrity of the zcheck file. At the same time also replaced hard coded language names with automake variable expansion in the pipespec.xml.in file. 2018-01-22T10:59:59+00:00
< [Template merge - langs/und] Fixed bug in building dictionary analysers for alternative orthographies, introduced in the changes yesterday. 2018-01-18T07:10:31+00:00
< [Template merge - langs/und] Added option to specify language variant, to allow testing spellers for alternative writing systems, alternative orthographies, different countries etc. 2018-01-18T06:35:48+00:00
< [Template merge - langs/und] Added support for area / country specific fst's for the specialised dict and oahpa build files. At the same time reorganised the build code so that targets with two variables now consistently use the fst type / suffix as the pattern, and the writing system/alt orth/area/etc as the function parameter. This should make the build system more robust by reducing the risk for accidental pattern similarity. 2018-01-17T11:37:42+00:00
< [Template merge - langs/und] Added support for building area/country specific spellers. The target language for now is SMJ, but the feature is of course language independent and useful in a number of other circumstances. 2018-01-16T19:48:02+00:00
< [Template merge - langs/und] Changed dialect fst filenames to follow existing patterns used for Oahpa fst's. 2018-01-16T14:42:57+00:00
< [Template merge - langs/und] Added support for building dialect fst's. It is disabled by default, but can be enabled with a configure option. Also changed the disamb analyser to keep the dialect tags. Only normative fst's are filtered against dialect tags. 2018-01-16T12:39:01+00:00
< [Template merge - langs/und] Added initial support for building Area-specific analysers and generators (norm only). Also restored Area tags in the disamb and grammar checker analysers. Fixed missing support for Foma transducers in the alternative writing system support. 2018-01-16T07:44:07+00:00
< [Template merge - langs/und] Grammar checker .zcheck file should go into datadir, not libdir. 2018-01-15T11:55:49+00:00
< [Template merge - langs/und] Now using speller version info from configure.ac, not version.txt, which is removed. New giella-core required. 2018-01-15T10:40:45+00:00
< [Template merge - langs/und] Fixed a bug in fst format handling for the grammar checker - conflicting formats caused a segfault. Now using openfst-tropical for all fst's being processed in the grammarcheckers/ dir (presently only the speller acceptor analyser). 2018-01-15T08:51:33+00:00
< [Template merge - langs/und] Fixed OLang tag extraction and filter generation. 2018-01-12T13:19:58+00:00
< [Template merge - langs/und] Added weights to compounds in the language-indpendent build steps (languages without compounds will go through the same step, but will not be changed). Applied only to analysers. Also added spellrelax to the language-independent build of the analysers = it it always applied. 2018-01-12T11:58:01+00:00
< [Template merge - langs/und] Improved the previous fix: make sure it does not crash when the target file does not exist, and use the same test on all autogenerated tag lists. This should save a few more seconds of build time. 2018-01-12T08:33:08+00:00
< [Template merge - langs/und] Fixed bug #2355 so that the filters for semantic tags will only be rebuilt when there are real changes to the semantic tags. 2018-01-11T17:28:56+00:00
---
> [Template merge - langs/und] Removed filters for removing morphological borders - they destroy the assymetry of the fst's, and make yaml testing more complicated. 2018-02-02T08:10:41+00:00
> [Template merge - langs/und] Added support for Area variants of the grammar checker generator. Should fix nightly build error for SMJ. 2018-02-01T19:26:57+00:00
> [Template merge - langs/und] Added missing Foma support for dictionary fst's. 2018-02-01T18:13:16+00:00
> [Template merge - langs/und] Fixed the last bunch of path errors. Now all yaml tests are back to normal. 2018-02-01T17:48:24+00:00
> [Template merge - langs/und] Cleanup: commented in outcommented test loop, removed exit statement used during development, fixed path for two test scripts. 2018-02-01T15:57:03+00:00
> [Template merge - langs/und] The last set of test runners for yaml tests changed to the new system. 2018-02-01T15:11:08+00:00
> [Template merge - langs/und] Three more yaml test runners done, still a few more to go before yaml testing is back in shape. 2018-02-01T13:56:58+00:00
> [Template merge - langs/und] Changed the last yaml testing scripts in the template to follow the new and improved system. No need for autoconf processing anymore. 2018-02-01T10:53:02+00:00
> [Template merge - langs/und] Major rework of the yaml testing framework, to be able to properly support fst type specific yaml testing (ie test only xfst or hfst transducers, or everything but xfst transducers (=foma & hfst)). This change triggered a number of other changes. The user-facing shell scripts are greatly simplified by this change. 2018-02-01T09:56:08+00:00
> [Template merge - langs/und] Corrected AM errors in the previous merge. Now the build is working again, 2018-01-31T11:40:41+00:00
> [Template merge - langs/und] Added support for grammar checker generators for alternative orthographies and writing systems. Should fix nightly build issue in CRK. 2018-01-31T11:11:48+00:00
> [Template merge - langs/und] Added support for a grammar checker specific generator. Should fix various issues re generation of suggestions. 2018-01-25T09:24:02+00:00
> [Template merge - langs/und] Added test for the presence of divvun-validate-suggest, which is now required to build grammar checkers. Now configure will error out instead of make. 2018-01-23T07:19:58+00:00
> [Template merge - langs/und] Add note to the errors.xml file that it is generated, and from which file it is generated, to avoid people editing the wrong file. 2018-01-22T12:33:52+00:00
> [Template merge - langs/und] Error messages are now copied from a source file to a build file, after bein validated. This allows support for VPATH builds and retains the integrity of the zcheck file. At the same time also replaced hard coded language names with automake variable expansion in the pipespec.xml.in file. 2018-01-22T10:24:51+00:00
> [Template merge - langs/und] Fixed bug in building dictionary analysers for alternative orthographies, introduced in the changes yesterday. 2018-01-18T07:04:50+00:00
> [Template merge - langs/und] Added option to specify language variant, to allow testing spellers for alternative writing systems, alternative orthographies, different countries etc. 2018-01-18T06:26:43+00:00
> [Template merge - langs/und] Added support for area / country specific fst's for the specialised dict and oahpa build files. At the same time reorganised the build code so that targets with two variables now consistently use the fst type / suffix as the pattern, and the writing system/alt orth/area/etc as the function parameter. This should make the build system more robust by reducing the risk for accidental pattern similarity. 2018-01-17T11:30:00+00:00
> [Template merge - langs/und] Added support for building area/country specific spellers. The target language for now is SMJ, but the feature is of course language independent and useful in a number of other circumstances. 2018-01-16T19:31:57+00:00
> [Template merge - langs/und] Changed dialect fst filenames to follow existing patterns used for Oahpa fst's. 2018-01-16T14:38:00+00:00
> [Template merge - langs/und] Added support for building dialect fst's. It is disabled by default, but can be enabled with a configure option. Also changed the disamb analyser to keep the dialect tags. Only normative fst's are filtered against dialect tags. 2018-01-16T12:09:04+00:00
> [Template merge - langs/und] Added initial support for building Area-specific analysers and generators (norm only). Also restored Area tags in the disamb and grammar checker analysers. Fixed missing support for Foma transducers in the alternative writing system support. 2018-01-16T07:33:21+00:00
> [Template merge - langs/und] Grammar checker .zcheck file should go into datadir, not libdir. 2018-01-15T11:33:44+00:00
> [Template merge - langs/und] Now using speller version info from configure.ac, not version.txt, which is removed. New giella-core required. 2018-01-15T09:56:53+00:00
> [Template merge - langs/und] Fixed a bug in fst format handling for the grammar checker - conflicting formats caused a segfault. Now using openfst-tropical for all fst's being processed in the grammarcheckers/ dir (presently only the speller acceptor analyser). 2018-01-15T08:44:53+00:00
> [Template merge - langs/und] Fixed OLang tag extraction and filter generation. 2018-01-12T13:13:04+00:00
> [Template merge - langs/und] Added weights to compounds in the language-indpendent build steps (languages without compounds will go through the same step, but will not be changed). Applied only to analysers. Also added spellrelax to the language-independent build of the analysers = it it always applied. 2018-01-12T11:53:15+00:00
> [Template merge - langs/und] Improved the previous fix: make sure it does not crash when the target file does not exist, and use the same test on all autogenerated tag lists. This should save a few more seconds of build time. 2018-01-12T08:28:10+00:00
> [Template merge - langs/und] Fixed bug #2355 so that the filters for semantic tags will only be rebuilt when there are real changes to the semantic tags. 2018-01-11T17:12:17+00:00
1024c1018
< [Template merge - langs/und] Corrected a € vs cut incompatibility on Linux, cf bug report #2457. 2018-01-11T08:49:04+00:00
---
> [Template merge - langs/und] Corrected a € vs cut incompatibility on Linux, cf bug report #2457. 2018-01-11T08:40:19+00:00
1028d1021
< [Template merge - langs/und] Updated the pipespec.xml file to comply with the newest version of the grammar checker code, where each argument type is explicitly specified. Makes for a more robust pipeline. 2018-01-10T12:05:36+00:00
1029a1023
> [Template merge - langs/und] Updated the pipespec.xml file to comply with the newest version of the grammar checker code, where each argument type is explicitly specified. Makes for a more robust pipeline. 2018-01-10T11:59:03+00:00
1032,1033c1026,1027
< [Template merge - langs/und] Corrected fileref in m4, added correct autoconf path to errors.xml. 2018-01-08T14:48:18+00:00
< [Template merge - langs/und] Renamed pipespec.xml to *.in, to allow autoconf processing. This makes it possible to use modes when building using VPATHS/out-of-source builds. 2018-01-08T14:23:41+00:00
---
> [Template merge - langs/und] Corrected fileref in m4, added correct autoconf path to errors.xml. 2018-01-08T14:42:23+00:00
> [Template merge - langs/und] Renamed pipespec.xml to *.in, to allow autoconf processing. This makes it possible to use modes when building using VPATHS/out-of-source builds. 2018-01-08T14:20:23+00:00
1035c1029,1030
< [Template merge - langs/und] Hard-coded filename in fallback target - that was the only way to work around a loop in make on some systems. 2018-01-08T09:46:56+00:00
---
> [Template merge - langs/und] Hard-coded filename in fallback target - that was the only way to work around a loop in make on some systems. 2018-01-08T09:42:06+00:00
> [Template merge - langs/und] Renamed src/syntax/disambiguation.cg3 to src/syntax/disambiguator.cg3, to keep the file naming consistent (actor noun if possible), and remove discrepancy between the regular disambiguator and the grammar checker disambiguator that caused makefile troubles. 2018-01-08T05:50:37+00:00
1042,1049c1037,1043
< [Template merge - langs/und] Heavy rewrite of the analysis regression check tool, to support testing the grammar checker pipeline. 2017-12-12T12:20:30+00:00
< [Template merge - langs/und] Do not remove semantic tags, dialect tags and other tags useful for disambiguation or suggestion generation. The grammar checker speller needs these, and they will anyway disappear when we project the final fst. 2017-12-11T13:07:19+00:00
< Updated svn ignores. 2017-12-11T12:55:46+00:00
< [Template merge - langs/und] Proper verbosity specification in a few more instances, and added weight pushing for the grammar checker speller now (how could I have missed that?). 2017-12-01T12:31:44+00:00
< [Template merge - langs/und] Fixed a bug in piped hfst-xfst commands: in three cases the -p option was missing, causing strange misbehavior in hfst-xfst on some systems. 2017-12-01T12:09:04+00:00
< [Template merge - langs/und] Further configure.ac cleanup: moved some variable definitions to other m4 files, moved the language definition on top, deprecated GTLANG* variables for GLANG* variants (ie Giella instead of GiellaTechno). Updated copyright year. 2017-12-01T10:27:06+00:00
< [Template merge - langs/und] Moved all default AC_CONFIG_FILES into a separate function in a separate m4 file, to clean up configure.ac. Some other cleanup of configure.ac. 2017-12-01T09:32:03+00:00
< [Template merge - langs/und] Defined variable for separate speller release version string. 2017-12-01T08:23:56+00:00
---
> [Template merge - langs/und] Heavy rewrite of the analysis regression check tool, to support testing the grammar checker pipeline. 2017-12-12T11:44:25+00:00
> [Template merge - langs/und] Do not remove semantic tags, dialect tags and other tags useful for disambiguation or suggestion generation. The grammar checker speller needs these, and they will anyway disappear when we project the final fst. 2017-12-11T13:03:01+00:00
> [Template merge - langs/und] Proper verbosity specification in a few more instances, and added weight pushing for the grammar checker speller now (how could I have missed that?). 2017-12-01T12:23:37+00:00
> [Template merge - langs/und] Fixed a bug in piped hfst-xfst commands: in three cases the -p option was missing, causing strange misbehavior in hfst-xfst on some systems. 2017-12-01T11:58:58+00:00
> [Template merge - langs/und] Further configure.ac cleanup: moved some variable definitions to other m4 files, moved the language definition on top, deprecated GTLANG* variables for GLANG* variants (ie Giella instead of GiellaTechno). Updated copyright year. 2017-12-01T10:10:31+00:00
> [Template merge - langs/und] Moved all default AC_CONFIG_FILES into a separate function in a separate m4 file, to clean up configure.ac. Some other cleanup of configure.ac. 2017-12-01T09:18:28+00:00
> [Template merge - langs/und] Defined variable for separate speller release version string. 2017-12-01T08:18:29+00:00
1051,1057c1045,1051
< [Template merge - langs/und] Updated comment in preparation for other changes. 2017-12-01T07:53:01+00:00
< [Template merge - langs/und] Added support for analysing whitespace and thus make it possible to tag whitespace errors (double spaces, extra spaces, etc), and also to more reliably detect sentence and paragraph borders by using whitespace as a delimiter. 2017-11-30T14:23:26+00:00
< [Template merge - langs/und] Using absolute dir refs to make it possible to call the shell scripts from everywhere. 2017-11-30T12:36:00+00:00
< [Template merge - langs/und] Fixed a bug: forgot to remove a line. 2017-11-29T13:37:02+00:00
< [Template merge - langs/und] Rewrote the speller test scripts in devtools/ to be VPATH safe and rely on autotools for paths etc, so that the scripts will work also when only checking out single languages. 2017-11-29T13:00:15+00:00
< [Template merge - langs/und] Added support for specifying language-specific files to be included in the grammar checker archive file. 2017-11-15T13:19:51+00:00
< [Template merge - langs/und] Updated grammar checker files and build rules. 2017-11-13T09:47:19+00:00
---
> [Template merge - langs/und] Updated comment in preparation for other changes. 2017-12-01T07:47:21+00:00
> [Template merge - langs/und] Added support for analysing whitespace and thus make it possible to tag whitespace errors (double spaces, extra spaces, etc), and also to more reliably detect sentence and paragraph borders by using whitespace as a delimiter. 2017-11-30T14:07:36+00:00
> [Template merge - langs/und] Using absolute dir refs to make it possible to call the shell scripts from everywhere. 2017-11-30T12:31:01+00:00
> [Template merge - langs/und] Fixed a bug: forgot to remove a line. 2017-11-29T13:32:44+00:00
> [Template merge - langs/und] Rewrote the speller test scripts in devtools/ to be VPATH safe and rely on autotools for paths etc, so that the scripts will work also when only checking out single languages. 2017-11-29T12:09:07+00:00
> [Template merge - langs/und] Added support for specifying language-specific files to be included in the grammar checker archive file. 2017-11-15T13:17:02+00:00
> [Template merge - langs/und] Updated grammar checker files and build rules. 2017-11-13T09:38:05+00:00
1060c1054
< [Template merge - langs/und] Added hfst-push-weights to move transducer weights to the beginning of the strings, to enable proper optimisations of speller lookup in hfst-ospell. Stripped out most lang-specific stuff from grammar checker cg file, and added simple example rules + some explanations. Use gramcheck tokeniser in pre-pipe. 2017-11-07T15:46:35+00:00
---
> [Template merge - langs/und] Added hfst-push-weights to move transducer weights to the beginning of the strings, to enable proper optimisations of speller lookup in hfst-ospell. Stripped out most lang-specific stuff from grammar checker cg file, and added simple example rules + some explanations. Use gramcheck tokeniser in pre-pipe. 2017-11-07T15:44:23+00:00
1062,1063c1056,1057
< [Template merge - langs/und] Added default rule for speller suggestions, to make the suggestions survive cg treatment. 2017-10-25T09:54:16+00:00
< [Template merge - langs/und] Added spell checking component to the grammar checker pipeline. Now every planned component is working as it should. The spell checking requires first that one builds the latest hfst-ospell code, and then the newest grammar checker code for this to work. 2017-10-24T12:53:13+00:00
---
> [Template merge - langs/und] Added default rule for speller suggestions, to make the suggestions survive cg treatment. 2017-10-24T17:26:25+00:00
> [Template merge - langs/und] Added spell checking component to the grammar checker pipeline. Now every planned component is working as it should. The spell checking requires first that one builds the latest hfst-ospell code, and then the newest grammar checker code for this to work. 2017-10-24T12:48:15+00:00
1068c1062
< [Template merge - langs/und] Increased weights for fall-back rule-based hyphenation. Added .hfst suffix to rule fst for consistency. 2017-10-13T07:41:24+00:00
---
> [Template merge - langs/und] Increased weights for fall-back rule-based hyphenation. Added .hfst suffix to rule fst for consistency. 2017-10-13T07:35:48+00:00
1070,1075c1064,1067
< [Template merge - langs/und] Replaced the huge sme grammar checker with the more moderate smn grammar checker cg file, as the template file for future grammar checkers. 2017-10-12T08:39:54+00:00
< [Template merge - langs/und] Added note (readme file) about NOT touching the local am-shared dir, to avoid future unintended changes. 2017-10-12T06:36:44+00:00
< [Template merge - langs/und] Added the missing files for a working grammar checker. Fixed grammar checker build rules to not be dependent upon enabling tokenisers. 2017-10-11T19:19:18+00:00
< Updated svn ignores for tokenisers and grammar checkers + subdirs. 2017-10-11T11:47:18+00:00
< Updated svn ignores for tokenisers and grammar checkers + subdirs. 2017-10-11T11:22:45+00:00
< [Template merge - langs/und] Added conversion of the analysis tags from the grammar checker speller into CG format. 2017-10-11T05:53:04+00:00
---
> [Template merge - langs/und] Replaced the huge sme grammar checker with the more moderate smn grammar checker cg file, as the template file for future grammar checkers. 2017-10-12T08:30:44+00:00
> [Template merge - langs/und] Added note (readme file) about NOT touching the local am-shared dir, to avoid future unintended changes. 2017-10-12T06:32:03+00:00
> [Template merge - langs/und] Added the missing files for a working grammar checker. Fixed grammar checker build rules to not be dependent upon enabling tokenisers. 2017-10-11T15:15:25+00:00
> [Template merge - langs/und] Added conversion of the analysis tags from the grammar checker speller into CG format. 2017-10-11T05:16:05+00:00
1076a1069
> [Template merge - langs/und] One misplaced variable caused the grammar checker speller to be built independent of the configuration. This caused a build fail for everyone. Solves bug #2437. Also added $(srcdir) in front of root.lexc, to ensure that the file reference resolves correctly in local build targets. 2017-10-10T09:23:22+00:00
1077a1071
> [Template merge - langs/und] Moved the target clean-local to the local Makefile, to make it possible to enhance the clean target with locally generated files. 2017-10-10T08:53:25+00:00
1080,1081c1074,1075
< [Template merge - langs/und] Correctiona to the grammar checker speller build: we now build a working zhfst file that can be used as part of the development cycle. Also additions to silent builds. 2017-10-04T07:00:03+00:00
< [Template merge - langs/und] Major update to the grammar checker template. It still does not work completely as it should, so hold your horses. Update content: ensured that all files needed are copied to the grammar checker build dir, removed option to name files (=irrelevant bloat), now builds an almost proper zip file, and ensured that tokenisers are built before grammarcheckers. Also made it so that when grammar checkers are enabled, spellers are automatically enabled too, as they will be included as part of the grammar checker pipeline. 2017-10-03T07:01:12+00:00
---
> [Template merge - langs/und] Correctiona to the grammar checker speller build: we now build a working zhfst file that can be used as part of the development cycle. Also additions to silent builds. 2017-10-04T04:06:57+00:00
> [Template merge - langs/und] Major update to the grammar checker template. It still does not work completely as it should, so hold your horses. Update content: ensured that all files needed are copied to the grammar checker build dir, removed option to name files (=irrelevant bloat), now builds an almost proper zip file, and ensured that tokenisers are built before grammarcheckers. Also made it so that when grammar checkers are enabled, spellers are automatically enabled too, as they will be included as part of the grammar checker pipeline. 2017-10-03T06:50:59+00:00
1090c1084,1085
< [Template merge - langs/und] Made cg3 file compilation more general. 2017-09-19T14:19:51+00:00
---
> [Template merge - langs/und] Changed the file exists test for the lemma generation testing so that it will work even in cases where multiple source files are used as input. 2017-09-20T11:48:42+00:00
> [Template merge - langs/und] Made cg3 file compilation more general. 2017-09-19T14:13:29+00:00
1094,1095c1089,1090
< [Template merge - langs/und] Moved the code to build the apertium relabel script in the apertium directory, so that we can use the actual giella-tagged fst for MT as the tag source. This should fix all issues of missing tags in the relabel script. 2017-09-15T14:15:22+00:00
< [Template merge - langs/und] GLE requires regex compilation possibilities in src/, no reason why it can't be. 2017-09-14T11:27:39+00:00
---
> [Template merge - langs/und] Moved the code to build the apertium relabel script in the apertium directory, so that we can use the actual giella-tagged fst for MT as the tag source. This should fix all issues of missing tags in the relabel script. 2017-09-15T14:03:38+00:00
> [Template merge - langs/und] GLE requires regex compilation possibilities in src/, no reason why it can't be. 2017-09-14T11:22:55+00:00
1096a1092
> [Template merge - langs/und] Fixed a shortcoming in the build infra uncovered by gle: no explicit support for language-specific build rules that will not end up in lexicon.?fst. 2017-09-14T05:44:15+00:00
1108c1104
< [Template merge - langs/und] Moved tag extraction to a separate am-include file, so that it can be shared between different dirs. Moved generation of regex for turning tags into CG friendly format from src/filters/ to tools/tokenisers/filters/. 2017-08-28T14:39:46+00:00
---
> [Template merge - langs/und] Moved tag extraction to a separate am-include file, so that it can be shared between different dirs. Moved generation of regex for turning tags into CG friendly format from src/filters/ to tools/tokenisers/filters/. 2017-08-28T13:12:39+00:00
1110,1114c1106,1109
< Updating svn ignores. 2017-08-25T10:22:58+00:00
< [Template merge - langs/und] After a couple of bug fixes in giella-core, require the new version. 2017-08-25T10:11:28+00:00
< [Template merge - langs/und] Initial support for building tokenisers where the morphological analysis tags are given in CG format directly instead of having to be postprocess by hfst-tokenise before being printed. The idea is to make the hfst-tokenise code more general, and move everything that is particular to one language or setup go into the fst instead of being hardcoded in the C++ code. There are some issues that must be resolved, but fst-wise the code works. 2017-08-24T11:54:29+00:00
< [Template merge - langs/und] Added support for building a regex that transform all tags from the format "+Adv" to " Adv" (including space). The idea is to make the tags readily consumable by CG. Both prefix and suffix tags are converted. Newest giella-core required. 2017-08-24T10:09:48+00:00
< [Template merge - langs/und] Part two of renaming the preprocess dir to tokenisers. Now all refs to it are updated. 2017-08-24T07:29:47+00:00
---
> [Template merge - langs/und] After a couple of bug fixes in giella-core, require the new version. 2017-08-25T09:58:41+00:00
> [Template merge - langs/und] Initial support for building tokenisers where the morphological analysis tags are given in CG format directly instead of having to be postprocess by hfst-tokenise before being printed. The idea is to make the hfst-tokenise code more general, and move everything that is particular to one language or setup go into the fst instead of being hardcoded in the C++ code. There are some issues that must be resolved, but fst-wise the code works. 2017-08-24T11:33:54+00:00
> [Template merge - langs/und] Added support for building a regex that transform all tags from the format "+Adv" to " Adv" (including space). The idea is to make the tags readily consumable by CG. Both prefix and suffix tags are converted. Newest giella-core required. 2017-08-24T10:02:20+00:00
> [Template merge - langs/und] Part two of renaming the preprocess dir to tokenisers. Now all refs to it are updated. 2017-08-24T06:44:58+00:00
1115a1111
> [Template merge - langs/und] Renamed the preprocess dir to tokenisers, to better describe the content of it. 2017-08-24T06:10:51+00:00
1122,1126c1118,1122
< [Template merge - langs/und] Added support for diffing and merging on Linux. As part of that added checking for diff tools in m4/giella-macros.m4, and added more tests against failures. Also added test for cg-mwesplit, and increased the required vislcg3 version to the 1.0 release. 2017-08-16T10:52:11+00:00
< [Template merge - langs/und] More robust test for the existence of the various vislcg3 files. 2017-08-15T12:22:08+00:00
< [Template merge - langs/und] Added more robust option checking, and a test for the existence of the specified corpus file. Also added some comments. 2017-08-15T07:17:16+00:00
< [Template merge - langs/und] Actually open the other diff views. And force-add to svn - we don't want error messages in this context. 2017-08-14T14:47:01+00:00
< [Template merge - langs/und] Corrected glaring variable copy&paste bug. Thanks to Trond for spotting it! 2017-08-14T12:56:13+00:00
---
> [Template merge - langs/und] Added support for diffing and merging on Linux. As part of that added checking for diff tools in m4/giella-macros.m4, and added more tests against failures. Also added test for cg-mwesplit, and increased the required vislcg3 version to the 1.0 release. 2017-08-16T10:34:09+00:00
> [Template merge - langs/und] More robust test for the existence of the various vislcg3 files. 2017-08-15T12:06:28+00:00
> [Template merge - langs/und] Added more robust option checking, and a test for the existence of the specified corpus file. Also added some comments. 2017-08-15T07:08:43+00:00
> [Template merge - langs/und] Actually open the other diff views. And force-add to svn - we don't want error messages in this context. 2017-08-14T14:39:00+00:00
> [Template merge - langs/und] Corrected glaring variable copy&paste bug. Thanks to Trond for spotting it! 2017-08-14T12:34:53+00:00
1139c1135
< [Template merge - langs/und] Removed from the default build rules the automatic removal of +Comp tags in adverbs. That is definitely not a behavior we want universally. 2017-07-02T01:40:00+00:00
---
> [Template merge - langs/und] Removed from the default build rules the automatic removal of +Comp tags in adverbs. That is definitely not a behavior we want universally. 2017-07-02T00:25:29+00:00
1141c1137
< [Template merge - langs/und] Fixed a bug that caused the check_analysis_regressions.sh script to fail if you hadn't put giella-core/scripts/ in your path - which is not automatically done when you just checks out giella-core and your language of interest. 2017-06-30T00:57:44+00:00
---
> [Template merge - langs/und] Fixed a bug that caused the check_analysis_regressions.sh script to fail if you hadn't put giella-core/scripts/ in your path - which is not automatically done when you just checks out giella-core and your language of interest. 2017-06-30T00:31:02+00:00
1144,1146c1140
< [Template merge - langs/und] Changed command to extract the specified fst name, the old version was not reliable. 2017-06-29T01:18:11+00:00
< Updated svn ignores. 2017-06-28T23:37:25+00:00
< Updated svn ignores. 2017-06-28T23:08:42+00:00
---
> [Template merge - langs/und] Changed command to extract the specified fst name, the old version was not reliable. 2017-06-29T01:06:15+00:00
1195,1197c1189,1191
< [Template merge - langs/und] Due to wrong AM conditional, it still built a few mobile speller fst's. Now it should be quiet. 2017-05-23T09:32:25+00:00
< [Template merge - langs/und] Really do disable mobile spellers by default... 2017-05-23T08:57:05+00:00
< [Template merge - langs/und] Made mobile spellers not build by default, even when enabling spellers. The mobile spellers must now be explicitly enabled. 2017-05-23T08:39:53+00:00
---
> [Template merge - langs/und] Due to wrong AM conditional, it still built a few mobile speller fst's. Now it should be quiet. 2017-05-23T09:19:48+00:00
> [Template merge - langs/und] Really do disable mobile spellers by default... 2017-05-23T08:51:24+00:00
> [Template merge - langs/und] Made mobile spellers not build by default, even when enabling spellers. The mobile spellers must now be explicitly enabled. 2017-05-23T08:32:45+00:00
1205c1199
< [Template merge - langs/und] Removed Ins() around Unknown. This triggered a bug(?) in hfst-tokenise, that caused wordforms not to be output. Speed and memory consumption should not be noticably affected. 2017-05-16T17:01:39+00:00
---
> [Template merge - langs/und] Removed Ins() around Unknown. This triggered a bug(?) in hfst-tokenise, that caused wordforms not to be output. Speed and memory consumption should not be noticably affected. 2017-05-16T16:59:19+00:00
1268c1262
< [Template merge - langs/und] Improved pmatch scripts - unification by reference instead of full fst unification. Reduces file size by ≈2/3, and runtime memory consumption by 50%. 2017-05-04T10:22:09+00:00
---
> [Template merge - langs/und] Improved pmatch scripts - unification by reference instead of full fst unification. Reduces file size by ≈2/3, and runtime memory consumption by 50%. 2017-05-04T10:18:16+00:00
1275c1269
< [Template merge - langs/und] Now that there is a new version of Hfst out, require it. Should resolve issues with compiling the url.lexc file. 2017-04-18T16:22:44+00:00
---
> [Template merge - langs/und] Now that there is a new version of Hfst out, require it. Should resolve issues with compiling the url.lexc file. 2017-04-18T15:19:10+00:00
1282,1286c1276,1279
< ign 2017-03-21T19:49:19+00:00
< [Template merge - langs/und] Further development of the analysis regression check: added support for diff views of all diff types, and now you can specify which diff view you want to see (and you must specify at least one). You can also override the default corpus, and specify a corpus of your own with the -c/--corpus option. Also corrected the initial description of the script in the help text, and added a diff view comparing the old pipeline using Xerox with the new pipeline using hfst-tokenise. This will help in finding unwanted differences between the two. 2017-03-17T12:48:36+00:00
< [Template merge - langs/und] Further improvements to the analysis regression check: only do function and dependency analysis if the required cg3 files exist. Also clarified the -d option and silenced the Xerox lookup tool. 2017-03-16T14:34:03+00:00
< [Template merge - langs/und] Improved analysis regression check script: added a short help text, and added an option to ask for a diff between old-style (preprocess+lookup+lookup2cg) and new-style (hfst-tokenise+mwe-disamb+cg-mwesplit) morphological analysis. Intended to be used to find weak (and strong!) spots in the new-style morphological analysis. 2017-03-16T12:21:56+00:00
< [Template merge - langs/und] Added the first version of a $LANG/devtools/ script that will process a corpus with the available tools, and compare the result against the previous version in the svn repository. The idea is to be able to easily spot regressions in analyses due to changes in the lexicons or CG rules. There are a number of rough edges, but it works. 2017-03-16T10:12:06+00:00
---
> [Template merge - langs/und] Further development of the analysis regression check: added support for diff views of all diff types, and now you can specify which diff view you want to see (and you must specify at least one). You can also override the default corpus, and specify a corpus of your own with the -c/--corpus option. Also corrected the initial description of the script in the help text, and added a diff view comparing the old pipeline using Xerox with the new pipeline using hfst-tokenise. This will help in finding unwanted differences between the two. 2017-03-17T12:44:23+00:00
> [Template merge - langs/und] Further improvements to the analysis regression check: only do function and dependency analysis if the required cg3 files exist. Also clarified the -d option and silenced the Xerox lookup tool. 2017-03-16T14:33:20+00:00
> [Template merge - langs/und] Improved analysis regression check script: added a short help text, and added an option to ask for a diff between old-style (preprocess+lookup+lookup2cg) and new-style (hfst-tokenise+mwe-disamb+cg-mwesplit) morphological analysis. Intended to be used to find weak (and strong!) spots in the new-style morphological analysis. 2017-03-16T12:17:12+00:00
> [Template merge - langs/und] Added the first version of a $LANG/devtools/ script that will process a corpus with the available tools, and compare the result against the previous version in the svn repository. The idea is to be able to easily spot regressions in analyses due to changes in the lexicons or CG rules. There are a number of rough edges, but it works. 2017-03-16T10:03:42+00:00
1287a1281
> [Template merge - langs/und] Only remove generated lemma files if the lemma generation tests succeeds. 2017-03-14T14:05:45+00:00
1289,1293c1283,1287
< [Template merge - langs/und] Only delete generated dic and tex files if one really wants to start anew. Do not delete the version.txt file, only the generated wordlist file. 2017-03-07T18:46:22+00:00
< [Template merge - langs/und] Add the url parser also to the grammar checker tokeniser. 2017-03-07T15:01:20+00:00
< [Template merge - langs/und] Make the url.hfst a dependent of the hfst tokenising analyser. Improved the tokeniser based on recent changes in sme. 2017-03-06T17:08:41+00:00
< [Template merge - langs/und] Removed automatic inclusion of the url parsing fst. The union with the regular fst blew up the total, in some cases more than 10x! The preferred way of adding it is to add it in the last steps of the *.tmp.fst > *.fst processing by loading it onto the stack (and inverse it for hfst) before saving the fst stack, and thus creating a transducer file with two fst's. Applying the input to them both will in effect union them, giving the output we want without blowing up the size of the fst file. 2017-03-03T14:19:52+00:00
< [Template merge - langs/und] Added support for compiling a lexc file for parsing URL's as such, giving them a separate tag. Only added to the descriptive analysers for now. Requires an updated version of giella-shared, due to the new file needed for the new functionality. 2017-03-02T14:17:12+00:00
---
> [Template merge - langs/und] Only delete generated dic and tex files if one really wants to start anew. Do not delete the version.txt file, only the generated wordlist file. 2017-03-07T18:45:30+00:00
> [Template merge - langs/und] Add the url parser also to the grammar checker tokeniser. 2017-03-07T15:00:06+00:00
> [Template merge - langs/und] Make the url.hfst a dependent of the hfst tokenising analyser. Improved the tokeniser based on recent changes in sme. 2017-03-06T17:06:10+00:00
> [Template merge - langs/und] Removed automatic inclusion of the url parsing fst. The union with the regular fst blew up the total, in some cases more than 10x! The preferred way of adding it is to add it in the last steps of the *.tmp.fst > *.fst processing by loading it onto the stack (and inverse it for hfst) before saving the fst stack, and thus creating a transducer file with two fst's. Applying the input to them both will in effect union them, giving the output we want without blowing up the size of the fst file. 2017-03-03T14:13:08+00:00
> [Template merge - langs/und] Added support for compiling a lexc file for parsing URL's as such, giving them a separate tag. Only added to the descriptive analysers for now. Requires an updated version of giella-shared, due to the new file needed for the new functionality. 2017-03-02T14:04:15+00:00
1295,1298c1289,1291
< [Template merge - langs/und] Corrects an inconsistency in the order of tag changing processing, where generators and analysers got their tags changed in different order, which caused different tags in some cases. Fixes bug #2264. Thanks to Heiki-Jaan Kaalep for the new and corrected code. 2017-03-02T06:40:00+00:00
< Updated svn ignores. 2017-03-01T12:02:48+00:00
< [Template merge - langs/und] Updated Python feedback to correctly state that Python 3.5 is required. 2017-02-27T09:33:35+00:00
< [Template merge - langs/und] Fixed issue with link generation thanks to Heiki-Jaan Kalep. 2017-02-22T09:03:27+00:00
---
> [Template merge - langs/und] Corrects an inconsistency in the order of tag changing processing, where generators and analysers got their tags changed in different order, which caused different tags in some cases. Fixes bug #2264. Thanks to Heiki-Jaan Kaalep for the new and corrected code. 2017-03-02T06:35:58+00:00
> [Template merge - langs/und] Updated Python feedback to correctly state that Python 3.5 is required. 2017-02-27T09:21:02+00:00
> [Template merge - langs/und] Fixed issue with link generation thanks to Heiki-Jaan Kalep. 2017-02-22T08:58:22+00:00
1304,1305c1297,1298
< [Template merge - langs/und] Increased reqiured version of Python3, due to the updated speller test bench. 2017-02-15T08:02:20+00:00
< [Template merge - langs/und] New version of the speller test bench, now with sortable table columns, and optional timing of the suggestions for every input word (hfst-ospell-office only). Not finished, but working quite well. It is also possible now to specify the number of suggestions returned by hfst-ospell-office. 2017-02-14T09:38:50+00:00
---
> [Template merge - langs/und] Increased reqiured version of Python3, due to the updated speller test bench. 2017-02-15T07:57:46+00:00
> [Template merge - langs/und] New version of the speller test bench, now with sortable table columns, and optional timing of the suggestions for every input word (hfst-ospell-office only). Not finished, but working quite well. It is also possible now to specify the number of suggestions returned by hfst-ospell-office. 2017-02-14T09:37:59+00:00
1308c1301
< [Template merge - langs/und] Increased required version of giella-core due to bug fix in the core. 2017-02-03T11:51:18+00:00
---
> [Template merge - langs/und] Increased required version of giella-core due to bug fix in the core. 2017-02-03T11:46:31+00:00
1310c1303
< [Template merge - langs/und] Increased required version of giella-core due to changes in speller building. 2017-02-03T09:50:59+00:00
---
> [Template merge - langs/und] Increased required version of giella-core due to changes in speller building. 2017-02-03T09:45:31+00:00
1313c1306
< [Template merge - langs/und] One more attempt at fixing the giella-common package bug. 2017-02-02T08:57:48+00:00
---
> [Template merge - langs/und] One more attempt at fixing the giella-common package bug. 2017-02-02T08:46:50+00:00
1317,1326c1310,1318
< [Template merge - langs/und] Added final step in building pattern-based hyphenators: now also prepared for Hunspell-like OOo hyphenation. Requires new version of the giella-core. Also corrected bug in checking the version number of giella-common. 2017-02-01T11:11:30+00:00
< [Template merge - langs/und] Tex pattern based hyphenation generation works. The output must be checked and tested, and the process may have to be rerun several times to get the desired hyphenation behavior. Removed outcommented build code from the old infra - the new build code is essentially just a reformulation of the old one. 2017-01-31T14:44:34+00:00
< [Template merge - langs/und] Added support for checking the version of the giella-common package (aka giella-shared/). Added two new regexes to the source file list for shared regexes. Updated the required version of Hfst - it has not been updated in ages. 2017-01-31T13:56:33+00:00
< [Template merge - langs/und] Further work on the pattern based hyphenators: added tra file template, which is used to 'translate' non-ASCII chars to ascii only for the pattern creation process. Initial build steps for the pattern build. 2017-01-31T12:26:09+00:00
< [Template merge - langs/und] Improved the fst-based hyphenator by removing irrelevant paths from the fst. Started work on the pattern-based hyphenator, based on code from the old infra. 2017-01-31T11:12:44+00:00
< [Template merge - langs/und] Finished first version of fst-based hyphenator: now includes plain rules as a fall-back solution (including for misspelled words), and Err-tagged forms get a high weight penalty. In general, this seems to give good hyphenation patterns if one pick the first (lowest-weight) one. 2017-01-30T13:51:38+00:00
< [Template merge - langs/und] First version of lexicon-based and fst-based hyphenation done. Works, but misses capitalised words, and does not give extra weights to Err-tagged word forms. Also no hyphenation of misspelled words yet. Hyphenation builds are off by default. 2017-01-30T12:14:37+00:00
< [Template merge - langs/und] Added template file for weighting tags when the fst is used as a hyphenator. 2017-01-30T10:42:47+00:00
< Updated svn ignores. 2017-01-30T10:04:48+00:00
< [Template merge - langs/und] Added check for cg-relabel when enabling apertium. Thanks to Flammie for identifying the issue. 2017-01-30T09:31:50+00:00
---
> [Template merge - langs/und] Added final step in building pattern-based hyphenators: now also prepared for Hunspell-like OOo hyphenation. Requires new version of the giella-core. Also corrected bug in checking the version number of giella-common. 2017-02-01T11:03:20+00:00
> [Template merge - langs/und] Tex pattern based hyphenation generation works. The output must be checked and tested, and the process may have to be rerun several times to get the desired hyphenation behavior. Removed outcommented build code from the old infra - the new build code is essentially just a reformulation of the old one. 2017-01-31T14:37:09+00:00
> [Template merge - langs/und] Added support for checking the version of the giella-common package (aka giella-shared/). Added two new regexes to the source file list for shared regexes. Updated the required version of Hfst - it has not been updated in ages. 2017-01-31T13:46:02+00:00
> [Template merge - langs/und] Further work on the pattern based hyphenators: added tra file template, which is used to 'translate' non-ASCII chars to ascii only for the pattern creation process. Initial build steps for the pattern build. 2017-01-31T12:15:05+00:00
> [Template merge - langs/und] Improved the fst-based hyphenator by removing irrelevant paths from the fst. Started work on the pattern-based hyphenator, based on code from the old infra. 2017-01-31T10:17:49+00:00
> [Template merge - langs/und] Finished first version of fst-based hyphenator: now includes plain rules as a fall-back solution (including for misspelled words), and Err-tagged forms get a high weight penalty. In general, this seems to give good hyphenation patterns if one pick the first (lowest-weight) one. 2017-01-30T13:47:32+00:00
> [Template merge - langs/und] First version of lexicon-based and fst-based hyphenation done. Works, but misses capitalised words, and does not give extra weights to Err-tagged word forms. Also no hyphenation of misspelled words yet. Hyphenation builds are off by default. 2017-01-30T12:08:14+00:00
> [Template merge - langs/und] Added template file for weighting tags when the fst is used as a hyphenator. 2017-01-30T10:28:43+00:00
> [Template merge - langs/und] Added check for cg-relabel when enabling apertium. Thanks to Flammie for identifying the issue. 2017-01-30T09:19:29+00:00
1331c1323
< [Template merge - langs/und] Added basic dir structure for building hyphenators. 2017-01-27T07:35:00+00:00
---
> [Template merge - langs/und] Added basic dir structure for building hyphenators. 2017-01-27T07:14:28+00:00
1337c1329
< Replaced gtcore with giella-core, or just removed it where not needed. 2017-01-25T12:13:25+00:00
---
> Replaced gtcore with giella-core. 2017-01-25T11:50:25+00:00
1342c1334,1335
< [Template merge - langs/und] Replaced gtcore with giella-core. 2017-01-25T09:59:45+00:00
---
> Replaced gtcore with giella-core. 2017-01-25T10:39:33+00:00
> [Template merge - langs/und] Replaced gtcore with giella-core. 2017-01-25T09:38:25+00:00
1346d1338
< Moved file from old to new infra. 2017-01-23T11:44:48+00:00
1348,1349c1340,1341
< [Template merge - langs/und] Added test dir for hyphenators, to store data from the old infra. 2017-01-23T10:54:58+00:00
< [Template merge - langs/und] Added test dirs for listbased spellcheckers, if we ever get to that. 2017-01-23T09:11:43+00:00
---
> [Template merge - langs/und] Added test dir for hyphenators, to store data from the old infra. 2017-01-23T10:48:15+00:00
> [Template merge - langs/und] Added test dirs for listbased spellcheckers, if we ever get to that. 2017-01-23T09:03:14+00:00
1351,1355c1343,1347
< [Template merge - langs/und] Fixed logical error in the handling of negated specified fst handling in yaml tests (e.g. ~xfst) - the test didn't work, and the yaml file was run when not intended. 2017-01-18T00:34:52+00:00
< [Template merge - langs/und] Fixed regression introduced in the previous commit: one-sided tests where included when looking for test data, causing a subsequent python fail when no actual test data was found. Fixed by using a stricter file name pattern. 2017-01-17T15:52:04+00:00
< [Template merge - langs/und] Added option to specify in a yaml filename that it should only be tested against a specific technology or not, by specifying one of .foma, .hfst or .xfst before the suffix part (before [.gen].yaml), and prefixed with '~' if negated (i.e. .~xfst for NOT running it against Xerox). 2017-01-17T08:48:41+00:00
< [Template merge - langs/und] Slightly more robust yaml testing code. 2017-01-16T15:14:39+00:00
< [Template merge - langs/und] Common starting point for both weighted and unweighted parts. 2017-01-16T15:07:32+00:00
---
> [Template merge - langs/und] Fixed logical error in the handling of negated specified fst handling in yaml tests (e.g. ~xfst) - the test didn't work, and the yaml file was run when not intended. 2017-01-18T00:26:16+00:00
> [Template merge - langs/und] Fixed regression introduced in the previous commit: one-sided tests where included when looking for test data, causing a subsequent python fail when no actual test data was found. Fixed by using a stricter file name pattern. 2017-01-17T15:51:18+00:00
> [Template merge - langs/und] Added option to specify in a yaml filename that it should only be tested against a specific technology or not, by specifying one of .foma, .hfst or .xfst before the suffix part (before [.gen].yaml), and prefixed with '~' if negated (i.e. .~xfst for NOT running it against Xerox). 2017-01-17T08:25:53+00:00
> [Template merge - langs/und] Slightly more robust yaml testing code. 2017-01-16T15:14:00+00:00
> [Template merge - langs/und] Common starting point for both weighted and unweighted parts. 2017-01-16T15:06:42+00:00
1362c1354
< [Template merge - langs/und] Added removal of Area tags also for specialised fst's. Fixes Korp issue reported by Ciprian. 2017-01-10T13:56:03+00:00
---
> [Template merge - langs/und] Added removal of Area tags also for specialised fst's. Fixes Korp issue reported by Ciprian. 2017-01-10T13:39:38+00:00
1372a1365
> New work with twolc. 2016-12-11T07:42:25+00:00
1375c1368
< [Template merge - langs/und] Ensure the fastest lookup method is used during hfst yaml generation tests. 2016-12-09T09:42:34+00:00
---
> [Template merge - langs/und] Ensure the fastest lookup method is used during hfst yaml generation tests. 2016-12-09T09:40:16+00:00
1414c1407
< [Template merge - langs/und] Removed the bash hack to add a css processing instruction - it is done by the perl script writing the xml file. 2016-11-28T20:13:53+00:00
---
> [Template merge - langs/und] Removed the bash hack to add a css processing instruction - it is done by the perl script writing the xml file. 2016-11-28T19:45:47+00:00
1418c1411
< [Template merge - langs/und] Removed the removal for dialect and variant tags from the grammar checker analyser, the information can be useful when generating suggestions for corrections. 2016-11-23T14:49:21+00:00
---
> [Template merge - langs/und] Removed the removal for dialect and variant tags from the grammar checker analyser, the information can be useful when generating suggestions for corrections. 2016-11-23T14:45:29+00:00
1421c1414
< [Template merge - langs/und] Removed repetition of the frequency weighted fst. The goal was to promote compounds where each part was already seen in the corpus, but it made the speller bigger and slower, and actually decreased suggestion quality slightly. — Also added code to do manual priority union, but it is buggy and outcommented for now. 2016-11-21T11:49:44+00:00
---
> [Template merge - langs/und] Removed repetition of the frequency weighted fst. The goal was to promote compounds where each part was already seen in the corpus, but it made the speller bigger and slower, and actually decreased suggestion quality slightly. — Also added code to do manual priority union, but it is buggy and outcommented for now. 2016-11-21T08:14:21+00:00
1428a1422
> [Template merge - langs/und] Added info about which file to look in to find a suitable frequency corpus cut-off location (=line number). 2016-11-18T09:26:11+00:00
1430c1424
< [Template merge - langs/und] Renamed the option --enable-hfst-dekstop-spellers (added plural 's'), and changed the behavior of it so that when disabled, zhfst files are still built (and only those). 2016-11-16T10:40:33+00:00
---
> [Template merge - langs/und] Renamed the option --enable-hfst-dekstop-spellers (added plural 's'), and changed the behavior of it so that when disabled, zhfst files are still built (and only those). 2016-11-16T09:14:37+00:00
1483c1477
< [Template merge - langs/und] Cleaner build steps for local speller filters - the regex is now copied in and compiled according to the fst-format of the speller as opposed to earlier, where the binary fst was compiled and then transformed. 2016-11-02T23:08:18+00:00
---
> [Template merge - langs/und] Cleaner build steps for local speller filters - the regex is now copied in and compiled according to the fst-format of the speller as opposed to earlier, where the binary fst was compiled and then transformed. 2016-11-02T22:47:28+00:00
1488c1482
< [Template merge - langs/und] Also moved the CmpNP filtering to the relevant languages. 2016-11-02T06:59:57+00:00
---
> [Template merge - langs/und] Also moved the CmpNP filtering to the relevant languages. 2016-11-02T04:32:23+00:00
1491,1492c1485,1486
< [Template merge - langs/und] Forgot one file in the previous commit - now that filter is completely removed from the core and template, and all language-independent processing. 2016-11-01T10:36:26+00:00
< [Template merge - langs/und] Moved the remove-norm-comp-tags.regex file from the giella-shared directory to the languages actually using it, and consequently removed it from the language-independent build files. 2016-11-01T10:25:23+00:00
---
> [Template merge - langs/und] Forgot one file in the previous commit - now that filter is completely removed from the core and template, and all language-independent processing. 2016-11-01T10:35:43+00:00
> [Template merge - langs/und] Moved the remove-norm-comp-tags.regex file from the giella-shared directory to the languages actually using it, and consequently removed it from the language-independent build files. 2016-11-01T10:17:09+00:00
1498,1499c1492,1493
< [Template merge - langs/und] Updated the speller devtools scripts to obey the new name and location of the giella-core directory. 2016-10-26T13:37:35+00:00
< [Template merge - langs/und] Added test for available GNU Make, and at least at version 3.82. Error if not found, except on OSX/macOS, where the builtin make is GNU Make 3.81 + patches, which corresponds to the required version or newer. 2016-10-26T12:27:02+00:00
---
> [Template merge - langs/und] Updated the speller devtools scripts to obey the new name and location of the giella-core directory. 2016-10-26T13:34:35+00:00
> [Template merge - langs/und] Added test for available GNU Make, and at least at version 3.82. Error if not found, except on OSX/macOS, where the builtin make is GNU Make 3.81 + patches, which corresponds to the required version or newer. 2016-10-26T12:25:37+00:00
1506,1507c1500,1501
< [Template merge - langs/und] Better support for speller filters using source files from other locations. 2016-10-20T14:33:53+00:00
< [Template merge - langs/und] Added mwe-dis.cg3, to allow disambiguation of multiword expressions and other tokenisation ambiguity. 2016-10-18T08:37:21+00:00
---
> [Template merge - langs/und] Better support for speller filters using source files from other locations. 2016-10-20T14:25:41+00:00
> [Template merge - langs/und] Added mwe-dis.cg3, to allow disambiguation of multiword expressions and other tokenisation ambiguity. 2016-10-18T08:36:24+00:00
1509,1510c1503,1504
< [Template merge - langs/und] We build the tokeising analysers directly off the disamb and grammar checker analysers in src/, assuming that they are identical. This is a reasonable assumption now that the hfst tool kit contains all necessary machinery, and we don't need to pay special attention to the requirements of the tokenisation. 2016-10-17T07:29:45+00:00
< [Template merge - langs/und] Make --with-backend-format work also for the tokenising analysers. 2016-10-17T06:43:07+00:00
---
> [Template merge - langs/und] We build the tokeising analysers directly off the disamb and grammar checker analysers in src/, assuming that they are identical. This is a reasonable assumption now that the hfst tool kit contains all necessary machinery, and we don't need to pay special attention to the requirements of the tokenisation. 2016-10-17T07:25:22+00:00
> [Template merge - langs/und] Make --with-backend-format work also for the tokenising analysers. 2016-10-17T06:40:32+00:00
1520c1514
< [Template merge - langs/und] Corrected makefile dependency for the und.timestamp file. 2016-10-10T14:50:11+00:00
---
> [Template merge - langs/und] Corrected makefile dependency for the und.timestamp file. 2016-10-10T14:49:22+00:00
1527a1522
> [Template merge - langs/und] More robustness added to the test scripts: checking several variables, testing whether the found variables are pointing to existing directories, and giving an error message if no directory is found. 2016-10-06T15:29:04+00:00
1530a1526
> [Template merge - langs/und] Changed variable name and definition to allow overriding the path to the called script, to make it easy to use a locally modified script instead. 2016-10-04T09:34:48+00:00
1531a1528
> [Template merge - langs/und] Changed variable name in devtool scripts, to reflect similar changes elsewhere. Part of fixing bug #2219. 2016-10-04T08:44:42+00:00
1556,1557c1553,1554
< [Template merge - langs/und] Corrected path for the test for availability of the giella-common resources. 2016-09-09T11:33:47+00:00
< [Template merge - langs/und] Added support for getting precompiled proofing tools libraries across the net if not found locally. Makes it actually possible to build spellers without checking out the whole of $GIELLA_HOME. Now it is also possible to just check out $GIELLA_LIBS if one still wants to build everything locally. 2016-09-09T10:39:58+00:00
---
> [Template merge - langs/und] Corrected path for the test for availability of the giella-common resources. 2016-09-09T11:31:19+00:00
> [Template merge - langs/und] Added support for getting precompiled proofing tools libraries across the net if not found locally. Makes it actually possible to build spellers without checking out the whole of $GIELLA_HOME. Now it is also possible to just check out $GIELLA_LIBS if one still wants to build everything locally. 2016-09-09T10:27:02+00:00
1562c1559
< [Template merge - langs/und] Applied backend format rules to the tools/mt/ap/filters dir. This is not future proof, but does not create problems for sme, and solves a bug in smj. The future problem is that we mix both a specified backend format (for compilation efficiency) with the default/unspecified format fst (for weighting) in the same dir, and we can't automatically say which filters need to be in the specified backend format and which should be in the default format. This needs further consideration. 2016-09-02T08:23:48+00:00
---
> [Template merge - langs/und] Applied backend format rules to the tools/mt/ap/filters dir. This is not future proof, but does not create problems for sme, and solves a bug in smj. The future problem is that we mix both a specified backend format (for compilation efficiency) with the default/unspecified format fst (for weighting) in the same dir, and we can't automatically say which filters need to be in the specified backend format and which should be in the default format. This needs further consideration. 2016-09-02T08:20:21+00:00
1564c1561
< [Template merge - langs/und] Completely clean src/transcriptions/, and also clean tools/mt/apertium/filters/. 2016-09-01T13:31:52+00:00
---
> [Template merge - langs/und] Completely clean src/transcriptions/, and also clean tools/mt/apertium/filters/. 2016-09-01T13:12:29+00:00
1570c1567
< [Template merge - langs/und] Do not use PKG_CHECK_MODULES if you don't really have to - it clutters your code and creates unneeded variables = noise. 2016-08-31T11:21:08+00:00
---
> [Template merge - langs/und] Do not use PKG_CHECK_MODULES if you don't really have to - it clutters your code and creates unneeded variables = noise. 2016-08-31T11:17:39+00:00
1572,1573c1569,1570
< [Template merge - langs/und] Corrected placeholder string for two-letter ISO language code. 2016-08-25T21:08:56+00:00
< [Template merge - langs/und] Changed the path to the css for the xml speller test results in devtools. 2016-08-25T18:59:30+00:00
---
> [Template merge - langs/und] Corrected placeholder string for two-letter ISO language code. 2016-08-25T20:22:15+00:00
> [Template merge - langs/und] Changed the path to the css for the xml speller test results in devtools. 2016-08-25T18:48:37+00:00
1576c1573
< [Template merge - langs/und] Added support for building alternate orthography fst's for dictionary and oahpa, and also morphers for alternative orthographies. Slight simplification of defs. 2016-08-24T13:18:23+00:00
---
> [Template merge - langs/und] Added support for building alternate orthography fst's for dictionary and oahpa, and also morphers for alternative orthographies. Slight simplification of defs. 2016-08-24T13:15:31+00:00
1578,1579c1575,1576
< [Template merge - langs/und] One small change to support spellers for alternative orthographies built off of the raw fst instead of the standard fst. 2016-08-23T22:11:26+00:00
< [Template merge - langs/und] Added a possibility to build fst's for alternate orthographies based on the raw fst surface forms, instead of from the default/standard orthography. 2016-08-23T20:40:58+00:00
---
> [Template merge - langs/und] One small change to support spellers for alternative orthographies built off of the raw fst instead of the standard fst. 2016-08-23T22:05:53+00:00
> [Template merge - langs/und] Added a possibility to build fst's for alternate orthographies based on the raw fst surface forms, instead of from the default/standard orthography. 2016-08-23T20:30:51+00:00
1581,1583c1578,1580
< [Template merge - langs/und] Changed all references to $(GIELLA_SHARED)/common into $(GIELLA_SHARED)/all_langs. 2016-08-23T06:28:03+00:00
< [Template merge - langs/und] Rewrote the code for identifying the location of GIELLA_CORE (former GTCORE). The code should be more robust, and is prepared to check against a pkg-config pc file as well. GTCORE is still used throughout the code, but in parallel to GIELLA_CORE, so that one can easily replace the former with the latter without causing bugs or other problems. 2016-08-22T20:22:11+00:00
< [Template merge - langs/und] Added checking for and setting of GIELLA_TEMPLATES, but only if you have defined GIELLA_MAINTAINER (renamed from GTMAINTAINER). Otherwise it is ignored. 2016-08-22T14:59:04+00:00
---
> [Template merge - langs/und] Changed all references to $(GIELLA_SHARED)/common into $(GIELLA_SHARED)/all_langs. 2016-08-23T05:19:01+00:00
> [Template merge - langs/und] Rewrote the code for identifying the location of GIELLA_CORE (former GTCORE). The code should be more robust, and is prepared to check against a pkg-config pc file as well. GTCORE is still used throughout the code, but in parallel to GIELLA_CORE, so that one can easily replace the former with the latter without causing bugs or other problems. 2016-08-22T20:14:43+00:00
> [Template merge - langs/und] Added checking for and setting of GIELLA_TEMPLATES, but only if you have defined GIELLA_MAINTAINER (renamed from GTMAINTAINER). Otherwise it is ignored. 2016-08-22T14:58:53+00:00
1585d1581
< docu 2016-08-21T15:01:48+00:00
1589,1591c1585,1588
< [Template merge - langs/und] Revert experiment with priority union - it doesn't work as expected when weights are involved. Corrected filenames in the .SECONDARY target. 2016-08-19T12:29:06+00:00
< [Template merge - langs/und] Added download links to the build feedbad for 'make upload' in tools/spellcheckers/fstbased/desktop/hfst/. 2016-08-19T10:31:43+00:00
< [Template merge - langs/und] Final step to make the GIELLA_SHARED dir be found in all cases: assign the path from pkg-config to the variable. 2016-08-18T10:36:39+00:00
---
> [Template merge - langs/und] Revert experiment with priority union - it doesn't work as expected when weights are involved. Corrected filenames in the .SECONDARY target. 2016-08-19T12:21:39+00:00
> [Template merge - langs/und] Added download links to the build feedbad for 'make upload' in tools/spellcheckers/fstbased/desktop/hfst/. 2016-08-19T10:24:36+00:00
> Corrections along the lines of ad hoc lexica; thanks, Trond! 2016-08-19T05:24:24+00:00
> [Template merge - langs/und] Final step to make the GIELLA_SHARED dir be found in all cases: assign the path from pkg-config to the variable. 2016-08-18T10:33:29+00:00
1594,1597c1591,1594
< [Template merge - langs/und] Added a configure test to check that there is actually data in GIELLA_SHARED. 2016-08-18T08:04:27+00:00
< [Template merge - langs/und] The giella-shared data dir is now found using several techniques in the following order: * env. variable GIELLA_SHARED * env. variable GIELLA_HOME * env. variable GTHOME * env. variable GTCORE * using --with-giella-shared=/dir/to/giella-shared * using pkg-config If all these fail, configure errors out. Since it a.o. uses GTHOME, the change should be of no concern to existing users having checked out everything. And since the svn location is still within GTCORE, it will also work for those checking out only the core and a single or a couple of languages without any action on their part. 2016-08-17T13:00:38+00:00
< [Template merge - langs/und] Second steps in renaming and splitting the gtcore into giella-core, giella-shared and giella-templates: replaced $(GTCORE)/giella-shared with the Automake variable @GIELLA_SHARED@. 2016-08-15T12:40:33+00:00
< [Template merge - langs/und] First steps in renaming and splitting the gtcore into giella-core, giella-shared and giella-templates: renamed variables. 2016-08-15T11:30:36+00:00
---
> [Template merge - langs/und] Added a configure test to check that there is actually data in GIELLA_SHARED. 2016-08-18T08:03:28+00:00
> [Template merge - langs/und] The giella-shared data dir is now found using several techniques in the following order: * evn. variable GIELLA_SHARED * evn. variable GIELLA_HOME * evn. variable GTHOME * evn. variable GTCORE * using --with-giella-shared=/dir/to/giella-shared * using pkg-config If all these fail, configure errors out. Since it a.o. uses GTHOME, the change should be of no concern to existing users having checked out everything. And since the svn location is still within GTCORE, it will also work for those checking out only the core and a single or a couple of languages without any action on their part. 2016-08-17T12:39:35+00:00
> [Template merge - langs/und] Second steps in renaming and splitting the gtcore into giella-core, giella-shared and giella-templates: replaced $(GTCORE)/giella-shared with the Automake variable @GIELLA_SHARED@. 2016-08-15T12:14:16+00:00
> [Template merge - langs/und] First steps in renaming and splitting the gtcore into giella-core, giella-shared and giella-templates: renamed variables. 2016-08-15T11:28:11+00:00
1607,1798c1604
< Change e-mail address 2015-05-21T14:03:58+00:00
< moved 2012-12-03T15:39:52+00:00
< moved 2012-12-03T15:39:48+00:00
< moved 2012-12-03T15:39:46+00:00
< moved 2012-12-03T15:39:44+00:00
< moved 2012-12-03T15:39:42+00:00
< moved 2012-12-03T15:39:39+00:00
< info 2012-12-03T09:00:50+00:00
< moved 2012-12-03T08:59:50+00:00
< moved 2012-12-03T08:59:37+00:00
< moved 2012-11-15T08:14:26+00:00
< mvd 2012-11-15T08:08:40+00:00
< mvd 2012-11-15T07:58:35+00:00
< mvd 2012-11-15T07:56:46+00:00
< moved 2012-11-15T07:56:19+00:00
< moved 2012-11-15T07:54:54+00:00
< mvd 2012-11-15T07:54:07+00:00
< mvd 2012-11-15T07:53:43+00:00
< mvd 2012-11-15T07:51:07+00:00
< mvd 2012-11-15T07:50:52+00:00
< mvd 2012-11-15T07:50:32+00:00
< mvd 2012-11-15T07:50:25+00:00
< mvd 2012-11-15T07:50:14+00:00
< mvd 2012-11-15T07:49:12+00:00
< mvd 2012-11-15T07:48:39+00:00
< mvd 2012-11-15T07:48:06+00:00
< Negation verb. 2012-07-12T12:07:44+00:00
< apostrophe 2012-07-12T11:51:26+00:00
< From smn 2012-07-12T07:25:14+00:00
< resfolder 2012-07-12T07:23:56+00:00
< Added some adverbs. 2012-07-11T07:34:06+00:00
< Corrected multichar symbols +Sǧ, +Sǥ. 2012-07-11T07:33:50+00:00
< Working with SEDGGJED, KAMRDED and VOLLJED 2012-06-21T09:56:36+00:00
< Work with Auli Oksanen in SEDGGJED and VOLLJED verb types 2012-06-21T09:55:22+00:00
< Working with mainsted, vuejted, siltteed etc. 2012-06-15T12:19:53+00:00
< Auli Oksanen has prepared the test documents and they have been tested by Jack Rueter 2012-06-15T12:19:12+00:00
< Adding LEED verb type 2012-06-15T09:15:58+00:00
< one line change 2012-06-15T09:15:13+00:00
< Auli Oksanen produced this table. Jack Rueter commented out the supletive forms and tested it to correlate with verb type LEED 2012-06-15T09:14:40+00:00
< Addressing verb type KAMRDED 2012-06-15T08:30:15+00:00
< Auli Oksanen has provided the forms and Jack Rueter has adapted the twol and verb-sms-morph.txt to this verb type KAMRDED 2012-06-15T08:29:19+00:00
< Most _âd_, _ad_ and _ed_ verbs should work 2012-06-12T12:09:21+00:00
< This verb, _kuullâd_ works fine 2012-06-12T12:06:22+00:00
< U´vdded is now valid 2012-06-12T11:34:04+00:00
< Copied the corrections from the single-paradigm files to the collected file. 2012-06-07T21:22:46+00:00
< Corrected 2012-06-07T14:23:22+00:00
< Replace ACUTE ACCENT with MODIFIER LETTER ACUTE ACCENT 2012-06-07T10:59:45+00:00
< tags 2012-06-07T10:52:32+00:00
< Enhancing and correcting 2012-06-06T15:46:15+00:00
< Working with _u_ in jurdded 2012-06-06T15:45:23+00:00
< Verb type TIETTED is being split into KALMMED as well. Reassessment is being made of vowel patterns with regard to length and raising or lowering 2012-06-06T12:25:18+00:00
< Working with simultaneous vowel lengthening and raising. Tags added 2012-06-06T12:23:12+00:00
< Working with simultaneous vowel lengthening and raising. Perhaps ordered rules would be better. 2012-06-06T12:22:40+00:00
< corrections to tags 2012-06-05T10:53:59+00:00
< Adding SILTTEED and SOLLEED verbs 2012-06-05T10:53:15+00:00
< Adding participles to VIQQAD type 2012-06-05T10:00:23+00:00
< Working with ää´veed verbs and adjusting contlex names to correlate with verb type TEEVVAD-Prt, for example. 2012-06-05T09:52:36+00:00
< Adjusting multicharacter tags 2012-06-05T09:51:19+00:00
< working with gerunds and 2012-06-05T09:06:57+00:00
< Added all tests to this file. Note that there must be 5 spaces in the beginning of each test pair line. Note also that __all tags must be declared in Multichar_Symbols__. 2012-06-05T09:01:50+00:00
< minor corrections to tags 2012-06-05T08:54:50+00:00
< minor adjustments to contlex 2012-06-05T08:53:48+00:00
< adjusting conjugation tags and ending 2012-06-05T08:52:43+00:00
< adding more multicharacters 2012-06-05T08:52:04+00:00
< adding verb-sms-morph.txt and noun-sms-morph.txt 2012-06-05T08:05:37+00:00
< Adding multicharacter symbols 2012-06-05T07:57:52+00:00
< Adding multicharacter symbols 2012-06-05T07:50:27+00:00
< Retaing a copy of previous twol work in personal gt/sms/src/loc-twol-sms.txt and adding new extended work with examples 2012-06-05T07:47:19+00:00
< Commenting out POORRÂD lexica and conjoining them with VIQQAD to correlate with Rueter twol-sms.txt work 2012-06-05T07:45:22+00:00
< Adding translations and some contlexs 2012-06-05T07:43:43+00:00
< Adding separate morphologies for nouns and verbs. This means that the ones in sms-lex.txt will be commented out 2012-06-05T07:41:24+00:00
< Adding separate morphologies for nouns and verbs. This means that the ones in sms-lex.txt will be commented out 2012-06-05T07:41:12+00:00
< Renamed file to avoid svn-OSX decomposed UTF8 trouble. 2012-06-05T06:57:44+00:00
< multichars 2012-06-05T05:53:35+00:00
< Verb strings appear to require 5 blank spaces before they start 2012-06-04T19:20:58+00:00
< Verb strings appear to require 5 blank spaces before they start 2012-06-04T19:13:54+00:00
< Verb strings appear to require 5 blank spaces before they start 2012-06-04T19:11:24+00:00
< simple initial info 2012-06-04T18:57:27+00:00
< additions by Auli Oksanen 2012-06-04T18:55:19+00:00
< additions by Auli Oksanen 2012-06-04T18:46:44+00:00
< adding -âd verb test 2012-06-04T18:24:14+00:00
< adding -ed verb test 2012-06-04T18:12:14+00:00
< correction made to Ind+Prs+Sg1 -am 2012-06-04T18:11:42+00:00
< Added ´ insertion for vowel shortening. 2012-06-04T16:30:35+00:00
< Added PrfPrc form, and added ´ insertion in past tense. 2012-06-04T16:30:01+00:00
< Keeping the distinct lexicon for -âd for now. 2012-06-04T16:29:21+00:00
< no dual here 2012-06-04T16:28:52+00:00
< Corrected space before code, removed comments. 2012-06-04T16:28:34+00:00
< Adding verbs and glosses from Koltansaamen koulukielioppi 2009 2012-06-04T12:25:05+00:00
< turning the acute treatment, refining several rules 2012-06-03T22:39:52+00:00
< tetong mõõnnâd without ´ in the stem, looked at multichar. and at verb affixes 2012-06-03T22:37:16+00:00
< tetong mõõnnâd without ´ in te stem 2012-06-03T22:36:44+00:00
< Adding missing letters to the alphabet _ʒ š_ and upper-case _Č Ǩ Ǯ Ǧ Ž Đ Ǥ Ʒ Š Õ Â_ 2012-06-01T07:12:55+00:00
< Whitespace changes, rule simplification, and a new consonant gradation rule. Work by Trond, Sjur, Tomi, Jack. 2012-05-30T11:10:03+00:00
< Added missing multichar symbol +4 - but is it really needed? Done by Trond. 2012-05-30T11:07:53+00:00
< Corrections by Jack. 2012-05-30T11:07:03+00:00
< Commented in all the multichar symbols. Why were they commented out? look int o this. 2012-05-30T06:36:49+00:00
< Fixed path. Now the command __ make fsttest GTLANG=sms __ (in gt) works. 2012-05-30T06:30:10+00:00
< Needed for smstest target, for some reason. 2012-05-30T06:29:13+00:00
< Auli Oksanen has provided this paradigm. 2012-05-11T10:41:57+00:00
< M4 work on SMS. No M4 code found, but the same escaped symbols and morphological boundary markers have been added, to ensure consistency across languages. Also the punctuation file has been updated to follow SME. 2012-03-07T08:18:04+00:00
< Adding final missing pieces for OOo support for sms. 2011-09-28T18:15:04+00:00
< Short but correct metadata file with correct language code and name. 2011-09-28T14:23:52+00:00
< Added very basic support files for hfst spellers for Skolt Sámi/SMS. Svn copy from SMN. 2011-09-28T14:20:02+00:00
< Commented in some lexica, added all nouns from the oahpa program, without any classification. 2011-08-30T09:30:34+00:00
< looking into the morphological for sms 2011-08-30T08:36:50+00:00
< looking into the morphological for sms 2011-08-30T08:35:48+00:00
< looking into the morphological for sms 2011-08-30T08:35:12+00:00
< Moving big files from $GTHOME to $GTBIG. 2011-04-18T14:56:32+00:00
< ignre 2011-04-12T20:22:00+00:00
< intermediate folder 2011-04-12T20:21:39+00:00
< Added name tags to all Sami languages, in order not to forget it in the future. 2011-03-25T17:34:25+00:00
< Replaced the variable $(TARGET) with $(GTLANG) everywhere. TARGET as a variable name is very unfortunate (extremely ambiguous, especially in a software build context), GTLANG is to the point (and does not conflict with the env. variable LANG). 2011-01-03T08:00:42+00:00
< Updated file to Tina Sanilas comments. 2010-06-26T16:03:57+00:00
< Pointing new users two levels up. 2010-06-11T13:01:29+00:00
< 1-1000 looks ok. 2010-06-08T06:03:57+00:00
< 1-100 correct. 2010-06-08T05:55:48+00:00
< git-svn-id: https://gtsvn.uit.no/langtech/trunk/gt/sms@32338 c7155fb1-f0a7-4240-a2fc-2600b6f42f90 2010-06-06T15:21:48+00:00
< Corrections. 2010-06-02T13:17:16+00:00
< A first approximation of an sms numeral generator. It is not proofread against a numeral list, and contains some consonant gradation errors here and there (the least). 2010-06-02T07:12:13+00:00
< Already catered for. 2010-03-13T10:12:40+00:00
< preliminary 2010-03-13T10:11:16+00:00
< Reverting, making separate. 2010-03-13T10:10:48+00:00
< numerals-only forthcoming. 2010-03-13T10:09:02+00:00
< missing contlex stopping compilation. 2010-02-11T13:38:54+00:00
< Added GPL license note. 2010-01-18T22:28:26+00:00
< sound file for the dictionary 2009-11-28T18:19:25+00:00
< sound file for the dictionary 2009-11-28T18:18:36+00:00
< sound file for the dictionary 2009-11-28T18:17:40+00:00
< sound file for the dictionary 2009-11-28T18:16:03+00:00
< sound file for the dictionary 2009-11-28T18:15:21+00:00
< sound file for the dictionary 2009-11-28T18:14:34+00:00
< sound file for the dictionary 2009-11-28T18:13:52+00:00
< sound file for the dictionary 2009-11-28T18:12:44+00:00
< sound file for the dictionary 2009-11-28T18:11:26+00:00
< sound file for the dictionary 2009-11-28T18:09:13+00:00
< sound file for the dictionary 2009-11-28T18:07:48+00:00
< sound file for the dictionary 2009-11-28T18:07:01+00:00
< sound file for the dictionary 2009-11-28T18:06:04+00:00
< sound file for the dictionary 2009-11-28T18:05:19+00:00
< sound file for the dictionary 2009-11-28T18:03:28+00:00
< • The $(wildcard ) function is meaningless if there is no file there, and when used together with the automatic generation of the transducer, it effectively causes everything to be made, because it returns null -> the all target in the main makefile is called • added printout before going to the top to make the transducer • synchronised sma, smj and sme 2009-11-27T17:14:05+00:00
< Changed the SAVEFILE make command to always generate the correct transducer, irrespective of which transducer is specified. 2009-11-27T14:17:45+00:00
< svn checkin test 2009-11-24T12:15:11+00:00
< svn checkin test 2009-11-24T12:12:14+00:00
< svn checkin test 2009-11-24T12:09:29+00:00
< added new status report on SJE-SJD-SMS cooperation 2009-11-24T10:25:51+00:00
< automatically generated Skolt file from FileMaker format 2009-11-24T07:04:25+00:00
< minutes of the sjd sms sje meeting with Micha, Lena and Josh 2009-11-23T07:41:35+00:00
< new raw dictionary files in sms/inc and sjd/inc 2009-11-21T13:21:11+00:00
< new folders /sms/inc and /sjd/inc (for incoming working files) 2009-11-21T12:32:46+00:00
< Commented out the adjectives. 2009-10-22T14:46:10+00:00
< Ignore generated files. 2009-07-01T12:43:34+00:00
< * Set properties for all known file-types 2009-04-21T09:48:21+00:00
< Remove files that don't belong in svn 2009-04-09T14:14:25+00:00
< ign 2008-12-08T18:43:51+00:00
< ign 2008-12-08T18:42:53+00:00
< propedit 2008-12-08T18:37:50+00:00
< propedit 2008-12-08T18:35:44+00:00
< polderlandcatalogue 2008-07-05T14:07:02+00:00
< Note. This is just a placeholder, so that the Makefile gets its sms-num.txt file. It contains South Sámi numerals, they should be exchanged with sms numerals. 2008-05-02T13:55:20+00:00
< Added verb class I vowel alternations. 2008-03-04T17:52:14+00:00
< Added « character to -stem lower side. 2008-03-04T17:51:47+00:00
< Added verb class I lexicons. 2008-03-04T17:51:02+00:00
< Added verbclass I LEXICONS and verbs. 2008-03-04T10:09:54+00:00
< Added twol file, but it doesn't compile :-( 2008-03-03T21:48:39+00:00
< Added verb class TEEVVAT. 2008-03-03T21:47:37+00:00
< Makefile 2008-03-02T23:11:31+00:00
< verbcodes 2008-03-02T23:11:07+00:00
< v-para 2008-03-02T23:10:48+00:00
< nouncodes 2008-03-02T23:10:34+00:00
< ref to proper 2007-09-10T14:14:19+00:00
< changes 2007-09-10T14:04:18+00:00
< propernoun files. 2007-09-10T13:20:26+00:00
< added ref to propernoun. 2007-09-10T13:20:06+00:00
< Fixed Sg1 and Sg2. 2007-03-29T07:03:15+00:00
< UTF-8. 2007-03-28T22:31:38+00:00
< basic utf8 encoding fix, still no reconversion. 2007-03-28T22:27:41+00:00
< Split closed in Makefile. 2007-03-28T22:26:55+00:00
< And forgot this one. 2007-01-11T20:40:39+00:00
< UTF-8, but these are not actually operative. 2007-01-11T20:40:15+00:00
< Split. 2007-01-11T20:39:32+00:00
< Removed save fst command. 2006-12-03T21:38:10+00:00
< First version of Skolt Sami parser 2003-03-04T11:11:11+00:00
< added * cvsignore, all files invisible 2002-09-20T13:03:47+00:00
< Added cvsignore 2002-09-20T12:50:30+00:00