• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Blank out non-zero weights.
2# Helper script for manual review of UCA DUCET and CLDR root collation data files.
3# Most of the collation element weights change with every new version.
4# "Blanking out" the weights makes files comparable,
5# for finding changes in sort order and changes in lengths of weights.
6#
7# sed -r -f blankweights.sed FractionalUCA.txt > frac-7.0.txt
8
9# protect allkeys 0000 weights
10s/0000/@@4ZEROES@@/g
11
12# fractional primary weights
13s/\[[0-9A-F]{2},/[pp,/g
14s/\[[0-9A-F]{2} [0-9A-F]{2},/[pp pp,/g
15s/\[[0-9A-F]{2} [0-9A-F]{2} [0-9A-F]{2},/[pp pp pp,/g
16# fractional secondary weights
17s/, [0-9A-F]{2},/, ss,/g
18s/, [0-9A-F]{2} [0-9A-F]{2},/, ss ss,/g
19# fractional tertiary weights
20s/, [0-9A-F]{2}\]/, tt]/g
21
22# allkeys primary weights
23s/\[[0-9A-F]{4}/[pppp/g
24s/\[([.*])[0-9A-F]{4}/[\1pppp/g
25# allkeys secondary weights
26s/\.[0-9A-F]{4}\./.ssss./g
27# leave fixed allkeys tertiary weights
28# s/\.[0-9A-F]{4}\]/.tttt]/g
29
30# restore zero weights
31s/@@4ZEROES@@/0000/g
32