24h購物| | PChome| 登入
2013-05-05 08:56:47| 人氣438| 回應0 | 上一篇 | 下一篇

[UVA][BWT轉換] 632 - Compression (II)

推薦 0 收藏 0 轉貼0 訂閱站台


  Compression (II) 

The fast growth of program sizes and amount of stored data raises the need for efficient compression algorithms. They can significantly reduce the amount of storage space. Also the ``Internet boom", still suppressed by slow links has big influence on new developments in this field.


One of the most modern methods is block sorting. Its first step performed on the data is Burrows-Wheeler transform (BWT). The idea, used by the authors, comes from well-established in image compression Fourier transform and it says: transform the data to the form which is easier to compress and then compress it using one of the well-known algorithms.


Let's assume that as input we have a character string S of length N. The first step of the transform is to generate from string S, N strings S0, S1, ..., SN-1 in the following way:

  • string S0 is identical to S
  • string Sk is generated from string Sk-1 by moving the first character to the end.


The example that illustrates this algorithm is shown below:

S = PASCAL


S0 P A S C A L
S1 A S C A L P
S2 S C A L P A
S3 C A L P A S
S4 A L P A S C
S5 L P A S C A

In the second step all the strings are sorted to give:

S4 A L P A S C
S1 A S C A L P
S3 C A L P A S
S5 L P A S C A
S0 P A S C A L
S2 S C A L P A

The results of the transform, which are used in further work are:

  • last column of just created array (CPSALA in this case)
  • number of the row (starting from 0) in which S1 is placed

It appears that such data are easier to perform an inverse transform and easier to compress by other methods than original input data.


Your task is to write a program, which performs above-described BWT transform on input string.

Input 

The first line of the input is an integer M, then a blank line followed by M datasets. There is a blank line between datasets.

In the first line of each dataset comes an integer N < 1997 which stands for string length. Subsequent lines contain a string to transform. Each line (besides the last one) contains exactly 50 characters.

Output 

For each dataset, there should be position of S1 string in the first line. Subsequent lines should contain a string to transform. Each line (besides the last one) should contain exactly 50 characters. Print a blank line between datasets.

Sample Input 

1

6
pascal

Sample Output 

1
cpsala



Miguel A. Revilla
2000-01-10

特別注意一下,輸入是很多行代表一個字串,一行最多 50 個。
輸出益然,一行最多打印 50 個,超過就換行。

#include <stdio.h>
#include <string.h>
#include <algorithm>
using namespace std;
char s[2048];
int a[2048], len;
bool cmp(int a, int b) {
    int i;
    for(i = 0; i < len; i++)
        if(s[(a+i)%len] != s[(b+i)%len])
            return s[(a+i)%len] < s[(b+i)%len];
    return false;
}
int main() {
    int t, i, j;
    scanf("%d", &t);
    while(t--) {
        scanf("%d", &len);
        int idx = 0;
        while(getchar() != '\n');
        while(gets(s+idx)) {
            idx += strlen(s+idx);
            if(idx == len)
                break;
        }
        for(i = 0; i < len; i++)
            a[i] = i;
        sort(a, a+len, cmp);
        int S1 = 0;
        for(i = 0; i < len; i++) {
            if(a[i] == 1)
                S1 = i;
        }
        printf("%d\n", S1);
        for(i = 0; i < len; i++) {
            if(i%50 == 0 && i)   puts("");
            printf("%c", s[(a[i]+len-1)%len]);

        }
        puts("");
        if(t)
            puts("");
    }
    return 0;
}
/*
5

1
a
5
aaaaa
12
abcabcabcabc
55
12345678901234567890123456789012345678901234567890
12345
50
12345678901234567890123456789012345678901234567890
*/

台長: Morris
人氣(438) | 回應(0)| 推薦 (0)| 收藏 (0)| 轉寄
全站分類: 不分類 | 個人分類: UVA |
此分類下一篇:[UVA][字串處理] 726 - Decode
此分類上一篇:[UVA][字串處理] 554 - Caesar Cypher

是 (若未登入"個人新聞台帳號"則看不到回覆唷!)
* 請輸入識別碼:
請輸入圖片中算式的結果(可能為0) 
(有*為必填)
TOP
詳全文