24h購物| | PChome| 登入
2013-07-18 13:28:23| 人氣856| 回應0 | 上一篇 | 下一篇

[UVA][bitmask] 12515 - Movie Police

推薦 0 收藏 0 轉貼0 訂閱站台


  Movie Police 

Movie Police (MP) is an international top secret law enforcement agency, controlling illegal movie downloads on internet. With their elite team of programmers, MP has developed a very smart algorithm to produce movie signatures. A movie signature is a binary string, one bit for every frame in the movie, so that the i-th bit in the signature corresponds to the i-th frame in the movie. The algorithm is so amazing that it outputs consistently the same signature for versions of the same film with different quality resolutions. One application of this revolutionary technology uses it to detect if a small clip is part of a movie, looking for a high similarity between the clip signature and the movie signature.


Now MP has started to apply this technology and, as a first step, a massive online database of movie signatures was already built. As a new member of the MP crew, you must write a program that, given the signature of a clip, finds the index in the MP database of a movie whose signature matches the clip signature at most. That is, a movie whose signature has a substring, of the same length of the clip signature, that is most similar to the clip signature.


Similarity between strings of the same length is defined by means of their Hamming distance (number of bits that do not match), so that ``more similar'' means ``less Hamming distance''.

Input 

The first line of the input contains two positive integer numbers M and Q, separated by a blank, where M indicates the number of movie signatures in the database and Q indicates the number of clip signatures to process ( 1 $ leq$ M $ leq$ 1000, 1 $ leq$ Q $ leq$ 500). Each one of the following M lines contains a binary string si describing the i-th movie signature in the database. You may suppose that si has length li, where 1 $ leq$ li $ leq$ 100. Finally, there are Q lines, each one with a binary string that corresponds to a clip signature to search for maximal similarity in the database. You may assume that, for every clip signature to be searched, there is at least one movie signature in the database whose length is greater or equal than the clip's length.

Output 

For each clip signature given in the input, output a single line with the lowest index i of a movie si ( 1 $ leq$ i $ leq$ M) that matches the clip at most, as above explained. If there are two movie signatures that match the clip signature maximally, answer the one with lower index in the database.

Sample Input 

3 1
000011
1101111111
1111100000
1000111

Sample Output 

2



題目描述:

現在有很多電影的片段,以及驗證版權碼,對於每個電影片段,找到其中一段 substring 去對應驗證碼,去計算漢明距離,即有多少 bit 是不同的,找一個漢明距離最小的電影片段。

題目解法:

對於一組詢問,會消耗 O(M*len*len),每個電影抓出來進行比較,又去切割 substring,要跟驗證碼一樣長,然後還要去比較。這樣很容易 TLE,但至少也要跑 O(M) 次,只能將比較次數往下壓。

於是將 32 bit 做一次 mask,藉由 xor 得到一個整數,在 32 bit 數字中得到 1 個個數
使用
__builtin_popcount 涵式去計算 1 的個數,因次一次可以跳 32 的次數。

基本上比較的次數將會變成 O(len/32) => 總共是 O(M*len*len/32)。
不可能的解就跳出。

#include <stdio.h>
#include <iostream>
#include <algorithm>
#include <string.h>
using namespace std;
char s[1005][105], ss[105];
unsigned int v[1005][5];
int dlen[1005];
int main() {
    int n, q;
    int i, j, k;
    while(scanf("%d %d", &n, &q) == 2) {
        int i, j, k;
        for(i = 0; i < n; i++) {
            scanf("%s", s[i]);
            dlen[i] = strlen(s[i]);
        }
        memset(v, 0, sizeof(v));
        for(i = 0; i < n; i++) {
            for(j = 0, k = 0; j < dlen[i]; j++) {
                v[i][k] = (v[i][k]<<1)|(s[i][j]-'0');
                if(j%32 == 31)  k++;
            }
        }
        while(q--) {
            scanf("%s", ss);
            int len = strlen(ss);
            int vv[105] = {};
            for(i = 0; i < len; i++) {
                for(j = 0; j < 32 && i+j < len; j++) {
                    vv[i] = (vv[i]<<1)|(ss[i+j]-'0');
                }
            }
            int mn = 0xfffffff, ret = 0;
            int p, q, diff, val;
            for(i = 0; i < n; i++) {
                int submn = 0xffffff;
                for(j = 0; j+len <= dlen[i]; j++) {
                    p = j, q = 0;
                    diff = 0;
                    while(q < len) {
                        if(q+32 <= len && p%32 == 0 && p+32 <= dlen[i]) {
                            val = vv[q]^v[i][p/32];
                            diff += __builtin_popcount(val);
                            p += 32, q += 32;
                        } else {
                            diff += (ss[q] != s[i][p]);
                            p++, q++;
                        }
                        if(diff >= mn)  break;
                    }
                    submn = min(submn, diff);
                }
                if(submn < mn) {
                    mn = submn, ret = i;
                }
            }
            //printf("%d\n", mn);
            printf("%d\n", ret+1);
        }
    }
    return 0;
}
/*
2 1
0000000000000000000000000000000000000000000000000000000000000000
1111111111111111111111111111111111111111111111111111111111111111
1000000000000000000000000000000000000000000000000000000000000000
*/
// __builtin_popcount

台長: Morris
人氣(856) | 回應(0)| 推薦 (0)| 收藏 (0)| 轉寄
全站分類: 不分類 | 個人分類: UVA |
此分類下一篇:[UVA] 234 - Switching Channels
此分類上一篇:[UVA][greedy] 12498 - Ant's Shopping Mall

是 (若未登入"個人新聞台帳號"則看不到回覆唷!)
* 請輸入識別碼:
請輸入圖片中算式的結果(可能為0) 
(有*為必填)
TOP
詳全文