Finding words

A common problem when processing incoming text is to isolate the words in the text. This is made more difficult by the punctuation; words have commas, ``quote marks", (even brackets) next to them, or hy-phens in the middle of the word. This punctuation doesn't count as letters when the words have to be looked up in a dictionary by the program.

For this problem, you must separate out ``clean" words from text, that is, words with no attached or embedded non-letters. A ``word" is any continuous string of non-whitespace characters, with whitespace characters on each side of it. For this problem, a ``whitespace" character is a space character or an end-of-line character, or the start or end of the file (so that, for example, if the input file consists of `Anne Bob', where there is a space character between the A and B but no other, then there are two words, `Anne' and `Bob').

Input

Input will consist of lines with no more than 60 characters in each line. Every line will be terminated by a character which isn't whitespace (which will be followed immediately by an end-of-line character). The input will be terminated by a line consisting of a single `#'.

Output

Output must be the lines of the incoming text, with the non-letters stripped away from each word. A non-letter is any character which is not a letter (a - z and A - Z) and not a whitespace character. Your program must not change the letters and space characters. When a non-letter occurs in the middle of a word (ie there is no whitespace character next to it), it must be simply removed (see what happens to the word `doesn't' in the example). A word which consists entirely of non-letters will therefore be removed entirely.

There is a special rule for a hyphen (`-') when it is the very last character in a line:

the word part before the hyphen and the first word part on the next line form a single word;
this complete word must be written on a line by itself;
you can assume that there will always be a space before the word part on the first line, and a space after the word part on the second line. These 2 spaces must appear in the output.

Sample Input

A common problem when processing incoming text is to isolate
the words in the text.  This is made more difficult by the
punctuation; words have commas, "quote marks",
(even brackets)      next to them, or hy-
phens in the middle of the word.  This punctuation doesn't
count as letters when the words have to be looked up in a
# dictionary by the 12345 "**&! program.
#

Sample Output

A common problem when processing incoming text is to isolate
the words in the text  This is made more difficult by the
punctuation words have commas quote marks
even brackets      next to them or 
hyphens
 in the middle of the word  This punctuation doesnt
count as letters when the words have to be looked up in a
 dictionary by the   program

最近都在嘗試新的語法, 狀況似乎不錯

#include <iostream>
#include <ctype.h>
using namespace std;
int main() {
    string line, tmp = "";
    int next = 0;
    while(getline(cin, line)) {
        if(line == "#")
            break;
        int i, len = line.length();
        for(i = 0; i < len; i++) {
            if(isspace(line[i])) {
                cout << tmp;
                if(next)
                    cout << endl;
                cout << line[i];
                next = 0;
                tmp = "";
            } else if(isalpha(line[i])) {
                tmp += line[i];
            } else {
                if(line[i] == '-') {
                    if(tmp.length() > 0 && i == len-1) {
                        next = 1;
                    }
                } else {
                    cout << tmp;
                    tmp = "";
                    if(next)
                        cout << endl;
                    next = 0;
                }
            }
        }
        if(next == 0) {
            cout << tmp;
            if(next)
                cout << endl;
            tmp = "";
        }
        cout << endl;
    }
    return 0;
}

我要檢舉

#892#Finding words

台長： Morris

您可能對以下文章有興趣

[UVA][dp] 11285 - Exchange Rates

[UVA] 585 - Triangles

[UVA][sieve, bitmask] 10948 - The primary problem

[UVA] 846 - Steps

人氣(1,509) | 回應(0)| 推薦 (0)| 收藏 (0)| 轉寄
全站分類: 不分類 | 個人分類: UVA |
此分類下一篇:[UVA] 290 - Palindroms smordnilaP
此分類上一篇:[UVA] 789 - Indexing

回應(0)