[求助]如何统计一篇英文文章中每个单词出现的次数
新手上路,请高手指教: 如何统计一篇英文文章中每个单词出现的次数? 每个单词用空格符或标点分隔,文章可能很长,所以最好兼顾perfermance的问题.
谢谢!
#include <map>
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
ifstream* Init(ifstream* ins,map<string,int> &dir)
{
if(ins->fail())
return NULL;
char seps[]=\" ,;.!\'\t\n\\"<>=-+*/%|&^~()[]{}?:@#$\";
char buf[4096]={0};
while(ins->getline(buf,4096))
{
char *token = strtok(buf, seps );
while( token != NULL )
{
dir[string(token)]++;
token = strtok( NULL, seps );
}
memset(buf,0,4096);
}
ins->close();
return ins;
}
int RepeatTimes(string word,map<string,int> &dir)
{
return dir[word];
}
void main()
{
map<string,int> mp;
ifstream *ins = Init(new ifstream(\"file.dat\"),mp);
if(ins)
cout<<RepeatTimes(\"love\",mp)<<endl;
delete ins;
}
[此贴子已经被作者于2007-7-2 13:39:09编辑过]