Github项目地址 | Github项目地址 |
---|---|
这个做业要求在哪里 | 做业要求的连接 |
结对同伴的连接 | 同伴连接 |
个人博客地址 | 个人地址 |
这是,咱们在结对编程实现关键代码,在两我的的齐心合力之下最终实现了关键代码html
PSP2.1 | Personal Software Process Stages | 预估耗时(分钟) | 实际耗时(分钟) |
---|---|---|---|
Planning | 计划 | 40 | 55 |
· Estimate | · 估计这个任务须要多少时间 | 90 | 90 |
Development | 开发 | 60 | 75 |
· Analysis | · 需求分析 (包括学习新技术) | 30 | 35 |
· Design Spec | · 生成设计文档 | 15 | 18 |
· Design Review | · 设计复审 (和同事审核设计文档) | 20 | 20 |
· Coding Standard | · 代码规范 (为目前的开发制定合适的规范) | 5 | 5 |
· Design | · 具体设计 | 30 | 40 |
· Coding | · 具体编码 | 120 | 150 |
· Code Review | · 代码复审 | 40 | 60 |
· Test | · 测试(自我测试,修改代码,提交修改) | 50 | 70 |
Reporting | 报告 | 100 | 120 |
· Test Report | · 测试报告 | 50 | 55 |
· Size Measurement | · 计算工做量 | 20 | 20 |
· Postmortem & Process Improvement Plan | · 过后总结, 并提出过程改进计划 | 15 | 10 |
合计 | 685 | 823 |
b.设计实现过程
大致上,除了主函数这个类,还有一个大类getFile(),在getFile这个类中有七个方法,有一个公共方法getDic 这个方法用于得到字典,存入字典中的是长度大于四且不以数
字开头的单词以及他们出现的次数,这个方法会返回一个Hashtable,方法getWordFre()方法将字典按照单词出现的次数进行排序,并返回一个动态数组。
其余的,能够直接调用,getWordFre这个方法利用返回的数组进行相应功能的实现。单元测试是对这几个方法所
对应的功能进行相应的测试。如下是整个程序的流程图
git
public Hashtable getDic(string pathName, ref Hashtable wordList) //getDic:从文本文件中统计词频保存在Hashtable中 { StreamReader sr = new StreamReader(pathName); string line; line = sr.ReadLine(); //按行读取 while (line != null) { MatchCollection mc; Regex rg = new Regex("[0-9A-Za-z-]+"); //用正则表达式匹配单词 mc = rg.Matches(line); for (int i = 0; i < mc.Count; i++) { Regex regNum = new Regex("^[0-9]"); string mcTmp = mc[i].Value.ToLower(); //大小写不敏感 if (mcTmp.Length >= 4 && regNum.IsMatch(mcTmp) == false)//字符长度大于4且不以数字开头 { if (!wordList.ContainsKey(mcTmp)) //第一次出现则添加为Key { wordList.Add(mcTmp, 1); } else //不是第一次出现则Value加 { int value = (int)wordList[mcTmp]; value++; wordList[mcTmp] = value; } } else continue; } line = sr.ReadLine(); } sr.Close(); return wordList; }
getDic(string pathName, ref Hashtable wordList)这个方法用于从文本中将每一个词提取出来,并统计出每一个词词频放到Hashtable中,而后用StreamReader打开文件,
用while实现按行读取,在循环体中,用正则表达式匹配每一行的单词,while中的for循环用于对匹配出来的单词进行按条件剔除,符合条件的加入字典,不符合的剔除,最后返回一个Hashtablegithub
public ArrayList getWordFre(string pathName, ref Hashtable wordList) { getFile Wordlist = new getFile(); Hashtable Wordlist_fre = new Hashtable(); Wordlist_fre = Wordlist.getDic(pathName, ref wordList); ArrayList keysList = new ArrayList(Wordlist_fre.Keys); keysList.Sort(); string tmp = String.Empty; int valueTmp = 0; for (int i = 1; i < keysList.Count; i++) { tmp = keysList[i].ToString(); valueTmp = (int)wordList[keysList[i]];//次数 int j = i; while (j > 0 && valueTmp > (int)wordList[keysList[j - 1]]) { keysList[j] = keysList[j - 1]; j--; } keysList[j] = tmp;//j=0 } return keysList; }
getWordFre(string pathName, ref Hashtable wordList)将传递过来的wordList进行按频率排序,并将Hashtable转换成动态数组并返回正则表达式
public void write(string outputPath, ref Hashtable wordList, int lines, int words, int characters, int wordsOutNumFla, int wordsOutNum,int m,string inputPath) { getFile Wordlist = new getFile(); ArrayList keysList = new ArrayList(); ArrayList keysList1 = new ArrayList(); keysList1 = Wordlist.getPhrase(inputPath, outputPath, ref wordList, m); keysList = Wordlist.getWordFre(outputPath, ref wordList); StreamWriter sw = new StreamWriter(outputPath); sw.WriteLine("characters:{0}", characters); sw.WriteLine("words:{0}", words); sw.WriteLine("lines:{0}", lines); if (wordsOutNumFla == 1) { wordsOutNum = wordsOutNum; } else wordsOutNum = 10; for (int i = 0; i < wordsOutNum; i++) { sw.WriteLine("<{0}>:{1}", keysList[i], wordList[keysList[i]]); } sw.WriteLine("如下是长度为{0}的词组:\n",m); foreach (string j in keysList1) { sw.WriteLine("<{0}>:{1}", j, 1); } sw.Flush(); sw.Close(); }
写入文件仍是比较简单,可是有一个小细节就是在打开文件以后必定要关闭所打开的文件,否则若是要对文件进行二次追加写入的时候回报错,
我以前分两次写入文件的,而后又忘记了在第一次打开文件以后进行关闭,致使了报错必定要记住
这个方法,传入了须要写入文件的总字符数、单词数、频率,以及频率最高的单词的个数的标志位wordsOutNumFla,
经过wordsOutNumFla这个来判断是输出默认的十个最高频率单词,仍是使用-n参数后面的数字
编程
[TestMethod] public void getHangNum() { int lines; int m = 3; string input_path = "C:/Users/罗伟诚/Desktop/input.txt", out_put = "C:/Users/罗伟诚/Desktop/out.txt"; Hashtable wordList = new Hashtable(); ArrayList keysList = new ArrayList(); getFile c = new getFile(); keysList = c.getWordFre(input_path, ref wordList); lines = c.getHangNum(input_path); }
测试出来如上图所示,没有问题c#
[TestMethod] public void getWordNum1() { int words; int m = 3; string input_path = "C:/Users/罗伟诚/Desktop/input.txt", out_put = "C:/Users/罗伟诚/Desktop/out.txt"; Hashtable wordList = new Hashtable(); Hashtable wordList1 = new Hashtable(); ArrayList keysList = new ArrayList(); getFile c = new getFile(); keysList = c.getWordFre(input_path, ref wordList); words = c.getWordNum(input_path); }
[TestMethod] public void getCharactersNum1() { int words, characters = 0, wordsOutNum = 0, wordsOutNumFla = 0, inputPathFla = 0, outputPathFla = 0; int m = 3; string input_path = "C:/Users/罗伟诚/Desktop/input.txt", out_put = "C:/Users/罗伟诚/Desktop/out.txt"; Hashtable wordList = new Hashtable(); Hashtable wordList1 = new Hashtable(); ArrayList keysList = new ArrayList(); getFile c = new getFile(); keysList = c.getWordFre(input_path, ref wordList); words = c.getWordNum(input_path); }
try { if (inputPathFla == 1 || outputPathFla == 1) { Hashtable wordList = new Hashtable(); Hashtable wordList1 = new Hashtable(); ArrayList keysList = new ArrayList(); getFile c = new getFile(); keysList = c.getWordFre(input_path, ref wordList); lines = c.getHangNum(input_path); words = c.getWordNum(input_path); characters = c.getCharactersNum(input_path); c.write(out_put, ref wordList, lines, words, characters, wordsOutNumFla, wordsOutNum,m,input_path ); Console.WriteLine("写入文件完成,请前往{0}查看\n", out_put); } else { Console.WriteLine("请使用 -i 参数和 -o 参数指定输入和输出路径\n"); } } catch (Exception e) { Console.WriteLine("请检查输入路径是否正确"); }
MatchCollection mc; Regex rg = new Regex("[A-Za-z]+"); //用正则表达式匹配单词 mc = rg.Matches(line); for (int i = 0; i < mc.Count - m + 1; i++) { Regex regNum = new Regex("^[0-9]"); string mcTmp = ""; int t = i; for (int q = 0; q < m; q++) { mcTmp += mc[t].Value.ToLower() + " "; t++; } k.Add(mcTmp); }
经过此次结对编程,总结了一下结对编程的好处数组