博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
CS 246: Objective-oriented programming
阅读量:6479 次
发布时间:2019-06-23

本文共 7778 字,大约阅读时间需要 25 分钟。

Objective oriented programming

Knuth-Morris-Pratt algorithm

In computer science, the Knuth-Morris-Pratt string searching algorithm (or KMP algorithm) searches for occurrences of a "word" W within a main "text string" S by employing the observation that when a mismatch occurs, the word itself embodies sufficient information to determine where the next match could begin, thus bypassing re0examination of previously matched characters.

The algorithm was conceived in 1970 by Donald Knuth and Vaughan Pratt, and independelty by James H.Morris.

The most straightforward algorithm is to look for a character match at successive values of the index m, the posisiotn in the string being searched, ie. S[m]. If the index m reaches the end of the string then there is no match, in which case the search is said to fial. At each position m the algorithm first checks for equality of the first character in the word being searched, I.e. S[m] = ? W[0]. If a match is found, the algorithm tests the other characters in the word being searched by checking successive values of the word position index, I. The algorithm retrieves the character W[I] in the word being searched and checks for equality of the expression S[m+i] = ? W[I]. If all successive characters match in W at position m, then a match is found at that position in the search string.

Usually, the trail check will quickly reject the trail match. If the strings are uniformly distributed random letters, then the chance that charaters match is 1 in 26. In most cases, the tral check will reject the match at the initial letter. The chance that the first two letters will match is 1 in 26^2 ( 1 in 676). So if the characters are random, then the expected complexity of searching string S[] of length k is on the order of k comparisons or O(k). The expected performance is very good. If S[] is 1 billion characters and W[] is 100 characters, then the string search should complete after about one billion character comparisons.
The expected peroframnce is not guaranteed. If the strings are not random, then checking a trial m may take many character comparisons. The worst case is if the two strings match in all but the last letter. Imagine that the string S[] consists of 1 billion characters that are all A, and that the word W[] is 999 A characters terminating in a final B character. The simple string matching algorithm will now examine 1000 characters at each trail position before rejecting the match and advancing the trail position. The simple string search example would now take about 1000 character comparisons times 1 billion positions for 1 trillion character comparisons. If the length of W[] is n, then the worst-case performance is O(kn)

The KMP algorithm has a better worst-case performance than the straightforward algorithm. KMP spends a little time precomputing a table, and then it uses that table to do an efficient search of the string in O(k)

The difference is that KMP makes use of previous match information that the straightforward algorithm does not.

Description of pseudocode for the search algorithm

We assume the existence of a "partial match" table T, described below, which indicates where we need to look for the start of a new match in the event that the current one ends in a mismatch. The entries of T are constructed so that if we have a match starting at S[m] that fails when comparing S[m + I] to W[I], then the next possible match will start at index m + I - T[i] in S. This has two implications: first, T[0] = -1, which indicates that if W[0] is a mismatch, we cannot backtrack and must simply check the next character. and second, although the next possible match will begin at index m + I - T[i], as in the example above, we need not actually check any of the T[I] characters after that, so that we continue searching from W[T[I]]. The following is a sample pseudocode implementation of the KMP search algorithm.

algorithm kmp_search:

input:    an array of characters, S (the text to be searched)    an array of characters, W (the word sought)output:    an integer (the zero-based position in S at which W is found)define variables:    an integer, m <- 0 (the beginning of the current match in S)

Lecture 3: egrep | permissions| Scripts

egrep cs246 index.html

[]:= pattern
|:= or, h|c matches h or c. cat|dog matches cat or dog
usage: regexpattern file
* := 0 or more repetitions of the preceding subexpression
(cs)*246 := ex. 246, cs246, cscs246
cs*246 := ex. c246, cs246, css246
for * to not have a special meaning, escape it by using
cs\*246
?:= 0 or 1 of proceeding subexpression
ex. cs ?246:= cs246, cs 246 (cs )?246:= 246, cs 246
+:= 1 or more of the proceeding subexpression
.:= any one characters
.*: any number of any characters
ex. cs.*246:= matches lines that contain a string starting with cs & ending with 246

use ^ to indicate pattern should match start of line

use $ to indicate pattern must end with end of line
"^cs.*246$"
^(cs.*)246$:= 246, csb246, csbcsa246

e.g. Fetch lines of even length

(..)* is not enough as you need to say the pattern start and end with (..)*
correct: ^(..)*$
e.g. files in the current directory that contain exactly one a
ls| egrep "^[^a]a[^a]*$" if "" is not added then will be interpreted as a glopping item
words in the dictionary containing exactly 5 characters
egrep ".{5}" file

Permissions

ls -l:= long listing of pwd

--rwxr--xr-- 1 naneem staff 28 Jan10 abcd.txt
type|user bits| group bits | others bits shortcuts owners group size last modified filename
w: write
x: execute as a program
r: read
group bits: permissions all others members of group have
user bits: permissions that the owner has
others bits: permissions that all others(people not in group staff in this case) has

r for directory:= read contents

w for directory:= write contents
x for directory:= enters a directory

Changing Permissions

only the owner can change permissions

chmod mode file(s)
mode:= ownership class:= U_users, g_group, o_others, a_all
mode:= operator:= +, -, =
mode:= r, w, x

425

100, 010, 101
r--, -w-, r-x

Programming using shell

vairables

$> x=1
variable=value

access the value

usage: $x (beter ${x})

In comparison: $(...) is the embedded command line
e.x. x=1
echo $x
y = 1, not working because of space
values in shell are stored as strings
dir=~/1171/a1
cd $dir
echo '$dir': single quote won't let embedded code to excecute

Script

Script: text file with Linux commands that are executed as a program

#!/bin/bash:= shebang line change this file into command line file
e.x. File basic
chmod a+x basic
Usage: basic, doesn't work, tell the shell where to find it for example ./basic
tells the shell that basic is in the current directory
Arguments to scripts: e.g. is a word in the dictionary
usage: ./isItAWord($0) hello($1)

cat isItAWord:= egrep "^$1$" /usr/share/dict/words

!/bin/bash

Answers whetehr a candidate word might be a good password

usage () {

echo "Usage: $0 password" >&2 (find in piazza| redirecting)exit 1

}

Using $0 so that if someone change the file name, it still work
${#} := number of args to script
if [${#} -ne 1]; then

usage

fi

syntax for if: if [condition]; then statement elif [condition]; then statement else statement /n fi

egrep "^$1$" /usr/share/dict/words > /dev/null
Loops
e.g. Print numbers 1 to $1
x = 1
while [$x -le $1]; do

echo ${x}x = $((x+1))

done

e.x. for x in a b c d; do echo $x; done

File: RenameC
change all .C files to .cc

  1. mv Hello.C Hello.cc

  2. filename = Hello.C mv ${filename} ${filename%C}cc

for name in *.C; do

mv ${name} ${name%C}cc

done

eg. count the number of occurrence of $1 in file $2
count = 0
for word in $(cat $2); do

if [${word}] == $1]; then    count = $((count + 1))fi

done

echo ${count}
usage: ./countWords "To be" ../sample.txt "" prevent the shell from treat to be as two arguments

转载地址:http://zpwuo.baihongyu.com/

你可能感兴趣的文章
兼容几乎所有浏览器的透明背景效果
查看>>
Linux VNC server的安装及简单配置使用
查看>>
阿里宣布开源Weex ,亿级应用匠心打造跨平台移动开发工具
查看>>
Android项目——实现时间线程源码
查看>>
招商银行信用卡重要通知:消费提醒服务调整,300元以下消费不再逐笔发送短信...
查看>>
C#_delegate - 调用列表
查看>>
[转]Windows的批处理脚本
查看>>
多维数组元素的地址
查看>>
数据库运维体系_SZMSD
查看>>
js的AJAX请求有关知识总结
查看>>
三分 POJ 2420 A Star not a Tree?
查看>>
修改OBS为仅直播音频
查看>>
OCA读书笔记(3) - 使用DBCA创建Oracle数据库
查看>>
Python基础进阶之路(一)之运算符和输入输出
查看>>
阻塞非阻塞异步同步 io的关系
查看>>
ClickStat业务
查看>>
spring3.0.7中各个jar包的作用总结
查看>>
Windows 10 /win10 上使用GIT慢的问题,或者命令行反应慢的问题
查看>>
Windows平台分布式架构实践 - 负载均衡
查看>>
iOS自定制tabbar与系统的tabbar冲突,造成第一次点击各个item图片更换选中,第二次选中部分item图片不改变...
查看>>