星期六, 九月 30, 2006
DNS服务又无法工作了
星期六, 九月 23, 2006
PPVOD P2P的视频点播系统 新版发布
先进的网络P2P视频点播工具,只有安装了该播放工具,才可以使用本站提供的视频点播服务
技术要点
视频点播客户端
C++编程实现,大量使用设计模式,系统运行更加高效稳定
HTTP协议为基础,智能穿透防火墙,支持内网互联
资源消耗更低,高效的内存管理方式,稳定运行时不超过10M, 2~6个工作线程, 不超过200个Handle, 具体会在不同的Windows平台稍有不同!
更加灵活及时的调度算法,启动更加迅速,所有的网络操作都采用异步方式完成
支持全部的媒体格式,数据传输方式与美特格式无关
可扩展支持多种主流媒体播放工具,WMP,RP,QT
视频点播服务器端
成熟的数据管理方式,MySQL + PHP + Apache
灵活的Web表现形式 基于xoops的一个视频点播模块,视频点播服务端模块也可通过asp, jsp等形式来实现
重在用户参与 用户可发布自己的视频媒体,用户可对媒体进行评论与评分!
播放组建模块化,可以通过JavaScript,IFrame切入的防火将媒体信息发布到其他的网站
视频资料存储位置灵活 可以在本地服务器,也可以在远程服务器,也可以是来自于网络的视频URL.
预留了广告发布接口,可以在系统中实现文字/图片/视频广告
希望这套软件对大家有用,爱好者可以将在使用过程中碰到的问题通过邮件或论坛的形式反馈对软件的使用情况,碰到BUG请详细描述BUG状况,谢谢
星期五, 九月 22, 2006
boost::xml
http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?BoostXMLDiscussion
星期一, 九月 18, 2006
星期日, 九月 17, 2006
FireFox Install Plugin
By default, extensions are installed for the current user only, but they can also be installed across multiple profiles and even globally.
Once installed, extensions can be configured by opening the Extensions Manager ("Tools -> Extensions"), selecting the extension from the list, and then clicking the "Options" button. (If this button is disabled then the extension is not user-configurable.)
If you experience problems installing or updating an extension you should consult the list of known causes before visiting the author’s home page for that extension to check for relevant known issues.
FireFox Mozilla 通用Activex Plugin
首先感谢sunwan为插件制作了安装程序
以前使用官方提供的Activex plugin总感觉有些别扭,因为随着FireFox的升级,plugin就得升级,曾经有一段时间简直有些望眼欲穿的感觉...所以下决心自己写个Activex 的Plugin出来!
这个plugin不依赖firefox的升级,尽量做到一劳永逸
压 缩包里只有两个文件,展开后放在firefox的plugins目录,其中npActiveX.ini是配置文件,把允许的Activex控件的 CLSID写进去就可以了,默认已经把 Windows media player 和 real media player 设置好了,大家可以参照这个格式自己填写。
如果想完全开放对所有Activex控件的支持,就在文件的最开头打进一个
*
但强烈不建议这样做,开放所有Activex绝对不是件什么好事。
另:此插件不准备支持网上支付等Activex,也不对此进行测试,将来也不会支持。
Firefox安装 Mozilla ActiveX Plugin
安装 Mozilla ActiveX Plugin
使用 ActiveX 可能会造成电脑安全问题,因此除非你在工作上有需要在有特別设计 (custom-designed) 的网站或內部网络 (intranet) 上,也就是說在控制环境里使用 ActiveX,否则你不应安装 ActiveX 插件。
Mozilla 浏览器本身并不支持 ActiveX 控制元件 (ActiveX control),但是如果你在工作上有特別需要,你可以安装插件 (plug-in) 来在 Mozilla 与 Firefox 上使用 ActiveX。
这篇“安装 Mozilla ActiveX Plugin”的教学主要是 http://forums.mozillazine.org/viewtopic.php?t=206213 的简化版。
安装前请先注意:
* 为了安全起見,ActiveX plugin 预设仅能执行 Windows Media Player(您可以自行指定允许安装的 ActiveX control)。
* 如果您安装了 AdBlock 这个扩展,請取消其中的“OBJ-TABS”(物件标签)设定,否则(几乎)所有 ActiveX control 都能直接執行。
* 本安装概要仅适用于 Firefox 1.0 及 1.0.3 官方 Windows 版,其他版本虽不见得不能安装,但请自负风险。
= 介绍 =
== Mozilla ActiveX 专案 ==
Mozilla ActiveX是 由 Adam Lock 所编写的 Mozilla 浏览器插件,与 Netscape 6 和 Netscape 7 所包括的 ActiveX插件一样。该插件执行与微软的 ActiveX 完全雷同的 APIs (程序编程界面),因此大多数的 ActiveX 控制元件应可以使用。
== 执行 (implementation) ==
MozillaActiveX 执行了多数常用的功能。Mozilla ActiveX 有支持一些文件物件 (DOM) 控制,譬如用 class 与 id取得与设定 HTML 元素等。不过因为 Mozilla 与 Internet Explorer的布景引擎內部的不同,出现的效果可能会有少许差异。另外,一些较无用的功能 (methods) 由无效的程序码所吸收(some of theless useful methods are replaced by dummy code)。
== 安全设定 ==
有那些 ActiveX 元件可以下载、使用、给脚本存取 (scripting) 的白名单与黑名单是由 defaults/pref/activex.js 文件所控制。编辑各个文件来控制 Mozilla ActiveX。
注意:Firefox 1.0 的 Mozilla ActiveX 有启用白名单,因此你必须设安全设定项。
= 安装 ActiveX plugin =
接下来将带您一步步在 Firefox 上安装 ActiveX plugin。
== 事前准备 ==
您必须先安装 Windows Media Player 9 或 10 版。如果尚未安装可以从 Windows Update 网站下载(该网站为 IE Only,仅能使用 IE 浏览)。
1. 打开 Firefox,在地址栏输入“about:plugins”
2. 开启“随打即找”功能,找看看此页中有没有以下三种文件名称:
npdsplay.dll
npwmsdrm.dll
npdrmv2.dll
这 些文件缺一不可。如果找不到的话,请从 Windows Media Player 安装文件夹(默认是 C:\ProgramFiles\Windows Media Player)中找找看,找到后复制到 Firefox Plugin 文件夹(默认是C:\Program Files\Mozilla Firefox\plugins),然后重新装入那一页再找找看。
#* 如果还是有文件找不到,那就把 Firefox 关掉、安装 Windows Media Player Plug-in for Netscape Navigator http://www.microsoft.com/windows/windowsme...oad/plugin.aspx,然后重复刚才动作再找一下。
还是缺少?那把 Windows Media Player 卸载,重新下载适当的离线安装文件来安装。这些文件可能是英文版,如果知道中文版位置的帮我补上吧:
Windows Media Player 10 Offline Installer (Windows XP)
Windows Media Player 9 Offline Installer (Windows XP)
Windows Media Player 9 Offline Installer (Windows 98SE, ME, 2000)
== 安装 ==
安装 Adam Lock 的 ActiveX Plugin for Firefox 1.0.3
1. 请在上述的链接按下右键,选择“链接另存为”
2. 将下载的 mozactivex-ff-10.xpi 拖曳到 Firefox 窗口中,进行安装。(整个安装过程结束后,这个“插件”不会显示在扩展管理器中)
3. 安装完毕后重新启动 Firefox。
== 检查 ==
1. 打开 Firefox,在地址栏输入“about:plugins”
2. 检查看看有没有“Mozilla ActiveX control and plugin support”,有的话就是成功了,否则您可能得重新安装一次。
== 修改注册表 ==
这个操作我没有做,不过目前是还跑得好好的... 无论如何,既然原始文件中注明了这点,还是提供您做參考:
1. 注册表修改文件在此,请使用“链接另存新文件”。
2. 执行您所下载的 wmp9.reg,您可以打开 Firefox 下載管理器后直接点两下那个文件。
3. 会有个窗口问你是不是要加入此信息,选“是”。
Firefox 有可能误用旧版的 WMP Plugin,修改注册表后可解決此问题。
== 测试 ==
请浏览ActiveX Test - Windows Media Player。如果您可以听音乐也可以看影片,那就是成功啦。
= 卸载 =
卸载特别简单,您必须从 Firefox 安装目录(默认是 C:\Program Files\Mozilla Firefox\)下手动删除四个文件:
1. 先把所有 Firefox 全部关掉,一个都不能留。
2. 刪除“{Firefox 安装資料夾}\plugins”下的“npmozax.dll”。
3. 刪除“{Firefox 安装資料夾}\components”下的“nsIMozAxPlugin.xpt”及“nsAxSecurityPolicy.js”。
4. 刪除“{Firefox 安装資料夾}\defaults\pref”下的“activex.js”。
大功告成,你可以按照刚刚的“测试”链接来测试一下,如果啥也看不見就是卸载成功啦。
= 备注 =
1. 安装后 User Agent String 会多出“(ax)”字样,如“Mozilla/5.0 (Windows; U; Windows NT5.1; zh-cn; rv:1.7.6) Gecko/20050226 Firefox/1.0.1 (ax)”。您可在“说明>关于Firefox”中看到。
2. 安装此 ActiveX plugin 后不代表可以浏览所有网站的影音多媒体,因为有些网站使用了 IE only 的小程序来控制,那一切都是白搭。
3. 如果您安装了 AdBlock 这个扩展软件,请取消其中的“OBJ-TABS”(物件标签)设定,否则(几乎)所有 ActiveX control 都能直接执行。
4. 如果安装后碰上先前未曾发生的 Flash 读取问题,请在每次启动 Firefox 后先在地址栏输入“about:plugins”,确保 Firefox 读入所有 plugin 。
5. 如果您还是不能看某些站的多媒体文件,而且确定那些文件为 WMP 格式,那建议您直接联系影音网站站长反映问题。
== 相关链接 ==
程序源码下载与组建
插件組件下载
(转自 Mozilla Taiwan Wiki,按照简体中文用语习惯修改了部分词汇并更新了相关内容)
Trackback: http://tb.donews.net/TrackBack.aspx?PostId=373487
视频点播控件开发工作下一步工作重点
2. 改善安装程序
3. 用户自助视频发布
4. 播放功能可以通过IFrame与JavaScript可以切入到论坛与博克中,活动论坛与博克禁止使用JavaScript与IFrame,还有其他的好办法吗?
当前任务: FireFox点播插件的开发
代理服务器自动配置脚本 Proxy Auto-Config File
很多时候你在Internet接入时只使用代理方式而非NAT,那么给使用代理的客户终端设置代理服务器将是一件很烦琐的事情,特别是当代理服务器 进行各种改动(比如服务器IP地址,服务端口等等变了)后,你不得不对通知所有的客户重新进行设置。Proxy Auto-Config(PAC)脚本将帮助你解决这些问题。
很多时候你在Internet接入时只使用代理方式而非NAT,那么给使用代理的客 户终端设置代理服务器将是一件很烦琐的事情,特别是当代理服务器进行各种改动(比如服务器IP地址,服务端口等等变了)后,你不得不对通知所有的客户重新 进行设置。Proxy Auto-Config(PAC)脚本将帮助你解决这些问题。
也许你已经注意到Internet Explorer的代理设置里面有一个“使用自动配置脚本”的选项,这里的自动配置脚本指的就是PAC脚本。这是一种以.pac为扩展名的 JavaScript脚本,我们可以把它放在内部网络的某个web服务器上,设置客户端IE浏览器把“自动配置脚本选项”指向它(比如http: //192.168.100.1/proxy.pac),完成集中设置代理配置的工作。
PAC脚本还可以根据用户访问请求的不同设置不同的代理策略,比如,用户访问内部网的某台服务器时,PAC脚本可以告诉浏览器该访问将不通过代理服务器,而用户访问的是内部网以外的地址时,PAC脚本告诉浏览器这个访问请求使用代理。
PAC脚本另外一个重要的应用是多台代理服务器并存的情况下,通过pac脚本的控制:
l 用户随机选择使用多台代理服务器中的任意一台来达到流量负载均衡的目的;
l 管理员通过PAC脚本控制用户使用和不使用某台代理服务器,这样可以空出时间对代理服务器进行维护;
l 让服务器工作在主备模式,当主服务器宕机时,会自动切换到其它备用服务器而不会中断服务;
l 根据访问目的地的不同,自动选择最佳代理服务器。
PAC脚本中必须定义一个名为FindProxyForURL的函数,这个函数会被浏览器自动调用。其格式如下:
function FindProxyForURL(url, host)
{
……
}
下面给出一个实际应用中使用过的简单例子,这里只有1台squid代理服务器:134.40.22.48,服务端口是3128,脚本中根据客户端IP地址判断用户通过何种途径访问Internet或其他资源:
function FindProxyForURL(url, host)
{
if (isInNet(myIpAddress(), "10.21.193.0", "255.255.255.0")) {
return "DIRECT";
} else {
return "PROXY 134.40.22.48:3128";
}
}
根据上面的脚本,整个局域网中,除了地址是10.21.193.0/255.255.255.0的终端,其他终端访问Internet都要经代理服务器134.40.22.48。
灵活运用pac脚本可以为你的网络维护工作带来很多方便。关于Proxy Auto-Config文件的格式和相关函数这里不再叙述,你可以访问Netscape网站上提供的文档来了解更多:
http://wp.netscape.com/eng/mozilla/2.0/relnotes/demo/proxy-live.html
用PHP和MySQL保存和输出图片
我们通常在数据库中所使用的文本或整数类型的字段和需要用来保存图片的字段的不同之处就在于两者所需要保存的数据量不同。MySQL数据库使用专门的字段来保存大容量的数据,数据类型为BLOB。
MySQL数据库为BLOB做出的定义如下:BLOB数据类型是一种大型的二进制对象,可以保存可变数量的数据。BLOB具有四种类型,分别是TINYBLOB,BLOB, MEDIUMBLOB 和LONGBLOB,区别在于各自所能够保存的最大数据长度不同。
在介绍了所需要使用的数据类型之后,我们可以使用以下语句创建保存图象的数据表。
CREATE TABLE Images ( PicNum int NOT NULL AUTO_INCREMENT PRIMARY KEY, Image BLOB );
编写上传脚本
关于如何实现文件的上传,我们在这里就不再介绍了,感兴趣的读者可以参见“网页陶吧”内的相关文章。现在,我们主要来看一下如何接收上传文件并将其存入到MySQL数据库中。具体的脚本代码如下,其中我们假定文件上传域的名称为Picture。
If($Picture != "none") {
$PSize = filesize($Picture);
$mysqlPicture = addslashes(fread(fopen($Picture, "r"), $PSize));
mysql_connect($host,$username,$password) or die("Unable to connect to SQL server");
@mysql_select_db($db) or die("Unable to select database");
mysql_query("INSERT INTO Images (Image) VALUES ($mysqlPicture)") or die("Cant Perform Query");
}else {
echo"You did not upload any picture";
}
?>
这样,我们就可以成功的把图片保存到数据库中。如果在将图片插入MySQL的过程中出现问题,可以检查一下MySQL数据库所允许的最大数据包的大小。如果设置值过小的话,我们会在数据库的错误日志中找到相应的记录。
下面,我们简单说明一下上述脚本程序。首先,我们通过“If($Picture != "none")”检查是否有文件被上传。然后,使用addslashes()函数避免出现数据格式错误。最后,连接MySQL,选择数据库并插入图片。
显示图片
在知道了如何将图? 入数据库之后,我们就需要考虑怎样才能从数据库中取出图片并在HTML页面中显示出来。这个过程要稍微复杂一些,下面我们就来介绍一下实现过程。
因为PHP显示图片需要发送相应的标头,所以我们就会面临这样一个问题,那就是一次只能显示一副图片,因为我们无法在发出标头之后再发送其它的标头。
为了有效的解决这一问题,我们编写了两个文件。其中,第一个文件作为HTML页面的模板,定位图片的显示位置。第二个文件则被用来从数据库中实际输出文件流,作为标签的SRC属性。
第一个文件的简单形式可以如下:
mysql_connect($host,$username,$password) or die("Unable to connect to SQL server");
@mysql_select_db($db) or die("Unable to select database");
$result=mysql_query("SELECT * FROM Images") or die("Cant Perform Query");
While($row=mysql_fetch_object($result)) {
echo "PicNum\">";
} ?>
当HTML页面被浏览时,每显示一副图片就会调用一次Second.php3文件。当第二个文件被调用时会传入相应的Picture ID,我们可以借此从数据库中取回对应的图片并显示。
Second.php3文件如下:
$result=mysql_query("SELECT * FROM Images WHERE PicNum=$PicNum") or die("Cant perform Query");
$row=mysql_fetch_object($result);
Header( "Content-type: image/gif");
echo $row->Image;
?>
到此,我们就介绍完了使用PHP和MySQL保存和显示图片的全过程。文中所举得都是一些最简单的实例,读者可以根据自己的实际需要加入其它一些功能,使整个程序更加完善。
工作汇报
星期日, 九月 03, 2006
lighttpd,thttpd,shttpd - 轻量级webserver介绍
lighttpd,thttpd,shttpd - 轻量级webserver介绍
国内绝大部分的web server不是IIS就是Apache,而论市场占有率,我认为Apache是大赢家了,至少是占据了半壁江山。
但除了IIS/Apache外,其实我们有很多选择,对于高负载/大并发的网站而言,高性能、轻量级的web server是一剂良药。最近手头一台Server 的负载太高,居然将swap吃光导致机器非常缓慢,后来一查,原来是Apache耗掉了几乎所有资源,当时apache进程已有9XX个了。
于是用轻量级的web server替换掉apache就进入了日程表。这里顺带介绍一下这些可选的对象:
lighttpd | thttpd | shttpd
lighttpd
官方主页:www.lighttpd.net
Lighttpd是一个德国人领导的开源软件,其根本的目的是提供一个专门针对高性能网站,安全、快速、兼容性好并且灵活的web server环境。具有非常低的内存开销,cpu占用率低,效能好,以及丰富的模块等特点。
lighttpd是众多OpenSource轻量级的web server中较为优秀的一个。支持FastCGI, CGI, Auth, 输出压缩(output compress), URL重写, Alias等重要功能,而Apache之所以流行,很大程度也是因为功能丰富,在lighttpd上很多功能都有相应的实现了,这点对于apache的用户是非常重要的,因为迁移到lighttpd就必须面对这些问题。
在google搜索了一下,简体中文介绍lighttpd的文章几乎没有,大多数都是台湾同胞的Big5内容。因此在接下来的时间里,想好好写一篇介绍lighttpd,以及简单的benchmark的文章。
实用起来lighttpd确实非常不错,上文提到的apache overload的问题,用lighttpd就完全解决了。apache主要的问题是密集并发下,不断的fork()和切换,以及较高(相对于lighttpd而言)的内存占用,使系统的资源几尽枯竭。而lighttpd采用了Multiplex技术,代码经过优化,体积非常小,资源占用很低,而且反应速度相当快。
利用apache的rewrite技术,将繁重的cgi/fastcgi任务交给lighttpd来完成,充分利用两者的优点,现在那台服务器的负载下降了一个数量级,而且反应速度也提高了一个甚至是2个数量级!
thttpd
官方网站:http://www.acme.com/software/thttpd/
thttpd是一个非常小巧的轻量级web server,它非常非常简单,仅仅提供了HTTP/1.1和简单的CGI支持,在其官方网站上有一个与其他web server(如Apache, Zeus等)的对比图+Benchmark,可以参考参考。此外,thttpd 也类似于lighttpd,对于并发请求不使用fork()来派生子进程处理,而是采用多路复用(Multiplex)技术来实现。因此效能很好。
Thttpd支持多种平台,如FreeBSD, SunOS, Solaris, BSD, Linux, OSF等。对于小型web server而言,速度快似乎是一个代名词,通过官方站提供的Benchmark,可以这样认为:thttpd至少和主流的web server一样快,在高负载下更快,因为其资源占用小的缘故。
Thttpd还有一个较为引人注目的特点:基于URL的文件流量限制,这对于下载的流量控制而言是非常方便的。象Apache就必须使用插件实现,效率较thttpd低。
shttpd
官方网站:http://shttpd.sourceforge.net/ Shttpd是另一个轻量级的web server,具有比thttpd更丰富的功能特性,支持CGI, SSL, cookie, MD5认证, 还能嵌入(embedded)到现有的软件里。最有意思的是不需要配置文件!
由于shttpd可以嵌入其他软件,因此可以非常容易的开发嵌入式系统的web server,官方网站上称shttpd如果使用uclibc/dielibc(libc的简化子集)则开销将非常非常低。以下是其特点:
Stand-alone server, or embeddable into existing C/C++ program
GET, POST, PUT, DELETE methods
CGI
SSL
Digest (MD5) authorization
Multiple (and user defineable) index files
Directory listing
Standard logging
Cookies
inetd mode
User-defineable mime types
No configuration files
No external dependencies
由于shttpd可以轻松嵌入其他程序里,因此shttpd是较为理想的web server开发原形,开发人员可以基于shttpd开发出自己的webserver
星期六, 九月 02, 2006
sed, a stream editor
version 3.02, 28 June 1998
by Ken Pizzini
Table of Contents
Introduction
Invocation
SED Programs
Selecting lines with SED
Overview of regular expression syntax
Where SED buffers data
Often used commands
Less frequently used commands
Commands for die-hard SED programmers
Some sample scripts
About the (non-)limitations on line length
Other resources for learning about SED
Reporting bugs
Concept Index
Command and Option Index
Introduction
SED is a stream editor. A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline). While in some ways similar to an editor which permits scripted edits (such as ED), SED works by making only one pass over the input(s), and is consequently more efficient. But it is SED's ability to filter text in a pipeline which particularly distinguishes it from other types of editors.
Invocation
SED may be invoked with the following command-line options:
`-V'
`--version'
Print out the version of SED that is being run and a copyright notice, then exit.
`-h'
`--help'
Print a usage message briefly summarizing these command-line options and the bug-reporting address, then exit.
`-n'
`--quiet'
`--silent'
By default, SED will print out the pattern space at then end of each cycle through the script. These options disable this automatic printing, and SED will only produce output when explicitly told to via the p command.
`-e script'
`--expression=script'
Add the commands in script to the set of commands to be run while processing the input.
`-f script-file'
`--file=script-file'
Add the commands contained in the file script-file to the set of commands to be run while processing the input.
If no -e, -f, --expression, or --file options are given on the command-line, then the first non-option argument on the command line is taken to be the script to be executed.
If any command-line parameters remain after processing the above, these parameters are interpreted as the names of input files to be processed. A file name of - refers to the standard input stream. The standard input will processed if no file names are specified.
SED Programs
A SED program consists of one or more SED commands, passed in by one or more of the -e, -f, --expression, and --file options, or the first non-option argument if zero of these options are used. This document will refer to "the" SED script; this will be understood to mean the in-order catenation of all of the scripts and script-files passed in.
Each SED command consists of an optional address or address range, followed by a one-character command name and any additional command-specific code.
Selecting lines with SED
Addresses in a SED script can be in any of the following forms:
`number'
Specifying a line number will match only that line in the input. (Note that SED counts lines continuously across all input files.)
`first~step'
This GNU extension matches every stepth line starting with line first. In particular, lines will be selected when there exists a non-negative n such that the current line-number equals first + (n * step). Thus, to select the odd-numbered lines, one would use 1~2; to pick every third line starting with the second, 2~3 would be used; to pick every fifth line starting with the tenth, use 10~5; and 50~0 is just an obscure way of saying 50.
`$'
This address matches the last line of the last file of input.
`/regexp/'
This will select any line which matches the regular expression regexp. If regexp itself includes any / characters, each must be escaped by a backslash (\).
`\%regexp%'
(The % may be replaced by any other single character.) This also matches the regular expression regexp, but allows one to use a different delimiter than /. This is particularly useful if the regexp itself contains a lot of /s, since it avoids the tedious escaping of every /. If regexp itself includes any delimiter characters, each must be escaped by a backslash (\).
`/regexp/I'
`\%regexp%I'
The I modifier to regular-expression matching is a GNU extension which causes the regexp to be matched in a case-insensitive manner.
If no addresses are given, then all lines are matched; if one address is given, then only lines matching that address are matched.
An address range can be specified by specifying two addresses separated by a comma (,). An address range matches lines starting from where the first address matches, and continues until the second address matches (inclusively). If the second address is a regexp, then checking for the ending match will start with the line following the line which matched the first address. If the second address is a number less than (or equal to) the line matching the first address, then only the one line is matched.
Appending the ! character to the end of an address specification will negate the sense of the match. That is, if the ! character follows an address range, then only lines which do not match the address range will be selected. This also works for singleton addresses, and, perhaps perversely, for the null address.
Overview of regular expression syntax
[[I may add a brief overview of regular expressions at a later date; for now see any of the various other documentations for regular expressions, such as the AWK info page.]]
Where SED buffers data
SED maintains two data buffers: the active pattern space, and the auxiliary hold space. In "normal" operation, SED reads in one line from the input stream and places it in the pattern space. This pattern space is where text manipulations occur. The hold space is initially empty, but there are commands for moving data between the pattern and hold spaces.
Often used commands
If you use SED at all, you will quite likely want to know these commands.
`#'
[No addresses allowed.] The # "command" begins a comment; the comment continues until the next newline. If you are concerned about portability, be aware that some implementations of SED (which are not POSIX.2 conformant) may only support a single one-line comment, and then only when the very first character of the script is a #. Warning: if the first two characters of the SED script are #n, then the -n (no-autoprint) option is forced. If you want to put a comment in the first line of your script and that comment begins with the letter `n' and you do not want this behavior, then be sure to either use a capital `N', or place at least one space before the `n'.
`s/regexp/replacement/flags'
(The / characters may be uniformly replaced by any other single character within any given s command.) The / character (or whatever other character is used in its stead) can appear in the regexp or replacement only if it is preceded by a \ character. Also newlines may appear in the regexp using the two character sequence \n. The s command attempts to match the pattern space against the supplied regexp. If the match is successful, then that portion of the pattern space which was matched is replaced with replacement. The replacement can contain \n (n being a number from 1 to 9, inclusive) references, which refer to the portion of the match which is contained between the nth \( and its matching \). Also, the replacement can contain unescaped & characters which will reference the whole matched portion of the pattern space. To include a literal \, &, or newline in the final replacement, be sure to precede the desired \, &, or newline in the replacement with a \. The s command can be followed with zero or more of the following flags:
`g'
Apply the replacement to all matches to the regexp, not just the first.
`p'
If the substitution was made, then print the new pattern space.
`number'
Only replace the numberth match of the regexp.
`w file-name'
If the substitution was made, then write out the result to the named file.
`I'
(This is a GNU extension.) Match regexp in a case-insensitive manner.
`q'
[At most one address allowed.] Exit SED without processing any more commands or input. Note that the current pattern space is printed if auto-print is not disabled.
`d'
Delete the pattern space; immediately start next cycle.
`p'
Print out the pattern space (to the standard output). This command is usually only used in conjunction with the -n command-line option. Note: some implementations of SED, such as this one, will double-print lines when auto-print is not disabled and the p command is given. Other implementations will only print the line once. Both ways conform with the POSIX.2 standard, and so neither way can be considered to be in error. Portable SED scripts should thus avoid relying on either behavior; either use the -n option and explicitly print what you want, or avoid use of the p command (and also the p flag to the s command).
`n'
If auto-print is not disabled, print the pattern space, then, regardless, replace the pattern space with the next line of input. If there is no more input then SED exits without processing any more commands.
`{ commands }'
A group of commands may be enclosed between { and } characters. (The } must appear in a zero-address command context.) This is particularly useful when you want a group of commands to be triggered by a single address (or address-range) match.
Less frequently used commands
Though perhaps less frequently used than those in the previous section, some very small yet useful SED scripts can be built with these commands.
`y/source-chars/dest-chars/'
(The / characters may be uniformly replaced by any other single character within any given y command.) Transliterate any characters in the pattern space which match any of the source-chars with the corresponding character in dest-chars. Instances of the / (or whatever other character is used in its stead), \, or newlines can appear in the source-chars or dest-chars lists, provide that each instance is escaped by a \. The source-chars and dest-chars lists must contain the same number of characters (after de-escaping).
`a\'
`text'
[At most one address allowed.] Queue the lines of text which follow this command (each but the last ending with a \, which will be removed from the output) to be output at the end of the current cycle, or when the next input line is read.
`i\'
`text'
[At most one address allowed.] Immediately output the lines of text which follow this command (each but the last ending with a \, which will be removed from the output).
`c\'
`text'
Delete the lines matching the address or address-range, and output the lines of text which follow this command (each but the last ending with a \, which will be removed from the output) in place of the last line (or in place of each line, if no addresses were specified). A new cycle is started after this command is done, since the pattern space will have been deleted.
`='
[At most one address allowed.] Print out the current input line number (with a trailing newline).
`l'
Print the pattern space in an unambiguous form: non-printable characters (and the \ character) are printed in C-style escaped form; long lines are split, with a trailing \ character to indicate the split; the end of each line is marked with a $.
`r filename'
[At most one address allowed.] Queue the contents of filename to be read and inserted into the output stream at the end of the current cycle, or when the next input line is read. Note that if filename cannot be read, it is treated as if it were an empty file, without any error indication.
`w filename'
Write the pattern space to filename. The filename will be created (or truncated) before the first input line is read; all w commands (including instances of w flag on successful s commands) which refer to the same filename are output through the same FILE stream.
`D'
Delete text in the pattern space up to the first newline. If any text is left, restart cycle with the resultant pattern space (without reading a new line of input), otherwise start a normal new cycle.
`N'
Add a newline to the pattern space, then append the next line of input to the pattern space. If there is no more input then SED exits without processing any more commands.
`P'
Print out the portion of the pattern space up to the first newline.
`h'
Replace the contents of the hold space with the contents of the pattern space.
`H'
Append a newline to the contents of the hold space, and then append the contents of the pattern space to that of the hold space.
`g'
Replace the contents of the pattern space with the contents of the hold space.
`G'
Append a newline to the contents of the pattern space, and then append the contents of the hold space to that of the pattern space.
`x'
Exchange the contents of the hold and pattern spaces.
Commands for die-hard SED programmers
In most cases, use of these commands indicates that you are probably better off programming in something like PERL. But occasionally one is committed to sticking with SED, and these commands can enable one to write quite convoluted scripts.
`: label'
[No addresses allowed.] Specify the location of label for the b and t commands. In all other respects, a no-op.
`b label'
Unconditionally branch to label. The label may be omitted, in which case the next cycle is started.
`t label'
Branch to label only if there has been a successful substitution since the last input line was read or t branch was taken. The label may be omitted, in which case the next cycle is started.
Some sample scripts
[[Not this release, sorry. But check out the scripts in the testsuite directory, and the amazing dc.sed script in the top-level directory of this distribution.]]
About the (non-)limitations on line length
For those who want to write portable SED scripts, be aware that some implementations have been known to limit line lengths (for the pattern and hold spaces) to be no more than 4000 bytes. The POSIX.2 standard specifies that conforming SED implementations shall support at least 8192 byte line lengths. GNU SED has no built-in limit on line length; as long as SED can malloc() more (virtual) memory, it will allow lines as long as you care to feed it (or construct within it).
Other resources for learning about SED
In addition to several books that have been written about SED (either specifically or as chapters in books which discuss shell programming), one can find out more about SED (including suggestions of a few books) from the FAQ for the seders mailing list, available from any of: http://www.dbnet.ece.ntua.gr/~george/sed/sedfaq.html
http://www.ptug.org/sed/sedfaq.htm
http://www.wollery.demon.co.uk/sedtut10.txt
There is an informal "seders" mailing list manually maintained by Al Aab. To subscribe, send e-mail to af137@torfree.net with a brief description of your interest.
Reporting bugs
Email bug reports to bug-gnu-utils@gnu.org. Be sure to include the word "sed" somewhere in the "Subject:" field.
Concept Index
This is a general index of all issues discussed in this manual, with the exception of the SED commands and command-line options.
a
Adding a block of text after a line
Address, as a regular expression
Address, last line
Address, numeric
Addresses, in SED scripts
Addtional reading about SED
Append hold space to pattern space
Append next input line to pattern space
Append pattern space to hold space
b
Backreferences, in regular expressions
Branch to a label, if s/// succeeded
Branch to a label, unconditionally
Buffer spaces, pattern and hold
Bugs, reporting
c
Case-insensitive matching
Caveat -- #n on first line
Caveat -- p command and -n flag
Command groups
Comments, in scripts
Conditional branch
Copy hold space into pattern space
Copy pattern space into hold space
d
Delete first line from pattern space
Deleting lines
e
Exchange hold space with pattern space
Excluding lines
f
Files to be processed as input
Flow of control in scripts
g
Global substitution
GNU extensions, I modifier, GNU extensions, I modifier
GNU extensions, n~m addresses
GNU extensions, unlimited line length
Goto, in scripts
Grouping commands
h
Hold space, appending from pattern space
Hold space, appending to pattern space
Hold space, copy into pattern space
Hold space, copying pattern space into
Hold space, definition
Hold space, exchange with pattern space
i
Insert text from a file
Inserting a block of text before a line
l
Labels, in scripts
Last line, selecting
Line number, print
Line selection
Line, selecting by number
Line, selecting by regular expression match
Line, selecting last
List pattern space
n
Next input line, append to pattern space
Next input line, replace pattern space with
p
Parenthesized substrings
Pattern space, definition
Portability, comments
Portability, line length limitations
Portability, p command and -n flag
Print first line from pattern space
Print line number
Print selected lines
Print unambiguous representation of pattern space
Printing text after substitution
q
Quitting
r
Range of lines
Read next input line
Read text from a file
Replace hold space with copy of pattern space
Replace pattern space with copy of hold space
Replace specific input lines
Replacing all text matching regexp in a line
Replacing only nth match of regexp in a line
Replacing text matching regexp
Replacing text matching regexp, options
s
Script structure
Script, from a file
Script, from command line
SED program structure
Selected lines, replacing
Selecting lines to process
Selecting non-matching lines
Several lines, selecting
Slash character, in regular expressions
Spaces, pattern and hold
Standard input, processing as input
Stream editor
Substitution of text
Substitution of text, options
t
Text, appending
Text, insertion
Transliteration
u
Usage summary, printing
v
Version, printing
w
Write result of a substitution to file
Write to a file
Command and Option Index
This is an alphabetical list of all SED commands and command-line opions.
#
# (comment) command
-
-n, forcing from within a script
:
: (label) command
=
= (print line number) command
a
a (append text lines) command
b
b (branch) command
c
c (change to text lines) command
d
D (delete first line) command
d (delete) command
g
G (appending Get) command
g (get) command
h
H (append Hold) command
h (hold) command
i
i (insert text lines) command
l
l (list unambiguously) command
n
N (append Next line) command
n (next-line) command
p
P (print first line) command
p (print) command
q
q (quit) command
r
r (read file) command
s
s (substitute) command
s command, option flags
t
t (conditional branch) command
w
w (write file) command
x
x (eXchange) command
y
y (transliterate) command
{
{} command grouping
This document was generated on 28 October 1999 using the texi2html translator version 1.54.
正则表达式之道
http://matrix.foresee.cn/blogs/neo/books/tao_regexps_zh.html