importing rss feed from msn spaces to movable type

Yueyue is migrating her blog from msn spaces to movable type . It's a real...

Yueyue is migrating her blog from msn spaces to movable type . It's a real difficult thing. Msn spaces have a limited number of post in the rss feed. Yueyue didn't find any option to overcome this limitation. So it's difficult to have the entire rss feed for importing. Here's a discussion(Chinese) on this issue and gave a way: Get the rss feed then remove the posts in this feed, then get the rss feed again and plus to previous feed, then remove... untill get the entire feed. It works, but really takes time if you have a long list. Anyhow, you would have it! We go to Movable Type.

MT builds in import/export feature but it only works on his own format, rss x.0 is not supported yet. We can not import the msn rss feed directly. I had a solution for it. I write a macro file for UltraEdit that read a msn rss feed file and convert it to MT import/export format. After the convert ion, you could save the output as a new file for importing use.

AUTHOR, ALLOW COMMENTS, CONVERT BREAKS and ALLOW PINGS are set default value, you could change them before your importing. DATE might not match your requirement if you have different timezone settings between msn spaces and movable type. Another point, in this case the same charset setting(UTF-8 here) is taken for both side movable type and msn spaces. If you have different setting between them, must do additional charset convertion for the feed file. this could be done within UltraEdit. Unfortunaly some convertion only available in UltraEdit menu but not in Macro. So we have to do it manually.

Output example: 

AUTHOR: #authorname          
TITLE: importing rss feed from msn spaces to movable type
ALLOW COMMENTS: 1            
CONVERT BREAKS: 1            
ALLOW PINGS: 1               
PRIMARY CATEGORY: essay      
DATE: 11/23/2006 03:42:07    
-----                        
BODY:                        
<p>Yueyue is migrating her blog from ...
-----                        
--------                     

UltraEdit tool is necessary for running this macro file. The macro could be used on migrating other rss 2.0 feed  to movable type with little change.

msn_feed2mt.mac

InsertMode
ColumnModeOff
HexOff
UnixReOff
Top
Find RegExp "<generator>Microsoft Spaces v[0-9.]+</generator>"
IfNotFound
ExitMacro
EndIf
Top
Find RegExp "^n"
Replace All ""
Find RegExp "<^?xml*</cf:listinfo>"
Replace All ""
Find RegExp "<item><title>Photo Album:*</item>"
Replace All ""
Find RegExp "<item><title>Custom List:*</item>"
Replace All ""
Find RegExp "<item><title>Music List:*</item>"
Replace All ""
Find RegExp "<item><title>Blog List:*</item>"
Replace All ""
Find RegExp "</channel></rss>"
Replace All ""
Find RegExp "<link>*</link>"
Replace All ""
Find RegExp "<guid*</guid>"
Replace All ""
Find RegExp "<comments>*</comments>"
Replace All ""
Find RegExp "<slash:comments>*</slash:comments>"
Replace All ""
Find RegExp "<msn:type>*</msn:type>"
Replace All ""
Find RegExp "<live:type>*</live:type>"
Replace All ""
Find RegExp "<live:typelabel>*</live:typelabel>"
Replace All ""
Find RegExp "<dcterms:modified>*</dcterms:modified>"
Replace All ""
Find "<item>"
Replace All ""
Find "</item>"
Replace All ""
Find "&lt;"
Replace All "<"
Find "&gt;"
Replace All ">"
Find RegExp "<title>^(*^)</title><description>^(*^)</description><category>^(*^)</category><pubDate>^(*^)</pubDate>"
Replace All "^nAUTHOR: #authorname^nTITLE: ^1^nALLOW COMMENTS: 1^nCONVERT BREAKS: 1^nALLOW PINGS: 1^nPRIMARY CATEGORY: ^3^nDATE: ^4^n-----^nBODY: ^n^2^n-----^n--------"
Find RegExp "%DATE: *, ^([0-9]+^) ^([a-zA-Z]+^) ^([0-9]+^) ^(*^) GMT$"
Replace All "DATE: ^2/^1/^3 ^4"
Find RegExp "%DATE: Jan*/"
Replace All "DATE: 01/"
Find RegExp "%DATE: Feb*/"
Replace All "DATE: 02"
Find RegExp "%DATE: Mar*/"
Replace All "DATE: 03/"
Find RegExp "%DATE: Apr*/"
Replace All "DATE: 04/"
Find RegExp "%DATE: May/"
Replace All "DATE: 05/"
Find RegExp "%DATE: Jun*/"
Replace All "DATE: 06/"
Find RegExp "%DATE: Jul*/"
Replace All "DATE: 07/"
Find RegExp "%DATE: Aug*/"
Replace All "DATE: 08/"
Find RegExp "%DATE: Sep*/"
Replace All "DATE: 09/"
Find RegExp "%DATE: Oct*/"
Replace All "DATE: 10/"
Find RegExp "%DATE: Nov*/"
Replace All "DATE: 11/"
Find RegExp "%DATE: Dec*/"
Replace All "DATE: 12/" 

For getting rss feed on msn spaces, it needs to turn the option Syndicate this space On within msn spaces setting.

Yueyue have another place blogcn need to migrate. The rss feed from blogcn is more clear than msn spaces. But it's in GB2312. I made changes on the macro file as below that works for blogcn's rss feed. Following step must be taken after the macro process: File -> Conversions -> ASCII to UTF-8 (Unicode Editing), then save. Or you have other way to do the charset convertion.

InsertMode
ColumnModeOff
HexOff
DosToUnix
UnixReOff
Top
Find RegExp "<rss version="
IfNotFound
ExitMacro
EndIf
Top
Find RegExp "^p"
Replace All ""
Find RegExp "<^?xml*</dc:language>"
Replace All ""
Find RegExp "</channel></rss>"
Replace All ""
Find RegExp "<link>*</link>"
Replace All ""
Find RegExp "<guid*</guid>"
Replace All ""
Find RegExp "<comments>*</comments>"
Replace All ""
Find "<item>"
Replace All ""
Find "</item>"
Replace All ""
Find RegExp "<author>^(*^)</author>*<title><!^[CDATA^[^(*^)]]></title>*<pubDate>^(*^)</pubDate>*<description>*<!^[CDATA^[^(*^)]]>*</description>"
Replace All "^nAUTHOR: ^1^nTITLE: ^2^nALLOW COMMENTS: 1^nCONVERT BREAKS: 1^nALLOW PINGS: 1^nPRIMARY CATEGORY: #blogcn^nDATE: ^3^n-----^nBODY: ^n^4^n-----^n--------"
Find RegExp "%DATE: ^([0-9]+^)-^([0-9]+^)-^([0-9]+^) ^(*^)$"
Replace All "DATE: ^2/^3/^1 ^4"

RSS 简介

上一封邮件给大家介绍了blog,这次介绍一个新的工具:在线RSS 订阅器为什么使用rss?网站通过rss 发布内容,订阅者可以通过rss 阅读器同时订阅多个感兴趣网站的rss发布,通过订阅工具,读者可以通过一个入口,集中的阅读当天最新的文章,而不用到处登录去寻找某日以后的更新。哪里找到rss?支持rss 发布的网站一般都会找到类似下面这样的图片或文字链接:                      Syndicate this site (XML)并包含有类似如下的链接:http://www.domain.com/index.xmlhttp://www.domain.com/index.rdfhttp://www.domain.com/atom.xmlhttp://feeds.feedburner.com/aiview通过上面的链接就可以订阅rss。如何使用bloglines订阅?bloglines 是一个在线的rss 订阅工具,不需要安装软件,通过登录www.bloglines.com 进行订阅并在线阅读rss。首先申请一个用户帐号,登录后,点击新增按钮,把需要订阅的rss 链接粘贴到“Blog or feed...
上一封邮件给大家介绍了blog,这次介绍一个新的工具:在线RSS 订阅器

为什么使用rss?网站通过rss 发布内容,订阅者可以通过rss 阅读器同时订阅多个感兴趣网站的rss发布,通过订阅工具,读者可以通过一个入口,集中的阅读当天最新的文章,而不用到处登录去寻找某日以后的更新。

哪里找到rss?支持rss 发布的网站一般都会找到类似下面这样的图片或文字链接:
                      Syndicate this site (XML)

并包含有类似如下的链接:
http://www.domain.com/index.xml
http://www.domain.com/index.rdf
http://www.domain.com/atom.xml
http://feeds.feedburner.com/aiview

通过上面的链接就可以订阅rss。

如何使用bloglines订阅?bloglines 是一个在线的rss 订阅工具,不需要安装软件,通过登录www.bloglines.com 进行订阅并在线阅读rss。
首先申请一个用户帐号,登录后,点击新增按钮,把需要订阅的rss 链接粘贴到“Blog or feed URL”文本框,点击“subscribe”即可。

mail.png


标签订阅|Tag Subscription

If you use an RSS reader, you can subscribe to a feed of all future entries tagged 'rss'. [What is this?]

Subscribe to feed Subscribe to feed

最近更新|Recent Entries

不定期更新|Handy Entries

其它标签|Other Tags

分类栏目|Categories

按月归档|By Month

2008
11
10
07
05
04
03
02
01
2007
12
10
07
06
05
04
03
02
01
2006
12
11
10
09
08
07
06
05
04
03
02
01
2005
11
10
09
08
07
04
03
2004
12
11
10
09
08
07
06
05
04
03
02
01
2003
12
10
09
08
06
2002
09
08
04
03
02
2001
12
09
07
06
05

站内链接|Site Links

Powered by
Movable Type 3.34