平衡点


2010/11/04

_ Debian JP 対訳表の SKK 辞書

Debian JP Project 内の執筆者/翻訳者のコーナーにて、対訳表を公開しています。形式として skk, tsv, dict が選べるのですが, 例えば skk 形式の URL を叩くとヘッダとフッタがくっついた html として取得されます. というわけで, これを SKK-JISYO.debianjp とかに変換するためにゴニョっと.

#!/usr/bin/env ruby
# -*- mode: ruby; coding: utf-8-unix; indent-tabs-mode: nil -*-
#
# Copyright(C) 2010 Youhei SASAKI All rights reserved.
# $Id: $
# 
# Author: Youhei SASAKI <uwabami@gfd-dennou.org>
# Keywords:
# 
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or (at
# your option) any later version.
# 
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# General Public License for more details.
# 
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
# 
#  Code:
require 'hpricot'
require 'nkf'
require 'open-uri'
doc = Hpricot(open('http://www.debian.or.jp/community/translate/trans_table/table.skk'))
content = doc.search('/html/body/div[@id=MAIN]/div').inner_html
puts NKF.nkf('-e', content)

skkdic-expr2 を呼んで整頓すれば辞書の出来上がり, というお話でした.

そんなことより, 急にタスクが溜まってエライ感じです. 自分が二人欲しい...


連絡先など
最近の日記
一覧
2006|03|04|05|06|07|08|09|10|11|12|
2007|01|02|03|04|05|06|07|08|09|10|11|12|
2008|01|02|03|04|05|06|07|08|09|10|11|12|
2009|01|02|03|04|05|06|07|08|09|10|11|12|
2010|01|02|03|04|05|06|07|08|09|10|11|12|
2011|01|02|03|04|05|06|07|08|09|10|11|12|
2012|02|03|04|08|09|10|11|12|
2013|01|02|03|04|05|06|08|09|10|11|12|
2014|01|02|04|05|06|07|08|09|10|11|12|
2015|01|02|03|04|05|06|07|09|10|
2016|02|03|
2017|01|02|03|05|06|07|09|11|12|
2018|03|06|07|10|11|12|
2019|01|02|03|04|05|07|10|12|
2020|01|02|03|04|05|08|09|10|11|12|
2021|01|02|03|05|06|07|08|09|11|12|
2022|01|02|03|04|05|06|08|10|11|12|
2023|02|03|04|06|08|09|11|12|
2024|01|02|03|
Back to Top ▲