博士論文 - 学術データベース

髙畠 嘉将 - 九州工業大学 博士(情報工学) - オンライン文法圧縮に基づく自己索引とそのアプリケーション

  1. 九州工業大学 (1204)
  2. 博士(情報工学) (397)
  3. 髙畠 嘉将

この論文にアクセスする

書誌事項

タイトル

Online Grammar-Based Self-Index and Its Applications

タイトル別名

オンライン文法圧縮に基づく自己索引とそのアプリケーション

著者名

髙畠 嘉将

学位授与大学

九州工業大学

取得学位

博士(情報工学)

学位授与番号

甲情工第322号

学位授与年月日

2017-03-24

注記・抄録

Text collections including many repetitions of substrings, called highly repetitive text collections, have been increasing in various fields like the version control system and the genome database. For the efficient use of such texts collections, the importance of the compression algorithm and compressed indexes is rapidly increasing more and more. The grammar-based self-indexes are suitable for the problem of information retrieval on a compressed text because they support the random access on the compressed text. However, the constructing algorithms of existing grammar-based self-indexes are offline, that is, these algorithms require the whole input beforehand. Therefore, the memory consumption depends on the size of input text explicitly. In order to overcome this difficulty, we propose the first online algorithm for grammar-based self-index, called Online Edit-Sensitive Parsing index (OESP-index, for short). The proposed algorithm directly transforms the input text into the corresponding variable-length encoded string reading the input symbol one-by-one. Compared to the existing self-indexes, the memory consumption of our online algorithm depends on the size of output, that is, the size of compressed text. Additionally, we also present three applications based on the grammar-based compression: (i) the online pattern matching problem for string edit-distance with moves called online ESP (OESP); (ii) the string index for edit-distance with moves called siEDM; (iii) the online grammar compression for frequent pattern discovery in smaller space. For these applications, we demonstrate the performance of our algorithm for large-scale data.

九州工業大学博士学位論文 学位記番号:情工博甲第322号 学位授与年月日:平成29年3月24日

1 Introduction|2 Preliminaries|3 Structure of the OESP-index|4 Online pattern matching for string edit distance with moves|5 siEDM: an efficient string index and search algorithm for edit distance with moves|6 Online grammar compression for frequent pattern discovery|7 Conclusions and future work

平成28年度

九州工業大学博士学位論文(要旨)学位記番号:情工博甲第322号 学位授与年月日:平成29年3月24日

目次

  1. 2017-10-02 再収集 / (index.pdf)

各種コード

NII論文ID(NAID)
500001036571
NII著者ID(NRID)
  • 8000001129903
本文言語コード
  • eng
大学ID

0071

CAT機関ID

KI000844

データ提供元
  • 機関リポジトリ
  • NDLデジタルコレクション

キーワード

九州工業大学 博士(情報工学) - 博士論文

博士論文をもっと見る

大学

大学をもっと見る

学位

学位をもっと見る