成果報告書詳細
管理番号20150000000414
タイトル*平成26年度中間年報 「学術・産業技術俯瞰システム開発プロジェクト」
公開日2015/7/17
報告書年度2014 - 2014
委託先名国立大学法人東京大学
プロジェクト番号P13014
部署名イノベーション推進部
和文要約
英文要約Developing Scientific and Technological Landscape System

In this project, we are aiming at developing the system, called "Scientific and Technological Landscape System", which automatically detects 1. rapidly growing technologies in the near future (emerging research), 2. complementary technologies to the emerging research (related research), and 3. leading researchers and their groups in the emerging research.

To this aim, we analyze the large amount of science and technology information including academic papers and industrial patents using advanced information technologies such as machine learning, natural language processing, and network analysis. Our system contributes to planning and executing a research and development strategy of both public and private sectors.

1. Detecting emerging researches
We first collected the data of the large amount of academic papers from several research areas. We construct a citation network of the papers and apply a clustering to the network to identify research fields in a research area as clusters. To find the optimal granularity of a cluster, we introduced two criteria of clustering: modularity and resolution limit. Based on the results of clustering, we visualized time-series variation of a research area as the dynamics of clusters. We then conducted the preliminary experiment of detecting emerging researches. We assumed that the emerging research grows from highly-cited emerging papers and tried to predict such papers using topological and textural features extracted from a citation network. Our results show that we can predict the emerging papers with F-value of 0.7-0.8.

2. Detecting related researches
We conducted the preliminary analysis of three types of data expansion approach: query expansion, network-based data expansion, and text-based data expansion to identify related researches. For query expansion and text-based expansion, we introduced a neural network language model, which enables to compute text similarity of the large volume of texts in the latent semantic space and therefore to expand the data of papers. We also tried network-based data expansion, which basically uses citation relations of a paper. Examining the expanded data, we classified and annotated types of potentially related researches to the emerging research.

3. Detecting leading researchers and groups
We obtained metadata such as author, affiliation, and nation from the data of papers from an emerging research field and constructed a co-authorship network. We calculated several network centralities for each researcher on the co-authorship network to find features that can be used to identify leading researchers and their groups. In the course of analysis, we find that dealing with people having same names is one of crucial issues of processing researcher information. We conducted the preliminary experiment of disambiguating the sane names using external text and ID information of a researcher.

4. Developing graphical user interface
We are developing and providing graphical user interface of Academic Landscape System. Collaborative research companies use it and give us feedback.
ダウンロード成果報告書データベース(ユーザ登録必須)から、ダウンロードしてください。

▲トップに戻る