February 2020
Japan Electronic Publishing Association
About Japan Electronic Publishing Association
Since the establishment of the Japan Electronic Publishing Association (JEPA) in 1986, JEPA has been devoting itself for 35 years to facilitate digitalization of Japanese publications. These efforts have contributed to the standardization and promotion of the CD-ROM, Unicode, character fonts, multimedia, reading devices, the internet, and Japanese typesetting to name a few. In 2010 the association joined EPUB Enhanced Global Language Support (egls), organized by the Ministry of Internal Affairs and Communications, which proposed to W3C the specs of Japanese typesetting, including vertical writing mode, ruby, etc. The proposal of these specs was implemented in primary browsers in 2011 and has contributed to disseminating reflowable EPUB.
Publishing industry status
We are pleased to see that quite a number of Japanese publications have been digitally published, and our ambition since the establishment of JEPA to have no “out of print” works is coming close to realization. Nonetheless, there are differences between Japan and the western world, as follows:
- 1. Japanese manga publications are expanding internationally with the digital publications, whereas digital publications of text-oriented publications such as literary works, etc. have been limited.
- 2. The recognition rate of OCR for Japanese documents is still unsatisfactory.
- 3. While the transition to DTP took place in the 1980s in the U.S. and Europe, it only occurred in Japan from the 2000s. This gap spanned over 20 years, which has affected the digitalization of the editorial process quite significantly.
- 4. Unlike the Big 5 dominating the market in U.S., there are quite a large number of mid- to small-sized publishers existing in Japan.
Digitalization of publications from the past
The currently prevailing reflowable EPUB format does not suit Japanese publications made in the past. This is due to the fact that text data has not been kept in an organized manner, and refurbishing them digitally would require significant cost that would not equate to the demands. Thus, there are vast numbers of publications published in the heyday of the Japanese publishing industry between the 1960s and the 2000s that have remained not digitally published.
In addition, there are a variety of complex layout publications and books of versatile references in the Japanese publications, which made digitalization of such books difficult.
We would like to propose the digitalization of Japanese books to make more use of the image-dominated FXL in addition to the reflowable format. JEPA would like to contribute to disseminating the method below to be adopted in and by the industry.
First, create scanned images from the publications that exist only in print to create digital versions, then facilitate image-dominated FXL digital publications to be available at eBook stores. Since image-based content is not sufficient for accessibility requirements, we will collaborate with AI Data Consortium (aidata.or.jp) to extract text data through OCR research and study.
JEPA Proposal
We believe that the primary objective of existing versatile digital publications is to make them widely available for use by a variety of users. Formats and media should be of secondary importance. In addition, suffice to say that efforts of extending searchability and facilitating usability of digital publications are quite essential. Therefore, while we will continue to make it easier for everybody to find information and digital publications available for all kinds of readers, we will strive to promote the following 4 items, in addition to the reflowable EPUB format JEPA has been promoting in the past:
A. Promoting the use of image-based PDFs and FXL EPUBs of digital publications
B. Improving the accuracy of OCR for Japanese with AI
C. Disseminating text-based PDFs of digital publications
D. Improving accessibility of text-based PDFs with read along functionality. Studying formats for both accessible (e.g., textual or aural) renditions and visual renditions.
Translated by Media Do International.