Grants:IdeaLab/Djvu text layer editor
Appearance
status: idea
IdeaLab |
meet more people |
visit more ideas |
join other ideas |
project:
please add a title
idea creator:
project contact:
alex.brollogmail.com
participants:
summary:
Use some of VE features to edit djvu text layer
created on: 12:10, 13 March 2014
Project idea
[edit]What is the problem you're trying to solve?
[edit]Wikisource makes a large use of OCR text layer, but effectively uses just a little bit of it (naked text). Djvu text layer contains much more information (words, lines, paragraphs, regions, columno, page text coordinates), unluckily better exportable in a lisp-like format or as xml instead of hOCR.
What is your solution?
[edit]- To test VE or other WYSIWYG simpler html/xml editors for editing text only, saving information wrapped into xml tags;
- to test conversion extraction/upload of text layer into djvu files using a simple web interface.
Ideas for a test tool
[edit]A test could be done with existent tools:
- djvuLibre (running into Tool Labs), and particularly:
- djvutoxml, that extracts internal mapped text of djvu pages as an xml file;
- djvuxmlparser, that loads back modified mapped text into djvu file;
- tinyEditor, to edit xml text with a WYSIWYG comfortable interface (xml tags are hidden, only editable text is shown into any html textarea;
- a little bit of cgi from Tool Labs to manage such a web editing interface.
Project goals
[edit]- to split proofreading into two steps:
- djvu text editing (saving the result into djvu text layer)
- text formatting
Get involved
[edit]Welcome, brainstormers! Your feedback on this idea is welcome. Please click the "discussion" link at the top of the page to start the conversation and share your thoughts.
See also
[edit]
Does this idea need funding? Learn more about WMF grantmaking. Or, expand to turn this idea into an Individual Engagement Grant proposal
|
---|
Ready to create the rest of your proposal? Need more help? |