Transcript The Royal Society of Chemistry
Practical Experiences With the Adoption of XML in Commercial Publishing
Richard Kidd Neil Hunter [email protected]
http://www.rsc.org
© The Royal Society of Chemistry 2000
Royal Society of Chemistry
Learned society & publisher Journals publishing: • • • 20+ journals 35,000 pages per year Print, PDF, HTML
Conventional Publishing
Paper-based editing PDF available later PDF used in conjunction with SGML header data to publish on the web But: Lack of flexibility in electronic delivery Data for electronic products of variable quality No archive
Conventional Publishing
HTML WWW Print PDF & SGML artwork
Production Office
Paper Word files
Studio On-paper Editing
proofs
Typesetters
proofs corrections
Authors
author proofs
On-paper Proof Correction
New Web Requirements
To publish on web before print version Other factors: • • • • Overall reduction in publication times Reduce costs Introduce on-screen editing Need one source for all outputs: • HTML • • Print/PDF Headers, contents lists
‘Ideal’ Process
Author data captured as SGML RSC edit SGML ‘Autoproof’ created RSC correct SGML RSC publish HTML and PDF SGML returned to typesetter for final page make-up
‘Ideal’ process
Print artwork
Production Office
Paper Word files
Data Conversion Studio Typesetters
SGML
SGML editing Proof creation SGML store
HTML WWW proofs corrections
Authors
Where to Start?
Big bang?
Capture?
Editing?
Our SGML experience
SGML DTD development, software evaluations and trials were unproductive Expensive and complex tools with high support, training and consultation costs Full SGML implementation costly and demanding for our various typesetters Unless the data was “live” it was unreliable
Start at the End
After final correction • • • Develop DTD against real data Practical experience of SGML/XML Didn’t affect other production processes • Investigate repository, editing etc. later
XML Arrives
XML - the part of SGML that we need Developed a DTD that could be used in either XML or SGML environment Set time-scales for DTD revisions Pragmatic approach to tables, maths and bibliographic references Set the repository issues aside and use file system
Microsoft MSXML
IE5 and MSXML allowed us to test our DTD, XML data and prove concepts Now used to generate static HTML pages Includes XSLT, DOM and parser ASP and JScript used to preprocess documents via the DOM Offered an inexpensive, well documented and reliable tool set
Graphics and Glyphs
Maths, chemistry and other non-ASCII characters mapped to glyphs in HTML Combining character entities XSLT used to output context sensitive character mappings Unicode ready Common Publisher problems?
Bringing XML Forward
One supplier already had SGML workflow so we could start a pilot: Data captured as SGML (now XML) RSC edits SGML (now XML) Typesetter creates auto-proof RSC corrects SGML (now XML) RSC creates HTML Final SGML (now XML) returned to typesetter for page make-up
On-screen Editing
Currently in Arbortext Adept/Epic Softquad XMetal a possibility for future
Next Steps
Roll-out to remaining suppliers, see how they can implement an XML workflow Continue to train our editors and improve the process Aim to have a full XML workflow by mid 2001
Next Developments
Gaining control of our data should allow: Creation of proofs: HTML or PDF using XSL:FO All outputs from one source using XSLT Integration with our manuscript tracking system to enable exchange of control data and updating of XML Cross publisher article linking
Future Developments
SVG, MathML, CML Templates to simplify capture XSL FO - one file for all outputs Improved Unicode support Improved search functionality XML direct to browser or HTML on the fly - customised views
When the Roulette Wheel Stops...
Continuous in-house development?
Relationship with suppliers?
But we’ll be in a good position to act quickly whatever we have to do
Now:
Reap the benefits of open standards An archive, which can service our publishing needs as they develop Continuous publication Reduced costs Concentrate on adding value
Our Conclusions:
Data’s not trustworthy unless you do something with it You
will
know your data inside-out Using XML gives us the control over the information that
is
our business What helped: • Suppliers co-operation and staff commitment • • Expertise in DTD development Industry support for XML standards