The Royal Society of Chemistry

Download Report

Transcript The Royal Society of Chemistry

Practical Experiences With the Adoption of XML in Commercial Publishing

Richard Kidd Neil Hunter [email protected]

[email protected]

http://www.rsc.org

© The Royal Society of Chemistry 2000

Royal Society of Chemistry

 Learned society & publisher  Journals publishing: • • • 20+ journals 35,000 pages per year Print, PDF, HTML

Conventional Publishing

 Paper-based editing  PDF available later  PDF used in conjunction with SGML header data to publish on the web But:  Lack of flexibility in electronic delivery  Data for electronic products of variable quality  No archive

Conventional Publishing

HTML WWW Print PDF & SGML artwork

Production Office

Paper Word files

Studio On-paper Editing

proofs

Typesetters

proofs corrections

Authors

author proofs

On-paper Proof Correction

New Web Requirements

 To publish on web before print version  Other factors: • • • • Overall reduction in publication times Reduce costs Introduce on-screen editing Need one source for all outputs: • HTML • • Print/PDF Headers, contents lists

‘Ideal’ Process

 Author data captured as SGML  RSC edit SGML  ‘Autoproof’ created  RSC correct SGML  RSC publish HTML and PDF  SGML returned to typesetter for final page make-up

‘Ideal’ process

Print artwork

Production Office

Paper Word files

Data Conversion Studio Typesetters

SGML

SGML editing Proof creation SGML store

HTML WWW proofs corrections

Authors

Where to Start?

 Big bang?

 Capture?

 Editing?

Our SGML experience

 SGML DTD development, software evaluations and trials were unproductive  Expensive and complex tools with high support, training and consultation costs  Full SGML implementation costly and demanding for our various typesetters  Unless the data was “live” it was unreliable

Start at the End

 After final correction • • • Develop DTD against real data Practical experience of SGML/XML Didn’t affect other production processes • Investigate repository, editing etc. later

XML Arrives

 XML - the part of SGML that we need  Developed a DTD that could be used in either XML or SGML environment  Set time-scales for DTD revisions  Pragmatic approach to tables, maths and bibliographic references  Set the repository issues aside and use file system

Microsoft MSXML

 IE5 and MSXML allowed us to test our DTD, XML data and prove concepts  Now used to generate static HTML pages  Includes XSLT, DOM and parser  ASP and JScript used to preprocess documents via the DOM  Offered an inexpensive, well documented and reliable tool set

Graphics and Glyphs

 Maths, chemistry and other non-ASCII characters mapped to glyphs in HTML  Combining character entities  XSLT used to output context sensitive character mappings  Unicode ready  Common Publisher problems?

Bringing XML Forward

One supplier already had SGML workflow so we could start a pilot:  Data captured as SGML (now XML)  RSC edits SGML (now XML)  Typesetter creates auto-proof  RSC corrects SGML (now XML)  RSC creates HTML  Final SGML (now XML) returned to typesetter for page make-up

On-screen Editing

 Currently in Arbortext Adept/Epic  Softquad XMetal a possibility for future

Next Steps

 Roll-out to remaining suppliers, see how they can implement an XML workflow  Continue to train our editors and improve the process  Aim to have a full XML workflow by mid 2001

Next Developments

Gaining control of our data should allow:  Creation of proofs: HTML or PDF using XSL:FO  All outputs from one source using XSLT  Integration with our manuscript tracking system to enable exchange of control data and updating of XML  Cross publisher article linking

Future Developments

 SVG, MathML, CML  Templates to simplify capture  XSL FO - one file for all outputs  Improved Unicode support  Improved search functionality  XML direct to browser or HTML on the fly - customised views

When the Roulette Wheel Stops...

 Continuous in-house development?

 Relationship with suppliers?

But we’ll be in a good position to act quickly whatever we have to do

Now:

 Reap the benefits of open standards  An archive, which can service our publishing needs as they develop  Continuous publication  Reduced costs  Concentrate on adding value

Our Conclusions:

 Data’s not trustworthy unless you do something with it  You

will

know your data inside-out  Using XML gives us the control over the information that

is

our business  What helped: • Suppliers co-operation and staff commitment • • Expertise in DTD development Industry support for XML standards