Energy Company

5,000+ Word/CAD files to DITA, with PDF and HTML5 output.

This client is a large utilities company, bringing natural gas and electric power to millions of homes. A team of 10-15 subject matter experts create and maintain all the documentation for designers as well as installation and maintenance engineers in the field. Most documentation is required to comply with federal and state regulations.

The client had 5,000+ documents, 45 percent in MS Word without any styles being applied, 55 percent created in CAD software and converted to AutoCAD files. The only structure in the files was implied by formatting, which was applied individually to paragraphs or even individual words and characters.

We created scripting to allow efficient, author-guided mapping of paragraph styles. After all Word content was mapped in this way, the individual page documents were merged into sections, with the AutoCAD pages inserted as SVGs at the right locations.

The prepery styled Word content was converted to structured FrameMaker, using a subset of the DITA standard. A text extraction tool was built to pull textual content from the SVGs into the DITA structure.

Further scripting was done to create DITA-compliant tables of content for each section, and to allow publication of separate sections to PDF and HTML5, while maintaining cross-references from one section to another.

Today, the client maintains all content in structured FrameMaker. They can publish to a set of interlinked PDFs as well as HTML5. Publishing of a single section does not lead to broken links to other sections.