Using Open Xml and Helper Libraries to generate

Download Report

Transcript Using Open Xml and Helper Libraries to generate

Using Open Xml and Helper Libraries
to generate Excel and Word
documents for websites
By Peter Messenger
Senior Developer
Triple Point Technologies (previously QMASTOR)
Website: www.petermessenger.com
Email : [email protected]
Sole IT Developer : http://www.physiotherapyexercises.com
Why Generate Excel/Word
documents?
•
•
•
•
Portable
Commonly Used
Users are quite familiar with using it
Can be read by many other programs
• Used at work and for my home project to
output ResX files for translation by
professional or other language speakers
How to generate excel files?
Idea 1 – used by student who developed our
translation program
COM Automation
–
–
–
–
–
–
“Ok” for single user desktop app
Not Thread Safe
Excel.exe not written to support concurrent users
Microsoft excel needs to be installed on the server
Very, very slow for large datasets
Not recommended by Microsoft
How to generate excel files?
Idea 2 – used in my output page on my website (2005)
Generate CSV or tab files and import
•
No compression, files can be quite large
Import/Export using Open OLE and Sql
•
•
•
Can be quite temperamental due to the way it determines the column types –
based on looking at the first “X” rows
Mixed column types (integers and text) can result in data being nulled
Varying length columns can be truncated due to it determining column types text
(255 characters max) versus long text columns types
using (OleDbConnection conn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\\temp\\test.xls;Extended
Properties='Excel 8.0;HDR=Yes'"))
{
conn.Open();
OleDbCommand cmd = new OleDbCommand("CREATE TABLE [Sheet1] ([Column1] string, [Column2] string)", conn);
cmd.ExecuteNonQuery();
}
Problems?
• Both methods are problematic
• The users can “break” the files by adding
columns, changing names or using sort badly
• This results in data corruption and lots of
heart ache when uploading them
So it there a better way?
A Better Way?
• Office 2007 introduced the Open Xml Format
– Zipped xml format
– Developer website http://openxmldeveloper.org/
– Can download Open XML Productivity Tool
• This can read any office file, display the xml and
generate C# script to regenerate the file
• Based on this you can use files for templates and add
data or generate files from scratch
It all sounds good…….
Problems Still?
• Documentation on how to do things isn’t very
good
• Can be quite difficult, a single change can
result in the file refusing to open with very
cryptic error messages
• Reading in large recordsets can be slow
So what then?
A solution?
• Open source codeplex library
– http://closedxml.codeplex.com/
– http://epplus.codeplex.com/
I used Closed Xml as the documentation was better and it was a more
active project.
It provides a more structured interface and generates the xml document
for you in the background.
A good example
http://closedxml.codeplex.com/wikipage?title=Showcase&referringTitle=
Documentation
Benefits
• Much easier to develop, takes out the
complexity of having to know the xml
• Not brittle, doesn’t break
• Loading in large datasets is lightning fast
• Can still use existing files as templates
• Can generate protected workbooks so users
cannot change anything you do not want
them to change
Demo
Translation page download for translators
http://www.physiotherapyexercises.com/ExerciseData_Download.aspx
– Download file
– Can only edit their language column, cannot edit
english or key column
– Can still filter and sort to aid them in translation
– Cannot sort in way that would break data
relationships
– We then upload file, it determines changes made and
generates sql scripts to update what has changed
Demo 2
Main data for english files
http://www.physiotherapyexercises.com/ExerciseData_English.aspx
– Download file
– Can add, delete or modify records
– Data can have validation built in, with comments
indicating what is acceptable
• We then upload file, it determines changes made and
generates sql scripts to update what has changed
What about Word?
• Have functionality for users to generate web,
PDF, or word documents that they can
customise. This uses the iTextSharp library.
• Some users want absolute flexibility, being
able to provide their own word templates
Want another cheap (free) solution, something
on codeplex again?
Word Document Generator
• Another open source codeplex project
– http://worddocgenerator.codeplex.com/
– Uses document placeholders
– Had to amend the source code a bit so it could
handle custom images in the template file
Example for a client booklet
– http://www.physiotherapyexercises.com/
Example Code
DocumentGenerationInfo generationInfo = GetDocumentGenerationInfo("ExerciseDocumentGenerator", "1.0", datasource, templatefile, false);
var sampleDocumentGenerator = new ExerciseDocumentGenerator(generationInfo); -- ExerciseDocumentGenerator inherits off DocumentGenerator
byte[] result = sampleDocumentGenerator.GenerateDocument();
Response.ClearContent();
Response.Clear();
Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";
Response.AddHeader("Content-Disposition", "attachment; filename=Exercises.docx;");
Response.BinaryWrite(result);
Response.Flush();
Response.End();
private static DocumentGenerationInfo GetDocumentGenerationInfo(string docType, string docVersion, object dataContext, string fileName, bool
useDataBoundControls)
{
return new DocumentGenerationInfo
{
Metadata =
new DocumentMetadata { DocumentType = docType, DocumentVersion = docVersion },
DataContext = dataContext,
TemplateData =
File.ReadAllBytes(
HostingEnvironment.MapPath("~/" + fileName)),
IsDataBoundControls = useDataBoundControls
};
}
Summary
• Easy to integrate Word and Excel into your
websites with open xml and these libraries
• Other libraries are also worth investigating
Comments and questions?