Cocoon – Web-Publishing Framework

Cocoon is an Open Source and free software from Apache Software Foundation. It is meant for use as a Web-Publishing Framework. After the emergence of the web as a publishing medium, there are literally thousands of pages available, containing documentation, information, tutorials, opinions and whatever an individual or organization wants to present on the web.

About a decade back (1995), clients had only the standard web browser. But nowadays, clients can be of many types like wireless and wap browsers, and text-speech interactive types. To make the content suitable for each type of client is a difficult job, laborious and time-consuming. XML has become the universal form of data but we have to convert the content in XML into a form suitable for various clients. This is achieved by XSLT (Readers can refer to an earlier tutorial on XSLT using JAXP and in DotNet – October 2004 issue of DeveloperIQ).

It is customary to use three related terms in this context:

  • Web Publishing Framework;
  • Content-Syndication; and
  • Content-Management Systems (CMS).

CMS is a very hot topic nowadays and Apache Software Foundation (ASF) has its own Open Source and free CMS in Java under development. Understanding and getting familiar with Cocoon will help in venturing into CMS. In brief, the main requirement is to convert XML into:

  •  
    • HTML;
    • PDF;
    • RTF;
    • WML (Wap);
    • SVG (Structured Vector Graphics);
    • Voxml (Voice-XML); and
    • VRML.

  
This is achieved using appropriate style sheets. Though such conversions can be done by through the command line, Cocoon offers an easier solution, in the form of a ready-made framework.

The recent version is Cocoon 2.x, but we are using Cocoon 1.8. Generally, it is better to use an earlier version, because the latest versions may not have been tested sufficiently enough and we do not get sufficient information (version 2 usually denotes a stable version, however). For beginners, ‘slightly earlier versions’ are always better. Our aim is just to get familiar with the environment rather than an exhaustive exploration of all the features. The field is changing rapidly and things such as WML and VRML, which were much spoken about a few years back, are now passé. However, conversion of XML into HTML and PDF (Acrobat reader) and RTF is still very much relevant. It appears that PDF conversion is much in demand by both readers and publishing companies.

Content-Syndication (RSS- Rich-Site-Summary) is another related area and there have been some tutorials on this earlier.

We can download Cocoon1.8.2 (zip) from the Apache website (www.apache.org). It’s about 6 MB only. We can place the zip file in any drive of our choice and then unzip it into any folder we prefer. In our case, it has been unzipped to C:\cocoon-1.8.2.

Let us now see the various folders available in this distribution:

  • bin
  • conf
  • docs
  • lib
  • samples
  • skins
  • src
  • xdocs

Though recently books and articles on Cocoon have started appearing, all of these deal with version 2. The only book that deals with Cocoon 1.8 is ‘Java and XML’ by Brett McLaughlin (chapter 10). Even a Google search for cocoon1.8 does not yield much information!

We will use Tomcat3.2. It is much simpler to use for beginners and is adequate for our purpose. We first copy all the jar files available in the lib folder of cocoon-1.8.2 to the lib folder of tomcat3.2.

Following are the relevant jar files available in the lib folder of cocoon-1.8.2:

  • bsfengines.jar
  • bsf.jar
  • fop.jar
  • sax-bugfix.jar
  • turbine-pool.jar
  • w3c.jar
  • xalan.jar (1.2)
  • xerces.jar(1.2)

(Actually, instead of being very selective, we can just copy all the jar files.)

After this, we should copy cocoon.jar available in cocoon’s bin folder to the lib folder of tomcat3.2.

The next step is to edit the server.xml file of tomcat3.2 available in tomcat’s
conf folder (NOTE: It needs to be stressed that we should be extremely careful regarding the keystroke case). Check out code 1.

Code 1

<Server

<ContextManager>
<!.............    ->

<Context         path=”/cocoon”
docBase=”webapps/cocoon”
debug=”0”
reloadable=”true”
</Context>
</ContextManager>
</Server>

Now, we have to create a directory as: Tomcat3.2\webapps\cocoon and create WEB-INF folder under it (WEB-INF and not web-inf!).

We should copy cocoon\conf\cocoon.properties into Tomcat3.2\webapps\cocoon\WEB-INF.

Secondly, we should copy Cocoon1.8.2\src\WEB-INF\web.xml to Tomcat3.2\webapps\cocoon\WEB-INF. The web.xml file would be as in code 2.

Code 2

<web-app>
<servlet>

<servlet-name>
org.apache.cocoon.Cocoon
</servlet-name>

<servlet-class>
org.apache.cocoon.Cocoon
</servlet-class>

<init-param>

<param-name>
properties
</param-name>

<param-value>
WEB-INF/cocoon.properties
</param-value>

</init-param>

<servlet-mapping>
<servlet-name>
org.apache.cocoon.Cocoon
</servlet-name>

<url-pattern>
*.xml
</url-pattern>
</servlet-mapping>
</web-app>

Now comes the most important step for successful running. We must remove parser.jar from the lib folder of tomcat3.2!

Cocoon works correctly only with xerces.jar (1.2) and the presence of parser.jar causes failure. Tomcat3.2 loads the jar files in alphabetical order and so parser.jar gets loaded ahead of xerces.jar and whatever parser comes first gets effective! Thus, if we leave parser.jar in the lib folder, xerces.jar is not used and as cocoon uses DOM2 level of xerces.jar, the application fails to get started.

At least for learning such nuances of class path, it is instructive to work with earlier versions of the software!

We are now ready to test our cocoon installation. As usual, we go to tomcat3.2\bin. Set JAVA_HOME=D:\JDK1.3

>startup

This starts the tomcat server. In the browser, we type the URL as: ‘http://localhost:8080/cocoon/Cocoon.xml.

The last line in screen may be worthy of attention. It says that the page has been dynamically created (where is the Cocoon.xml?). Check the following quote from the cocoon-FAQ (available in docs folder of cocoon is instructive).

“Firstly, Cocoon.xml is not an actual file on the disk - it is a special "virtual" test page. Note that it is case-sensitive, so cocoon.xml won't work.”

Anyway, if we get this display correctly, it means that we are ready to use Cocoon.

Our aim is to see the efficacy of cocoon in effecting the transformation of XML files into other formats, aided by appropriate xsl style sheets.

Apache has provided a number of sample demos, all of which are available in cocoon-1.8.2\samples folder.

We can just copy all the folders in this samples folder and place them in Tomcat3.2\webapps\cocoon.

We are now ready to test all the samples. We will find them to be very interesting.

What are the various samples in the samples folder and what do they demonstrate? The sample folder names are given below.

  • hello
  • fo
  • fp
  • slides
  • svg
  • wap
  • xsp
  • rss
  • docbook
  • sites, etc.

Details provided in table 1 have been taken from the index page of cocoon.

Table 1

Hello World - This is a very simple demonstration of how to use Cocoon. A simple XML page is transformed into an HTML page.

Hello World (with external message) - Same as above page, but some of its content is included using XML external entities.

Hello World (with imported style sheet) - Same as above page, but its style sheet is an extension of the previous one and changes some of its properties.

java.apache.org - This page shows a much more complex example that shows how powerful the style separation is and how powerful XML+XSL can be even for static web publishing.

jakarta.apache.org - This page shows the same thing for the Apache Jakarta web site.

RSS example - This page shows the use of Netscape RSS format for site description. This creates a simple way for one site to have headlines for news or items on other sites. Checkout the JetSpeed project for more information on this technology.

Structure Formatting with XSLT - This page shows the use of a general XSLT style sheet that creates a tree view of the input file. The slides XML source is used as input.

Article Outline (Simplified DocBook DTD) - This page shows the structure of an article written using the simplified DocBook DTD.

Simple Form Handling - This example shows how the FP XSP form handling taglib can be used to maintain a dynamic page.

DCP using Java - This page shows how you can use Cocoon to generate dynamic content using XML processing instructions to trigger Java logic execution.

Simple FO Example - This page shows some potentials of the XSL Formatting Object specifications transformed into PDF (we suggest you install Adobe Acrobat Reader as your browser plugin for smoother integration).

More complex FO + SVG Example - This page shows higher potentials of the XSL Formatting Object specifications when mixed with other graphic outline specification such as SVG (Scalable Vector Graphics). Both are interpreted by FOP and rendered into PDF as area and vector graphics.


Novel FO formatting - In this example, part of Joseph Conrad's "The Heart of Darkness" novel is taken from its original style-free XML format and rendered into PDF using an XML à FO style sheet.

SVG - In this example, we show how Cocoon is able to generate an SVG image out of a dynamically generated page. Database graphs and vector counters are just a few tags away.

VRML - In this example, we show how Cocoon is able to generate virtual reality models by applying the appropriate style sheet to an XML page.

Web and WAP - In this example, we show how Cocoon is able to understand which browser is requesting the page and applies a different style sheet to the same XML page to render on different clients. This page is formatted in WML (Wireless Markup Language) if the Nokia Wap Toolkit 1.2 browser (which you could get for free from Nokia) requests this page. Look into the example source to change this for your favorite WAP browser/cell-phone/PDA.

VoxML - This page has the same exact source file as the hello world example, but the style sheet formats it using the VoxML language. You page has been tested with the Motorola VoxML SDK, which you get for free from Motorola.

It is now time to test these samples in tomcat3.2.
Start the browser and type the URL as: http://localhost:8080/cocoon.

We get the directory listing as shown in table 2.

Table 2

Directory Listing for:/cocoon

Up to:/

Subdirectories:

      xsp/      

 

Thu, 17 Apr 2016 00:20 GMT+05:30

      import/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      docbook/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      entities/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      fo/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      fp/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      hello/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      dcp/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      ldap/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      mail/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      profiler/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      rss/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      sites/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      slides/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      sql/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      structure/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      svg/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      vml/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      vrml/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      wap/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

      xinclude/      

 

Thu, 17 Apr 2016 00:23 GMT+05:30

 

Files:

 

 

 

 

       README      

0.4 KB

Fri, 26 Jan 2016 18:17 GMT+05:30

 

       index.xml      

7.4 KB

Fri, 26 Jan 2016 18:17 GMT+05:30

 

       index.xsl      

1.7 KB

Fri, 26 Jan 2016 18:17 GMT+05:30

 

 

 

Tomcat Web Server v3.2.1

 

We can now try the examples one by one and see for ourselves that Cocoon delivers what it promises. Our interest is mainly in XML to PDF conversion. There are three examples for that!

  • http://localhost:8080/cocoon/test-fo.xml

Alas! We get a blank screen only!  Back to the FAQ then. It says that the problem is with the IE. It is suggested that we should modify the URL as:

http://localhost:8080/cocoon/test-fo.xml?dummy=test.pdf.

Fine! It starts the Acrobat reader and the PDF document is rendered on the screen! (Note: It is assumed that Acrobat Reader has been installed in our system).

b) The second example is: http://localhost:8080/cocoon/test2-fo.xml?dummy=test.pdf (the result is shown in figure 1). It includes SVG also!

c) Let us now see the sample for SVG alone. http://localhost:8080/cocoon/svg/hello.xml?dummy=test.svg
(It may be necessary to have Adobe Illustrator installed in your system).

Carefully note that we had to append ‘?dummy=test.svg to the URL (The result is shown in figure 2).

For testing VRML conversion, we need the VRML plug-in to be installed correctly in our system. We can download vrml plug-in for Internet explorer from: http://www.parallelgraphics.com/products/cortona/download/iexplore/ (file name: cortvrml.exe, file size: 1.58MB).

After installation, we can type the URL as: http://localhost:8080/cocoon/vrml/hello.xml

We will not get any display. However, as soon as we append ?dummy=test.wrl, the vrml-viewer is activated and we get the display.

Similarly, voiceXML (voxml) also requires the appropriate software from Motorola to be installed in the system.

Let us see if WAP browser conversion works. The corresponding URL is:
‘http://localhost:8080/cocoon/wap/example-portfolio.xml’

We have installed a very popular and standard wap-browser known as  ‘OpenWave’ (UP-Wap browser).

We can type the URL in wap-browser as: ‘http://localhost:8080/cocoon/wap/example-portfolio.xml’

We get the correct result! Check out figure 3.

If we see in each sample folder, we find style sheets and XML files. These will serve as models for our work. And with very slight modifications, as required, we can effect these transformations very easily. It is worth experimenting with each of these files.

The author can be contacted at: rs.ramaswamy@gmail.com.








}