This document is divided into two sections. The first is the installation process for XML and HTML reports for a standard DSpace installation, the second contains the instructions for taking advantage of Cocoon graphical reports.
The following are requirements for installing the ANU Statistics feature:
As this DSpace extension is only at an early release the installation process is a manual series of steps and so only recommended for technical users. To install:
tar xvfz dspace-stats.tar.gz) and cd to the distribution directory.ant and copy it into your DSpace source lib directory. Ensure you have your CLASSPATH environment variable to include the DSpace jars. This will create a dspace-stats.jar file in the lib directory.Note: Included in the source tree are some modified versions of servlets, namely RetrieveServlet, HTMLServlet, BitstreamServlet and HandleServlet. The modifications are to log events for logging item and bitstream views (e.g. changed log.info(LogManager.getHeader(context, "view_bitstream", "bitstream_id=" + bitstream.getID())); to log.info(LogManager.getHeader(context, "view_bitstream", "bitstream_id=" + bitstream.getID() + ":ip_addr=" + request.getRemoteAddr()));. The java classes provided with this distribution are based on DSpace 1.3.1 code so will need to be either removed completely (this will remove the ability to generate reports by IP), replaced by the modules from the version you are running with the changes applied, or the changes need to be applied to your own existing versions of these servlets.
dspace-web.xml in your DSpace source distribution with the new servlets and servlet mappings. These are provided in the web-xml-mods.xml file. You can ignore the ReportDispatcher servlet if you are not intending on installing the Cocoon reports. Ensure also to update the entries for the servlets listed above if you wish to use the IP filter capability and wish to use the version of these servlets provided with this distribution.psql < database/stats-dspace.sql. Ignore any errors of the form ERROR: table <tablename> does not existpsql dspace < database/stats-postgres.sqltemplog, view_item_log, view_bitstream_log, and ip_filter. The latter command will create a trigger to update the log tables whenever a "view item" or "view bitstream" event occurs.jsp/mydspace into the local JSP are of your DSpace source: cp -R jsp/mydspace $DSPACE_SOURCE/jsp/local. Note that main.jsp is the MyDSpace page and if you have customised this already you need to apply the changes in this distribution's main.jsp into your version. Note also if you are running a different DSpace version you will need to adjust the JSPs to fit with your version.config/querylist.xml file into your DSpace config directory.xsl directory into your DSpace directory (i.e. at the same level as the config directory).jsp.stats = Stats Page jsp.mydspace.main.stats.heading1 = Stats & Report Generators (Admin Only) jsp.mydspace.main.stats.heading2 = Cocoon Report Generator Application jsp.mydspace.main.stats.heading3 = Subscriptions jsp.mydspace.main.stats.button.xml = xml stats jsp.mydspace.main.stats.button.html = html stats jsp.mydspace.main.stats.button.graph = graph stats jsp.mydspace.main.stats.button.cocoon-report = Cocoon report
##### ANU Statistics Settings ##### stylesheet.dir = /dspace/xsl report.enabled = true report.querylist.file = /dspace/config/querylist.xml report.stylesheet.dir = /dspace/xsl
log4j.properties file substituting appropriate values for $DSPACE_DATABASE_USER and $DSPACE_DATABASE_USER_PASSWORD for your installation:
# A2 appender JDBCAppender
log4j.appender.A2=org.apache.log4j.jdbc.JDBCAppender
log4j.appender.A2.URL=jdbc:postgresql://localhost:5432/dspace
log4j.appender.A2.user=$DSPACE_DATABASE_USER
log4j.appender.A2.password=$DSPACE_DATABASE_USER_PASSWORD
log4j.appender.A2.sql=INSERT INTO templog (date, logger, priority, message) VALUES ('%d', '%c', '%-5p', '%m%n')
log4j.appender.A2.layout=org.apache.log4j.PatternLayout
Also add the A2 appender by modifying the existing rootCategory entry as follows:log4j.rootCategory=INFO, A1, A2
This section covers the installation of the graphical Cocoon reports. To install:
cp -R cocoon/charts $COCOON_DIR/build/webappcp -R cocoon/dspace-logs $COCOON_DIR/build/webappdspace-logs Cocoon webapp:
sitemap.xmap file and set the global variables to match your DSpace installationgen-stats-page.xsl file and set the hardcoded DSpace URLcd cocoonant -Dcocoon-dir=$COCOON_DIR to compile a custom Cocoon generator, substituting $COCOON_DIR with your Cocoon installation directory.cp -R src/au $COCOON_DIR/build/webapp/WEB-INF/classes##### ANU Cocoon Statistics Settings ##### report.cocoon.enabled = true report.cocoon.chart.url = http://mydspace.myorg:8888/charts/ report.cocoon.stats.url = http://mydspace.myorg:8888/dspace-logs/main dspace.url.withport = http://mydspace.myorg:8080
dspace.jar and postgresql.jar into $COCOON_DIR/build/webapp/WEB-INF/libhost dspace dspace 127.0.0.1 255.255.255.255 md5 is required in the pg_hba.conf fileOn logging into DSpace the MyDSpace page will now have a number of buttons allowing access to reports. The reports are defined in the querylist.xml -- some samples are included in the distribution -- however any number can be added or deleted dynamically (no restart of Tomcat is required). If you installed the Cocoon extensions and have Cocoon running you will be able to access the graphical reports.
A simple overview of the components involved in the statistics add-on is shown below.

A simple use case. A user explores a DSpace collection during which time log4j appends all log events to the templog table. For example the user accesses a bitstream which results in the following record being recorded in the database:
date = 2005-06-11:23:54,461 logger = org.dspace.app.webui.servlet.RetrieveServlet priority = INFO message = anonymous:session_id=600F15A4E8056EBD23C575F056A0474C:view_bitstream:bitstream_id=6110:ip_addr=150.203.59.132The trigger function analyses the data and records the following data in the view_bitstream_log table:
bitstream_id = 6110 item_id = 6109 session_id = 600F15A4E8056EBD23C575F056A0474C user_id = anonymous date = 2005-06-22 time = 11:23:54.4610 remote_ip = 150.203.59.132This data can be retrieved by queries specified in the querylist.xml file. For example, using the form generator servlet the data can be retrieved in XML, HTML or graph (Cocoon only, JPEG or SVG).
By selecting the report format, a page of queries is shown. The example below shows the representation of the 'items-viewed-in-collection-inTime' query which is defined in the querylist.xml file.

A dissection of the querylist.xml entry for this query is as follows:
<query name="items-viewed-in-collection-inTime" title="Items Viewed in a Period of Time">
Optional transformation specification. Each should define type and stylesheet. The result processor will transform the result set using the nominated by the users.
<option name="use-xsl-transform" type="xml" stylesheet="identity.xsl" render-to="xml document"/>
<option name="use-xsl-transform" type="html" stylesheet="resultset2table.xsl" render-to="table"/>
<option name="use-xsl-transform" type="graph" stylesheet="resultset2dataSet.xsl" render-to="linechart.jpg"/>
Required parameters. The form generator will produce input fields for each element to collect parameter values.
<param src="collectionlist4Eperson" name="collection IDs" id="p3"/>
<param name="date From (DD-MM-YYYY)" id="p1"/>
<param name="date To (DD-MM-YYYY)" id="p2"/>
The SQL query definition element. The report generator will replace the parameters with the input values and execute the query.
<sql>
SELECT collection_id, date, name, sum(volume) FROM (
SELECT c2i.collection_id, vil.date, cl.name, count(vil.item_id) AS volume
FROM view_item_log vil, collection2item c2i, collection cl
WHERE vil.item_id = c2i.item_id
AND c2i.collection_id IN (p3)
AND cl.collection_id = c2i.collection_id
AND date < to_date('p2', 'DD-MM-YYYY')
AND date > to_date('p1', 'DD-MM-YYYY')
AND vil.remote_ip not IN (select remote_ip FROM ip_filter)
GROUP BY 2,3,1
UNION ALL
SELECT c2i.collection_id, vbl.date, cl.name,count(vbl.item_id)
FROM view_bitstream_log vbl, collection2item c2i, collection cl
WHERE vbl.item_id = c2i.item_id
AND c2i.collection_id IN (p3)
AND cl.collection_id = c2i.collection_id
AND date < to_date('p2', 'DD-MM-YYYY')
AND date > to_date('p1', 'DD-MM-YYYY')
AND vbl.remote_ip not IN (select remote_ip FROM ip_filter)
GROUP BY 2,3,1
) AS Foo
GROUP BY 2,3,1
ORDER BY 2
<sql>
The output generated by the transformers for the appropriate collection and date range will produce the following resultsets:
XML
<resultset EPersonID="1" EpersonName="Leo Monus" QueryTitle="Items Viewed in a Period of Time" date="Wed Jun 22 16:11:54 EST 2005" p1="10-05-2005" p2="10-07-2005" p3="4,5,9,10,11,12,13,14,16" rName="items-viewed-in-collection-inTime" render-to="xml document" scaleBy="day" xsltype="xml"> <result><collection_id>10</collection_id> <date day="131" month="05" week="20" year="2005">2005-05-11</date><name>CM:1</name> <sum>1</sum></result><result><collection_id> 9</collection_id><date day="131" month="05" week="20" year="2005">2005-05-11</date> <name>CM:40</name><sum>14</sum></result> <result><collection_id>11</collection_id><date day="131" month="05" week="20"e; year="2005">2005-05-11</date> <name>cm-40-b</name><sum>21</sum></result> <result><collection_id>5</collection_id><date day="131" month="05" week="20"year="2005">2005-05-11</date> <name>test..1..2..3</name><sum>7</sum></result> <result><collection_id>12</collection_id><date day="132" month="05" week="20" year="2005">2005-05-12</date><name>cm-1-b</name> <sum>13</sum></result><result><collection_id>11</collection_id><date day="132" month="05" week="20" year="2005">2005-05-12</date><name>cm-40-b</name> <sum>18</sum></result><result><collection_id>4</collection_id><date day="132" month="05" week="20"e; year="2005">2005-05-12</date><name>potr loadtest</name><sum>28</sum> </result><result><collection_id>5</collection_id><date day="132" month="05" week="20" year="2005">2005-05-12</date><name>test..1..etc.
HTML

Cocoon Graph
