The need:
We wanted to be able to play videos on DSPACE item pages without having to download the video file. On our DSPACE 4.2 installation, visitors were shown icons for video material and if they wanted to watch the video, they needed to download the whole file and play it with their own desktop video player software. This was a very inconvenient way of presenting video repositories and caused waste of time+bandwidth.
We wanted something like the screen below (and now we have it): (click image to enlarge)
You can see a working demo video item here. The demo item contains two video bitstreams; one in mp4, the other in webm.
Challenges:
- Inserting some type of web video player into DSPACE item view pages,
- making these video players pseudo stream; i.e. allow seeking into various spots of the video time line without waiting for the whole thing to be downloaded.
Solution:
The solution we have developed works only on Linux DSPACE servers and the Mirage theme. Adapting the solution to other platforms and/or themes, however; should be possible.
A quick summary of what we have done:
- We have installed a second web server software (namely lighttpd) onto the DSPACE server listening to port 9090.
- We modified the item-view.xsl file of the Mirage theme so that it inserts some Javascript and HTML5 video tags into the item-view pages.
- We used Ajax to map DSPACE’s non-trivial bitstream file naming convention to “real looking” files located in the web document root so that lighttpd’s pseudo streaming would work.
Now the gory details… I shall try to mention every and single detail that will help you understand what’s going on and how things are handled…
Our environment
UBUNTU 14.10 Linux
DSPACE 4.2
Tomcat 7.0.53
Java (Open Java) 1.7.0_79
PostgreSQL 9.4
Prerequisites
- You should read through this document before attempting installation/configuration changes.
- You should have root shell access to the server.
- You should consider the risk of having to restore/restart/reboot your DSPACE server. Working on a clone server would be the best. (What? You are not running DSPACE on a virtual machine! Maybe you should!) .
- We have tested and implemented the procedures described in this document on an Ubuntu 14.10 Linux server running DSPACE 4.2. These steps should work on other contemporary Linux distros gracefully; but you never know! Other operating systems might require serious modifications in the procedures/scripts described in this document.
Install and configure lighttpd as a second web server on the DSPACE server
Why?
Because the primary web server; Tomcat and DSPACE do not serve “bitsream” files as a normal web server would do. DSPACE uses some Java code to “push” binary file contents with an appropriate MIME type so that the remote client can save it to its local disk. But this is not what we want. We want a web server which supports pseudo streaming (PS) and a good candidate is lighttpd (suppports PS natively, has small footprint on the server, easy to configure). Pseudo-streaming allows the client to send a time-line index to the server while it is downloading a binary video file and the server can calculate a new seek position within the file and jump to the calculated frame within the video file. This is good; and lighttpd supports this by default. We recommend that you start with installing lighttpd and php5: (use your own distro’s packet manager command; for us it was “apt-get”)
# apt-get install lighttpd php5 php5-fpm
# apt-get install mediainfo
You do not have to install mediainfo if you do not have plans to automatically resize video players but it is a nice tool to have around; especially when you are dealing with videos.
Edit /etc/lighttpd/lighttpd.conf file so that it has the lines
server.document-root = “/var/www”
server.port = 9090
in it. Now, you have to make sure that lighttpd works with PHP5. You can test your installation by using the standard info.php procedure. If you cannot see the typical info.php screen by typing
http://yourserver.somedomain.edu:9090/info.php
into your browser (note the port specification “:9090“) then you have to stop here and resolve this issue first. Asking Google with the keywords “install lighttpd php5 your-distro” will normally help a lot.
Once you get lighttpd up and serving on port 9090, you shall need to remove the “info.php” file and install postgresql modules for PHP5:
# apt-get install php5-pgsql
Then you have to tell lighttpd that cross-domain Ajax calls should be allowed. Append the lines
# DSPACE: Allow cross-sites Ajax calls
setenv.add-response-header = ( “Access-Control-Allow-Origin” => “*” )
to the end of /etc/lighttpd/lighttpd.conf file and restart lighttpd.
# /etc/init.d/lighttpd restart
The reason for this “allow cross-site Ajax calls” thing is, after our modifications, Tomcat web pages (on port 8080) will issue Ajax requests to scripts running on port 9090; and although they all reside on the same server; lighttpd will normally treat them as “cross-site” requests just because of their disparate port numbers and deny serving these requests.
Install /var/www/yourDIR
Now you have to put some files in a directory under the lighttpd document root. In our case, this directory is “/var/www/bilkent“; i.e. “yourDIR” is “bilkent“. For the sake of integrity and consistency I shall use this /var/www/bilkent directory throughout this document.
Create the file /var/www/bilkent/bilkent.js so that it contains
document.onreadystatechange = function() {
if (document.readyState == "complete") {
var allVideos = document.getElementsByTagName('video');
for(var i = 0; i < allVideos.length; i++) {
var id = allVideos[i].id;
var domId = document.getElementById(id);
var src = domId.getAttribute("src");
loadXMLDoc( src, id);
}
}
}
function loadXMLDoc( src, elemId)
{
var xmlhttp;
if (window.XMLHttpRequest)
{// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp=new XMLHttpRequest();
}
xmlhttp.onreadystatechange=function() {
if (xmlhttp.readyState==4 && xmlhttp.status==200) {
response = xmlhttp.responseText;
// something like "1|http://dspace.bilkent.edu.tr:9090/bilkent/tmp/xyz.mp4|640|480" is expected
// a "|" separated string; first item is the id of the
// video tag in question and the second item is the SRC content
// that is supposed to go into this tag
// The last two items are the width and height of the video
// If any error was encountered in the ajaxserver.php,
// the response would start with the characters "Error" followed by
// reason for the error.
if ( response.substring(0, 5) == "Error") {
alert(response);
return false;
}
parts = response.split("|");
// replace video's SRC attribute with the URL received from ajax_server.php
document.getElementById(parts[0]).setAttribute("SRC", parts[1]);
// adjust element's w and h according to the video's w & h
//document.getElementById(parts[0]).setAttribute("width", parts[2]);
//document.getElementById(parts[0]).setAttribute("height", parts[3]);
}
}
// Parse src to get
// i) handle
// ii) title
// iii) seq number
// src example string: /xmlui/bitstream/handle/123456789/1637/title.webm?sequence=3&isAllowed=y
matches = src.match(/handle\/\d*\/(\d*)\/(.*)\?sequence=(\d*)\&/) || [""];
handle = matches[1];
title = matches[2];
seq = matches[3];
ajax_msg = 'h='+handle+"&t="+title+"&s="+seq+"&VIDEOID="+elemId;
xmlhttp.open("GET", "http://repository.bilkent.edu.tr:9090/bilkent/ajaxserver.php?"+ajax_msg, true);
xmlhttp.send();
The above JS code will be injected into every item-view page by the item-view.xsl (more about this later). This JS code traverses all DOM elements with a <video> tag and gets the contents of each SRC field; makes an Ajax call to an ajaxserver.php (more on this later) which returns a modified text to replace the <video> tag’s SRC field.
When the item-view.xsl is transforming item data, all we know about the video item is its DSPACE URL which is something like:
/xmlui/bitstream/handle/123456789/1637/title.mp4?sequence=1&isAllowed=y
The important pieces of info in this URL are the handle (“1637” in the above example) and sequenceID (e.g. “1“). Using these two pieces of information, it is possible to resolve the bitsream file’s DSPACE generated name and its location in the “assetstore” directory. This resolution is done by the PHP code which is activated by an Ajax call made by the Javascript code above. The java script function xmlhttp.onreadystatechange parses the SRC field of each DOM element with a “<video>” tag; makes Ajax calls to “ajaxserver.php” for each of them, with a GET parameter of the form “?h=1637&t=title.mp4&s=1“.
The Ajax server (ajaxserver.php) then dips into the DSPACE database ( assumed to be a PostgreSQL db), finds the bitsream’s file name; makes a UNIX soft link to it under lighttpd’s document root and extracts its video size (h & w) and returns a “|” separated string of the form:
“1|http://dspace.bilkent.edu.tr:9090/bilkent/tmp/xyz.mp4|640|480”
where the 1st item is the HTML ID of the relevant <video> tag; second item being the URL of the link target which points to the video file; and last two items being the dimensions (h & w) of the video. Video dimensions are not used in the current version of this implementation.
The Ajax Server
Create a file called “ajaxserver.php” in /var/www/yourDIR so that it contains (you’ll need to change $host, $user, $pass, $db variables so that they reflect your installation):
<?php // What this piece of code does: // This code acts as an Ajax server for some modifications // to DSPACE ( http://dspace.org ) so that video items stored in DSPACE // can be played reasonably on a web browser while viewing the item. // In short: // - this PHP code must reside and run on the same server with DSPACE, // - It expects Ajax requests which contain a DSPACE "item show" URL, // - parses this URL to extract the DSPACE "bitstream handle", // - finds the "real" file stored as a DSPACE asset for this handle, // - makes a soft link from the asset file to a file in a temporary dir // under the lighttpd web document root (not DSPACE's Tomcat // document root), // - determines the video's height and width, // - returns the URL for the linked video file and its height & width // (uses external Linux command "mediainfo" to fetch video dims). // IMPORTANT: THIS SCRIPT WILL ONLY WORK ON LINUX/UNIX servers. // (What? do you use a WinX box as your server? Are you crazy?) // Joking aside; it should be possible to adapt all these stuff // to WinX env; but it is totally out of my expertise. Sorry! // Author: Can Ugur Ayfer ( cayfer@gmail.com, cayfer@bilkent.edu.tr ) // June 2015, Ankara // License: GPL of course... Do whatever you want to with this code. // Just keep my name somewhere in it. // DSPACE database details (Assuming the dB is PostgreSQL) // If ORACLE or another dB is used in your DSPACE installation, // this script will need serious modification. $host = "localhost"; // or your DB host $user = "yourDBuser"; $pass = "yourDBpwd"; $db = "yourDB"; // See your DSPACE installation for the correct value of $assetstore // It should point to the directory on your server where all // "bitstream" (DSPACE parlance) files are stored $assetstore = "/dspace/assetstore"; // This temp dir should be manualy created and writable by the lighttpd user // (typically www-data:www-data). Softlinks to DSPACE bitsream // files will be created here links older than $max_age days will // be deleted automatically. // This dir must be somewhere under the lighttpd document root. $tmp_dir = "/var/www/bilkent/tmp"; // this is the tmp dir we use at Bilkent Uni. // prefix to be added to URLs pointing to soft linked video files // Note the non-standard port number 9090) // This is the web URL prefix to the temp directory declared above $url_prefix = "http://repository.bilkent.edu.tr:9090/bilkent/tmp"; // softlinks created under $tmp_dir will be deleted after $max_age days // who needs them, afterall, after the video is watched? $max_age = 1; // Linux mediainfo command is used to determine a video's height & width // if this command is not available, or H & W cannot be extracted // meaningfully, W: $default_w and H: $default_h will be used; $mediainfo_path = "/usr/bin/mediainfo"; $default_w = 320; $default_h = 240; // Max-W and Max-H: These are the max height or width we shall // have for our video windows on the page. Whichever limits the // window size first... That is; if a video has a width > Max-W; // we shall limit the width to Max-W and adjust height accordingly. // Similarly; if a video has a height > Max-H, we shall set the // height to Max-H and adjust the width accordingly. $maxW = 640; // pixels $maxH = 480; // Nothing to set beyond this point unless you know what you are doing ////////////////////////////////////////////////////////////////////// $uri = $_SERVER['REQUEST_URI']; // // URI is something like the following: // // ajaxserver.php?http://repository.bilkent.edu.tr/xmlui/bitstream/handle/123456789/1637/title.mp4?sequence=1&isAllowed=y // // We shall find the real file's name and location using this information // and build an SQL statement which would look like: // // select internal_id from bitstream where bitstream_id in ( // select bitstream_id from bundle2bitstream where bundle_id in ( // select bundle_id from item2bundle where item_id in ( // select resource_id from handle where handle_id=1637))) and sequence_id=1; // if ( ! ( isset($_GET['h']) && isset($_GET['t']) && isset($_GET['s']) && isset($_GET['VIDEOID']) ) ) { respond( "Error|Invalid URI $uri"); } $handle = $_GET['h']; $title = $_GET['t']; $seq = $_GET['s']; $videoid= $_GET['VIDEOID']; $conn = pg_connect("host=$host dbname=$db user=$user password=$pass") or respond ("Error|Could not connect to dspace dB server"); $query = "select internal_id from bitstream where bitstream_id in ( select bitstream_id from bundle2bitstream where bundle_id in ( select bundle_id from item2bundle where item_id in ( select resource_id from handle where handle_id=" . $handle . "))) and sequence_id=" . $seq.";"; $rs = pg_query($conn, $query) or respond("Error|Invalid query: $query"); $row = pg_fetch_row($rs); // Only one row is expected to return if ( ! $row ) { respond("Error!Bitstream not found"); } $source = $row[0]; pg_close($conn); // now, $source contains the dspace bitstream source name // which is the dspace internal file name for the bitstream. // This file should be somewhere under $assetstore (e.g. /dspace/assetstore) // The file name itself includes the 3 levels of directories under which // the file is actually stored. // If, for instance, the filename is "51045803935617297212273559354695283194", // the file is located in /dspace/assetstore/51/04/58/ // The dir names are the first 2, 3rd, 4th and 5th, 6th digits of the // filename. // i.e. 51045803935617297212273559354695283194 // ^^^^^^ // |||||| // |||| // || // // This assetstore is not normally under the web tree therefore we cannot // directly serve it as a file thru the web server. Therefore, we shall // establish a temp soft link to the $tmp_dir directory with a proper file // extension and return this as the value to replace the SRC attribute // in the calling Javascript/HTML page. // Old links in $tmp_dir directory can be removed safely. // What is the extension of the bitstream? if ( ! preg_match( '/\.(.*)$/', $title, $matches) ) { respond ("Error|Title does not have an extension"); } $ext = $matches[1]; // Check if $tmp_dir exists and is writable if ( ! is_writable($tmp_dir) ) respond ("Error| ".$tmp_dir." is not writable"); // Clean old links in $tmp_dir $cmd = "/usr/bin/find ".$tmp_dir." -mtime +".$max_age.' -exec /bin/rm {} \;'; system($cmd); // Extract dir names for the $assetstore dir $d1 = substr($source, 0, 2); $d2 = substr($source, 2, 2); $d3 = substr($source, 4, 2); $source_file = $assetstore."/".$d1."/".$d2."/".$d3."/".$source; // use PHP microtime to generate unique link name $microt = strtr(microtime(), " ", "."); $link_target = $tmp_dir . "/" . $microt . "." . $ext; if ( ! symlink ( $source_file, $link_target) ) respond("Error|Cannot create link"); $linkname = $microt . "." . $ext; $url = $url_prefix."/".$linkname; // Try to get video's height and width $w = $default_w; $h = $default_h; if (is_executable($mediainfo_path)) { if (file_exists($link_target)) { $cmd = $mediainfo_path." --Output=XML ".$link_target; exec($cmd, $lines); foreach ($lines as $line) { // mediainfo reports size info with a space in numbers > 999 // That is; "1024" is reported as "1 024" if (preg_match('/(\d*\s*\d*)\spixels<\/Width>/', $line, $matches)) { $w = $matches[1]; // mediainfo command places a space in numbers with more than 3 digits // That is; "1024" is reported as "1 024" $w = str_replace(" ", "", $w); } if (preg_match('/(\d*\s*\d*)\spixels<\/Height>/', $line, $matches)) { $h = $matches[1]; $h = str_replace(" ", "", $h); } } // Lets see whether the video fits into our $maxH and $maxW limits if ($w > $maxW) { // what is the ratio ? $scale = $maxW/$w; $w = $maxW; $h = intval($h * $scale); } if ($h > $maxH) { // what is the ratio ? $scale = $maxH/$h; $h = $maxH; $h = intval($h * $scale); } } } respond($videoid."|".$url."|".$w."|".$h); exit; function respond($msg) { echo $msg; exit; } ?>
Please do not forget to edit $host, $user, $pass, $db variables so that they reflect the settings of your DSPACE installation. You can consult “/dspace/config/dspace.cfg” file (your installation might have the config file elsewhere) for the correct values.
This Ajax server digs into the DSPACE database to find the “internal_id” which corresponds to the “handle” from the DSPACE database (see DSPACE 4 schema). “handle” and “sequence_id” together can identify a bitstream resource uniquely.
“internal_id” in DSPACE is a long string of digits (e.g. 51045803935617297212273559354695283194 ) which actually is the name of a file under the DSPACE “assetstore” directory ( in our case it is “/dspace/assetstore“). The interesting and useful info embedded into this file name (internal_id) is that the first 6 digits indicate the directory path to the actual file itself. For example; if the “internal_id” is 51045803935617297212273559354695283194; the file is located in “/dspace/assetstore/51/04/58” and the file’s name is “51045803935617297212273559354695283194″ (note the first 6 digits; in two digit triplets: “51“, “04” and “58“).
The ajaxserver.php script, once it finds the DSPACE filename of the bitstream, creates a soft link (Linux/UNIX only) to it in the lightttpd document root hierarchy ( “/var/www/bilkent/tmp” as in our case; set by the variable $tmp_dir) using the PHP microtime function.
The softlink’s name and path are generated by catenating the $tmp_dir variable, the microtime string and title’s file extension part; e.g. :
/var/www/bilkent/tmp/0.45029000.1435702502.mp4
Assuming that lighttpd Follows Sym Links; which it does by default; the soft link can now be accessed through http using the URL:
http://dspace.bilkent.edu.tr:9090/bilkent/tmp/0.45029000.1435702502.mp4
The microtime function ensures that the link name is unique for each visitor. The ajaxserver.php script deletes soft links which are older than $max_age (typically 1; which means 24 hours) every time it is invoked.
Then the ajaxserver.php script determines the dimensions of the video file using the mediainfo Linux command.
Finally the ajaxserver.php sends back the Ajax response which is a simple “|” separated string in the format:
videoID|URL|W|H
“videoID” was an input to this script ($_GET[‘VIDEOID’]). The Javascript code (bilkent.js) then parses this response and replaces the SRC parameters of the VIDEO element with the returned ID. We do not use “W” and “H” (video width and height in pixels) parameters right now because common browsers these days cannot resize video players’ control bars properly.
Modifying the XSL file
We have been using only DSPACE’s Mirage theme therefore all the modifications we present here were tested with this theme. The file to be modified is :
/dspace/webapps/xmlui/themes/Mirage/lib/xsl/aspect/artifactbrowser/item-view.xsl
Before modifying this file; first make a backup copy:
# cd /dspace/webapps/xmlui/themes/Mirage/lib/xsl/aspect/artifactbrowser
# copy item-view.xsl item-view.original
Then using your favorite text editor (it’s “vi”, isn’t it? 🙂 ) locate the closing </a> tag around line 487.
Locate this block of lines:
.......
<xsl:if test="contains(mets:FLocat[@LOCTYPE='URL']/@xlink:href,'isAllowed=n')">
<img>
<xsl:attribute name="src">
<xsl:value-of select="$context-path"/>
<xsl:text>/static/icons/lock24.png</xsl:text>
</xsl:attribute>
<xsl:attribute name="alt">xmlui.dri2xhtml.METS-1.0.blocked</xsl:attribute>
<xsl:attribute name="attr" namespace="http://apache.org/cocoon/i18n/2.1">alt</xsl:attribute>
</img>
</xsl:if>
</a>
</div>
<div class="file-metadata" style="height: {$thumbnail.maxheight}px;">
<div>
<span class="bold">
<i18n:text>xmlui.dri2xhtml.METS-1.0.item-files-name</i18n:text>
<xsl:text>:</xsl:text>
</span>
.......
Copy the lines in the following box and paste them so that the lines are inserted just after the closing “</a>” tag.
<!-- **** HTML Video Player modifications START here **** -->
<!-- Insert the Javascript -->
<xsl:text disable-output-escaping="yes">
<![CDATA[
<script src="http://repository.bilkent.edu.tr:9090/bilkent/bilkent.js" language="javascript" DEFER="yes">
</script>
]]>
</xsl:text>
<!-- end of Javascript stuff -->
<xsl:choose>
<xsl:when test="current()/@MIMETYPE = 'video/webm'">
<video width="320" height="240" type="video/webm" controls="controls" preload="auto" poster="http://repository.bilkent.edu.tr:9090/bilkent/poster.png">
<xsl:attribute name="id">
<xsl:number/> <!-- used to increment element ID automatically -->
</xsl:attribute>
<xsl:attribute name="src">
<xsl:value-of select="mets:FLocat[@LOCTYPE='URL']/@xlink:href" disable-output-escaping="yes"/>
</xsl:attribute>
</video>
</xsl:when>
</xsl:choose>
<xsl:choose>
<xsl:when test="current()/@MIMETYPE = 'video/mp4'">
<video width="320" height="240" type="video/mp4" controls="controls" preload="auto" poster="http://repository.bilkent.edu.tr:9090/bilkent/poster.png">
<xsl:attribute name="id">
<xsl:variable name="i" select="position()" />
<xsl:copy>
<xsl:value-of select="$i" />
</xsl:copy>
</xsl:attribute>
<xsl:attribute name="src">
<xsl:value-of select="mets:FLocat[@LOCTYPE='URL']/@xlink:href" disable-output-escaping="yes"/>
</xsl:attribute>
</video>
</xsl:when>
</xsl:choose>
<xsl:choose>
<xsl:when test="current()/@MIMETYPE = 'video/mp4v-es'">
<video width="320" height="240" type="video/mp4" controls="controls" preload="auto" poster="http://repository.bilkent.edu.tr:9090/bilkent/poster.png">
<xsl:attribute name="id">
<xsl:variable name="i" select="position()" />
<xsl:copy>
<xsl:value-of select="$i" />
</xsl:copy>
</xsl:attribute>
<xsl:attribute name="src">
<xsl:value-of select="mets:FLocat[@LOCTYPE='URL']/@xlink:href" disable-output-escaping="yes"/>
</xsl:attribute>
</video>
</xsl:when>
</xsl:choose>
<xsl:choose>
<xsl:when test="current()/@MIMETYPE = 'video/ogg'">
<video width="320" height="240" type="video/ogg" controls="controls" preload="auto" poster="http://repository.bilkent.edu.tr:9090/bilkent/poster.png">
<xsl:attribute name="id">
<xsl:variable name="i" select="position()" />
<xsl:copy>
<xsl:value-of select="$i" />
</xsl:copy>
</xsl:attribute>
<xsl:attribute name="src">
<xsl:value-of select="mets:FLocat[@LOCTYPE='URL']/@xlink:href" disable-output-escaping="yes"/>
</xsl:attribute>
</video>
</xsl:when>
</xsl:choose>
<!-- **** HTML Video Player modifications END here **** -->
Our item-view.xsl can be seen here.
What this additional XSL code does is:
- Inserts Javascript inclusion lines into the page,
- For each video bitstream item, inserts an HTML “<video>” tag with incremental IDs and SRC parameters containing DSPACE item view URLs and an appropriate “type” parameter depending on the MIME type of the bitstream. In our case, we are only interested in “video/webm“, “video/mp4“, “video/mp4v-es“, “video/ogg” MIME types; therefore we have <xsl:when test=”current()/>MIMETYPE = ‘video/xxx'”> blocks only for these MIME types.
Done…
You should be ready to view mp4, webm video files inline now. Go ahead and upload some mp4 or webm video files to some items and try.
Video encoding issues
Well.. This is a complicated topic! If you choose the WEBM encoding standard because it is the newest, open source, widely supported (so you think) modern HTML5 standard; IOS users can’t watch your videos. If you revert to MP4, Firefox users can’t view your videos. If you choose to use a cross-platform video player with Flash etc.; then you have to pay a subscription fee; plus Flash has a different set of problems in the Apple world.
As of this write-up, to our knowledge, only Chromebook users cannot watch our MP4 videos with the standard HTML5 player we use. They still have the option of downloading the video though…
We decided to stick to MP4 for the time being but wrote our XSL so that it supports the WEBM encoding as well. We are aware that more work on this is waiting for us in the near future. We’ll see..