How to Set Up DHconvalidator with Docker

This blog is usually written in Japanese, but I will write it in English because I think people who set up DHconvalidator can usually at least read English tutorials.

DHconvalidator is a very cool tool that, in conjunction with Conftool (https://www.conftool.net/en/index.html), automatically retrieves the list of authors and affiliations of the papers presented and finally makes the presenters create abstracts in TEI/XML format.

A huge amount of files of papers in TEI/XML format are already accumulated in various locations (mainly on Github) and can be used in various ways. Statistical analysis is also expected, but for now, an easy-to-understand way to use the data is to go to The Index of Digital Humanities Conferences, which is a very straightforward site. You can search for papers (mainly abstracts) presented at past DH conferences in various regions. For example, a search using my name returns 26 results.

The DHconvalidator, which makes all these wonderful things possible, is very easy to set up if you keep a few points in mind. I would be able to set it up in 20 minutes with the following tutorial.

However, it took me about three days because I had overlooked one of the following points I decided to write this tutorial so that you all do not have to go through this unnecessary suffering of mine.

0. Check the secure pass of the Conftool for your conference

A Google search on this matter may turn up old documents. In the current Conftool, go to Overview > Data Import and Export > Integrations With Other Systems and look at the bottom and see "Enable General REST Interface ". Select Yes here and note the REST Interface Password. Also note the URL of the REST API. This URL is probably something like https://www.conftool.net/[your conference name]/rest.php.

Here, let's assume the server name running DHconvalidator is dhconv.dhii.jp.

1. Configure the virtual server

(This is the case for Redhat Linux operating systems including Alma, Rocky, etc. For Debian, Ubuntu, etc., please modify the settings according to the OS settings.)

To set up a virtual server, I can assign a new hostname using CNAME on the DNS server and then configure it on the HTTP server. In this case, I used the hostname dhconv.dhii.jp using Bind and Apache, so it looks like this

Configuration for Bind under chroot environment:

Edit "[chroot]/etc/DNS/dhii.jp":

dhconv  IN      CNAME   parentserver.dhii.jp.

(Don't forget to increase the value of serial)

Restart Bind

# systemctl restart named-chroot

As for Apache, I configured virtual hosts and then Certbot to support https.

Edit "/etc/httpd/conf.d/vhost.conf" or "[Apache home]/conf.d/vhost.conf"

<VirtualHost *:80>
ServerName dhconv.dhii.jp
DocumentRoot /var/www/html/
DirectoryIndex index.html index.php
ErrorLog logs/dhconv-error_log
CustomLog logs/dhconv-access_log combined env=!no_log
</VirtualHost>

Restart Apache

# systemctl restart httpd

Run Certbot

# certbot -d dhconv.dhii.jp
2. Set up a reverse proxy

Edit "/etc/httpd/conf.d/vhost-le-ssl.conf" or "[Apache home]/conf.d/vhost-le-ssl.conf":

<VirtualHost *:443>
ServerName dhconv.dhii.jp
(snip)
<Proxy *>
        Require all granted
</Proxy>

        ProxyRequests Off
        ProxyPreserveHost On
        ProxyPass / http://localhost:8080/ keepalive=On
        ProxyPassReverse / http://localhost:8080/
        RequestHeader set X-Forwarded-Proto "https"
(snip)
</VirtualHost>

Restart Apache

# systemctl restart httpd
3. Clone the GitHub repository
% git clone https://github.com/ADHO/dhconvalidator
4. Edit template files

Edit the docx and ott file templates in the following directories as appropriate for your meeting. Be careful not to change too much or the conversion to TEI/XML may not work.

$ ls dhconvalidator/src/main/resources/template/
DH_template_DH2016_en.docx  DH_template_DH2018_es.docx   DH_template_DHd2016_en.docx  old_DH_template_DH2018_en.docx
DH_template_DH2016_en.ott   DH_template_DH2018_es.ott    DH_template_DHd2016_en.ott
DH_template_DH2018_en.docx  DH_template_DHd2016_de.docx 
DH_template_DH2018_en.ott   DH_template_DHd2016_de.ott   

In this case, the following two files were created. Note how the files are named.

 DH_template_JADH_en.docx DH_template_JADH_en.ott
5. Edit the Dockerfile

Next, here is the most important part. Write the required information in the following file.

dhconvalidator/Dockerfile

The contents of this file will look like this. Rewrite the "jadh-2024" value as appropriate for your conference environment and some other values. In particular, do not forget the [REST Interface Password]. (This Dockerfile is a slight modification of the one on GitHub, so if you edit it based on this one, you can eliminate errors during Docker builds..)

# First step: build the war file
FROM gradle:5.4 as builder

WORKDIR /home/gradle/dhconvalidator
USER root:root
COPY . .
RUN gradle war  --no-daemon

# step 2: run the application server
FROM jetty:alpine
USER root
RUN apk add --no-cache curl

ENV dhconvalidator_base_url=http://dhconv.dhii.jp/dhconv \
    dhconvalidator_conftool_login_url=https://www.conftool.net/jadh-2024/ \
    dhconvalidator_conftool_rest_url=https://www.conftool.net/jadh-2024/rest.php \
    dhconvalidator_conftool_shared_pass=[REST Interface Password on the Conftool] \
    dhconvalidator_defaultSubmissionLanguage=ENGLISH \
    dhconvalidator_encodingDesc='<encodingDesc xmlns="http://www.tei-c.org/ns/1.0"><appInfo><application ident="DHCONVALIDATOR" version="{VERSION}"><label>DHConvalidator</label></application></appInfo></encodingDesc>' \
    dhconvalidator_html_address_generation=true \
    dhconvalidator_html_to_xml_link=true \
    dhconvalidator_image_min_resolution_height=50 \
    dhconvalidator_image_min_resolution_width=100 \
    dhconvalidator_logConversionStepOutput=true \
    dhconvalidator_oxgarage_url=https://teigarage.tei-c.org/ege-webservice/ \
    dhconvalidator_performSchemaValidation=true \
    dhconvalidator_publicationStmt='<publicationStmt xmlns="http://www.tei-c.org/ns/1.0"><publisher>Japanese Association for Digital Humanities</publisher><address><addrLine>5-26-4-11F, Hongo, </addrLine><addrLine>Bunkyo-ku, Tokyo</addrLine><addrLine>Japan</addrLine><addrLine>Japanese Association for Digital Humanities</addrLine></address></publicationStmt>' \
    dhconvalidator_showOnlyAcceptedPapers=false \
    dhconvalidator_showOnlyAcceptedUsers=true \
    dhconvalidator_tei_image_location=/Pictures \
    dhconvalidator_templateFileEN=template/DH_template_JADH_en \
    dhconvalidator_paperProviderClass=org.adho.dhconvalidator.conftool.ConfToolClient \
    dhconvalidator_userProviderClass=org.adho.dhconvalidator.conftool.ConfToolClient

COPY --from=builder /home/gradle/dhconvalidator/build/libs/*.war /tmp/
COPY entrypoint.sh /entrypoint.sh

USER root:root
RUN mkdir -p ${JETTY_BASE}/webapps/ROOT \
    && unzip /tmp/*.war -d ${JETTY_BASE}/webapps/ROOT \
    && chown -R jetty:jetty ${JETTY_BASE}/webapps/ROOT

Especially the following two lines are very important, so I will show them again for confirmation.

dhconvalidator_paperProviderClass=org.adho.dhconvalidator.conftool.ConfToolClient
dhconvalidator_userProviderClass=org.adho.dhconvalidator.conftool.ConfToolClient
6. Build the Docker image

Now that you've made it this far, you're almost ready to build Docker, as described in the GitHub tutorial.

# docker build -t dhconvalidator .
7. Run the Docker container

Then, run the Docker container

# docker run -d --rm -p8080:8080 --name dhconvalidator dhconvalidator 
8. Access your DHconvalidator with your Web browser

Access your DHconvalidator with your Web browser. If your paper has been accepted, your name will be displayed after you log in, and the title of your paper will be displayed when you go beyond that, and so on. If your paper is not accepted, you may want to create an accepted temporary user on Conftool, for example. In any case, I hope you are logging in successfully.