As described in our previous article on SEO for JavaScript applications, prerendering services still play an important role in enabling search engine crawlers to index the dynamic content. This time we will learn how to install Prerender.io on your server infrastructure.

Prerender.io is a great service that allow you to optimize your JavaScript applications for search engines without an effort - you just purchase an account with them and install the middleware on your server. The source code for their service is available on GitHub and you can alternatively run it from your servers which is handy in high-volume scenarios. This is what we are using for our applications, and since the installation instructions are a bit terse, we’ve decided to document the whole process. We will be using a fresh and dedicated installation of Ubuntu 14.04 to run our local instance of Prerender.io.

Basics

  • Go to the folder under which you want to install Prerender.io - in our case it is /home/mono - and install git, node.js and the actual prerender.io package.
    sudo apt-get update
    sudo apt-get install git
    sudo apt-get install nodejs
    sudo apt-get install nodejs-legacy
    sudo apt-get install npm
    git clone https://github.com/prerender/prerender.git
    cd prerender
    npm install
    npm install weak
  • You should be able to run the Prerender.io server at this point by typing
    node server.js
  • We don’t want to leave our prerendering service open to just about anyone. Therefore, we’ll activate the authentication plugin by commenting out the line below in the server.js file:
    server.use(prerender.basicAuth());

Caching

  • Generating HTML snapshots is a resource-intensive process, so some sort of caching strategy should be used to improve the performance. Prerender.io comes with several different caching plugins, but we will use the one based on Redis, as it offers scalability and a simple cache expiration mechanism. Redis can be built from source, but the pre-built version is sufficient for the task at hand.
    sudo apt-get update $ sudo apt-get upgrade
    sudo apt-get -y install redis-server
  • By default, our Redis instance listens only on localhost adapter, but you should always check /etc/redis/redis.conf if the line containing bind 127.0.0.1 is uncommented and active. For added security, follow the instructions for securing the Redis installation on Ubuntu.

  • You can install Redis Desktop Manager from http://redisdesktop.com/ for managing the contents of your Redis instance.

  • Download Redis cache plugin and install it. Currently, it caches the pages for one day and then expires them. This can be overridden by specifying the env variable process.env.PAGE_TTL in seconds. To never expire, you should set the PAGE_TTL variable to 0.

    npm install prerender-redis-cache --save
  • Change the server.js to use the redis cache plugin
    server.use(require('prerender-redis-cache'));

Finishing the installation

  • Install the Access log plugin to keep the access logs - it can be useful for debugging and other maintenance tasks.
	npm install prerender-access-log --save
  • Initialize the plugin in the server.js. You also need to configure the access log settings. Here is the finished version of the server.js:
    #!/usr/bin/env node
    var prerender = require('./lib');

    var server = prerender({
        workers: process.env.PHANTOM_CLUSTER_NUM_WORKERS,
        iterations: process.env.PHANTOM_WORKER_ITERATIONS || 10,
        phantomBasePort: process.env.PHANTOM_CLUSTER_BASE_PORT || 12300,
        messageTimeout: process.env.PHANTOM_CLUSTER_MESSAGE_TIMEOUT,
        accessLog: {
            // Check out the file-stream-rotator docs for parameters
            fileStreamRotator: {
                filename: ' /var/log/prerender/access-%DATE%.log',
                frequency: 'daily',
                date_format: 'YYYY-MM-DD',
                verbose: false
            },

            // Check out the morgan docs for the available formats
            morgan: {
                format: 'combined'
            }
        }
    });

    server.use(prerender.basicAuth());
    // server.use(prerender.whitelist());
    server.use(prerender.blacklist());
    //server.use(prerender.logger());
    server.use(prerender.removeScriptTags());
    server.use(prerender.httpHeaders());
    server.use(require('prerender-access-log'));
    server.use(require('prerender-redis-cache'));
    // server.use(prerender.inMemoryHtmlCache());
    // server.use(prerender.s3HtmlCache());
    server.start();    
  • Create the startup script in /home/mono/prerender called startup.sh, use chmod to change its permissions to execute (‘x’).
    #!/usr/bin/env bash 
    
    export BASIC_AUTH_USERNAME=myusername
    export BASIC_AUTH_PASSWORD=mypassword
    export PORT=80
    node /home/mono/prerender/server.js
  • To test how it works use Postman or an alternative tool; set Authorization for the request to Basic Auth and enter username and password as set in the startup.sh and issue a GET request, for example http://myprerender.mydomain.com/http://domaintorender.com/somepage/.

  • Create a file /etc/init/prerender.conf with the following contents to start Prerender as a service and respawn it if it fails - for more details on the Upstart event system that it uses, click here.

    #!upstart
    description "A job that runs the prerender service"
    author "Denis"

    start on filesystem or runlevel [2345]
    script
        export HOME=" /home/mono/prerender"
        cd /home/mono/prerender
        exec su -c ' /home/mono/prerender/startup.sh'
    end script

    pre-start script
        echo "['date'] Prerender Service Starting" >> /var/log/prerender/prerender.log
    end script

    pre-stop script
        echo "['date'] Prerender Service Stopping" >> /var/log/prerender/prerender.log
    end script

    respawn
    respawn limit 10 90
  • You can optionally install NewRelic analytics tool to track the server performance while it renders and serves pages to the search engine bots.
    echo 'deb http://apt.newrelic.com/debian/ newrelic non-free' | sudo tee /etc/apt/sources.list.d/newrelic.list
    wget -O- https://download.newrelic.com/548C16BF.gpg | apt-key add -
    apt-get update
    apt-get install newrelic-sysmond
    nrsysmond-config --set license_key=MyLicenseKey
    /etc/init.d/newrelic-sysmond start
  • To test the service, type sudo service prerender start (you will also use restart or stop). It will be run automatically on restart. That’s all for the server side!

Client side

  • On the client application side, there are two ways to utilize Prerender.io functionality - via the URL rewrite functionality in the server config file, or via the middleware - for example, here is the middleware for ASP.NET. We will describe the usage of the URL rewrite approach with Internet Information Server (IIS) for Windows, as the middleware technique is trivial.
  • To allow forwarding requests to other servers in the URL rewriting rule, Application Request Routing has to be installed on the server level in the IIS.
  • Double click on the Application Request Routing Cache feature in the IIS manager, and choose Server Proxy Settings in the Action pane.
  • Click on the Enable proxy checkbox, click Apply.
  • On the site level, double click the URL Rewrite and choose View Server Variables in the action pane.
  • Click on Add and enter HTTP_Authorization to allow this header to be set in the URL rewrite rule.
  • Enter the rewrite rule in the web.config at the site level.
    <?xml version="1.0" encoding="utf-8"?>
    <!--
    For more information on how to configure your ASP.NET application, please visit
    http://go.microsoft.com/fwlink/?LinkId=169433
    -->
    <configuration>
        <system.web>
            <compilation targetFramework="4.5.1" />
            <httpRuntime targetFramework="4.5.1" />
            <customErrors mode="Off" />
        </system.web>
        <system.webServer>
            <httpErrors errorMode="Detailed" />
            <rewrite>
                <rules>
                    <!--# Only proxy the request to Prerender if it's a request for HTML-->
                    <rule name="Prerender" stopProcessing="true">
                        <match url="^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent))(.*)" ignoreCase="false" />
                        <conditions logicalGrouping="MatchAny">
                            <add input="{HTTP_USER_AGENT}" pattern="baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator" />
                            <add input="{QUERY_STRING}" pattern="(.*)_escaped_fragment_(.*)" ignoreCase="false" />
                        </conditions>
                        <action type="Rewrite" url="http://prerender1.mono.software/http://{HTTP_HOST}{REQUEST_URI}" appendQueryString="false" />
                        <serverVariables>
                            <set name="HTTP_Authorization" value="Basic bXl1c2VybmFtZTpteXBhc3N3b3Jk" />
                        </serverVariables>
                    </rule>
                </rules>
            </rewrite>
        </system.webServer>  
    </configuration>
  • The value in the HTTP_Authorization variable is base64 encoded pair of username:password as set in the Prerender.io authentication settings.
  • The pages that need to use its functionality has to have the following tag in their head section:
     <meta name="fragment" content="!">

I hope that you will find this post useful for your JavaScript SEO usage scenarios. As always, please do not hesitate to send us your comments and questions, we will be happy to help.

More articles

Related posts