Multilingual Search Indexes with Lunrjs in Jekyll

Why Per-Language Search is Important

When managing a multilingual Jekyll site, it's not enough to offer translation — your search function must also understand language boundaries. Without separation, users searching in Japanese might get results in Portuguese, degrading the experience. With Lunr.js, you can build client-side search indexes tailored to each language, ensuring relevant results and faster response times.

Step 1: Install Lunr.js and Multilingual Plugins

First, include the Lunr.js core library in your site’s assets:


/assets/js/lunr.min.js

Then include language support plugins for each language you plan to index (e.g., lunr.stemmer.support.js, lunr.jp.js, lunr.pt.js).

Example Include:


<script src="/assets/js/lunr.min.js"></script>
<script src="/assets/js/lunr.stemmer.support.js"></script>
<script src="/assets/js/lunr.jp.js"></script>
<script src="/assets/js/lunr.pt.js"></script>

Step 2: Generate Per-Language Index JSON

In your Jekyll project, create a new layout: search_index.html and set its output to JSON:


---
layout: null
permalink: /search_index/en.json
---

[
{% assign pages = site.pages | where:"lang","en" %}
{% for page in pages %}
  {
    "title": "{{ page.title | escape }}",
    "url": "{{ page.url }}",
    "content": {{ page.content | strip_html | strip_newlines | jsonify }}
  }{% unless forloop.last %},{% endunless %}
{% endfor %}
]

Repeat this for each language, e.g., /search_index/ja.json, /search_index/pt.json, etc.

Step 3: Load the Correct Index Based on Language

On your search page, use JavaScript to detect the current language and load the matching index file:


const lang = document.documentElement.lang || 'en';
fetch(`/search_index/${lang}.json`)
  .then(res => res.json())
  .then(data => {
    const idx = lunr(function () {
      this.ref('url');
      this.field('title');
      this.field('content');

      data.forEach(doc => this.add(doc));
    });

    // Add event listener for search
    document.getElementById('search-input').addEventListener('input', e => {
      const query = e.target.value;
      const results = idx.search(query);
      displayResults(results, data);
    });
  });

Step 4: Display Search Results

Create a simple result renderer using the ref returned by Lunr:


function displayResults(results, data) {
  const container = document.getElementById('search-results');
  container.innerHTML = '';

  if (results.length === 0) {
    container.innerHTML = '<p>No results found.</p>';
    return;
  }

  results.forEach(result => {
    const item = data.find(d => d.url === result.ref);
    const entry = document.createElement('div');
    entry.innerHTML = `<a href="${item.url}"><h3>${item.title}</h3></a><p>${item.content.substring(0, 150)}...</p>`;
    container.appendChild(entry);
  });
}

Step 5: Make the Search Page Multilingual

Create a version of search.html for each language. At minimum, you need to set lang in the front matter and ensure it loads the correct index JSON.

Front Matter:


lang: fr
permalink: /fr/search/
title: "Recherche"

This guarantees the language context is respected throughout the page lifecycle.

Optional: Compress the JSON Index

To improve performance, consider compressing your JSON with tools like gzip or Brotli. This reduces load time significantly, especially on mobile devices.

Real-World Scenario: Multinational Blog

A software company blog supports English, French, and Spanish readers. Their documentation and blog posts are localized. By using per-language Lunr.js indexes, their search interface only surfaces content relevant to the user’s language, reducing bounce rate and increasing time-on-site.

Advanced: Split Index by Content Type

If your site includes different content types like blog posts, docs, and FAQs, you can split each index by section:


"category": "docs"

Then filter results by section using JavaScript tabs or dropdowns, improving usability further.

Conclusion

Multilingual search indexes provide a powerful upgrade to your Jekyll site’s user experience. They prevent irrelevant results, improve performance, and honor the language preferences of each visitor. Lunr.js offers a flexible and lightweight solution you can customize extensively.

Pada artikel berikutnya, kita akan membahas bagaimana membuat custom 404 pages yang mendukung bahasa pengguna secara otomatis menggunakan Liquid dan redirect fallback.