Programmable Search Engine on Squarespace
Squarespace has a built-in search function that works fairly well, but it’s limited in functionality. While I can’t find any documentation on how it works, a few test queries can show where we’re at. When we talk about building a search component, what we’re really doing is:
Accepting Input
Interpreting Input
Ranking Results
What we’ve grown accustomed to is a search function that is incredibly adept at interpreting input. We expect spell corrections, synonym matching, and ranked results based on how closely the result matches the intent of my query. These are all somewhat complex, and the underlying models and taxonomies that power these results have been refined over years. While anyone could take the time to build one from scratch, generally speaking you would be spending more time and money for a lesser result. However, in some cases it does make sense to supplement search with organization-specific data where general models would not have the necessary context. Product feeds, for example, benefit heavily from content labelling.
Squarespace search expects an exact match, so although it works, it doesn’t really keep up with user expectations. In an ideal scenario, we can get the best of both worlds by integrating with a search provider that allows for supplementary feeds. When researching, options are all over the place. You can go with something like Swiftype or Algolia and take advantage of some really advanced features, but the cost is prohibitive when it comes to supporting something that isn’t revenue-generating. There are other options like Elfsight, but I wouldn’t use most of what they do so, despite the lower cost, I don’t see the benefit.
Google’s Programmable Search Engine is free while also offering some customization that I think will outperform Squarespace search and, at least for the time being, scale as needed.
New Search Engine Set-Up
With Google being the cheapest option that appears to offer some flexibility, we’ll start here and maybe pivot in the future if it becomes a hinderance. It’s important to have triggers for decisions like that, so what I mean by “hinderance” is:
Limited customization - if high value and prioritized features that drive utility are not possible
Limited results - if relying on third party indexing results in more than 10% failed intent matching
I don’t feel it’s valuable for me to walkthrough the steps of implementation; there are plenty of great guides already available. But, I’ll call out a few relevant aspects to think about.
Blog Search - When setting up your search engine, you can take advantage of website structure to limit search results to a specific area of the website. For me, this will be handy to have a separate search for the blog.
Category Search - While the set-up would be manual, in Advanced Settings of the search engine you can set Categories to let users refine Search results. When I look into how to implement other pieces of taxonomy like Subject Area, I think this will be valuable.
Query Enhancement - Also manual, but manual optimization of your synonyms and taxonomy can be tweaked. If I start seeing search data come through, while I could tweak the content itself to ensure users find what they’re looking for, I could do this at scale by adding synonyms to the search engine itself. Both have merits, so it’s handy to have this customization.
Improvements
While Search is working, there are some definite improvements that can be made.
Automating Sitemap Submissions
In current state, the search results are at the mercy of Google indexing. On a positive note, this means that I don’t need to manage Search indexing in multiple places - there are pages on the website that aren’t really relevant for Search, like the Search page itself for example. While testing, I found quite a few older pages that I had published as tests show up. I quickly resubmitted my website for indexing, but if I have to remember to do this after every material change to the website, I’m going to get annoyed. Doing tasks like that drive me crazy, so regardless of how small it is, I’m going to find a way to not have to think about it.
Category Search
I believe that as the website expands, it’ll be valuable to a user to be able to filter content. Right now the content is so minimal and specific that it’s not really valuable, but it will be in the future. For now, I’ll just keep in mind that this is a worthwhile enhancement at scale.
What’s Next?
I really hate that search doesn’t index properly, and I don’t like that I’ll need to manually submit to Google for my search to function properly. But, I also don’t want to pay for something that I know can be done for free through an API. Of course, time is money, but if I can automate the sitemap submissions, that’s fun to me from a learning perspective and it will likely require a lot of the same resource use as automating data collection for other projects. With that in mind, I’m going to write a script that can run as a Cloud Service job to submit my sitemap to Google Search Console.