Upgrading legacy Solr servers
A client of ours has millions of items on their Drupal website that they index into Solr using the fantastic Search API Solr module.
However, they've been stuck on a set of very old Solr servers until earlier this year, when moving to some shiny new Solr servers became possible.
ComputerMinds helped them to make the leap from the legacy version they were on to the latest Solr 9 version. We were going to do so with minimal or no downtime to their busy site that gets more than 10 search-related page views every second.
Our first job was to get the Solr 9 server configured to use config and schema written for the legacy Solr server. Thankfully most of this task had been done for us, because we leverage open source software and so the configuration had already been updated by other members of the community and we simply got to benefit from that work for free. Thank you open source maintainers!
We stood up a Solr 9 server in our DDEV development environment, added in the updated configuration, and then we pointed Drupal's Search API Solr module at the new development server. We indexed our small development database into Solr and tested out all the site functionality: everything was working perfectly, we just needed to repeat those steps on production, right?
We stood up a new Solr 9 server in production and again configured the Search API Solr module to connect to this server additionally. Then we cloned the configuration for the index so that we had two indices: one pointing at the legacy server, and one at the new server. Our plan was to let the indexing of the production data run over the weekend and then we'd be able to cut the traffic over from one server to the other without interruption. However, by Monday morning the indexing hadn't really got that far through the data at all. It turns out that millions of items is really quite a lot for Drupal to index to Solr.
We needed to change our approach, and decided that a better way to do it would be to do a direct copy of the items from the old Solr server into the new one, effectively creating a replica.
We asked ChatGPT to generate me a script to do this, and after a bit of back and forth it produced something like this Python code:
import urllib.request
import json
# Solr legacy source
SRC_CORE = "collection1"
SRC_URL = f"http://old-solr:8983/solr/{SRC_CORE}/select"
# Solr 9 target
DST_CORE = "collection1"
DST_URL = f"http://new-solr:8983/solr/{DST_CORE}/update?commitWithin=5000"
BATCH_SIZE = 1000
start = 0
while True:
# Build Solr legacy request URL
params = f"?q=*:*&fl=*&sort=id+asc&rows={BATCH_SIZE}&start={start}&wt=json"
url = SRC_URL + params
print(f"Fetching batch starting at {start}...")
with urllib.request.urlopen(url) as response:
data = json.load(response)
docs = data['response']['docs']
if not docs:
break # finished all docs
# Prepare JSON for Solr 9
payload = json.dumps(docs).encode("utf-8")
req = urllib.request.Request(
DST_URL,
data=payload,
headers={"Content-Type": "application/json"},
method="POST"
)
with urllib.request.urlopen(req) as r:
print(f"Pushed batch of {len(docs)} docs to Solr 9.")
start += BATCH_SIZE
print("All documents migrated from Solr legacy to Solr 9!")Now, this isn't the exact script I ended up using for lots of reasons, but the bare bones of it are there.
We are simply grabbing the documents out of the old server, and sending them over to the new one for indexing.
This was so fast... like 20 minutes to transfer all that lovely data.
So, we had a solid plan now: we'd pause the indexing on the Drupal side for the old legacy Solr server, we'd run the script to copy the data over to the new Solr server and then once it's there make a quick change in the Search API index to say that it lives on the other server now, and then it all just works!
And it did! We didn't have any downtime because of this. Writes to the Solr index were paused, but because we've got our client's website using queues all over the place, it was simple to pause the queue, and then restart it after the change to the new Solr server was made.
No one noticed, a search index was available for read only queries the entire time, and there was no downtime for the site at all.
That's the perfect sort of upgrade, right? When no one notices what you've expertly done.