Alfresco Solr Trackers Showcase

In 2014, while working at Alfresco, I helped upgrade from using Solr 1.4 to Solr 4.9, and in doing so I changed much of the Solr tracking code.  We were on Alfresco version 4.2, and our next big release would be 5.0.  Here’s an overview of my work.

We started with one solr project.  In ACE-916 and other issues we separated the code into two additional projects, solr-client, and solr4.

The solr-client Project

The solr-client project contains our Java API to connect to Alfresco for tracking.  Despite the name of this project, it really provides a proxy to Alfresco rather than to Solr.  I moved code from the original solr project that was relevant for all Solr versions to this project.  I created the SOLRAPIClientFactory and its associated SOLRAPIClientFactoryTest.  I created these adapter interfaces in order to keep some dependent code in the project free from dependencies on specific versions of those classes.  The solr and solr4 projects are each dependent upon solr-client but independent of each other.

The solr Project

The solr project contains code relevant to Solr 1.4.  I created the specific implementations of the adapters.  The original CoreTracker had > 2000 lines of code and did all kinds of tracking sequentially.  This was refactored in the solr4 project.  I also created the InformationServer interface and the LegacySolrInformationServer.

The solr4 Project

The solr4 project is specific to Solr 4.9.  I created this project, and the majority of my work was done here.  I created the ModelTracker, ContentTracker, and MetadataTracker, and I contributed to the AbstractTracker and AclTracker.  The now multiple kinds of trackers used the ThreadHandler, QueueHandler, and AbstractWorkableRunner to take advantage of multi-threading.  The ContentTracker took advantage of our new SolrContentStore that we used as a cache to prevent having to hit Alfresco for a reindex.  Associated tracker tests are here.

I also changed how we triggered tracking.  Alfresco has a separate Solr core for each Alfresco store, i.e. workspace://SpacesStore for “live” content, archive://SpacesStore for “deleted” content, etc. The AlfrescoCoreAdminHandler, which is a custom CoreAdminHandler, instantiates a SolrTrackerScheduler which schedules a CoreWatcherJob. The CoreWatcherJob goes through the Solr cores and registers with the admin handler the information server and the trackers. To do this I created a TrackerRegistry to register trackers per core.  Here are the SolrTrackerSchedulerTest and the TrackerRegistryTest.  

As was required by the new SolrCore, I created an AlfrescoSolrCloseHook along with its AlfrescoSolrCloseHookTest.  I created the InformationServer interface and the SolrInformationServer implementation for solr4.  * I wanted to have one InformationServer interface in the solr-client project, but the implementations were so different that it didn’t fit.  Now I almost feel like there is no point in having the interface for both projects, but I left them in there anyway.

I implemented the adapters mentioned above here.  Since we were trying to make our new implementation of Solr cloud-friendly, I implemented Cloud to facilitate running solr queries in the cloud.  Along with that were the SolrCoreTestBase and the CloudTest.

For ACE-3126 I ensured that module models are gotten before queries go through during installation.  I created the EnsureModelsComponent that makes queries block wait until the first model sync is done to the repository.

Testing

I prefer when writing unit tests to use a mocking framework so that the tests have no external dependencies such as a database or an app server.  That’s why I invested time in blazing a trail for using Mockito at Alfresco.  Of course we also performed integration tests like those in the AlfrescoCoreAdminTester and manual tests as documented in various Jira issues.  I didn’t participate in the performance tests, but I know they were done.

Mavenization

The code was originally being built using Ant, and I mavenized the build.  This included a solr4-ssl profile to enable secure comms in the solr4 pom and a solr-http profile to disable ssl in the solr pom.

Conclusion

I learned about Solr, multi-threading, scheduling jobs in Alfresco, Ant, Maven, and generally how to populate the Solr index with all relevant information in an enterprise content management system like Alfresco.  I have come to know that this problem is something that others like Lucidworks have solved.  I have developed a passion for this area and would like to do more on it in the future.

 

Security Breaches Leave Millions Vulnerable

Sounds pretty bad, eh?  Well that’s the fact of today’s world where every so often you hear about another big company’s security being breached and their data getting leaked out to hackers.  Usernames, passwords, social security numbers, birthdates… all identifiable information has been leaked from multiple different companies’ databases.  How close does that leave you from having your identity stolen?  It’s almost a matter of time… or is it?  Isn’t there anything you can do about it?

Yes, there is!  There are tools that can help you protect yourself from… well yourself.  What do I mean?  See, the problem is that most people use the same passwords for multiple accounts, don’t change their passwords frequently enough, and don’t use secure passwords.  Not you?  Well I bet you’re guilty of at least one of these faults.  We all are, and that’s because managing all those passwords is hard work!  Luckily the password manager LastPass is here to help.  Yes, LastPass, as in the Last Password you will have to remember.

I’ve been using LastPass for over a year now, and it has saved me tons of headache.  I never worry about having to create another password that I’ll have to remember, because I don’t.  LastPass creates them for me and remembers them for me too!  In fact, it even performs a security analysis and tells me if any of the usernames in my vault have been involved in any known security breaches.  It tells me if any of my passwords are insecure and when they were last updated.  It helps me update passwords automatically without me having to do anything except tell it what sites to update.  That comes in really handy when it advises me to change compromised, weak, reused, and old passwords.

The best thing about LastPass is that it is free to use with your PC/laptop browser.  If you want to use it for your mobile devices, then it is just one dollar a month for the LastPass Premium.  Create your account here and get a month of LastPass Premium for FREE.