Everything that Google should have asked me about Cloud Search (Springboard) but did not
January 7, 2017
Based on Google’s announcement, Springboard should be entering general availability (Although most likely still in “beta”). Not having being asked what should be in the product, I’ve put together my own short wish list. Obviously, this is a much much larger topic than can be described in a short blog post. Nonetheless, here are a couple of thoughts:
1 - What should we use for our connector framework?
You should migrate from the existing Plexi Framework for a Microservices based architecture based on Spring or pure Google Cloud Platform. Spring runs everywhere both on prem and in the cloud. Microservices allow for a more pluggable / dependency injection model for traversal and processing of data to the cached store and the native Cloud Platform is excellent and processing large amounts of data.
Some reasons to not use the current Plexi framework is that it does not use dependency injection and therefore requires complicated re-architecture / recoding for changes. Further, crawling is slow and requires many http connections. Indexing things in batch and streams is much more efficient. The connectors should sync data to a temporal storage system and use something like Cloud DataFlow for stream and event based processing.
2 - Where should that temporal storage system be then?
Google Cloud Storage should be the primary data source for external content ingestion. The api is documented and performs very well under large data loads. Connectors sync data to a variety of buckets. As the system is updated or potentially requires reindexing, change event notifications can be wired to auto update (link ). The storage is encrypted at rest and provides low latency access from other GCP regions.
3 - How should we process the data?
Data stored in Google Cloud Storage and be synced into the Springboard index via Cloud Pub / Sub and Cloud Data Flow. Standard processing templates can be provided and users should be allowed to upload their own pipelines which leverage other Machine Learning APIs both in GCP and other areas.
4 - How should we display the result set?
I have a core belief that search should not be an “opt-in” experience. What I mean by this is that you should not have to goto “springboard.google.com” but rather you want to take into the applications that you want to expose it contextually. Search is simply a service that can be extended through whatever framework you like.
5 - What other things should we think about?
Don’t make search opt in
Forcing everyone into navigating to a search experience is dated. The Google search algorithm and machine learning needs to be extended into everyone’s application. By participating the indexing and the event stream you have the ability to provide true Google Now type of events/cards that are useful.
Springboard needs to provide its basic interface for an out of the box experience but its value truly is amplified if it can be extended into all enterprise applications. This requires simply that Springboard listen to an event stream, process that data and provide APIs that allow notifications to brought throughout the G Suite, Chrome Alerts and etc.
Integration with Other Google APIs
We are in an API economy and Google’s got the brand name around big data processing. Integrating with the cognitive API suite is both competitive advantage and end user demand. Google Machine Learning, Natural Language, Speak to text and image recognition APIs are just a couple of examples where integration into the larger services suite which could be leveraged.
Integration with Google Cloud
Currently, Springboard seems to be integrated deeply within G Suite. This potentially limits its use case in the way that Hangouts and Google Docs does. Everything is tied to a G Suite user account and the G Suite underlying infrastructure. Google Cloud on the other hand already has exposed services and infrastructure which widen the use cases for which Springboard can be leveraged.