A common requirement in realtime messaging applications is to be able to insert some business logic into a message processing pipeline. Typical use-cases might be to perform some filtering or payload transformation on a message-by-message basis, either when first ingested by the messaging service, or as part of a rule that captures messages from one channel, applies the business logic, and then forwards the message to another channel.
The Ably platform supports these use-cases by allowing rules to be created that invoke cloud functions (for example, AWS Lambdas or Google Cloud Functions). These functions can be invoked each time Ably processes a message on a matching channel, and those functions have access to all of the functionality of those environments. After applying the required business logic, publishing back into Ably channels is possible by making a normal Ably REST publish request using the libraries that are available for the language in use.
By providing integrations with externally-managed cloud functions instead of offering functions as an intrinsic part of Ably's message processing pipeline, we believe that we provide the best available mechanism to support these use-cases. We also provide the flexibility for developers to use the services they’re already working with while ensuring minimum latency. In the end, they can focus on their application rather than on infrastructure engineering. This rationale is based on multiple factors.
The capabilities of the execution environment
Effective provision of cloud functions depends on an execution environment for the function code that is supremely reliable, performant, and scalable. Cloud service providers can provide an extremely effective technical solution by being able to specialize in these services unilaterally.
As a result, resources available from those providers surpass any home-grown solution, in terms of both technical superiority as well as the breadth and scale at which they are deployed. Execution environments are supported for a wide range of languages, with best-in-class implementations, and cloud providers are resourced to be able to support all of these, along with the ongoing development required for feature and standards progression.
The nature of the developer and operator experience
Creating, deploying, operating and maintaining functions in production effectively is about much more than just being able to execute code. The code that runs in cloud functions needs to be supportable in the same way as any other production code: this goes for both developer productivity and operational effectiveness.
Developer productivity depends on the level of support for developers (including IDE integration and code debugging), for tools, and for ecosystems (including wider, new emerging ones such as, Zapier, IFTTT, Terraform, Pulumi, and Serverless Framework). Operational effectiveness depends on the ability to monitor function execution at scale — including having visibility of, and tooling to manage latency, throughput and error.
And finally, developers should get to choose the languages they want to work in, and cloud vendors typically support all popular languages.
Performance
One issue that is often raised with this approach is the apparent performance impact of having messages "leave" the Ably infrastructure to be handled elsewhere, before possibly being returned to the Ably message pipeline for onward delivery. The potential concern is that latency is added by having to divert a message into separate infrastructure in order to invoke the cloud function. Furthermore, there is a possible downside from the fact that the cloud function resource is managed independently of the message processing resource. These factors could introduce transit delays, and delays arising from the need to "warm" the cloud function capacity.
In practice, our experience of operating these services is that the reverse is true; the single most important factor in the overall performance of the pipeline is having the most effective possible function execution environment, and that is achieved by using the cloud services provider's primitives. Transit latencies between Ably and the function can be effectively eliminated by provisioning the function in the same region as where the messages are processed. Any such latencies are easily dwarfed by a less-than-effective execution environment.
Integrity
The Ably service provides a system in which our integrity assurances are clearly defined and communicated: if for any reason there are errors in any processing operation, they are made visible to the user of the API. For example, we provide the assurance that once a message has been accepted for processing, with an acknowledgement provided to the caller, then we guarantee the onward processing of the message.
Invocation of any function at the time a message is ingested — but doing it asynchronously relative to the message acceptance and acknowledgement — violates that principle, because any failure to invoke the function will mean that the message in question is then left unhandled. The situation ends up being that onward processing is not possible because the ingestion failed, but the caller also isn't told because the acknowledgement has already been sent.
In the Ably solution we address this issue by allowing function invocation after a message has been accepted on a channel. Even if function invocation fails, messages are still durably part of the history of the source channel, and therefore it is always possible to ensure that onward processing can be retried, or business logic can be invoked to handle those messages in another way.
Business logic ownership
Aside from these technical factors, at Ably we believe that business logic is best managed in services under your ownership. This way you have the right level of authority in order to manage operations for that code; without that command it becomes very hard to address concerns such as the introduction of changes to functionality, the roles and security rights that the code runs with, and the implications of load and scale for the function itself and its dependencies.
Summary
The underlying principle of cloud-based APIs is that cloud services form ecosystems that work well together and allow developers to use best-in-class services. Ably performs its function, and other services exist to perform their specific functionality as well as possible. Dedicated function providers will always execute functions better than us. Conversely, we’ll always provide better pub/sub messaging at the edge than dedicated function providers.
The key to building effective systems is to create the means for joining these services to create a bigger whole. Our job is not to guess what developers need, or to constrain them into concentrating increasingly more functionality within a single service. Instead it is to deliver a service layer developers can depend on for their complete realtime needs, while providing a performant and reliable gateway to the best available cloud services for their requirements.