What is WebSub?
In a sentence: WebSub is an open protocol for distributed pub/sub communication. WebSub started life as PubSubHubbub, refined in the W3C Social Web Working Group, and subsequently published as a W3C Recommendation. It was initially an extension of the Atom and RSS protocols, serving the purpose of realtime event notifications from any data type. WebSub’s ‘USP’ is that it saves the client from having to fetch content over HTTP. Instead, WebSub will push the changes to the client, based on their subscriptions.
WebSub is being widely adopted to replace server polling (a relatively bandwidth heavy way of implementing realtime functionality), working within community-driven standards. A key advantage is that WebSub brings together publishers of and subscribers to realtime data, allowing content to be published and updated in a secure and trustable way. This makes it possible to avoid periodic requests to networks. Websub’s secure communication environment is provided through a Publisher-Hub-Subscriber data transfer chain. WebSub is a W3C proposed recommendation, in existence since April 2017. It is based on Publish/Subscribe pattern and on Topics.
According to W3C WebSub: “WebSub provides a common mechanism for communication between publishers of any kind of web content and their subscribers, based on HTTP webhooks. Subscription requests are relayed through hubs, which validate and verify requests. Hubs then distribute new and updated content to subscribers as and when it becomes available.”
How does WebSub work?
WebSub uses a publish-subscribe pattern. It can work with any data type over HTTP without any server polling.
The main three entities in this pattern are Publishers, Hubs, and Subscribers.
Publisher: The publisher is the producer party that generates the content and publishes it to the Internet.
Hub: Hub acts as an intermediary party between publisher and subscriber. It accepts the content from the publishers and pushes them to subscribers, based on what they are subscribed to.
Subscriber: Subscriber is the consumer party in WebSub pattern, They find the Hubs advertised by Publisher’s topic and subscribe to them to get notifications whenever the Publisher publishes new content to the Topic they are subscribed to.
Publishers publish new content or update changes to the Hubs, referencing HTTP headers that contain the Topic’s information. Now, the Subscribers who are subscribing to these topics via Hubs will get served with the relevant content securely through the HTTP Post call from Hubs. Subscribers will have securely shared the HTTP end-point to Hubs at the time of subscription. The process for this is as below.
WebSub flow diagram with the main three entities: Publishers, Hubs, and Subscribers.
Subscribers discover the Hub of a topic URL and make a POST to one or more of the advertised hubs in order to receive updates when the topic changes.
Publishers notify their Hub(s) URLs when their topic(s) change.
When the Hub identifies a change in the topic, it sends a content distribution notification to all registered subscribers.
Thus the providers, hubs and the subscribers jointly generate the WebSub nexus. The above diagram shows the visual information flow of the WebSub pattern. The steps below show how Subscribers and Hubs actually share the information securely with each other.
1. The subscriber sends a subscription request to Hub
To subscribe to the topic, the subscriber sends a form-encoded POST request to the Hub with the following parameters:
hub.mode – The string “subscribe”
hub.topic – The URL with the content subscribers are subscribing to,
hub.callback – The URL that subscribers want the hub to send notifications to, so it must be a publicly-accessible URL.
Here is the code snippet of subscribe request in Nodejs.
var request = require("request");
var options = {
method: 'POST',
url: 'https://pubsubhubbub.superfeedr.com',
headers:
{
'Content-Type': 'application/x-www-form-urlencoded'
},
form:
{
'hub.mode': 'subscribe',
'hub.topic': 'http//topic.feeder.com/subscribe/to',
'hub.callback': 'http://subscriber.server.com/your/callback'
} };
request(options, function (error, response, body) {
if (error) throw new Error(error);
console.log(body);
});
It is a good idea to use a unique URL for each subscription so you know what feed is updated when you receive a notification. The Hub will reply with either
`202 Accepted` or
`400 Bad Request`.
If there were any problems with the request, if the subscriber attempts to subscribe to a topic that does not exist, for example, the Hub will return 400 Bad Request and a plain text description of the error. If the Hub accepted the subscription request, it will return 202 Accepted and will then attempt to verify the subscription request.
2. The Hub verifies the subscription request (with built-in security)
After sending the subscription request, the Hub will make a separate request back to the subscriber’s callback URL for verification purposes. It is necessary to avoid arbitrary subscriptions by the attackers. It’ll be a simple GET request with the below query params.
hub.mode – The string “subscribe”
hub.topic – The topic URL from the subscription request
hub.challenge – A hub-generated random string that must be echoed by the subscriber to verify the subscription
hub.lease_seconds – The hub-determined number of seconds that the subscription will stay active. After which the subscriber will need to subscribe again.
To confirm the subscription, Subscribers will need to reply with 200 OK and a request body of exactly the string provided in the “hub.challenge” parameter in the request. The response body should not contain anything else and is not form-encoded, just the plain string.
To reject the request, reply with 404 Not Found and an empty body. Any response other than 200 OK will indicate to the Hub that the subscription is rejected and you will not receive notification of new content.
To unsubscribe the subscription just follow the step 1 & 2 with hub.mode = “unsubscribe”
Now the subscription is created and the subscriber will get the contents through the callback URL provided at the time of subscription via the POST method.
WebSub – use case
WebSub is useful in applications or implementations where content changes frequently and frequent updates are required. As such, it forms a basic part of the internet infrastructure, especially the ever-increasing portion that plays out in realtime. Areas where WebSub is particularly useful include news aggregator platforms (CNN), stock exchanges, air traffic networks, analytics platforms, RSS like feeds, weather information providers, etc. In all of these examples fetching content via polling would not be good practice. WebSub is useful as Hubs distribute the latest status to all the subscribed nodes as soon as hub receives the information from the publishers. Using WebSub, subscribers can rest assured that they are always in sync with the most up-to-date data.
Straight out – an area where WebSub is not useful is where you have private or sensitive information. Hubs push the contents to all the subscribers who have subscribed to the topics and publishers share their content with Hubs, so here publishers and subscribers are completely unknown to each other. Private information shared by publishers may reach hundreds of unknown subscribers.
Big names using WebSub
Following are the companies list who are using WebSub for their services.
Publishers
Big name publishers relying on WebSub to make sure published content is received by people who want to read it include: Google (which uses WebSub for most of its content services); CNN; LiveJournal; MySpace; which uses WebSub to control its news feed function), and Wordpress (the world’s largest open source blogging platform, which publishes all their blogs and articles on WebSub).
Subscribers
Examples of companies relying on WebSub subscription systems include: InFeeds.com (an online directory for important stories and ideas shared by people, run on the basis of Google Reader shared items); Netvibes (a personal dashboard to manage and review media from a variety of different channels, using WebSub to get the most relevant media content from the maximum channels).
Hub
Hubs are the center party in WebSub networks, both the Publishers and Subscribers communicate with the Hubs to publish and subscribe the topics’ content.
Here is a list of some popular Hubs.
Superfeedr provides feeds in realtime, It performs as an open Hub in WebSub network, pubsubhubbub.superfeedr.com
WebSub – challenges
The solution can be more technologically challenging on the hub side than on the data providers’ side. With the Hub the central atom of the whole WebSub system, Hubs need to be better prepared to address more complex technical implementation. If the hub goes down the WebSub network stops functioning.
Generally, the development of in-house WebSub system is complex to architect and demands high maintenance. Development considerations for added data distribution functionality include connection state recovery, stability, high performance, and cost-effectiveness. Scalability is another factor when traffic is unpredictable. The network must be scalable to handle the sudden high traffic.
These considerations are helpful to bear in mind when taking the decision to develop in-house solution or buy/use existing solutions.
WebSub and Ably
Ably is a realtime messaging platform. We deliver billions of realtime messages every day to more than 50 million end-users across web, mobile, and IoT platforms. We support a wide array of communication protocols, including open protocols, such as WebSub.
Developers use Ably’s 40+ SDKs to build realtime capabilities in their apps with our multi-protocol pub/sub messaging, presence, push notifications, free streaming data sources from across industries like transport and finance, and integrations that extend Ably into third-party clouds and systems like AWS Kinesis and RabbitMQ.
Businesses use Ably to distribute their streaming data to other businesses. This allows them to offload the engineering and data delivery challenges of providing data streams that are performant, reliable, and available to consume with various protocols.
Both businesses and developers choose to build on Ably because we provide the only realtime platform architected around four pillars of dependability: performance, high availability, reliability, and integrity of data. This allows our customers to focus on their code and data streams while we provide unrivaled quality of service, fault-tolerance, and scalability.
References and further reading
Ably Engineering blog, which covers a wide spectrum of engineering challenges related to building a globally-distributed serverless realtime infrastructure at scale
Recommended Articles
Socket.IO vs SockJS
Compare realtime libraries Socket.IO and SockJS on performance, scalability, developer experience, and features.
Scaling realtime apps with Golang and WebSockets
Learn how to build dependable realtime apps that scale with Golang and WebSockets, and the associated engineering challenges.
Scaling Socket.IO - practical considerations
A review of Socket.IO’s advantages, limitations & performance. Learn about the challenges of using Socket.IO to deliver realtime apps at scale.