Distributed Checking FAQ

Updated March 22nd 2005

What is Distributed Checking?

Distributed Checking centrally coordinates the checking activity of multiple copies of Shrook subscribed to the same channel. This can keep each copy of Shrook updated within minutes of the channel itself updating, amongst other benefits.

Why not just check more often?

In order to keep each channel as up to date, each individual copy of Shrook would have to load the channel at least every ten minutes. For users, this would obviously have an effect on the performance of their internet connection and computer. For popular channel providers, the hundreds or thousands of RSS readers connecting regularly, around the clock, already represent a huge financial burden (amongst other problems), as most are billed on the volume of data transferred to and from their computers. Checking more often would only make things worse.

So how does it work?

To oversimplify: A central server maintains a database of when each channel was last updated. To keep it up to date, every so often, the server chooses a computer to check for new items and report back. The frequency of this varies from every 5 minutes for popular channels, to every half hour for channels with only one online subscriber, and it tries to use a different computer each time. At the other end, each copy of Shrook checks in with the server every 5 minutes, and if any of its channels are out of date, it reloads them.

The exact behaviour is determined largely by the software on the server, so the system can be improved and refined without updating Shrook.

Does the central server distribute content in any way?

No. Distributed checking controls the timing of when copies of Shrook look for new content. Traffic to and from the server consists solely of Shrook asking if it needs to check a partciular channel, and the server supplying a yes or no answer. Shrook will then download content for itself, as any new reader would without distributed checking.

For Shrook Users

What kinds of channels does Distributed Checking work with?

Distributed Checking is designed for typical channels associated with news sites and blogs. Any channel where items remain in the channel for a while, and periodically items are added, removed or replaced, or even where all or most of the items are changed. Most channels work like this, so you rarely need to worry about switching it off. It even works fine when you're the only subscriber to a particular channel, as your own computer will check it every half hour.

However, results will be unpredictable with channels that (for example) represent real time statistical data, and you also shouldn't use it with channels hosted on localhost or an intranet. For channels that update very often, Distributed Checking may cause the channel to be reloaded more often than the user finds necessary. You can use the Show Info window in Shrook to disable the feature on a per channel basis.

Will it work behind a firewall or NAT router?

Yes. Distributed Checking is uses standard outgoing HTTP requests on standard ports. If you can use the web, you can use Distributed Checking.

How do I activate it?

Set the check period to Automatic, either in Preferences, or for individual channels in the Show Info window.

Will my computer be used to check channels I don't personally use?


What about my privacy?

Instead of transmitting full URLs, Shrook calculates a short code number from the URL and sends that instead. It is for all practial purposes impossible to deduce a private URL from just its code. Additionally, the central server does not store a permanent list of who is subscribed to which channels, and any data obtained from the system will never be sold to a third party. Finally, no actual content is ever passed to the central server, only information about when the channel was last updated.

shrook.com privacy policy

I'm not convinced. Can I disable it?

Yes. In Preferences, set Check For New Items to something other than Automatic. If you've manually set any of your channels to "utomatic in the Show Info window, you'll need to change those too. If the feature is disabled on one computer, Shrook will no longer attempt to connect to the central server (unless you are using synchronization). If you enable it for some channels, absolutely no data about the other channels will be passed.

I sometimes see "Submitting results" in the Activity Viewer. What does it mean?

When your computer is chosen to check for new items on behalf of the others, it needs to pass the results back as soon as possible (In the So how does it work? section above, this is the "and report back" phase). It's only a few bytes of data (essentially just the dates of when new items were last found), and it's what makes the system work.

For RSS Publishers

I've found Shrook/x (Distributed; x users; http://www.fondantfancies.com/shrook/distfaq.php) in my server log. What does it mean?

Shrook is a desktop RSS aggregator for MacOS X. It can be setup to use the system described above to check for new items, instead of checking at regular intervals like most aggregators do. The users count represents the number of Shrook users online in the last half hour subscribed to a particular URL. The Distributed section is added to the user agent string when one client is checking on behalf of the others, because the pattern of regular activity from disparate IP addresses may seem abnormal to some administrators.

The system causes a flurry of activity over a short period just after I update my site? Isn't it essentially a DDoS attack?

The number of people using Shrook online at the same time and subscribed to the same channel doesn't reach outside double figures. In addition, the activity is spread over a period of five minutes as different computers check in and find out about the update. The effect should not be noticable.

What happens if the Shrook userbase grows?

Shrook is a shareware application for MacOS X only, so its userbase will always be small. That said, if Shrook does grow to the point where a large nnumber of people are subscribing to the same feed (and are also online at the same time), the system would need to be changed. For example, the central server can already perform many of the duties of a BitTorrent tracker. It not only knows when a particular channel was last updated, but which clients have up to date copies (although the current server software does not store this information). It would be very possible to build a peer-to-peer content-distribution network on top of a large Shrook userbase.

Can Distributed Checking save me bandwidth?

It can reduce overall bandwidth if you update rarely but have lots of subscribers, but reducing bandwidth is not its primary goal. However, the system is designed not to make things worse. If necessary, you can ask to have the system check on your channel less often - see below for contact details. Again, the Shrook userbase is small, so don't expect any major effects.

I still have questions

Email me. You can also get in touch if you want Shrook to behave differently with your feed, for example save bandwidth.

Back To Shrook