Skip to content
Alex Dean edited this page Nov 2, 2012 · 23 revisions

Frequently Asked Questions

  1. Is SnowPlow real-time?
  2. Does SnowPlow have a graphical user interface to enable me to analyse and visualise web analytics data?
  3. What's next on the roadmap?
  4. I want to use SnowPlow but not Amazon CloudFront - how?
  5. How can I contribute to SnowPlow?
  6. Does implementing SnowPlow impact the performance of my site e.g. page load times?
  7. Does SnowPlow use first- or third-party cookies?
  8. Is SnowPlow IPv6 compliant?
## Is SnowPlow real-time?

No, currently SnowPlow is not a real-time analytics solution. This is because SnowPlow depends on CloudFront's [access logging] cloudfrontlog capability, and it can take 20-60 minutes (and sometimes even longer) for CloudFront access logs to appear in your Amazon S3 logging bucket. This makes the current version of SnowPlow better suited to "after the fact", batch-based analysis.

The SnowPlow team are exploring other (non-CloudFront) architectures to support a real-time analytics capability alongside (not replacing) the current SnowPlow platform.

## Does SnowPlow have a graphical user interface to enable me to analyse and visualise web analytics data?

No, currently SnowPlow does not have a GUI. Analysts who want to query data collected by SnowPlow can use Hive, Pig or write MapReduce tasks in Java / Hadoop.

There are a number of companies working to build GUIs to work on top of Hadoop. We are watching these developments closely, and hope that to make it easy to integrate these front-ends with SnowPlow in the future, to enable analysts less comfortable with e.g. Hive to use SnowPlow.

We are also looking at possibilities of building GUIs to perform repeatable analyses that we see are popular amongst the SnowPlow community. However, we do not believe in general purpose GUIs for web analytics: the whole point of SnowPlow is to free the experienced analyst from the constraints of GUIs (with all their assumptions about how the analyst does and does not want to slice the data), so analysts can have maximum flexibility to slice, dice and model data to his / her heart's content.

## What's next on the roadmap?

Lots! We will shortly be open sourcing our SnowPlow-specific Hive Deserializers for SnowPlow; in the meantime you can get started with this general-purpose [CloudFront Log Deserializer] cflogde.

Also on the roadmap is releasing the first of the "recipes" for Hive analyses on SnowPlow's clickstream data.

On the ad serving analytics front, we are working on a micro-webserver to support redirection-based click-tracking, called SnowHusky snowhusky.

## I want to use SnowPlow but not Amazon CloudFront - how?

SnowPlowing without CloudFront is on the roadmap: we are currently building an ultra-fast, micro-webserver called SnowHusky snowhusky which you can use for impression as well as redirection-based click tracking. SnowHusky is being actively developed and is not yet ready for production deployment; [contact the SnowPlow team] contact if you want to find out more about SnowHusky.

## How can I contribute to SnowPlow?

The SnowPlow team welcome contributions! The core team (funded by Keplar keplar) is small so we would love more people to join in and help realise our objectives of building the world's most powerful analytics platform. Stay tuned for a more detailed update on how best you can contribute to SnowPlow.

## Does implementing SnowPlow on my site effect site performance e.g. page load times?

SnowPlow will have an impact on site performance, just as implementing any javascript-based tracking (e.g. another web analytics package) will impact site performance. However, we have done everything we can to minimise the effect on site performance.

Pages tracked using SnowPlow have to load the SnowPlow.js file. By hosting this page on Amazon's Cloudfront, the time takent to load the javascript is minimised. In addition, users have the choice to implement syncronous and asyncrounous tracking tags: if users wants to minimise the impact on page load times, for example, they should employ async tracking.

## Does SnowPlow use first- or third-party cookies?

SnowPlow uses first-party cookies, which are generated by the SnowPlow tracking JavaScript running on your domain. Because our tracking pixel is served from CloudFront, we don't have the option to set an additional "third-party" cookie to join up user behaviours across multiple domains. We are exploring some workarounds for this - [contact the SnowPlow team] contact if you want to know more.

## Is SnowPlow IPv6 compliant?

IPv6 (Internet Protocol version 6) is a revision of the Internet Protocol (IP) which allows for far more addresses to be assigned than with the current IPv4. At the moment, the SnowPlow tracking is not IPv6 compliant - for the simple reason that Amazon CloudFront is not yet IPv6 compliant. The AWS team have yet to announce any specific plans or timeline to support IPv6 - but you can request this support in the [AWS usage survey] awssurvey.

Clone this wiki locally