Tag: Analytics

Proxying Third-Party JavaScript as First-Party JavaScript (and the Potential Effect on Analytics)

First, check out how incredibly easy it is to write a Cloudflare Worker to proxy another URL:

addEventListener("fetch", (event) => {   event.respondWith(      fetch("https://css-tricks.com")   ); });

It doesn’t have any error handling or anything, but hey, it works:

Now imagine how some websites give you a URL to JavaScript in order to do stuff. CodePen does this for our Embedded Pens feature.

That URL is:

https://cpwebassets.codepen.io/assets/embed/ei.js

I can proxy that URL just as easily:

Doing nothing special, it even serves up the right content-type header and everything:

Cloudflare Workers gives you a URL for them, which is decently nice, but you can also very easily “Add a Route” to a worker on your own website. So, here I’ll make a URL on CSS-Tricks to serve up that Worker. Lookie lookie, it does just what it says it’s going to do:

CSS-Tricks.com serving up a JavaScript file that is actually just proxied from CodePen. I’m probably not going to leave this URL live, it’s just a demo.

So now, I could do….

<script src="/super-real-url/codepen-embeds.js"></script>

Right from css-tricks.com and it’ll load that JavaScript. It will look to the browser like first-party JavaScript, but it will really be proxied third-party JavaScript.

Why? Well nobody is going to block your first-party JavaScript. If you were a bit slimy, you could run all your scripts for ads this way to avoid ad blockers. I have mixed feelings there. I feel like if you wanna block ads you should be able to block ads without having to track down specific scripts on specific sites to do that. On the other hand, proxying some third-party resources sometimes seems kinda fine? Like if it’s your own site and you’re just trying to get around some CORS issue… that would be fine.

More in the middle is something like analytics. I recently blogged “Comparing Google Analytics and Plausible Numbers” where I discussed Plausible, a third-party analytics service that “is built for privacy-conscious site owners.” So, ya know, theoretically trustable and not third-party JavaScript that is terribly worrisome. But still, it doesn’t do anything to really help site visitors and is in the broad category of analytics, so I could see it making its way onto blocklists, thus giving you less accurate information over time as more and more people block it.

The default usage for Plausible is third-party JavaScript

But as we talked about, very few people are going to block first-party JavaScript, so proxying would theoretically deliver more accurate information. In fact, they have docs for proxying. It’s slightly more involved, and it’s over my head as to exactly why, but hey, it works.

I’ve done this proxying as a test. So now I have data from just using the third-party JavaScript directly (from the last article):

Metric Plausible (No Proxy) Google Analytics
Unique Visitors 973k 841k
Pageviews 1.4m 1.5m
Bounce Rate 82% 82%
Visit Duration 1m 31s 1m 24s
Data from one week of non-proxied third-party JavaScript integration

And can compare it to an identical-in-length time period using the proxy:

Metric Plausible (Proxy) Google Analytics
Unique Visitors 1.32m 895k
Pageviews 2.03m 1.7m
Bounce Rate 81% 82%
Visit Duration 1m 35s 1m 24s
Data from one week of proxied third-party JavaScript integration

So the proxy really does highly suggest that doing it that way is far less “blocked” than even out-of-the-box Plausible is. The week tested was 6%¹ busier according to the unchanged Google Analytics. I would have expected to see 15.7% more Unique Visitors that week based on what happened with the non-proxied setup (meaning 1.16m), but instead I saw 1.32m, so the proxy demonstrates a solid 13.8% increase in seeing unique visitors versus a non-proxy setup. And comparing the proxied Plausible setup to Google Analytics directly shows a pretty staggering 32% more unique visitors.

With the non-proxied setup, I actually saw a decrease in pageviews (-6.6%) on Plausible compared to Google Analytics, but with the proxied setup I’m seeing 19.4% more pageviews. So the numbers are pretty wishy-washy but, for this website, suggest something in the ballpark of 20-30% of users blocking Google Analytics.

  1. I always find it so confusing to figure out the percentage increase between two numbers. The trick that ultimately works for my brain is (final - initial) / final * 100.

The post Proxying Third-Party JavaScript as First-Party JavaScript (and the Potential Effect on Analytics) appeared first on CSS-Tricks. You can support CSS-Tricks by being an MVP Supporter.

CSS-Tricks

, , , , , ,

Embedded Analytics Made Simple With Cumul.io Integrations

Browse through SaaS communities on Twitter, LinkedIn, Reddit, Discord, you name it and you’ll see a common theme appear in many of them. That theme can go by many names: BI, analytics, insights and so on. It’s natural, we do business, collect data, we succeed or fail. We want to look into all of that, make some sense of the data we have and take action. This need has produced many projects and tools that make the lives of anyone who wants to look into the data just a bit easier. But, when humans have, humans want more. And in the world of BI and analytics, “more” often comes in the form of embedding, branding, customized styling and access and so on. Which ends up meaning more work for developers and more time to account for. So, naturally there has been a need for BI tools that will let you have it all.

Let’s make a list of challenges you may face as the builder and maintainer of these dashboards:

  1. You want to make the dashboards available to end users or viewers from within your own application or platform
  2. You want to be able to manage different dashboard collections (i.e. “integrations”)
  3. You want to be able to grant specific user rights to a collection of dashboards and datasets
  4. You want to make sure users have access to data only relevant to them

Cumul.io provides a tool we call Integrations which helps solve these challenges. In this article I’ll walk you through what integrations are, and how to set one up. The cool thing is that for most of the points above, there is minimal code required and for the most part can be set within the Cumul.io UI.

Some Background — Integrations

An Integration in Cumul.io is a structure that defines a collection of dashboards intended to be used together (e.g. in the same application). It is also what we use to embed dashboards into an application. In other words, to embed dashboards into an application, we give the application access to the integration that they belong to. You can associate dashboards to an integration and administrate what type of access rights the end users of the integration will have on these dashboards and the datasets they use. A dashboard may be a part of multiple integrations, but it may have different access rights on different integrations. When it comes to embedding, there are a number of SDKs available to make life simple regardless of what your stack looks like. 😊

Once you have a Cumul.io account and if you are an “owner” of an organization in Cumul.io, you will be able to manage and maintain all of your integrations via the Integrations tab. Let’s have a look at an example Cumul.io account. Below you can see the Dashboards that one Cumul.io user might have created:

Although these are all the dashboards this user may have created, it’s likely that not all dashboards are intended for the same end-users, or application for that matter. So, the owner of this Cumul.io account would create and maintain an Integration (or more!) 💪 Let’s have a look at what that might look like for them:

So, it looks like the owner of this Cumul.io account maintains two separate applications.

Now let’s see what the process of creating an integration and embedding its dashboards into an application would look like. The good news is, as mentioned before, a lot of the steps you will have to take can be done within the Cumul.io UI.

Disclaimer: For the purposes of this article, I’ll solely focus on the Integration part. So, I’ll be skipping everything to do with dashboard creation and design and we will be starting with a pre-made set of imaginary dashboards.

What we will be doing:

Creating an Integration

For simplicity, let’s only create one integration for now. Let’s imagine we have an analytics platform that we maintain for our company. There are three dashboards that we want to provide to our end-users: the Marketing Dashboard, the Sales Dashboard and the Leads Dashboard.

Let’s say that out of all the dashboards this account has created or has access to, for this particular project they want to use only the following:

New Integration

To create the integration, we go to the Integrations tab and select New Integration. The dialogue that pops up will already give you some idea of what your next steps will be:

Selecting Dashboards

Next up, you will be able to select which of your dashboards will be included in this integration. You will also be able to give the Integration a name, which here I’ve decided will appropriately be “Very Important Integration”:

Once you confirm your selection, you will have the option of defining a slug for each dashboard (highly recommended). These can later be used while embedding the dashboards into your application. You will later see that slugs make it easy to reference dashboards in your front-end code, and make it easier to replace dashboards if needed too (as you won’t need to worry about dashboard IDs in the front-end code).

Access Rights

You will then get to set the integration’s access rights for the datasets its dashboards use. Here we set this to “Can view.” For more info on access rights and what they entail, check out our associating datasets to integrations:

Filters and Parameters (and Multi-Tenant Access)

Side Note: To help with multi-tenant access — which would make sense in this imaginary set up — Cumul.io makes it possible to set parameters and filters on datasets that a dashboard uses. This means that each user that logs into your analytics platform would only see the data they personally have access to in the dashboards. You can imagine that in this scenario access would be based on which department the end user works for in the company. For more on how to set up multi-tenancy with Cumul.io, check out our article, “Multi-Tenancy on Cumul.io Dashboards with Auth0”. This can be done within the dashboard design process (which we are skipping), which makes it easier to visualize what the filters are doing. But here, we will be setting these filters in the Integration creation process.

Here, we set the filters the datasets might need to have. In this scenario, as we filter based on the users’ departments, we define a department parameter and filter based on that:

And voilà! Once you’re done with setting those, you have successfully created an integration. The next dialogue will give you instructions for what will be your next steps for embedding your integration:

Now you’ll be able to see this brand new Integration in your Integration tab. This is also where you will have quick access to the Integration ID, which will later be used for embedding the dashboards.

Good news! After your Integration is created, you can always edit it. You can remove or add dashboards, change the slugs of dashboards or access rights too. So you don’t have to worry about creating new integrations as your application changes and evolves. And as editing an integration is all within the UI, you won’t need to worry about having a developer set it all up again. Non-technical users can adapt these integrations on the go.

Embedding Dashboards

Let’s see where we want to get to. We want to provide the dashboards within a custom app. Simple, user logs into an app, the app has dashboards, they see the dashboards with the data they’re allowed to see. It could look like the following for example:

Someone had a very specific vision on how they wanted to provide the dashboards to the end user. They wanted a sidebar where they could flip through each of the dashboards. It could have been something completely different too. What we will focus on is how we can embed these dashboards into our application regardless of what the host application looks like.

Cumul.io comes with a set of publicly available SDKs. Here I’ll show you what you would do if you were to use the Node SDK. Check out our developer docs to see what other SDKs are available and instructions on how to use them.

Step 1: Generate SSO Tokens For Your End Users

Before you can generate SSO tokens for your end users, you will have to make sure that you create an API key and token in Cumul.io. You can do this from your Cumul.io Profile. It should be the organization owner with access to the integration that creates and uses this API key and token to make the SSO authorization request. Once you’ve done this, let’s first create a Cumul.io client which would be done in the server side of the application:

const Cumulio = require("cumulio");  const client = new Cumulio({   api_key: '<YOUR API KEY>',   api_token: '<YOUR API TOKEN>', });

Now we can create the SSO token for the end user. For more information on this API call and the required fields check out our developer documentation on generating SSO tokens.

let promise = client.create('authorization', {   integration_id: '<THE INTEGRATION ID>',   type: 'sso',   expiry: '24 hours',   inactivity_interval: '10 minutes',   username: '< A unique identifier for your end user >',   name: '< end-user name >',   email: '< end-user email >',   suborganization: '< end-user suborganization >',   role: 'viewer',   metadata: {} });

Here, notice how we have added the optional metadata field. This is where you can provide the parameters and values with which you want to filter the dashboards’ datasets on. In the example we’ve been going through we’ve been filtering based on department so we would be adding this to the metadata. Ideally you would get this information from the authentication provider you use. See a detailed explanation on how we’ve done this with Auth0.

This request will return a JSON object that contains an authorization id and token which is later used as the key/token combination to embed dashboards in the client-side.

Something else you can optionally add here which is pretty cool is a CSS property. This would allow you to define custom look and feel for each user (or user group). For the same application, this is what the Marketing Dashboard could look like for Angelina vs Brad:

Step 2: Embed

We jumped ahead a bit there. We created SSO tokens for end users but we haven’t yet actually embedded the dashboards into the application. Let’s have a look at that. First up, you should install and import the Web component.

import '@cumul.io/cumulio-dashboard';

After importing the component you can use it as if it were an HTML tag. This is where you will embed your dashboards:

<cumulio-dashboard   dashboardId="< a dashboard id >"   dashboardSlug="< a dashboard slug >"   authKey="< SSO key from step 1 >"   authToken="< SSO token from step 1 >"> </cumulio-dashboard>

Here you will have a few options. You can either provide the dashboard Id for any dashboard you want to be embedding, or you can provide the dashboard slug which we defined in the Integration setup (which is why I highly recommend this, it’s much more readable doing it this way). For more detailed information on how to embed dashboards you can also check out our developer documentation.

A nice way to do this step is of course just defining the skeleton of the dashboard component in your HTML file and filling in the rest of it from the client side of your application. I’ve done the following, although it’s of course not the only way:

I’ve added the dashboard component with the ID dashboard:

<cumulio-dashboard id="dashboard"></cumulio-dashboard>

Then, I’ve retrieved this component in the client code as follows:

const dashboardElement = document.getElementById("dashboard");

Then I request the SSO token from the server side of my application which returns the required key and token to add to the dashboard component. Let’s assume we have a wrapper function getDashboardAuthorizationToken() that does this for us and returns the response from the server-side SSO token request. Next, we simply fill in the dashboard component accordingly:

const authorizationToken = await getDashboardAuthorizationToken(); if (authorizationToken.id && authorizationToken.token) {   dashboardElement.authToken = authorizationToken.token;   dashboardElement.authKey = authorizationToken.id;   dashboardElement.dashboardSlug = "marketing|sales|leads"; }

Notice how in the previous steps I chose to define slugs for my dashboards that are a part of this integration. This means I can avoid looking up dashboard IDs and adding dashboardId as one of my parameters of the dashboardElement. Instead I can just provide one of the slugs marketing, sales or leads and I’m done! Of course you would have to set up some sort of selection process to your application to decide where and when you embed which dashboard.

That’s it folks! We’ve successfully created an Integration in Cumul.io and in a few lines of code, we’ve been able to embed its dashboards into our application 🎉 Now imagine a scenario where you have to maintain multiple applications at once, either for within the same company or separate ones. Whatever your scenario, I’m sure you can imagine how if you have a number of dashboards where each of them have to go to different places and each of them have to have different access rights depending on where they are and on and on we go.. How it can quickly get out of hand. Integrations allow you to manage this in a simple and neat way, all in one place, and as you can see, mostly from within the Cumul.io UI.

There’s a lot more you can do here which we haven’t gone through in detail. Such as adding user specific custom themes and CSS. We also didn’t go through how you would set parameters and filters in dashboards, or how you would use them from within your host application so that you have a multi-tenant setup. Below you can find some links to useful tutorials and documentation for these steps if you are interested.


The post Embedded Analytics Made Simple With Cumul.io Integrations appeared first on CSS-Tricks. You can support CSS-Tricks by being an MVP Supporter.

CSS-Tricks

, , , , ,
[Top]

WooCommerce + Google Analytics

Google Analytics is powerful analytics software. A common way to use it is to just slap the JavaScript snippet on every page template you have and let it collect basic data about unique visitors and pageviews and such. That’s useful, but it’s also the bare minimum. Say there is an important button on your site. Leveling up, you could send custom events to track users clicking on that button. Those are the analytics that matter the most.

Further down that road is tracking eCommerce analytics. This is extra-tricky, as it requires you sending events to Google Analytics for sales, instances of adding/removing things from cart, views on products… all sorts of stuff. If you don’t do all that (and do it right), you don’t get good analytics information.

Yet another reason I like WooCommerce! Instead of this analytics integration being a monumental effort and a substantial bit of technical debt to maintain, you just install the WooCommerce Google Analytics plugin and… that’s it. Also: it’s free.

I’ve had this integrated for months right here on CSS-Tricks, and I can confirm:

  1. It was close to zero effort.
  2. It just works.
The plugin installed and activated!
The one bit of config is adding this ID, which is easy to find in Google Analytics, your own code, or another Google Analytics plugin.

Now, to be clear, WooCommerce has its own analytics built right in. If what you are interested in is sales reports and top sellers and stuff like that, those are the dashboards I’d be looking at. But there are some things that the built-in WooCommerce analytics just don’t do. For example, I can check out the sales funnel on Google Analytics now:

30 days of traffic starting from all unique visitors to sessions where they actually bought something.

Somethings this data can be confusing for someone that is not an expert in this area, if this is your case, Google Ads management Melbourne can help you with the woo-commerce of your business.

CSS-Tricks isn’t exactly an eCommerce-focused website, so the funnel there starts super wide and gets super (super) tiny — but hey, at least I can confirm that and see it with my own eyes. Plus, I can glean some insights here, like the fact that 66/70 people completed checkout once they got there (pretty good), but only 70/525 even proceeded to checkout after adding to cart, so I’m losing a lot of people at that stage.

Here is some more interesting data that only Google Analytics knows:

Of 66 sales, 56 of them came from returning visitors, not new visitors. So people tend to not buy on first look, but do come back later. I’m not sure if that means I should be making things more enticing for those new visitors or if I should lean into reminding people about it after they’ve looked at it. Either way, now I know because I have the data.

There is data in the WooCommerce analytics that I’d normally have to go to Google Analytics to see. I can see individual orders. I can see what the top sellers are and compare product sales over different time periods. All useful stuff, and you might appreciate having all this in one place.

Again, my favorite part about this is having all this data. It feels like it should have been hard-won to get, but all it took was clicking a few buttons. That’s why I never regret just doing things the standard WordPress and WooCommerce way! Things tend to just work!


The post WooCommerce + Google Analytics appeared first on CSS-Tricks. You can support CSS-Tricks by being an MVP Supporter.

CSS-Tricks

, ,
[Top]

Comparing Google Analytics and Plausible Numbers

I saw this blog post the other day: 58% of Hacker News, Reddit and tech-savvy audiences block Google Analytics. That’s an enticing title to me. I’ve had Google Analytics on this site literally from the day I launched it. While we tend to see some small growth each year, I’m also aware that the ad-blocking usage (and general third-party script blocking) goes up over time. So maybe we have more growth than we can easily visualize in Google Analytics?

The level of Google Analytics blockage varies by industry, audience, the device used and the individual website. In a previous study, I’ve found that less than 10% of visitors block Google Analytics on foodie and lifestyle sites but more than 25% block it on tech-focused sites.

Marko Saric

Marko looked at three days of data on his own website when he had a post trending on both Hacker News and Reddit, and that’s where the 58% came from. Plausible analytics reported 50.9k unique visitors and Google Analytics reported 21.1k. (Google Analytics calls them “Users.”)

I had to try this myself. If we’re literally getting double the traffic Google Analytics says we are, I’d really like to know that. So for the last week, I’ve had Plausible installed on this site (they have a generous unlimited 30-day free trial).

Here’s the Plausible data:

And here’s the Google Analytics data for the same exact period:

It’ll be easier to digest in a table:

Metric Plausible Google Analytics
Unique Visitors 973k 841k
Pageviews 1.4m 1.5m
Bounce Rate 82% 82%
Visit Duration 1m 31s 1m 24s

So… the picture isn’t exactly clear. On one hand, Plausible reports 15% more unique visitors than Google Analytics. Definitely not double, but more! But then, as far as raw traffic is concerned (which actually matters more to me as a site that has ads), Plausible reports 5% less traffic. So it’s still fairly unclear to me what the real story is.

I do think Plausible is a pretty nice bit of software. Easy to install. Simple and nice charts. Very affordable.

Plausible is also third-party JavaScript

I’m sure less people block Plausible’s JavaScript, as, basically, it’s less-known and not Google. But it’s still third-party JavaScript and just as easily blockable as anything else. It’s not a fundamentally different way to measure traffic.

What is fundamentally different is something like Netlify Analytics, where the data comes from server logs that are not blockable (we’ve mentioned this before). That has its own issues, like the fact that Netlify counts bots and Google Analytics doesn’t. Maybe the best bet is something like server-side Google Analytics? That’s just way more technical debt than I’m ready for. Certainly a lot harder than slapping a script tag on the page.

Multiple sources

It felt weird to have multiple analytics tracking on the site at the same time. If I was doing things like tracking custom events, that would get even weirder. If I was going to continue to do it all client-side, I’d probably reach for something like Analytics which is designed to abstract that. But I prefer the idea of sending the data from the client once, and then if it has to go multiple places, do that server-side. That’s basically the entire premise of Segment, which is expensive, but you can self-host or probably write your own without terribly too much trouble. Just seems smart to me to keep less data going over the wire, and having it be first-party JavaScript.


The post Comparing Google Analytics and Plausible Numbers appeared first on CSS-Tricks. You can support CSS-Tricks by being an MVP Supporter.

CSS-Tricks

, , , ,
[Top]

How to Build a GraphQL API for Text Analytics with Python, Flask and Fauna

GraphQL is a query language and server-side runtime environment for building APIs. It can also  be considered as the syntax that you write in order to describe the kind of data you want from APIs. What this means for you as a backend developer is that with GraphQL, you are able to expose a single endpoint on your server to handle GraphQL queries from client applications, as opposed to the many endpoints you’d need to create to handle specific kinds of requests with REST and turn serve data from those endpoints.

If a client needs new data that is not already provisioned. You’d need to create new endpoints for it and update the API docs. GraphQL makes it possible to send queries to a single endpoint, these queries are then passed on to the server to be handled by the server’s predefined resolver functions and the requested information can be provided over the network.

Running Flask server

Flask is a minimalist framework for building Python servers. I always use Flask to expose my GraphQL API to serve my machine learning models. Requests from client applications are then forwarded by the GraphQL gateway. Overall, microservice architecture allows us to use the best technology for the right job and it allows us to use advanced patterns like schema federation. 

In this article, we will start small with the implementation of the so-called Levenshetein distance. We will use the well-known NLTK library and expose the Levenshtein distance functionality with the GraphQL API. In this article, i assume that you are familiar with basic GraphQL concepts like BUILDING GraphQL mutations. 

Note: We will be working with the free and open source example repository with the following:

In the projects, Pipenv was used for managing the python dependencies. If you are located in the project folder. We can create our virtual environment with this:

…and install dependencies from Pipfile.

We usually define a couple of script aliases in our Pipfile to ease our development workflow.

It allows us to run our dev environment easily with a command aliases as follows:

The flask server should be then exposed by default at port 5000. You can immediately move on to the GraphQL Playground, which serves as the IDE for the live documentation and query execution for GraphQL servers. GraphQL playground uses the so-called GraphQL introspection for fetching information about our GraphQL types. The following code initializes our Flask server:

It is a good practice to use the WSGI server when running a production environment. Therefore, we have to set-up a script alias for the gunicorn with: 

Levenshtein distance (edit distance)

The levenshtein distance, also known as edit distance, is a string metric. It is defined as the minimum number of single-character edits needed to change a one character sequence  a to another one b. If we denote the length of such sequences |a| and |b| respectively, we get the following:

Where

1(ai≠bj) is the distance between the first i characaters of a and the first j character of b. For more on the theoretical background, feel free to check out the Wiki.

In practice, let’s say that someone misspelt ‘machine learning’ and wrote ‘machinlt learning’. We would need to make the following edits:

Edit Edit type Word state
0 Machinlt lerning
1 Substitution Machinet lerning
2 Deletion Machine lerning
3 Insertion Machine learning

For these two strings, we get a Levenshtein distance equal to 3. The levenshtein distance has many applications, such as spell checkers, correction system for optical character recognition, or similarity calculations.

Building a Graphql server with graphene in Python

We will build the following schema in our article:

Each GraphQL schema is required to have at least one query. We usually define our first query in order to health check our microservice. The query can be called like this:

query {   healthcheck }

However, the main function of our schema is to enable us to calculate the Levenshtien distance. We will use variables to pass dynamic parameters in the following GraphQL document:

We have defined our schema so far in SDL format. In the python ecosystem, however, we do not have libraries like graphql-tools, so we need to define our schema with the code-first approach. The schema is defined as follows using the Graphene library:

We have followed the best practices for overall schema and mutations. Our input object type is written in Graphene as follows:

Each time, we execute our mutation in GraphQL playground:

With the following variables:

{   "input": {     "s1": "test1",     "s2": "test2"   } }

We obtain the Levenshtein distance between our two input strings. For our simple example of strings test1 and test2, we get 1. We can leverage the well-known NLTK library for natural language processing (NLP). The following code is executed from the resolver:

It is also straightforward to implement the Levenshtein distance by ourselves using, for example, an iterative matrix, but I would suggest to not reinvent the wheel and use the default NLTK functions.

Serverless GraphQL APIs with Fauna

First off some introductions, before we jump right in. It’s only fair that I give Fauna a proper introduction as it is about to make our lives a whole lot easier. 

Fauna is a serverless database service, that handles all optimization and maintenance tasks so that developers don’t have to bother about them and can focus on developing their apps and shipping to market faster.

Again, serverless doesn’t actually mean “NO SERVERS” but to simply put it: what serverless means is that you can get things working without necessarily having to set things up from scratch. Some apps that use serverless concepts don’t have a backend service written from scratch, they employ the use of cloud functions which are technically scripts written on cloud platforms to handle necessary tasks like login, registration, serving data etc.

Where does Fauna fit into all of this? When we build servers we need to provision our server with a database, and when we do that it’s usually a running instance of our database. With serverless technology like Fauna, we can shift that workload to the cloud and focus on actually writing our auth systems, implementing business logic for our app. Fauna also manages things like, maintenance and scaling which usually calls for concern with systems that use conventional databases.

If you are interested in getting more info about Fauna and it’s features, check the Fauna docs. Let’s get started with building our GraphQL API the serverless way with the GraphQL.

Requirements

  1. Fauna account: That’s all you need, an account with Fauna is all you need for the session, so click here to go to the sign up page.
  2. Creating a Fauna Database
  3. Login to your Fauna account once you have created an account with them. Once on the dashboard, you should see a button to create a new database, click on that and you should see a little form to fill in the name of the database, that resembles the ones below:

I call mine “graphqlbyexample” but you can call yours anything you wish. Please, ignore the pre-populate with demo data option we don’t need that for this demo. Click “Save” and you should be brought to a new screen as shown below:

Adding a GraphQL Schema to Fauna

  1. In order to get our GraphQL server up and running, Fauna allows us to upload our own graphQL schema, on the page we are currently on, you should see a GraphQL options; select that and it will prompt you to upload a schema file. This file usually contains raw GraphQL schema and is saved with either the .gql or .graphql file extension. Let’s create our schema and upload it to Fauna to spin up our server.
  1. Create a new file anywhere you like. I’m creating it in the same directory as our previous app, because it has no impact on it. I’m calling it schema.gql and we will add the following to it:

Here, we simply define our data types in tandem to our two tables (Notes, and user). Save this and go back to that page to upload this schema.gql that we just created. Once that is done, Fauna processes it and takes us to a new page — our GraphQL API playground.

We have literally created a graphql server by simply uploading that really simple schema to Fauna and to highlight some of the really cool feats that Fauna has, observe:

  1. Fauna automatically generates collections for us, if you notice, we did not create any collection(translates to Tables, if you are only familiar with relational databases). Fauna is a NoSQL database and collections are technically the same as tables, and documents as to rows in tables. If we go to the collections options and click on that we had see the tables that were auto-generated on our behalf, courtesy of the schema file that we uploaded.
  1. Fauna automatically creates indexes on our behalf: head over to the indexes option and see what indexes have been created on behalf of the API. Fauna is a document-oriented database and does not have primary keys or foreign-keys as you have in relational databases for search and index purposes, instead, we create indexes in Fauna to help with data retrieval.
  1. Fauna automatically generates graphql queries and mutations as well as API Docs on our behalf: This is one of my personal favorites and I can’t seem to get over just how efficient Fauna does this. Fauna is able to intelligently generate some queries that it thinks you might want in your newly created API. Head over back to the GraphQL option and click on the “Docs” tab to open up the Docs on the playground.

As you can see two queries and a handful of mutations already auto-generated (even though we did nit add them to our schema file), you can click on each one in the docs to see the details. 

Testing our server

Let’s test out some of these queries and mutations from the playground, we also use our server outside of the playground (by the way, it is a fully functional GraphQL server).

Testing from the Playground

  1. We will test our first off by creating a new user, with the predefined createUser mutations as follows:

If we go to the collections options and choose User, we should have our newly created entry(document aka row) in our User collections.

  1. Let’s create a new note and associate it with a user as the author via it’s document ref id, which is a special ID generated by Fauna for all documents for the sake of references like this much like a key in relational tables. To find the ID for the user we just created simply navigate to the collection and from the list of documents you should see the option(a copy Icon)to copy Ref ID:

Once you have this you can create a new note and associate is as follows:

  1. Let’s make a query this time, this time to get data from the database. Currently, we can fetch users by ID or fetch a note by it’s ID. Let’s see that in action:

You must have been thinking it, what if we wanted to fetch info of all users, currently, we can’t do that because Fauna did not generate that automatically for us, but we can update our schema so let’s add our custom query to our schema.gql file, as follows. Note that this is an update to the file so don’t clear everything in the file out just add this to it:

Once you have added this, save the file and click on the update schema option on the playground to upload the file again, it should take a few seconds to update, once it’s done we will be able to use our newly created query, as follows:

Don’t forget that as opposed to having all the info about users served (namely: name, email, password) we can choose what fields we want because it’s a GraphQL and not just that. It’s Fauna’s GraphQL so feel free to specify more fields if you want. 

Testing from without the playground – Using python (requests library)

Now that we’ve seen that our API works from the playground lets see how we can actually use this from an application outside the playground environment, using python’s request library so if you don’t have it installed kindly install it using pip as follows:

pip install requests
  • Before we write any code we need to get our API key from Fauna which is what will help us communicate with our API from outside the playground. Head over to security on your dashboard and on the keys tab select the option to create a new key, it should bring up a form like this:

Leave the database option as the current one, change the role of the key from admin to the server and then save. It’ll generate for you a new key secret that you must copy and save somewhere safe, as an environment variable most probably.

  • For this I’m going to create a simple script to demonstrate, so add a new file call it whatever you wish — I’m calling mine test.py to your current working directory or anywhere you wish. In this file we’ll add the following:

Here we add a couple of imports, including the requests library which we use to send the requests, as well as the os module used here to load our environment variables which is where I stored the Fauna secret key we got from the previous step.

Note the URL where the request is to be sent, this is gotten from the Fauna GraphQL playground here:

Next, we create a query which is to be sent this example shows a simple fetch query to find a user by id (which is one of the automatically generated queries from Fauna), we then retrieve the key from the environment variable and store it in a variable called a token, and create a dictionary to represent out headers, this is, after all, an HTTP request so we can set headers here, and in fact, we have to because Fauna will look for our secret key from the headers of our request.

  • The concluding part of the code features how we use the request library to create the request, and is shown as follows:

We create a request object and check to see if the request went through via its status_code and print the response from the server if it went well otherwise we print an error message lets run test.py and see what it returns.

Conclusion

In this article, we covered creating GraphQL servers from scratch and looked at creating servers right from Fauna without having to do much work, we also saw some of the awesome, cool perks that come with using the serverless system that Fauna provides, we went on to further see how we could test our servers and validate that they work.

Hopefully, this was worth your time, and taught you a thing or two about GraphQL, Serverless, Fauna, and Flask and text analytics. To learn more about Fauna, you can also sign up for a free account and try it out yourself!


By: Adesina Abdrulrahman


The post How to Build a GraphQL API for Text Analytics with Python, Flask and Fauna appeared first on CSS-Tricks.

You can support CSS-Tricks by being an MVP Supporter.

CSS-Tricks

, , , , , ,
[Top]

Embedding an Interactive Analytics Component with Cumul.io and Any Web Framework

In this article, we explain how to build an integrated and interactive data visualization layer into an application with Cumul.io. To do so, we’ve built a demo application that visualizes Spotify Playlist analytics! We use Cumul.io as our interactive dashboard as it makes integration super easy and provides functionality that allow interaction between the dashboard and applications (i.e. custom events). The app is a simple JavaScript web app with a Node.js server, although you can, if you want, achieve the same with Angular, React and React Native while using Cumul.io dashboards too.

Here, we build dashboards that display data from the The Kaggle Spotify Dataset 1921–2020, 160k+ Tracks and also data via the Spotify Web API when a user logs in. We’ve built dashboards as an insight into playlist and song characteristics. We’ve added some Cumul.io custom events that will allow any end user visiting these dashboards to select songs from a chart and add them to one of their own Spotify playlists. They can also select a song to display more info on them, and play them from within the application. The code for the full application is also publicly available in an open repository.

Here’s a sneak peak into what the end result for the full version looks like:

What are Cumul.io custom events and their capabilities?

Simply put, Cumul.io custom events are a way to trigger events from a dashboard, to be used in the application that the dashboard is integrated in. You can add custom events into selected charts in a dashboard, and have the application listen for these events.

Why? The cool thing about this tool is in how it allows you to reuse data from an analytics dashboard, a BI tool, within the application it’s built into. It gives you the freedom to define actions based on data, that can be triggered straight from within an integrated dashboard, while keeping the dashboard, analytics layer a completely separate entity to the application, that can be managed separately to it.

What they contain: Cumul.io custom events are attached to charts rather than dashboards as a whole. So the information an event has is limited to the information a chart has.

An event is simply put a JSON object. This object will contain fields such as the ID of the dashboard that triggered it, the name of the event and a number of other fields depending on the type of chart that the event was triggered from. For example, if the event was triggered from a scatter plot, you will receive the x-axis and y-axis values of the point it was triggered from. On the other hand, if it were triggered from a table, you would receive column values for example. See examples of what these events will look like from different charts:

// 'Add to Playlist' custom event from a row in a table {  "type":"customEvent",  "dashboard":"xxxx",  "name":"xxxx",  "object":"xxxx",  "data":{    "language":"en",    "columns":[      {"id":"Ensueno","value":"Ensueno","label":"Name"},       {"id":"Vibrasphere","value":"Vibrasphere","label":"Artist"},       {"value":0.406,"formattedValue":"0.41","label":"Danceability"},       {"value":0.495,"formattedValue":"0.49","label":"Energy"},       {"value":180.05,"formattedValue":"180.05","label":"Tempo (bpm)"},       {"value":0.568,"formattedValue":"0.5680","label":"Accousticness"},       {"id":"2007-01-01T00:00:00.000Z","value":"2007","label":"Release Date (Yr)"},    ],    "event":"add_to_playlist"  } }
//'Song Info' custom event from a point in a scatter plot {  "type":"customEvent",  "dashboard":"xxxx",  "name":"xxxx",  "object":"xxxx",  "data":{    "language":"en",    "x-axis":{"id":0.601,"value":"0.601","label":"Danceability"},    "y-axis":{"id":0.532,"value":"0.532","label":"Energy"},    "name":{"id":"xxxx","value":"xxx","label":"Name"},    "event":"song_info"   } }

The possibilities with this functionality are virtually limitless. Granted, depending on what you want to do, you may have to write a couple more lines of code, but it is unarguably quite a powerful tool!

The dashboard

We won’t actually go through the dashboard creation process here and we’ll focus on the interactivity bit once it’s integrated into the application. The dashboards integrated in this walk through have already been created and have custom events enabled. You can, of course create your own ones and integrate those instead of the one we’ve pre-built (you can create an account with a free trial). But before, some background info on Cumul.io dashboards;

Cumul.io offers you a way to create dashboards from within the platform, or via its API. In either case, dashboards will be available within the platform, decoupled from the application you want to integrate it into, so can be maintained completely separately.

On your landing page you’ll see your dashboards and can create a new one:

You can open one and drag and drop any chart you want:

You can connect data which you can then drag and drop into those charts:

And, that data can be one of a number of things. Like a pre-existing database which you can connect to Cumul.io, a dataset from a data warehouse you use, a custom built plugin etc.

Enabling custom events

We have already enabled these custom events to the scatter plot and table in the dashboard used in this demo, which we will be integrating in the next section. If you want to go through this step, feel free to create your own dashboards too!

First thing you need to do will be to add custom events to a chart. To do this, first select a chart in your dashboard you’d like to add an event to. In the chart settings, select Interactivity and turn Custom Events on:

To add an event, click edit and define its Event Name and Label. Event Name is what your application will receive and Label is the one that will show up on your dashboard. In our case, we’ve added 2 events; ‘Add to Playlist’ and ‘Song Info’:

This is all the setup you need for your dashboard to trigger an event on a chart level. Before you leave the editor, you will need your dashboard ID to integrate the dashboard later. You can find this in the Settings tab of your dashboard. The rest of the work remains on application level. This will be where we define what we actually want to do once we receive any of these events.

Takeaway points

  1. Events work on a chart level and will include information within the limits of the information on the chart
  2. To add an event, go to the chart settings on the chart you want to add them to
  3. Define name and label of event. And you’re done!
  4. (Don’t forget to take note of the dashboard ID for integration)

Using custom events in your own platform

Now that you’ve added some events to the dashboard, the next step is to use them. The key point here is that, once you click an event in your dashboard, your application that integrates the dashboard receives an event. The Integration API provides a function to listen to these events, and then it’s up to you to define what you do with them. For more information on the API and code examples for your SDK, you can also check out the relevant developer docs.

For this section, we’re also providing an open GitHub repository (separate to the repository for the main application) that you can use as a starting project to add custom events to.

The cumulio-spotify-datatalks repository is structured so that you can checkout on the commit called skeleton to start from the beginning. All the following commits will represent a step we go through here. It’s a boiled down version of the full application, focusing on the main parts of the app that demonstrates Custom Events. I’ll be skipping some steps such as the Spotify API calls which are in src/spotify.js, so as to limit this tutorial to the theme of ‘adding and using custom events’.

Useful info for following steps

Let’s have a look at what happens in our case. We had created two events; add_to_playlist and song_info. We want visitors of our dashboard to be able to add a song to their own playlist of choice in their own Spotify account. In order to do so, we take the following steps:

  1. Integrate the dashboard with your app
  2. Listen to incoming events

Integrate the dashboard with your app

First, we need to add a dashboard to our application. Here we use the Cumul.io Spotify Playlist dashboard as the main dashboard and the Song Info dashboard as the drill through dashboard (meaning we create a new dashboard within the main one that pops up when we trigger an event). If you have checked out on the commit called skeleton and npm run start, the application should currently just open up an empty ‘Cumul.io Favorites’ tab, with a Login button at the top right. For instructions on how to locally run the project, go to the bottom of the article:

To integrate a dashboard, we will need to use the Cumulio.addDashboard() function. This function expects an object with dashboard options. Here’s what we do to add the dashboard:

In src/app.js, we create an object that stores the dashboard IDs for the main dashboard and the drill through dashboard that displays song info alongside a dashboardOptions object:

// create dashboards object with the dashboard ids and dashboardOptions object  // !!!change these IDs if you want to use your own dashboards!!! const dashboards = {   playlist: 'f3555bce-a874-4924-8d08-136169855807',    songInfo: 'e92c869c-2a94-406f-b18f-d691fd627d34', };  const dashboardOptions = {   dashboardId: dashboards.playlist,   container: '#dashboard-container',   loader: {     background: '#111b31',     spinnerColor: '#f44069',     spinnerBackground: '#0d1425',     fontColor: '#ffffff'   } };

We create a loadDashboard() function that calls Cumulio.addDashboard(). This function optionally receives a container and modifies the dashboardOptions object before adding dashboard to the application.

// create a loadDashboard() function that expects a dashboard ID and container  const loadDashboard = (id, container) => {   dashboardOptions.dashboardId = id;   dashboardOptions.container = container || '#dashboard-container';     Cumulio.addDashboard(dashboardOptions); };

Finally, we use this function to add our playlist dashboard when we load the Cumul.io Favorites tab:

export const openPageCumulioFavorites = async () => {   ui.openPage('Cumul.io playlist visualized', 'cumulio-playlist-viz');   /**************** INTEGRATE DASHBOARD ****************/   loadDashboard(dashboards.playlist); };

At this point, we’ve integrated the playlist dashboard and when we click on a point in the Energy/Danceability by Song scatter plot, we get two options with the custom events we added earlier. However, we’re not doing anything with them yet.

Listen to incoming events

Now that we’ve integrated the dashboard, we can tell our app to do stuff when it receives an event. The two charts that have ‘Add to Playlist’ and ‘Song Info’ events here are:

First, we need to set up our code to listen to incoming events. To do so, we need to use the Cumulio.onCustomEvent() function. Here, we chose to wrap this function in a listenToEvents() function that can be called when we load the Cumul.io Favorites tab. We then use if statements to check what event we’ve received:

const listenToEvents = () => {   Cumulio.onCustomEvent((event) => {     if (event.data.event === 'add_to_playlist'){       //DO SOMETHING     }     else if (event.data.event === 'song_info'){       //DO SOMETHING     }   }); };

This is the point after which things are up to your needs and creativity. For example, you could simply print a line out to your console, or design your own behaviour around the data you receive from the event. Or, you could also use some of the helper functions we’ve created that will display a playlist selector to add a song to a playlist, and integrate the Song Info dashboard. This is how we did it;

Add song to playlist

Here, we will make use of the addToPlaylistSelector() function in src/ui.js. This function expects a Song Name and ID, and will display a window with all the available playlists of the logged in user. It will then post a Spotify API request to add the song to the selected playlist. As the Spotify Web API requires the ID of a song to be able to add it, we’ve created a derived Name & ID field to be used in the scatter plot.

An example event we receive on add_to_playlist will include the following for the scatter plot:

"name":{"id":"So Far To Go&id=3R8CATui5dGU42Ddbc2ixE","value":"So Far To Go&id=3R8CATui5dGU42Ddbc2ixE","label":"Name & ID"}

And these columns for the table:

"columns":[  {"id":"Weapon Of Choice (feat. Bootsy Collins) - Remastered Version","value":"Weapon Of Choice (feat. Bootsy Collins) - Remastered Version","label":"Name"},  {"id":"Fatboy Slim","value":"Fatboy Slim","label":"Artist"},    // ...  {"id":"3qs3aHNUcqFGv7jMYJJCYa","value":"3qs3aHNUcqFGv7jMYJJCYa","label":"ID"} ]

We extract the Name and ID of the song from the event via the getSong() function, then call the ui.addToPlaylistSelector() function:

/*********** LISTEN TO CUSTOM EVENTS AND ADD EXTRAS ************/ const getSong = (event) => {   let songName;   let songArtist;   let songId;   if (event.data.columns === undefined) {     songName = event.data.name.id.split('&id=')[0];     songId = event.data.name.id.split('&id=')[1];   }   else {     songName = event.data.columns[0].value;     songArtist = event.data.columns[1].value;     songId = event.data.columns[event.data.columns.length - 1].value;   }   return {id: songId, name: songName, artist: songArtist}; };  const listenToEvents = () => {   Cumulio.onCustomEvent(async (event) => {     const song = getSong(event);     console.log(JSON.stringify(event));     if (event.data.event === 'add_to_playlist'){       await ui.addToPlaylistSelector(song.name, song.id);     }     else if (event.data.event === 'song_info'){       //DO SOMETHING     }   }); };

Now, the ‘Add to Playlist’ event will display a window with the available playlists that a logged in user can add the song to:

Display more song info

The final thing we want to do is to make the ‘Song Info’ event display another dashboard when clicked. It will display further information on the selected song, and include an option to play the song. It’s also the step where we get into more some more complicated use cases of the API which may need some background knowledge. Specifically, we make use of Parameterizable Filters. The idea is to create a parameter on your dashboard, for which the value can be defined while creating an authorization token. We include the parameter as metadata while creating an authorization token.

For this step, we have created a songId parameter that is used in a filter on the Song Info dashboard:

Then, we create a getDashboardAuthorizationToken() function. This expects metadata which it then posts to the /authorization endpoint of our server in server/server.js:

const getDashboardAuthorizationToken = async (metadata) => {   try {     const body = {};     if (metadata && typeof metadata === 'object') {       Object.keys(metadata).forEach(key => {         body[key] = metadata[key];       });     }      /*       Make the call to the backend API, using the platform user access credentials in the header       to retrieve a dashboard authorization token for this user     */     const response = await fetch('/authorization', {       method: 'post',       body: JSON.stringify(body),       headers: { 'Content-Type': 'application/json' }     });      // Fetch the JSON result with the Cumul.io Authorization key & token     const responseData = await response.json();     return responseData;   }   catch (e) {     return { error: 'Could not retrieve dashboard authorization token.' };   } };

Finally, we use the load the songInfo dashboard when the song_info event is triggered. In order to do this, we create a new authorization token using the song ID:

const loadDashboard = (id, container, key, token) => {   dashboardOptions.dashboardId = id;   dashboardOptions.container = container || '#dashboard-container';      if (key && token) {     dashboardOptions.key = key;     dashboardOptions.token = token;   }    Cumulio.addDashboard(dashboardOptions); };

We make some modifications to the loadDashboard() function so as to use the new token:

const loadDashboard = (id, container, key, token) =u003e {n  dashboardOptions.dashboardId = id;n  dashboardOptions.container = container || '#dashboard-container';  nn  if (key u0026u0026 token) {n    dashboardOptions.key = key;n    dashboardOptions.token = token;n  }nn  Cumulio.addDashboard(dashboardOptions);n};

Then call the ui.displaySongInfo(). The final result looks as follows:

const listenToEvents = () => {   Cumulio.onCustomEvent(async (event) => {     const song = getSong(event);     if (event.data.event === 'add_to_playlist'){       await ui.addToPlaylistSelector(song.name, song.id);     }     else if (event.data.event === 'song_info'){       const token = await getDashboardAuthorizationToken({ songId: [song.id] });       loadDashboard(dashboards.songInfo, '#song-info-dashboard', token.id, token.token);       await ui.displaySongInfo(song);     }   }); };

And voilá! We are done! In this demo we used a lot of helper functions I haven’t gone through in detail, but you are free clone the demo repository and play around with them. You can even disregard them and build your own functionality around the custom events.

Conclusion

For any one intending to have a layer of data visualisation and analytics integrated into their application, Cumul.io provides a pretty easy way of achieving it as I’ve tried to demonstrate throughout this demo. The dashboards remain decoupled entities to the application that can then go on to be managed separately. This becomes quite an advantage if say you’re looking at integrated analytics within a business setting and you’d rather not have developers going back and fiddling with dashboards all the time.

Events you can trigger from dashboards and listen to in their host applications on the other hand allows you to define implementations based off of the information in those decoupled dashboards. This can be anything from playing a song in our case to triggering a specific email to be sent. The world is your oyster in this sense, you decide what to do with the data you have from your analytics layer. In other words, you get to reuse the data from your dashboards, it doesn’t have to just stay there in its dashboard and analytics world 🙂

Steps to run this project

Before you start:

  1. Clone the cumulio-spotify-datatalks repository with npm install
  2. Create a .env file in the root directory and add the following from your Cumul.io and Spotify Developer accounts:
  3. From Cumul.io: CUMULIO_API_KEY=xxx CUMULIO_API_TOKEN=xxx
  4. From Spotify: SPOTIFY_CLIENT_ID=xxx SPOTIFY_CLIENT_SECRET=xxx ACCESS_TOKEN=xxx REFRESH_TOKEN=xxxnpm run start
  5. On your browser, go to http://localhost:3000/ and log into your Spotify account 🥳


The post Embedding an Interactive Analytics Component with Cumul.io and Any Web Framework appeared first on CSS-Tricks.

You can support CSS-Tricks by being an MVP Supporter.

CSS-Tricks

, , , , ,
[Top]

Comparing Data in Google and Netlify Analytics

Jim Nielsen:

the datasets weren’t even close for me.

Google Analytics works by putting a client-side bit of JavaScript on your site. Netlify Analytics works by parsing server logs server-side. They are not exactly apples to apples, feature-wise. Google Analytics is, I think it’s fair to say, far more robust. You can do things like track custom events which might be very important analytics data to a site. But they both have the basics. They both want to tell you how many pageviews your homepage got, for instance.

There are two huge things that affect these numbers:

  • Client-side JavaScript is blockable and tons of people use content blockers, particularly for third-party scripts from Google. Server-side logs are not blockable.
  • Netlify doesn’t filter things out of that log, meaning bots are counted in addition to regular people visiting.

So I’d say: Netlify probably has more accurate numbers, but a bit inflated from the bots.

Also worth noting, you can do server-side Google Analytics. I’ve never seen anyone actually do it but it seems like a good idea.

One bit of advice from Jim:

Never assume too much from a single set of data. In other words, don’t draw all your data-driven insights from one basket.

Direct Link to ArticlePermalink


The post Comparing Data in Google and Netlify Analytics appeared first on CSS-Tricks.

You can support CSS-Tricks by being an MVP Supporter.

CSS-Tricks

, , , ,
[Top]

The Analytics That Matter

I’ve long been skeptical of quoting global browser usage percentages to justify their usage of browser features. It doesn’t matter what global usage of a browser is, other than nerdy cocktail party fodder. The usage that matters is what users on your site are using, and that can be wildly different from site to site.

That idea of tracking real usage of your actual site concept has bounced around my head the last few days. And it’s not just “I can’t use CSS grid because IE 11 has 1.42% of global usage still” stuff, it’s about measuring metrics that matter to your site, no matter what they are.

Performance metrics are a big one. When you’re doing performance testing, much of it is what you would call synthetic testing. An automated browser loads your site and tracks what it finds as it loads, like the timing of a thing, the size of assets, the number of assets, etc. Synthetic information like this enters my mind when spending tons of time on performance. “I bet I can get rid of this one extra request,” I think. “I bet I can optimize this asset a little further.” And the performance tools we use report this kind of data to us readily. How big is our JavaScript bundle? What is our “Largest Contentful Paint”? What is our Lighthouse performance score? All things that are related to performance, but aren’t measuring actual user’s experience.

Let that sit for a second.

There are other analytics we can gather on a site, like usage analytics. For example, we might slap Google Analytics on a site, doing nothing but installing the generic snippet. This is going to tell us stuff like what pages are the most popular, how long people spend on the site, and what countries deliver the most traffic. Those are real user analytics, but it’s very generic analytic information.

If you’re hoping for more useful analytics data on your site, you have to think about it a little harder up front. What do you want to know? Maybe you want to know how often people use Feature X. Or you want to know how many files they have uploaded this week. Or how many messages they have sent. Or how many times they have clicked the star button. This is stuff that tells you how your site is doing. Generic analytics tracking won’t do that; you’ll have to write a little JavaScript to capture and report on those things. It takes a little effort to get the analytics you really care about.

Now apply that to performance tooling.

Rather than generic synthetic tests, why not measure things that are actually important to your specific site? One aspect to this is RUM, that is, “Real User Monitoring.” So rather than a single synthetic test being the source of all performance testing on your site, you’re tracking real users actually using the site on their actual devices. That makes a lot of sense to me, but aside from the logic of it, it unlocks some important data.

For example, one of Google’s Web Core Vitals, which are soon to affect the SEO of our pages, include a metric called First Input Delay (FID) and you have to collect data via JavaScript¹ on your page to use it.

Another Web Core Vital is “Largest Contentful Paint” which is a fascinating attempt at a more meaningful performance metric. Imagine a metric like “start render” or the first page paint. Is that interesting? Sorta. At least it is signaling to the user that something is happening (probably). Yet that first render might not be actually useful content, like the headline and body copy of a news article. So this metric makes a guess at what that useful content probably is and measures that. Very clever.

But, why guess? I get why Google has to guess. They have to measure LCP on a bazillion sites and provide generically useful measurements. But on your own site (again, where the focused analytics actually matter) we can tell performance tools which elements matter us and record when they render. Personally, I’d care about when the article itself renders on this site. With SpeedCurve’s hero rendering time, I could do something like:

<main elementtiming="article"></main>  <!-- or focus on the top of the page, like the "hero" timing suggests --> <header elementtiming="hero"></header>

Now I’m measuring what matters to my site and not just generic numbers.

Similarly, FID is cool and all, but why not fire off a JavaScript event telling performance tooling when things happen that are important to your site. For example, on CodePen, we’d do that when the editor is ready to use. That’s called User Timing and it’s a dang W3C spec!

Editors.init(); performance.mark("Editors are initialized.");

These kind some-effort-required analytics are definitely better than the standard fare. Sure, a performance budget that warns you when you go over 200KB of JavaScript is great, but a performance budget that warns you when a core feature of your app isn’t ready until 1.4 seconds when your budget is 1.1 seconds is way more important.

  1. I say this because I was trying to make a chart on the SpeedCurve of the three Web Core Vitals, and you can’t add FID unless you have LUX running, which is their RUM thing. Phew that was a lot of acronyms, sorry.

The post The Analytics That Matter appeared first on CSS-Tricks.

CSS-Tricks

,
[Top]

Options for Hosting Your Own Non-JavaScript-Based Analytics

There are loads of analytics platforms to help you track visitor and usage data on your sites. Perhaps most notably Google Analytics, which is widely used (including on this site), probably due to it’s ease of integration, feature-richness, and the fact that it’s free (until you need to jump up to the enterprise tier which is some crazy six-figure jump).

I don’t take any particular issue with Google Analytics. In fact I quite like it, especially as I’ve learned more about customizing it, like we’ve done here on CSS-Tricks as well as on CodePen.

But there are other options. In particular, I wanted to look at some other options where:

  • You can self-host the analytics. Always something to be said for owning your own data.
  • Data collection doesn’t require JavaScript. That’s so often blocked these days, as wariness of third-party JavaScript grows. It’s interesting to consider the entirely unobtrusive server-log based analytics.

I didn’t find a sea of options to look at. The classic one I always think of in this category is Shaun Inman’s Mint, but Mint isn’t taking new customers anymore. Maybe I’m not looking in all the right places, and perhaps you can help with that. Please chime in with a comment if you know of more options — especially ones you have experience with.

Fathom Analytics

This is one Dave Rupert uses on his personal site and has written about. They have a paid hosted version, which is still focused on privacy in the sense that it does not track or store user data. But they also have a free self-hosted version you can run on your own. Actual data collection is done via a JavaScript snippet you put into your site.

Ackee

This is based on Node.js and can only be self-hosted. Actual data collection is done with a JavaScript snippet you put into the site.

Matomo On-Premise

Matomo Cloud is their hosted version, and On-Premisis is the self-hosted version. The actual data collection is done via a JavaScript snippet you put into the site.

GoAccess

GoAccess is notable because it’s the first in the list that is a “web log analyzer” which means it looks at access logs that your web server creates rather than relying on JavaScript reporting from the client side. Theoretically, this should be more accurate since client-side JavaScript can be blocked. GoAccess generates reporting that can be viewed in the terminal, as well as browser-based charts and graphs.

Netlify Analytics

Netlify Analytics isn’t self-hosted in that you install it yourself on servers you rent. A big point of using Netlify is that it prevents you from dealing with your own servers. The analytics are server-log based rather than JavaScript which can be desirable as they are likely more accurate and don’t impact performance.

Web hosts are uniquely qualified to offer analytics to their users as they configure their own logging and such. For example, I also have analytics on this site through Flywheel, without installing anything, because they can analyze the traffic going through their servers. We wrote up an overview of the service when it was released.

AWStats

AWStats is the oldest analytics tool on the block. When I started out on the web, all the web hosting providers touted AWStats dashboards as part of their offerings. It runs on Perl, and like the last two services above, it gets data from server logs.

It ain’t pretty but it’s free, open-source, and has the stability of being a software project nearly 20 years old.

The post Options for Hosting Your Own Non-JavaScript-Based Analytics appeared first on CSS-Tricks.

CSS-Tricks

, , ,
[Top]

Introducing Netlify Analytics

You work a while on a side project. You think it’s pretty cool! You decide to release it into the world. And then… it goes well. Or it doesn’t go well. Wait, is that right? You forgot to add analytics — it just didn’t cross your mind at the time. Now you’re pretty curious how many people have been visiting the site, but… you’re not sure. Enter Netlify Analytics.

There are so many times where I:

  • Forget to add analytics
  • Don’t want to incur the extra page weight, or
  • I’m concerned with privacy issues

I released a CSS Grid Generator last month and I forgot to add analytics. The release went well, but now it’s a bit of a black box for me as far what happened there or if I need to adjust a release in the future. Now, however, I can enable Netlify Analytics and see into the past without having lost any information. Sweet.

Netlify Analytics doesn’t have a ton of bells and whistles — it’s not meant to be a replacement for super comprehensive marketing tools. But if you want to get some data about your site without adding a lot of scripts, it can be a handy tool.

One really nice thing about it is the accuracy. Since the data is coming from the server, you can have a clear picture of what the server actually served, rather than relying on a third party which might have varied reporting due to things like add blockers that can skew client-side reporting (15% of users are estimated to use tools like Ghostery, for instance), caching, and other factors.

The Analytics Dashboard

The dashboard for each site shows some “at a glance” information:

chart showing some of at a glance information

Then you can dive into more detailed information by specific date:

chart showing by date

There’s a bit of information from top sources and top pages:

chart showing top sources

There’s an area for “Top Resources Not Found”, which shows any pages, images, anything that your visitors are trying and failing to retrieve from your site. When I enabled it on mine, I was able to fix a broken resource that I had long forgotten about.

It’s going to be awesome being able to check how some of my dev projects are doing. But I’m also really excited to take that extra implementation step out of my work. The caveats to keep in mind is that your site needs to be hosted by Netlify in order to use the Analytics tools, and it’s a paid feature. Any site you enable will show up to 90 days (3 billing cycles) in the “Bandwidth used” chart, and up to 30 days in all other charts if it’s old enough, however it could take up to 2 days between when you enable analytics and when your dashboard is calculated and populated.

Under the hood

The analytics dashboard itself is built with React and Highcharts. Highcharts is a JavaScript charting library that includes responsive options and an accessibility module. All of the components consume data from our internal analytics API.

Before development began, we conducted an internal comparison survey of data visualization libraries in order to choose the best one for our needs. We landed on Highcharts over other popular options like d3.js, primarily because it means any engineer at Netlify with JavaScript experience can jump in and contribute, whether they have deep SVG and D3-specific knowledge or not.

While the charts themselves are rendered as SVG elements, Highcharts allows you to render any text inside the graph using HTML, simplifying and speeding our development time and allowing us to use CSS for advanced styling. The Highcharts docs are also top notch and offer a ton of customization options through their declarative API.

We used the Highcharts wrapper for React in order to create reusable React components for each type of graph. The “Top sources,” “Top pages,” and “Top resources not found” cards use a different component that displays a <table> using the data passed in as props.

One of the trickier challenges we encountered on the UI side while building these graphs was displaying dates along the X axis of the area charts in a way that wouldn’t look overwhelming.

Highcharts offers an option to customize the format of an axis label using a JavaScript callback function, so we hooked into that to display every other date as a label. From there, we wrote an algorithm to capture the first date of each month that was being displayed and add the month name into the markup for the label, making the UI a bit cleaner and easier to digest.

Other Analytics Alternatives, with Snippets

If you’d still like to run third-party scripts and other kind of analytics, Netlify has capabilities to add something globally to <head> or <body> tags. This is useful because, depending on how your site is set up, it can be a bit of a pain to add third-party scripts to every page. Plus, sometimes you want to give the ability to change these scripts to someone who doesn’t have access to the repo. Go to the particular site in the dashboard, then SettingsBuild & DeployPost processing.

That’s where you will find Snippet Injection:

Click “Add snippet” and you’ll be able to select whether you want to add the third-party snippet to the <body> or the <head> tag, and you’ll have a change to post your code in HTML. For example, if you need to add Google Analytics, you’d wrap it in a script tag like this:

<script async src="https://www.googletagmanager.com/gtag/js?id=UA-68528-29"></script> <script>   window.dataLayer = window.dataLayer || [];   function gtag(){dataLayer.push(arguments);}   gtag('js', new Date());    gtag('config', 'UA-XXXXX-XX'); </script>

You’ll also name it so that you can keep track of it. If you need to add more later, this is helpful.

That’s it!

You’re off and running with either the new Netlify Analytics offering that’s built-in or a more robust tool.

The post Introducing Netlify Analytics appeared first on CSS-Tricks.

CSS-Tricks

, ,
[Top]