Tag: URLs

Does WWW still belong in URLs?

For years, a small pedantry war has been raging in our address bars. In one corner are brands like Google, Instagram, and Facebook. This group has chosen to redirect example.com to www.example.com. In the opposite corner: GitHub, DuckDuckGo, and Discord. This group has chosen to do the reverse and redirect www.example.com to example.com.

Does “WWW” belong in a URL? Some developers hold strong opinions on the subject. We’ll explore arguments for and against it after a bit of history.

What’s with the Ws?

The three Ws stand for “World Wide Web”, a late-1980s invention that introduced the world to browsers and websites. The practice of using “WWW” stems from a tradition of naming subdomains after the type of service they provide:

  • a web server at www.example.com
  • an FTP server at ftp.example.com
  • an IRC server at irc.example.com

WWW-less domain concern 1: Leaking cookies to subdomains

Critics of “WWW-less” domains have pointed out that in certain situations, subdomain.example.com would be able to read cookies set by example.com. This may be undesirable if, for example, you are a web hosting provider that lets clients operate subdomains on your domain. While the concern is valid, the behavior was specific to Internet Explorer.

RFC 6265 standardizes how browsers treat cookies and explicitly calls out this behavior as incorrect.

Another potential source of leaks is the Domain value of any cookies set by example.com. If the Domain value is explicitly set to example.com, the cookies will also be exposed to its subdomains.

Cookie value Exposed to example.com Exposed to subdomain.example.com
secret=data
secret=data; Domain=example.com

In conclusion, as long as you don’t explicitly set a Domain value and your users don’t use Internet Explorer, no cookie leaks should occur.

WWW-less domain concern 2: DNS headaches

Sometimes, a “WWW-less” domain may complicate your Domain Name System (DNS) setup.

When a user types example.com into their browser’s address bar, the browser needs to know the Internet Protocol (IP) address of the web server they’re trying to visit. The browser requests this IP address from your domain’s nameservers – usually indirectly through the DNS servers of the user’s Internet Service Provider (ISP). If your nameservers are configured to respond with an A record containing the IP address, a “WWW-less” domain will work fine.

In some cases, you may want to instead use a Canonical Name (CNAME) record for your website. Such a record can declare that www.example.com is an alias of example123.somecdnprovider.com, which tells the user’s browser to instead look up the IP address of example123.somecdnprovider.com and send the HTTP request there.

Notice that the example above used a WWW subdomain. It’s not possible to define a CNAME record for example.com. As per RFC 1912, CNAME records cannot coexist with other records. If you tried to define a CNAME record for example.com, the Nameserver (NS) records for example.com containing the IP addresses of the domain’s name servers would not be allowed to exist. As a result, browsers would not be able to figure out where your name servers are.

Some DNS providers will allow you to work around this limitation. Cloudflare calls their solution CNAME flattening. With this technique, domain administrators configure a CNAME record, but their nameservers will expose an A record.

For instance, if the administrator configures a CNAME record for example.com pointing to example123.somecdnprovider.com, and an A record for example123.somecdnprovider.com exists pointing to 1.2.3.4, then Cloudflare would expose an A record for example.com pointing to 1.2.3.4.

In conclusion, while the concern is valid for domain owners who wish to use CNAME records, certain DNS providers now offer a suitable workaround.

WWW-less benefits

Most of the arguments against WWW are practical or cosmetic. “No-WWW” advocates have argued that it’s easier to say and type example.com than www.example.com (which may be less confusing for less tech-savvy users).

Opponents of the WWW subdomain have also pointed out that dropping it comes with a humble performance advantage. Website owners could shave 4 bytes off each HTTP request by doing so. While these savings could add up for high-traffic websites like Facebook, bandwidth generally isn’t a scarce resource.

WWW benefits

One practical argument in favor of WWW is in situations with newer top-level domains. For example, www.example.miami is immediately recognizable as a web address when example.miami isn’t. This is less of a concern for sites that have recognizable top-level domains like .com.

Impact on your search engine ranking

The current consensus is that your choice does not influence your search engine performance. If you wish to migrate from one to the other, you’ll want to configure permanent redirects (HTTP 301) instead of temporary ones (HTTP 302). Permanent redirects ensure that the SEO value of your old URLs transfers to the new ones.

Tips for supporting both

Sites typically pick either example.com or www.example.com as their official website and configure HTTP 301 redirects for the other. In theory, it is possible to support both www.example.com and example.com. In practice, the costs may outweigh the benefits.

From a technical perspective, you’ll want to verify that your tech stack can handle it. Your content management system (CMS) or statically generated site would have to output internal links as relative URLs to preserve the visitor’s preferred hostname. Your analytics tools may log traffic to both hostnames separately unless you can configure the hostnames as aliases.

Lastly, you’ll need to take an extra step to safeguard your search engine performance. Google will consider the “WWW” and “non-WWW” versions of a URL to be duplicate content. To deduplicate content in its search index, Google will display whichever of the two it thinks the user will prefer – for better or worse.

To preserve control over how you appear in Google, it recommends inserting canonical link tags. First, decide which hostname will be the official (canonical) one.

For example, if you pick www.example.com, you will have to insert the following snippet in the <head> tag on https://example.com/my-article:

<link href="https://www.example.com/my-article" rel="canonical">

This snippet indicates to Google that the “WWW-less” variant represents the same content. In general, Google will prefer the version you’ve marked as canonical in search results, which would be the “WWW” variant in this example.

Conclusion

Despite intense campaigning on either side, both approaches remain valid as long as you are aware of the benefits and limitations. To cover all your bases, be sure to set up permanent redirects from one to the other and you’re all set.


Does WWW still belong in URLs? originally published on CSS-Tricks, which is part of the DigitalOcean family. You should get the newsletter.

CSS-Tricks

, ,

Trailing Slashes on URLs: Contentious or Settled?

A fun deep dive from Zach. Do you have an opinion on which you should use?

1) https://website.com/foo/ 2) https://websites.com/foo

The first option has a “trailing slash.” The second does not.

I’ve always preferred this thinking: you use a trailing slash if that page has child pages (as in, it is something of a directory page, even if has unique content of its own). If it’s the end-of-the-line (of content), no trailing slash.

I say that, but this very site doesn’t practice it. Blog posts on this site are like css-tricks.com/blog-post/ with a trailing slash and if you leave off the trailing slash, WordPress will redirect to include it. That’s part of the reason Zach is interested here. Redirects come with a performance penalty, so it’s ideal to have it happen as infrequently possible.

Performance is one thing, but SEO is another one. If you render the same content, both with and without a trailing slash, that’s theoretically a duplicate content penalty and a no-no. (Although that seems weird to me, I would think Google would smart enough not to be terribly concerned by this.)

Where resources resolve to seems like the biggest deal to me. Here’s Zach:

If you’re using relative resource URLs, the assets may be missing on Vercel, Render, and Azure Static Web Apps (depending on which duplicated endpoint you’ve visited).

<img src="image.avif"> on /resource/ resolves to /resource/image.avif

<img src="image.avif"> on /resource resolves to /image.avif

That’s a non-trivial difference and, to me, a reason the redirect is worth it. Can’t be having a page with broken resources for something this silly.

What complicates this is that the site-building framework might have opinions about this and a hosting provider might have opinions about this. As Zach notes, there are some disagreements among hosts, so it’s something to watch for.

Me, I’d go with the grain as much as I possibly could. As long as redirects are in place and I don’t have to override any config, I’m cool.

To Shared LinkPermalink on CSS-Tricks


Trailing Slashes on URLs: Contentious or Settled? originally published on CSS-Tricks. You should get the newsletter.

CSS-Tricks

, , , ,
[Top]

A Chrome Extension for Cloudinary That Helps You Pluck Out Useful Media URLs From Your Library Quickly

(This is a sponsored post.)

Cloudinary is a host for your digital assets like images and video. If you don’t already know them, you should, because you can build it into the asset management you almost certainly need to do if you run any size of website. Cloudinary helps you serve the assets as efficiently as technologically possible, meaning optimization, resizing, CDN hosting, and goes further in allowing interesting transforms on those assets.

If you already use it, unless you use it entirely through the APIs, you’ll know Cloudinary has a Media Library that gives you a UI dashboard for everything you’ve ever uploaded to Cloudinary. This is where you find your assets and open them up to play with the settings and transformations and such (e.g. blur it — then serve in the best possible format with automatic quality adjustments). You can always pop over to cloudinary.com to use this. But wouldn’t it be nice if this process was made a bit easier?

That clutch moment where you get the URL of the image you need.

There are all sorts of moments while bopping the web around doing our jobs as developers where you might need to get your fingers on an asset URL.

Gimme that URL!

Here’s a personal example: we have a little custom CMS thing for building our weekly email The CodePen Spark. It expects a URL to an image.

This is the exact kind of moment that the brand new Chrome Media Library Extension could help. Essentially it gives you a context menu you can use right in the browser to snag a URL to an asset. Right click, Insert and Asset URL.

It pops up a UI right inline (where you are on the web) of your Media Library, and you pick an image from there. Find the one you want, open it up, and you can either “edit” it to customize it to your liking, or just Insert it straight away.

Then it plops the URL right onto the site (probably an input) where you need it.

You can set up defaults to your liking, but I really like how the defaults are f_auto and q_auto which are Cloudinary classics that you’ll almost surely want. They mean “serve in the best possible format” and “optimize it intelligently”.

Sharon Yelenik introduced it on the Cloudinary blog:

Say your team creates social posts on a browser tab on an automated marketing application. To locate a media asset, you must open another tab to search for the asset within the Media Library, copy the related URL, and paste it into the app. In some cases, you even have to download an asset and then upload it into the app.

Talk about a classic example of menial, mundane, and repetitive chores!

Exactly. I like the idea of having tools to optimize workflows that should be easy. I’d also call Cloudinary a bit of a technical/developer tool, and there is an aspect to this that could be set up on anyone’s machine that would allow them to pick assets from your Media Library easily, without any access control worries.

If all this appeals to you:

Or see more at Cloudinary Labsdocumentation, and blog post.


A Chrome Extension for Cloudinary That Helps You Pluck Out Useful Media URLs From Your Library Quickly originally published on CSS-Tricks. You should get the newsletter and become a supporter.

CSS-Tricks

, , , , , , , , , ,
[Top]

Better Line Breaks for Long URLs

CSS-Tricks has covered how to break text that overflows its container before, but not much as much as you might think. Back in 2012, Chris penned “Handling Long Words and URLs (Forcing Breaks, Hyphenation, Ellipsis, etc)” and it is still one of only a few posts on the topic, including his 2018 follow-up Where Lines Break is Complicated. Here’s all the Related CSS and HTML.

Chris’s tried-and-true technique works well when you want to leverage automated word breaks and hyphenation rules that are baked into the browser:

.dont-break-out {   /* These are technically the same, but use both */   overflow-wrap: break-word;   word-wrap: break-word;    word-break: break-word;    /* Adds a hyphen where the word breaks, if supported (No Blink) */   hyphens: auto; }

But what if you can’t? What if your style guide requires you to break URLs in certain places? These classic sledgehammers are too imprecise for that level of control. We need a different way to either tell the browser exactly where to make a break.

Why we need to care about line breaks in URLs

One reason is design. A URL that overflows its container is just plain gross to look at.

Then there’s copywriting standards. The Chicago Manual of Style, for example, specifies when to break URLs in print. Then again, Chicago gives us a pass for electronic documents… sorta:

It is generally unnecessary to specify breaks for URLs in electronic publications formats with reflowable text, and authors should avoid forcing them to break in their manuscripts.

Chicago 17th ed., 14.18

But what if, like Rachel Andrew (2015) encourages us, you’re designing for print, not just screens? Suddenly, “generally unnecessary” becomes “absolutely imperative.” Whether you’re publishing a book, or you want to create a PDF version of a research paper you wrote in HTML, or you’re designing an online CV, or you have a reference list at the end of your blog post, or you simply care how URLs look in your project—you’d want a way to manage line breaks with a greater degree of control.

OK, so we’ve established why considering line breaks in URLs is a thing, and that there are use cases where they’re actually super important. But that leads us to another key question…

Where are line breaks supposed to go, then?

We want URLs to be readable. We also don’t want them to be ugly, at least no uglier than necessary. Continuing with Chicago’s advice, we should break long URLs based on punctuation, to help signal to the reader that the URL continues on the next line. That would include any of the following places:

  • After a colon or a double slash (//)
  • Before a single slash (/), a tilde (~), a period, a comma, a hyphen, an underline (aka an underscore, _), a question mark, a number sign, or a percent symbol
  • Before or after an equals sign or an ampersand (&)

At the same time, we don’t want to inject new punctuation, like when we might reach for hyphens: auto; rules in CSS to break up long words. Soft or “shy” hyphens are great for breaking words, but bad news for URLs. It’s not as big a deal on screens, since soft hyphens don’t interfere with copy-and-paste, for example. But a user could still mistake a soft hyphen as part of the URL—hyphens are often in URLs, after all. So we definitely don’t want hyphens in print that aren’t actually part of the URL. Reading long URLs is already hard enough without breaking words inside them.

We still can break particularly long words and strings within URLs. Just not with hyphens. For the most part, Chicago leaves word breaks inside URLs to discretion. Our primary goal is to break URLs before and after the appropriate punctuation marks.

How do you control line breaks?

Fortunately, there’s an (under-appreciated) HTML element for this express purpose: the <wbr> element, which represents a line break opportunity. It’s a way to tell the browser, Please break the line here if you need to, not just any-old place.

We can take a gnarly URL, like the one Chris first shared in his 2012 post:

http://www.amazon.com/s/ref=sr_nr_i_o?rh=k%3Ashark+vacuum%2Ci%3Agarden&keywords=shark+vacuum&ie=UTF8&qid=1327784979

And sprinkle in some <wbr> tags, “Chicago style”:

http:<wbr>//<wbr>www<wbr>.<wbr>amazon<wbr>.com<wbr>/<wbr>s/<wbr>ref<wbr>=<wbr>sr<wbr>_<wbr>nr<wbr>_<wbr>i<wbr>_o<wbr>?rh<wbr>=<wbr>k<wbr>%3Ashark<wbr>+vacuum<wbr>%2Ci<wbr>%3Agarden<wbr>&<wbr>keywords<wbr>=<wbr>shark+vacuum<wbr>&ie<wbr>=<wbr>UTF8<wbr>&<wbr>qid<wbr>=<wbr>1327784979

Even if you’re the most masochistic typesetter ever born, you’d probably mark up a URL like that exactly zero times before you’d start wondering if there’s a way to automate those line break opportunities.

Yes, yes there is. Cue JavaScript and some aptly placed regular expressions:

/**  * Insert line break opportunities into a URL  */ function formatUrl(url) {   // Split the URL into an array to distinguish double slashes from single slashes   var doubleSlash = url.split('//')    // Format the strings on either side of double slashes separately   var formatted = doubleSlash.map(str =>     // Insert a word break opportunity after a colon     str.replace(/(?<after>:)/giu, '$ 1<wbr>')       // Before a single slash, tilde, period, comma, hyphen, underline, question mark, number sign, or percent symbol       .replace(/(?<before>[/~.,-_?#%])/giu, '<wbr>$ 1')       // Before and after an equals sign or ampersand       .replace(/(?<beforeAndAfter>[=&])/giu, '<wbr>$ 1<wbr>')     // Reconnect the strings with word break opportunities after double slashes     ).join('//<wbr>')    return formatted }

Try it out

Go ahead and open the following demo in a new window, then try resizing the browser to see how the long URLs break.

This does exactly what we want:

  • The URLs break at appropriate spots.
  • There is no additional punctuation that could be confused as part of the URL.
  • The <wbr> tags are auto-generated to relieve us from inserting them manually in the markup.

This JavaScript solution works even better if you’re leveraging a static site generator. That way, you don’t have to run a script on the client just to format URLs. I’ve got a working example on my personal site built with Eleventy.

If you really want to break long words inside URLs too, then I’d recommend inserting those few <wbr> tags by hand. The Chicago Manual of Style has a whole section on word division (7.36–47, login required).

Browser support

The <wbr> element has been seen in the wild since 2001. It was finally standardized with HTML5, so it works in nearly every browser at this point. Strangely enough, <wbr> worked in Internet Explorer (IE) 6 and 7, but was dropped in IE 8, onward. Support has always existed in Edget, so it’s just a matter of dealing with IE or other legacy browsers. Some popular HTML-to-PDF programs, like Prince, also need a boost to handle <wbr>.

One more possible solution

There’s one more trick to optimize line break opportunities. We can use a pseudo-element to insert a zero width space, which is how the <wbr> element is meant to behave in UTF-8 encoded pages anyhow. That’ll at least push support back to IE 9, and perhaps more importantly, work with Prince.

/**   * IE 8–11 and Prince don’t recognize the `wbr` element,  * but a pseudo-element can achieve the same effect with IE 9+ and Prince.  */ wbr:before {   /* Unicode zero width space */   content: "0B";   white-space: normal; }

Striving for print-quality HTML, CSS, and JavaScript is hardly new, but it is undergoing a bit of a renaissance. Even if you don’t design for print or follow Chicago style, it’s still a worthwhile goal to write your HTML and CSS with URLs and line breaks in mind.

References


The post Better Line Breaks for Long URLs appeared first on CSS-Tricks.

You can support CSS-Tricks by being an MVP Supporter.

CSS-Tricks

, , , ,
[Top]