WebSphere Commerce SEO: URL Flattening Lite

by Chris Sinchok on July 13, 2010

Out of the box, WebSphere Commerce has some poor SEO. This is a shame, since pretty URL’s aren’t all that difficult. WebSphere Commerce does have it’s own way to do “SEO URL’s”, but it’s pretty lacking. Basically, this will transform a URL like:
http://www.mysite.com/webapp/wcs/stores/servlet/TopCategoriesDisplay?storeId=10551&catalogId=10001&langId=-1
To:
http://www.mysite.com/webapp/wcs/stores/servlet/home_10551_10001_-1
So, it’s almost worthless. This kind of “flattening” has a number of issues:

  1. It doesn’t add any additional keywords to the URL–in the above example, the word “home” is the only thing of SEO value, and it’s pretty deep in the URL.
  2. It makes URL generation extremely difficult, as you can no longer use <c:url>, and have to literally build URL’s by hand.
  3. You can no longer just omit certain parameters, even if a default is set in the SEOUrlMapper.xml–”/webapp/wcs/stores/servlet/home” won’t load, but “/webapp/wcs/stores/servlet/home___” might.

So what’s the answer? There are a number of things you can do, but there are two that are extremely easy to implement, yet will make a huge difference in your URL’s.

Remove “/webapp/wcs/stores/servlet” at the webserver level

This is pretty easy if you know how to use mod_rewrites–but even if you don’t, stay tuned, I’ll drop some wisdom on you.

“mod_rewrite” is an Apache module available by default in IHS, and allows some pretty advanced URL rewriting capabilities. Basically, it lets you transform URL’s on the fly, redirecting them to other URL’s, or in this case, rewriting them on the backend.

Just go ahead and drop this into the “80″ and “443″ <virtualhost> elements of your httpd.conf:

RewriteCond %{REQUEST_URI} !^/webapp.*$
RewriteCond %{REQUEST_URI} !^/wcsstore.*$
RewriteCond %{REQUEST_URI} !^/js/.*$
RewriteCond %{REQUEST_URI} !^/robots.txt$
RewriteRule /(.*) /webapp/wcs/stores/servlet/$1

So what does this do? Simply, it just means that you can cut off that ugly “webapp/wcs/stores/servlet” anytime you want.

Let me demonstrate this through a sample request:

User: “Hey, I want to take a look at ‘http://www.mysite.com/TopCategoriesDisplay?storeId=10551&catalogId=10001&langId=-1‘, can you get that for me, Mr. Webserver?”

Web server: “Sure. Well, it looks like the URI doesn’t start with “webapp“, “wcsstore” or “js/“, and isn’t “robots.txt“, so I think the user really wants ‘http://www.mysite.com/webapp/wcs/stores/servlet/TopCategoriesDisplay?storeId=10551&catalogId=10001&langId=-1‘. Hey Mr. App Server, get a copy of that will ya?”

App server: “Here’s a copy of ‘http://www.mysite.com/webapp/wcs/stores/servlet/TopCategoriesDisplay?storeId=10551&catalogId=10001&langId=-1′, just like you wanted!

Web server: “Hey user, here’s “http://www.mysite.com/TopCategoriesDisplay?storeId=10551&catalogId=10001&langId=-1‘ just like you wanted!”

So basically, the web server lies, making the user thinking he’s getting one URL, and making the app server think that another thing entirely is being requested. “So”, you’re probably thinking “I get the last line, but why the first four lines?”

RewriteCond %{REQUEST_URI} !^/webapp.*$
RewriteCond %{REQUEST_URI} !^/wcsstore.*$
RewriteCond %{REQUEST_URI} !^/js/.*$
RewriteCond %{REQUEST_URI} !^/robots.txt$

Pretty simple: we want to add “/webapp/wcs/stores/servlet/” automatically to the front of a whole bunch of URL’s, but not all of them. Specifically, we don’t want to add it to anything that:

Anything that already starts with “webapp” : “RewriteCond %{REQUEST_URI} !^/webapp.*$

Anything that starts with “wcsstore” : “RewriteCond %{REQUEST_URI} !^/wcsstore.*$

Anything in the “/js/” folder: “RewriteCond %{REQUEST_URI} !^/js/.*$

The “robots.txt” file : “RewriteCond %{REQUEST_URI} !^/robots.txt$

You’ll probably have to add some more rules, but you can figure them out.

“Hold up a minute!” you’re probably thinking “But now I have two URL’s that load up the same page! Isn’t that terrible SEO?”. Why yes it is. Let fix it, and just add something like this:

RewriteRule /webapp/wcs/stores/servlet/(.*) /$1 [R=301, L]

Pretty simple. This just redirects anything that starts with “/webapp/wcs/stores/servlet/” to the new, flattened version, using a 301 redirect.

So, there you go, we got rid of the worst part of the WebSphere Commerce URL’s, and it only took about 10 minutes. Next time, we’ll fix up that home page, since it really shouldn’t be at “TopCategoriesDisplay”.

{ 5 comments… read them below or add one }

Koushik August 31, 2010 at 7:28 am

Hi Chris,

I am facing trouble in implementing your solution. Everything is working fine with any link with http protocol. But with https it does not work everytime. WCS encrypts some URL paramenters for https. Like if my URL is

/UserRegistrationForm?langId=-1&storeId=10951&catalogId=10051&new=Y

with http protocol. The first part of the rule converts it to

/webapp/wcs/stores/servlet/UserRegistrationForm?langId=-1&storeId=10951&catalogId=10051&new=Y

Now the application encrypts the URL and redirects to

/webapp/wcs/stores/servlet/UserRegistrationForm?langId=-1&storeId=10951&catalogId=10051&krypto=r%2Bct9ONFoig%2FRfaUrs5Jda0n5Lvmf4m8&ddkey=http:UserRegistrationForm

This encryption is the default behaviour of WCS. WCS then redirects this request with https protocol.

Now again rewrite rules makes it to

/UserRegistrationForm?langId=-1&storeId=10951&catalogId=10051&krypto=r%2Bct9ONFoig%2FRfaUrs5Jda0n5Lvmf4m8&ddkey=http:UserRegistrationForm

and redirects. Now during this redirection browser is encoding some part of the parameter ‘krypto’ and making it to

/UserRegistrationForm?langId=-1&storeId=10951&catalogId=10051&krypto=r%252Bct9ONFoig%252FRfaUrs5Jda0n5Lvmf4m8&ddkey=http:UserRegistrationForm

Here the browser has changed ‘%’ in krypto value to ‘%252′. Now when app server finally gets a request it tries to decrypt the modified krypto value and throwing error. Ultimately my page is not rendering.

It works only if I remove redirect in the rule

RewriteRule /webapp/wcs/stores/servlet/(.*) /$1 [R=301, L]

Can you give a solution to this problem?

Koushik August 31, 2010 at 7:58 am

Hi Chris,

Found the solution to my problem. You need to add another flag to you rule to prevent URL encoding while redirecting. This rule will work for all URLs and protocols

RewriteRule /webapp/wcs/stores/servlet/(.*) /$1 [R=301,NE, L]

Jake Brasee March 24, 2011 at 6:45 am

Koushik you wonderful person, thanks for pointing out that encoding error! If only I had stumbled across this four hours earlier! Your suggestion for changing the rewrite rule didn’t work for me, curiously, but when I manually changed the %25′s to %’s, my page worked. So now I know what I’m looking for :)

Chris, I would call you a wonderful person too, especially since you left me most of the code, but you never wrote the second blog post about TopCategoriesDisplay, and I had to go find that out all by my lonesome. So instead of “a wonderful person”, I will call you something slightly more reserved, like…hm…”not unpleasant”.

Jake Brasee March 24, 2011 at 8:16 am

Actually, Koushik, it works perfectly. =)

Tanuj September 6, 2011 at 7:24 am

New to IBM Websphere don’t know much but when I put this code in my element of httpd.conf file and restart my HTTPServer nothing happens, don’t know what am I missing must be something basic. Please help at the earliest.
My file is in IBM\HTTPServer\conf folder and there is another website hosted on the same so i guess changing this would make changes in both but currently and I am saying it again nothing happens.
Have researched on it and believe the code is almost perfect but unable to implement.

Leave a Comment

{ 1 trackback }