PDA

View Full Version : robot restrictions


Datagg
08-06-06, 05:51 AM
Since using the optimizer Bruce told me that i should put in robots.txt for Google to not index any Merchant pages.....

Not disputing this, yet Google is giving me major probs in the engines. Even after this last update they had last week.

The google program shows this

Indexing summary:

Pages from your site are included in Google's index. See Index stats (https://www.google.com/webmasters/sitemaps/indexstats?siteUrl=http%3A%2F%2Fwww.girlfriendslin gerie.com%2F&hl=en). [?] (http://www.google.com/support/webmasters/bin/answer.py?answer=35663&hl=en)
Googlebot has successfully accessed your home page. Last crawl date: Aug 2, 2006
You have submitted 1 Sitemaps (https://www.google.com/webmasters/sitemaps/sitemaps?siteUrl=http%3A%2F%2Fwww.girlfriendslinge rie.com%2F&hl=en). You have no Sitemap errors.
You may want to take a closer look at:
Status Distribution HTTP errors (0) (https://www.google.com/webmasters/sitemaps/webcrawlerrors?hl=en&siteUrl=http%3A%2F%2Fwww.girlfriendslingerie.com%2 F&sort=0&hl=en) https://www.google.com/webmasters/sitemaps/images/cleardot.gif https://www.google.com/webmasters/sitemaps/images/cleardot.gif Not found (38) (https://www.google.com/webmasters/sitemaps/webcrawlerrors?hl=en&siteUrl=http%3A%2F%2Fwww.girlfriendslingerie.com%2 F&sort=1&hl=en) https://www.google.com/webmasters/sitemaps/images/cleardot.gif https://www.google.com/webmasters/sitemaps/images/cleardot.gif URLs not followed (0) (https://www.google.com/webmasters/sitemaps/webcrawlerrors?hl=en&siteUrl=http%3A%2F%2Fwww.girlfriendslingerie.com%2 F&sort=5&hl=en) https://www.google.com/webmasters/sitemaps/images/cleardot.gif https://www.google.com/webmasters/sitemaps/images/cleardot.gif URLs restricted by robots.txt (98) (https://www.google.com/webmasters/sitemaps/webcrawlerrors?hl=en&siteUrl=http%3A%2F%2Fwww.girlfriendslingerie.com%2 F&sort=2&hl=en) https://www.google.com/webmasters/sitemaps/images/cleardot.gif https://www.google.com/webmasters/sitemaps/images/cleardot.gif URLs timed out (0) (https://www.google.com/webmasters/sitemaps/webcrawlerrors?hl=en&siteUrl=http%3A%2F%2Fwww.girlfriendslingerie.com%2 F&sort=4&hl=en) https://www.google.com/webmasters/sitemaps/images/cleardot.gif https://www.google.com/webmasters/sitemaps/images/cleardot.gif Unreachable URLs (0) (https://www.google.com/webmasters/sitemaps/webcrawlerrors?hl=en&siteUrl=http%3A%2F%2Fwww.girlfriendslingerie.com%2 F&sort=3&hl=en) https://www.google.com/webmasters/sitemaps/images/cleardot.gif

Sorry you can see the links as i do.... All I know is Google hates my site to no end..

And i have no idea why? Im going to have to wrote to them soon if no one can see whats up here..https://www.google.com/webmasters/sitemaps/images/cleardot.gif

Does this seem right?

ocpxc02
08-07-06, 01:05 AM
The results you show aren't necessarily indicating a problem. Since you are blocking access to Merchant pages, Google is going to report that pages were blocked. This is expected. You should check the 38 Not Found to see what's up with them. Maybe they existed recently, but didn't at the time the bot was crawling (items or categories removed, etc.).


Paul

Datagg
08-07-06, 01:13 AM
The results you show aren't necessarily indicating a problem. Since you are blocking access to Merchant pages, Google is going to report that pages were blocked. This is expected. You should check the 38 Not Found to see what's up with them. Maybe they existed recently, but didn't at the time the bot was crawling (items or categories removed, etc.).


Paul

Thanks paul

aarcmedia
08-08-06, 02:04 PM
Datagg, You need to watch this within google. Just because you use a robots.txt file to tell google not to index your mm5 directory in the future, doesn't mean that you're safe. If you all ready had mm5 pages indexed, you should have 301 redirected the mm5 url's that were indexed to the new se friendly url's your setting up (in the event bruce's software doesn't do that for you automatically).

Furthermore, having duplicate content (meaning your mm5 folder and your new se friendly folder) is a surefire way to get banned by yahoo and google. If not banned, they may at least impose a serious ranking penalty on you which is why you may be having SERP problems.

I just asked Bruce via PM if the optimizer software for version 5 takes measures to prevent the SE's from seeing duplicate content. If it doesn't, I'll have to go in and do a bunch of stuff myself using both a robots.txt file and a .htaccess file.

My background is in seo and ecommerce 2nd so I'm always very careful. Our site all ready gets tons of traffic as is using the non-friendly url's, but we figure that implementing the optimizer for version 5 should help in almost doubling it over the next 3 months as long as there are some key search engine friendly features he needs to include.

aarcmedia
08-08-06, 02:22 PM
and to elaborate on this after researching a bunch of sites that do use the optimizer module, I see they do in fact rank well in google so i'm guessing that the problems your experiencing can be simply fixed by 301 redirecting the pages that you're getting the errors for. You can do this by using a .htaccess file. Do a search in google for 301 redirect and it will come up with a ton of pages that show you how to execute this.

The big thing about it is, that just making the page disappear in google's eye's doesn't help your situation. If you 301 redirect your OLD product pages to the new ones, the old pages rank weight will carry over to the new product and you won't get the 38 errors you got in your sitemap report.

Pete McNamara
08-08-06, 11:13 PM
I know of no cases where sites were banned for duplicate content. Furthermore, there is no reason for search engines to do so.

The normal method of handling duplicate content is to include one page and ignore the duplicates, triplicates etc.

That is a problem because Murphy's Law generally applies and the page they take into account is one of the ones you don't want. In your case, it probably would be the original MM URLs as they have been around the longest i.e. your search engine friendly pages would be ignored.

As stated above, the solution is a 301 combined with a robots.txt "ban" on accessing the Merchant2 directory.

Datagg
08-09-06, 03:23 AM
Thanks all...... I discovered i had a major problem with my google sitemap..So i dropped that module , as it was creating url from my own static pages that were not encoded right.....

ive disabled this module and purchased a program that will read my static pages and make a google site map to reflect this folder.

As for my site, im using miva 4...i have so many modules right now, that im unsure a jump to miva 5 would be ethical at this point..

As for the Merchant optimizer, We do use an .htaccess that keeps surefers in the statics...There area few instances when they can fall out of it, yet wit hthe .htaccess they are thrown back in to it

William Davis
08-09-06, 03:35 PM
Thanks all...... I discovered i had a major problem with my google sitemap..So i dropped that module , as it was creating url from my own static pages that were not encoded right.....

ive disabled this module and purchased a program that will read my static pages and make a google site map to reflect this folder.

As for my site, im using miva 4...i have so many modules right now, that im unsure a jump to miva 5 would be ethical at this point..

As for the Merchant optimizer, We do use an .htaccess that keeps surefers in the statics...There area few instances when they can fall out of it, yet wit hthe .htaccess they are thrown back in to it

We also have some code to accomplish the same thing and yes there are a "...few instances when they can fall out...". For example when viewing any page after the first page for every category, search results and other s I am sure. But I do not have "code" for thoses instances other similar ones. Can you please elaborate on those "codes"?