Discussion:
How to avoid robot engines visit a domain but not to visit its alias ?
(too old to reply)
meucat
2011-09-05 03:25:09 UTC
Permalink
Hello

Those people what apply for new domain hosting services know sometimes
they give to you a ´temporary´ name kind of www.<number>.mydomain.com
o somethins else (to say an alias), until you provide real domain name
like www.yesterday999.com.

But even after providing real domain name, you still can access your
site using the old temporary www.<number>.mydomain.com .

Question is: how you can prevent search engines visit this temporary
host name of your domain BUT allow they to visit your real domain
yesterday999.com?

I ask this question because google knows my temporary name, but I put
there a robot.txt file with disallows all commands for search engines.

At the moment I set a real name for this host (yesterday999.com) I
will be forced to delete robot.txt file because I WANT google (and
others) to visit this domain to appear in their indexes.

But... then google also will index the old ´temprary´ name and it will
compete with my real domain name in google indexes.

I don´t know how to solve this problem, so would appreciate some idea
about how should I proceed.

thanks

Mig
Richard Bonner
2011-09-05 13:59:47 UTC
Permalink
Post by meucat
Those people what apply for new domain hosting services know sometimes
they give to you a =B4temporary=B4 name kind of www.<number>.mydomain.com
or something else (to say an alias), until you provide real domain name
like www.yesterday999.com.
But even after providing real domain name, you still can access your
site using the old temporary www.<number>.mydomain.com .
*** You might ask your ISP to remove or block the old domain name.
Post by meucat
Question is: how you can prevent search engines visit this temporary
host name of your domain BUT allow them to visit your real domain
yesterday999.com?
*** Place a "robots.txt" file in the root directory of the old domain.
Be sure the file name and extension are lower case, then place the
following lines into it:

User-agent: *
Disallow: /

Any search engine crawlers that understand and obey "robots.txt"
directives will not index that site. Be aware that some will have already
indexed that site and so it will continue to show up in Search Results
until the next crawl interval. Also realise that search engines which
cannot deal with "robots.txt" files will continue to index the old domain
until it is removed.

To thwart the latter, remove all files from the old domain except for
"robots.txt" and possibly an "index.html" file (see my final suggestion
at the end). If there is nothing to index, then nothing will be cached.
Post by meucat
I ask this question because google knows my temporary name, but I put
there a robot.txt file which disallows all commands for search engines.
*** Google will continue to present your website at that domain from its
cache. This will continue until the next crawl interval.
Post by meucat
At the moment I set a real name for this host (yesterday999.com) I
will be forced to delete robot.txt file because I WANT google (and
others) to visit this domain and appear in their indexes.
*** That is OK because that domain name is different from the temporary
one. You should place a "robots.txt" file in the root directory of the new
domain that reflects what you want to have happen for `yesterday999.com',
which will be to crawl that site.
Post by meucat
But... then google also will index the old =B4temprary=B4 name and it will
compete with my real domain name in google indexes.
Mig
*** Well, if it's that big of a problem, place an "index.html" file in
the root of the old domain with a redirector to your site at the new
domain. Use a "robots" meta tag in this file that says:

Meta Name="robots" Content="noindex, follow"

Remove everything from the old domain except the "index.html" and the
"robots.txt" files. Eventually all traffic will congregate at your new
domain and you can remove all the old stuff.
--
Richard Bonner
http://www.chebucto.ca/~ak621/
Scott Bryce
2011-09-08 13:52:16 UTC
Permalink
Post by Richard Bonner
*** Place a "robots.txt" file in the root directory of the old
domain. Be sure the file name and extension are lower case, then
User-agent: * Disallow: /
I think you are missing the OPs point. Both URLs point to the exact same
files on the exact same disk on the exact same server. It is just two
different addresses that take you to the same content. If you block
search engines from one URL, you block them from both.


So I guess the question is, is there a way to say If the URL used to get
here is X, then 301 redirect to Y.
Richard Bonner
2011-09-14 10:14:47 UTC
Permalink
Post by Scott Bryce
Post by Richard Bonner
*** Place a "robots.txt" file in the root directory of the old
domain. Be sure the file name and extension are lower case, then
User-agent: * Disallow: /
I think you are missing the OPs point. Both URLs point to the exact same
files on the exact same disk on the exact same server.
*** Ahh, yes. I did not clue into that. Thanks for setting me straight
on that point, Scott.
Post by Scott Bryce
It is just two different addresses that take you to the same content.
If you block search engines from one URL, you block them from both.
So I guess the question is, is there a way to say If the URL used to get
here is X, then 301 redirect to Y.
*** Yes. I suggest that the original poster contact his ISP to have the
unwanted URL reference removed.

I belong to a computer group that changed its URL to a new one. For a
time, both URLs pointed to the same content on the same server, but
as per an agreement, after one year the old URL was blocked from that
content by the club's ISP.
--
Richard Bonner
http://www.chebucto.ca/~ak621/
Adrienne Boswell
2011-09-05 22:12:08 UTC
Permalink
Post by meucat
Hello
Those people what apply for new domain hosting services know sometimes
they give to you a ŽtemporaryŽ name kind of www.<number>.mydomain.com
o somethins else (to say an alias), until you provide real domain name
like www.yesterday999.com.
But even after providing real domain name, you still can access your
site using the old temporary www.<number>.mydomain.com .
Question is: how you can prevent search engines visit this temporary
host name of your domain BUT allow they to visit your real domain
yesterday999.com?
I ask this question because google knows my temporary name, but I put
there a robot.txt file with disallows all commands for search engines.
At the moment I set a real name for this host (yesterday999.com) I
will be forced to delete robot.txt file because I WANT google (and
others) to visit this domain to appear in their indexes.
But... then google also will index the old ŽtempraryŽ name and it will
compete with my real domain name in google indexes.
I donŽt know how to solve this problem, so would appreciate some idea
about how should I proceed.
thanks
Mig
Please use example.com, search engine bots are likely to pickup the
domain listed in Usenet messages, thereby creating more of a problem.

In addition to what Richard said, if you have PHP available on the
server, you can send a header 301 Moved Permanently.
--
Adrienne Boswell at Home
Arbpen Web Site Design Services
http://www.cavalcade-of-coding.info
Please respond to the group so others can share
Loading...