Ship Simulator

English forum => Small talk => Topic started by: IRI5HJ4CK on March 22, 2009, 14:03:29

Title: New Search Engine- "Cuil"
Post by: IRI5HJ4CK on March 22, 2009, 14:03:29: Hi guys,

I came accross this new Search Engine, its called "Cuil", I find it to be very good!

http://www.cuil.com/

Its a little more "Flash" than Google, with the results, only just tried it out, so It may not be as good results/info wise. But, I have to say, so far, it seems on par with Google, as a search engine, only thing is, it dosen't seem to search images, but there again, I'm not the technical person :lol:

I also found out something rather interesting-It is developed by former employees of Google, they also claim that they have the largest Search Index.

Try it out if you like, thought I'd let you guys know about it!-If you don't already know!

Jack :)
p.s. another thing that you may find interesting is the fact that "Cuil" means Knowledge, in Irish.
Title: Re: New Search Engine- "Cuil"
Post by: Captain Darling on March 22, 2009, 14:24:50: Cuil was built to "kill" google. ;)

Plus it's got bad rep towards webmasters for requesting non-existant pages and using much bandwidth.

Not to be all negative, but I did ask them to stop indexing my site as it uses up alot of bandwidth, they never stopped and still haven't and they haven't got the courtesy to even reply back to me. :-\

Sorry, but I perfer Google, it's well known and trustworthy, well more trustworthy than Cuil anyway.
Title: Re: New Search Engine- "Cuil"
Post by: IRI5HJ4CK on March 22, 2009, 14:42:23: Oh dear, that dosen't sound good :-[ :-\

Should I delete the topic?

Jack.
Title: Re: New Search Engine- "Cuil"
Post by: TerryRussell on March 22, 2009, 14:49:36: Quote from: Captain Darling on March 22, 2009, 14:24:50
Cuil was built to "kill" google. ;)

Plus it's got bad rep towards webmasters for requesting non-existant pages and using much bandwidth.

Not to be all negative, but I did ask them to stop indexing my site as it uses up alot of bandwidth, they never stopped and still haven't and they haven't got the courtesy to even reply back to me. :-\

Sorry, but I perfer Google, it's well known and trustworthy, well more trustworthy than Cuil anyway.

Did you put a Cuil-specific tag in the robots file?
Title: Re: New Search Engine- "Cuil"
Post by: llamalord on March 22, 2009, 15:05:55: I know that the owners of such search engines are watching what we are searching so I always Stick It To The Man by searching Google, Yahoo, Dogpile...etc... ;D
Title: Re: New Search Engine- "Cuil"
Post by: Captain Darling on March 22, 2009, 15:21:55: Quote from: TerryRussell on March 22, 2009, 14:49:36
Did you put a Cuil-specific tag in the robots file?
Tried that but they somehow got around that.
Title: Re: New Search Engine- "Cuil"
Post by: TerryRussell on March 22, 2009, 16:08:46: Quote from: Captain Darling on March 22, 2009, 15:21:55
Tried that but they somehow got around that.

Hmmm. I put "robots.txt cuil" into Google and almost drowned in the complaints. Lots of adverse publicity about this one. e.g. "If don't want to find what you're after - Use Cuil". And loads of complaints from Webmasters about their crawler.

But, I found this page on Cuil which might be of interest. They are using Twiceler to crawl over your pages. They reckon they have a blocking mechanism, so follow the links on the page.
http://www.cuil.com/info/webmaster_info/
Title: Re: New Search Engine- "Cuil"
Post by: firestar12 on March 22, 2009, 16:28:29: It looks nicer than Google.
Title: Re: New Search Engine- "Cuil"
Post by: Nathan|C on March 22, 2009, 16:58:19: Quote from: Firestar on March 22, 2009, 16:28:29
It looks nicer than Google.

Looks can be deceiving... :evil:
Title: Re: New Search Engine- "Cuil"
Post by: firestar12 on March 22, 2009, 17:02:56: Quote from: Nathan|C on March 22, 2009, 16:58:19
Looks can be deceiving... :evil:
Dun dun dun...
;D
Title: Re: New Search Engine- "Cuil"
Post by: Captain Darling on March 22, 2009, 17:09:02: Quote from: TerryRussell on March 22, 2009, 16:08:46
Hmmm. I put "robots.txt cuil" into Google and almost drowned in the complaints. Lots of adverse publicity about this one. e.g. "If don't want to find what you're after - Use Cuil". And loads of complaints from Webmasters about their crawler.

But, I found this page on Cuil which might be of interest. They are using Twiceler to crawl over your pages. They reckon they have a blocking mechanism, so follow the links on the page.
http://www.cuil.com/info/webmaster_info/
I looked at that also Terry.
They also said that if you emailed them requesting for them to stop crawlling/indexing your site and then they will put your site on the "not to crawll" list. But as I said, tried that also and failed.. :-\
And they also have a crawller named "cuil", seen it on a list of search engines that has crawlled elitefun.
Title: Re: New Search Engine- "Cuil"
Post by: Agent|Austin on March 22, 2009, 17:37:33: And it crawls every 2 hours, really annoying...
Title: Re: New Search Engine- "Cuil"
Post by: TerryRussell on March 22, 2009, 17:55:21: If you have the facility, you can perhaps block their IP addresses. They do show them on that page I linked, above. (I assume that they actually know their own IP addresses, of course).
Title: Re: New Search Engine- "Cuil"
Post by: Agent|Austin on March 22, 2009, 17:57:23: Quote from: TerryRussell on March 22, 2009, 17:55:21
If you have the facility, you can perhaps block their IP addresses. They do show them on that page I linked, above. (I assume that they actually know their own IP addresses, of course).

Wouldn't the search results for the site then read the error page in the description and title? Would rather just have the thing be gone.

I have no clue if James hosting has cpanel (or equivalent)
Title: Re: New Search Engine- "Cuil"
Post by: TerryRussell on March 22, 2009, 18:00:12: Quote from: Agent|Austin on March 22, 2009, 17:57:23
Wouldn't the search results for the site then read the error page in the description and title? Would rather just have the thing be gone.

They shouldn't. Otherwise all search engines would consist mainly of "404" pages.
Title: Re: New Search Engine- "Cuil"
Post by: Agent|Austin on March 22, 2009, 18:01:41: Um...

http://www.cuil.com/search?q=404+Page+not+found
Title: Re: New Search Engine- "Cuil"
Post by: TerryRussell on March 22, 2009, 18:25:35: I think they must be somewhat "young" in their approach...
Title: Re: New Search Engine- "Cuil"
Post by: Captain Darling on March 22, 2009, 22:04:22: I'll try that Terry. Thanks!
Title: Re: New Search Engine- "Cuil"
Post by: bsm2003 on March 22, 2009, 22:11:36: Here is the robots.txt i use for the site I manage. blocks everything but google yahoo and internet archive.

google & yahoo goes through about 2 times a week. internet archive has not gone through yet. the rest do not get through.

User-agent: Google
Disallow: /addon-modules/
Disallow: /activities/
Disallow: /dist11/
Disallow: /images/
Disallow: /activities/
User-agent: Yahoo
Disallow: /addon-modules/
Disallow: /activities/
Disallow: /dist11/
Disallow: /images/
Disallow: /activities/
User-agent: ia_archiver
Disallow: /addon-modules/
Disallow: /activities/
Disallow: /dist11/
Disallow: /images/
Disallow: /activities/
User-agent: *
Disallow: /
Title: Re: New Search Engine- "Cuil"
Post by: TerryRussell on March 22, 2009, 22:20:36: That works with all "mature" search engines. But if someone writes a crawler that doesn't check the file, or that chooses to ignore the instructions, then there is nothing to force it to do that, except an IP address ban.
Title: Re: New Search Engine- "Cuil"
Post by: bsm2003 on March 22, 2009, 22:30:06: Quote from: TerryRussell on March 22, 2009, 22:20:36
That works with all "mature" search engines. But if someone writes a crawler that doesn't check the file, or that chooses to ignore the instructions, then there is nothing to force it to do that, except an IP address ban.

You are correct; however their webmaster info page says it supports the robots.txt file. in that instance it should be possible to limit it to specific pages on your site and restrict the rest.

to do that you have to use the
useragent: twiceler
and the disallow switch for each folder in the root that you want untouched.
Title: Re: New Search Engine- "Cuil"
Post by: Stuart2007 on March 22, 2009, 22:39:43: It's quite a bit faster than Google.

BTW since Google now has intimately detailed pictures, should it be called GO OGLE?
Title: Re: New Search Engine- "Cuil"
Post by: Gloat on March 22, 2009, 23:59:42: Sorry to be a pain

What does crawling mean?
What happenes when this website does it?
Title: Re: New Search Engine- "Cuil"
Post by: bsm2003 on March 23, 2009, 00:06:45: crawling is when a search engine like google goes to a website and indexes the site for displaying in their search.

as far as this particular search engine looks like it's hogging the network resources a bit to often. making the site slowdown or possibly timing out.
Title: Re: New Search Engine- "Cuil"
Post by: Gloat on March 23, 2009, 00:08:09: I see. Thanks, mate!
Title: Re: New Search Engine- "Cuil"
Post by: bsm2003 on March 23, 2009, 00:18:36: Here is what a crawl looks like from my http access log.
ipaddress - - [21/Mar/2009:13:59:48 -0500] "GET /robots.txt HTTP/1.0" 200 603 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
ipaddress - - [21/Mar/2009:13:59:49 -0500] "GET /htj.html HTTP/1.0" 200 7552 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
ipaddress - - [21/Mar/2009:14:26:20 -0500] "GET /robots.txt HTTP/1.1" 200 603 "-" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
ipaddress - - [21/Mar/2009:14:30:36 -0500] "GET /robots.txt HTTP/1.1" 200 603 "-" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"

here is an entry from a web browser
ipaddress - - [22/Mar/2009:15:40:26 -0500] "GET /style.css HTTP/1.1" 200 3059 "http://farriscreeklodge.org/" "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.6) Gecko/2009021619 Mandriva/1.9.0.6-0.1mdv2009.0 (2009.0) Firefox/3.0.6"
ipaddress - - [22/Mar/2009:15:40:26 -0500] "GET /images/valid-xhtml10-blue.png HTTP/1.1" 200 2026 "http://farriscreeklodge.org/" "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.6) Gecko/2009021619 Mandriva/1.9.0.6-0.1mdv2009.0 (2009.0) Firefox/3.0.6"
Title: Re: New Search Engine- "Cuil"
Post by: capn_cal on March 23, 2009, 00:21:43: that is very interesting!!!!!!!!! :o :o :o
Title: Re: New Search Engine- "Cuil"
Post by: llamalord on March 23, 2009, 15:32:14: Does Any one know if Google's Primary crawler is a spider or something else, Because I like to track my website's crawlings.
Title: Re: New Search Engine- "Cuil"
Post by: bsm2003 on March 23, 2009, 16:17:36: http://www.google.com/support/webmasters/?hl=en