robots.txt

Für Anfänger geeignet
 

Vorraussetzungen

Dauer

ca. 10 Min.

Sicherheit durch die Robots.txt zu erreichen ist natürlich sehr fraglich. Wir haben darauf schon im Artikel Security by Obscurity hingewiesen. Aber die positiven Auswirkungen haben bis heute angehalten.

Deshalb kann es auf jeden Fall nicht schaden, folgende Werte seiner robots.txt hinzuzufügen.

User-agent: 123peoplebot
Disallow: /
User-agent: AACrawler
Disallow: /
User-agent: AhrefsBot
disallow: / 
User-agent: Alexibot
Disallow: /
User-agent: Aqua_Products
Disallow: /
User-agent: asterias
Disallow: /
User-agent: Atomic_Email_Hunter
Disallow: /
User-agent: b2w/0.1
Disallow: /
User-agent: BackDoorBot/1.0
Disallow: /
User-agent: BacklinkCrawler
Disallow: /
User-agent: Baiduspider
Disallow: /
User-agent: Baiduspider+
Disallow: /
User-agent: Baiduspider-cpro
Disallow: /
User-agent: Baiduspider-favo
Disallow: /
User-agent: Baiduspider-image
Disallow: /
User-agent: Baiduspider-mobile
Disallow: /
User-agent: Baiduspider-news
Disallow: /
User-agent: Baiduspider-sfkr
Disallow: /
User-agent: Baiduspider-video
Disallow: /
User-agent: BlowFish/1.0
Disallow: /
User-agent: Bookmark search tool
Disallow: /
User-agent: BotALot
Disallow: /
User-agent: BotRightHere
Disallow: /
User-agent: BuiltBotTough
Disallow: /
User-agent: Bullseye/1.0
Disallow: /
User-agent: BunnySlippers
Disallow: /
User-agent: Check Url Bot
Disallow: /
User-agent: CheeseBot
Disallow: /
User-agent: CherryPicker
Disallow: /
User-agent: CherryPickerElite/1.0
Disallow: /
User-agent: CherryPickerSE/1.0
Disallow: /
User-agent: Cityreview
Disallow: /
User-agent: Copernic
Disallow: /
User-agent: CopyRightCheck
Disallow: /
User-agent: cosmos
Disallow: /
User-agent: Crescent
Disallow: /
User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
Disallow: /
User-agent: CyberPatrol SiteCat Webbot
Disallow: /
User-agent: DittoSpyder
Disallow: /
User-agent: DomainWatcher
Disallow: /
User-agent: EmailCollector
Disallow: /
User-agent: EmailSiphon
Disallow: /
User-agent: EmailWolf
Disallow: /
User-agent: EroCrawler
Disallow: /
User-agent: Exabot
Disallow: /
User-agent: exooba
Disallow: /
User-agent: ExtractorPro
Disallow: /
User-agent: Ezooms/1.0
Disallow: /
User-agent: FairAd Client
Disallow: /
User-agent: Fasterfox
Disallow: /
User-agent: Flaming AttackBot
Disallow: /
User-agent: Foobot
Disallow: /
User-agent: Gaisbot
Disallow: /
User-agent: GetRight/4.2
Disallow: /
User-agent: gigabot
Disallow: /
User-agent: GrapeshotCrawler/2.0
Disallow: /
User-agent: grapeFX/0.9
Disallow: /
User-agent: Harvest/1.5
Disallow: /
User-agent: heritrix
Disallow: /
User-agent: hloader
Disallow: /
User-agent: httplib
Disallow: /
User-agent: HTTrack
Disallow: /
User-agent: HTTrack 3.0
Disallow: /
User-agent: humanlinks
Disallow: /
User-agent: ia_archiver
Disallow: /
User-agent: ia_archiver/1.6
Disallow: /
User-agent: ICCrawler - ICjobs
Disallow: /
User-agent: Infohelfer/1.3.0
Disallow: /
User-agent: InfoNaviRobot
Disallow: /
User-agent: Iron33/1.0.2
Disallow: /
User-agent: JennyBot
Disallow: /
User-agent: jobs.de-Robot
Disallow: /
User-agent: Kenjin Spider
Disallow: /
User-agent: Keyword Density/0.9
Disallow: /
User-agent: larbin
Disallow: /
User-agent: LexiBot
Disallow: /
User-agent: libWeb/clsHTTP
Disallow: /
User-agent: LinkextractorPro
Disallow: /
User-agent: LinkScan/8.1a Unix
Disallow: /
User-agent: LinkWalker
Disallow: /
User-agent: LNSpiderguy
Disallow: /
User-agent: lwp-trivial
Disallow: /
User-agent: lwp-trivial/1.34
Disallow: /
User-agent: Mail.Ru
Disallow: /
User-agent: Mail.RU_Bot/2.0
Disallow: /
User-agent: magpie-crawler/1.1
Disallow: /
User-agent: Mata Hari
Disallow: /
User-agent: MIIxpc
Disallow: /
User-agent: MIIxpc/4.2
Disallow: /
User-agent: Mister PiX
Disallow: /
User-agent: MJ12bot
Disallow: /
User-agent: MJ12bot/v1.4.3
Disallow: /
User-agent: MLBot
Disallow: /
User-agent: moget
Disallow: /
User-agent: moget/2.1
Disallow: /
User-agent: NetAnts
Disallow: /
User-agent: netEstate NE Crawler
Disallow: /
User-agent: NICErsPRO
Disallow: /
User-agent: Offline Explorer
Disallow: /
User-agent: OnetSzukaj
Disallow: /
User-agent: Openbot
Disallow: /
User-agent: Openfind
Disallow: /
User-agent: Openfind data gatherer
Disallow: /
User-agent: Oracle Ultra Search
Disallow: /
User-agent: PerMan
Disallow: /
User-agent: ProPowerBot/2.14
Disallow: /
User-agent: ProWebWalker
Disallow: /
User-agent: proximic
Disallow: /
User-agent: psbot
Disallow: /
User-agent: Python-urllib
Disallow: /
User-agent: QueryN Metasearch
Disallow: /
User-agent: Radiation Retriever 1.1
Disallow: /
User-agent: RepoMonkey
Disallow: /
User-agent: RepoMonkey Bait & Tackle/v1.01
Disallow: /
User-agent: RMA
Disallow: /
User-agent: ScoutJet
Disallow: /
User-agent: SeznamBot/3.0
Disallow: /
User-agent: searchpreview
Disallow: /
User-agent: spbot/3.1
Disallow: /
User-agent: SiteSnagger
Disallow: /
User-agent: SpankBot
Disallow: /
User-agent: spanner
Disallow: /
User-agent: Sosospider/2.0
Disallow: /
User-agent: suzuran
Disallow: /
User-agent: Szukacz/1.4
Disallow: /
User-agent: Teleport
Disallow: /
User-agent: TeleportPro
Disallow: /
User-agent: Telesoft
Disallow: /
User-agent: The Intraformant
Disallow: /
User-agent: TheNomad
Disallow: /
User-agent: TightTwatBot
Disallow: /
User-agent: toCrawl/UrlDispatcher
Disallow: /
User-agent: Toplistbot
Disallow: /
User-agent: True_Robot
Disallow: /
User-agent: True_Robot/1.0
Disallow: /
User-agent: turingos
Disallow: /
User-agent: TurnitinBot
Disallow: /
User-agent: TurnitinBot/1.5
Disallow: /
User-agent: twiceler
Disallow: /
User-agent: twiceler.
Disallow: /
User-agent: URL Control
Disallow: /
User-agent: URLSpion
Disallow: /
User-agent: URL_Spider_Pro
Disallow: /
User-agent: URLy Warning
Disallow: /
User-agent: VCI
Disallow: /
User-agent: VCI WebViewer VCI WebViewer Win32
Disallow: /
User-agent: Web Image Collector
Disallow: /
User-agent: WebAlta Crawler
Disallow: /
User-agent: WebAuto
Disallow: /
User-agent: WebBandit
Disallow: /
User-agent: WebBandit/3.50
Disallow: /
User-agent: webbericht.com ?
Disallow: / 
User-agent: WebCapture 2.0
Disallow: /
User-agent: WebCopier
Disallow: /
User-agent: WebCopier v.2.2
Disallow: /
User-agent: WebCopier v3.2a
Disallow: /
User-agent: WebDataCentreBot
Disallow: /
User-agent: WebEnhancer
Disallow: /
User-agent: WebSauger
Disallow: /
User-agent: Website Quester
Disallow: /
User-agent: Webster Pro
Disallow: /
User-agent: WebStripper
Disallow: /
User-Agent: Webwiki
Disallow: /
User-agent: WebZip
Disallow: /
User-agent: WebZip/4.0
Disallow: /
User-agent: WebZIP/4.21
Disallow: /
User-agent: WebZIP/5.0
Disallow: /
User-agent: Wget
Disallow: /
User-agent: wget
Disallow: /
User-agent: Wget/1.5.3
Disallow: /
User-agent: Wget/1.6
Disallow: /
User-agent: WWW-Collector-E
Disallow: /
User-agent: Xenu´s
Disallow: /
User-agent: Xenu´s Link Sleuth 1.1c
Disallow: /
User-agent: Yandex
Disallow: /
User-agent: YandexBot/3.0
Disallow: /
User-agent: Yeti
Disallow: /
User-agent: Zeus
Disallow: /
User-agent: Zeus 32297 Webster Pro V2.9 Win32
Disallow: /
User-agent: Zeus Link Scout
Disallow: /
Ja, wir sagen hier der bekannten Suchmaschine „Baidu“ (China) das wir nicht in ihren Index wollen. Wir sind mal so „arrogant“ und behaupten das dieses kein Nutzen für die Chinesen hat unsere Seiten aufzunehmen und zu besuchen. Das Gleiche gilt auch für andere Suchmaschinen aus dieser geografischen Lage.