U ,a$@sXddlZddlZddlZdgZeddZGdddZGdddZGdd d Z dS) NRobotFileParser RequestRatezrequests secondsc@sneZdZdddZddZddZdd Zd d Zd d ZddZ ddZ ddZ ddZ ddZ ddZdS)rcCs2g|_g|_d|_d|_d|_||d|_dS)NFr)entriessitemaps default_entry disallow_all allow_allset_url last_checkedselfurlr7/opt/alt/python38/lib64/python3.8/urllib/robotparser.py__init__s zRobotFileParser.__init__cCs|jSN)r r rrrmtime%szRobotFileParser.mtimecCsddl}||_dS)Nr)timer )r rrrrmodified.szRobotFileParser.modifiedcCs&||_tj|dd\|_|_dS)N)rurllibparseurlparseZhostpathr rrrr 6szRobotFileParser.set_urlc Csztj|j}WnRtjjk rd}z0|jdkr:d|_n|jdkrT|jdkrTd|_W5d}~XYnX| }| | d dS)N)iiTiizutf-8) rZrequestZurlopenrerrorZ HTTPErrorcoderr readrdecode splitlines)r ferrrawrrrr;s zRobotFileParser.readcCs,d|jkr|jdkr(||_n |j|dSN*) useragentsrrappend)r entryrrr _add_entryHs  zRobotFileParser._add_entrycCsPd}t}||D]}|sP|dkr4t}d}n|dkrP||t}d}|d}|dkrn|d|}|}|s|q|dd}t|dkr|d|d<tj |d|d<|ddkr|dkr||t}|j |dd}q|ddkr.|dkr6|j t|ddd}q|dd krb|dkr6|j t|dd d}q|dd kr|dkr6|drt|d|_d}q|dd kr|dkr6|dd }t|dkr|dr|drtt|dt|d|_d}q|ddkr|j |dq|dkrL||dS)Nrr#:z user-agentZdisallowFZallowTz crawl-delayz request-rate/Zsitemap)Entryrr*findstripsplitlenlowerrrunquoter'r( rulelinesRuleLineisdigitintdelayrreq_rater)r linesstater)lineiZnumbersrrrrQsj                zRobotFileParser.parsecCs|jr dS|jrdS|jsdStjtj|}tjdd|j|j |j |j f}tj |}|sfd}|j D]}||rl||Sql|jr|j|SdS)NFTrr.)rr r rrrr5 urlunparserZparamsZqueryZfragmentquoter applies_to allowancer)r useragentrZ parsed_urlr)rrr can_fetchs*    zRobotFileParser.can_fetchcCs>|s dS|jD]}||r|jSq|jr:|jjSdSr)rrrBr:rr rDr)rrr crawl_delays   zRobotFileParser.crawl_delaycCs>|s dS|jD]}||r|jSq|jr:|jjSdSr)rrrBr;rrFrrr request_rates   zRobotFileParser.request_ratecCs|js dS|jSr)rrrrr site_mapsszRobotFileParser.site_mapscCs,|j}|jdk r||jg}dtt|S)Nz )rrjoinmapstr)r rrrr__str__s  zRobotFileParser.__str__N)r)__name__ __module__ __qualname__rrrr rr*rrErGrHrIrMrrrrrs    I  c@s$eZdZddZddZddZdS)r7cCs<|dkr|sd}tjtj|}tj||_||_dS)NrT)rrr@rrArrC)r rrCrrrrs  zRuleLine.__init__cCs|jdkp||jSr%)r startswith)r filenamerrrrBszRuleLine.applies_tocCs|jr dndd|jS)NZAllowZDisallowz: )rCrrrrrrMszRuleLine.__str__N)rNrOrPrrBrMrrrrr7sr7c@s,eZdZddZddZddZddZd S) r/cCsg|_g|_d|_d|_dSr)r'r6r:r;rrrrrszEntry.__init__cCsg}|jD]}|d|q |jdk r<|d|j|jdk rf|j}|d|jd|j|tt|j d |S)Nz User-agent: z Crawl-delay: zRequest-rate: r. ) r'r(r:r;ZrequestsZsecondsextendrKrLr6rJ)r ZretagentZraterrrrMs   z Entry.__str__cCsF|dd}|jD](}|dkr*dS|}||krdSqdS)Nr.rr&TF)r2r4r')r rDrUrrrrBs zEntry.applies_tocCs$|jD]}||r|jSqdS)NT)r6rBrC)r rRr>rrrrC s   zEntry.allowanceN)rNrOrPrrMrBrCrrrrr/s  r/) collectionsZ urllib.parserZurllib.request__all__ namedtuplerrr7r/rrrr s B